我对Pig拉丁语很陌生,我正在尝试复制一个简单的SQL查询。示例输入数据表的形式如下:
**A B C**
1 3 $5
2 4 $6
2 5 $7我想数一数B栏中的行数和C行的总和,以便:
**A Count(B) Sum(C)**
1 1 $5
2 2 $13或者在SQL中:
Select A, count(B), Sum(C)
From Data
Group by A我怎样才能在猪身上做到这一点?
发布于 2015-08-05 05:53:13
猪脚本:
input_data = LOAD 'input.csv'
USING PigStorage(',')
AS (A:long, B:long, C:long);
input_data_grp_by_A = GROUP input_data BY A;
required_stats = FOREACH input_data_grp_by_A
GENERATE group AS A,
COUNT(input_data.B) AS COUNT_B,
SUM(input_data.C) AS SUM_C;输入:
1,3,5
2,4,6
2,5,7输出: required_stats
(1,1,5)
(2,2,13)https://stackoverflow.com/questions/31823002
复制相似问题