我试着用AVG,MIN,MAX在猪身上。MIN和MAX函数在执行过程中被卡住,AVG函数抛出一个错误。但计数函数工作正常。
org.apache.pig.backend.executionengine.ExecException:错误0:标量在输出中有多行。1:(2年级教师,{(65587.90)}),2:(四年级教师,{(56567.24)})
我的代码:
register 'pig/contrib/piggybank/java/piggybank.jar';
define Replace org.apache.pig.piggybank.evaluation.string.REPLACE();
A = LOAD '/user/hduser/salaryTravel.csv' using org.apache.pig.piggybank.storage. CSVLoader() AS (name:chararray,job:chararray,salary:chararray,TA:chararray,type:chararray,org:chararray,year:int);
B = foreach A generate name,job,REPLACE(salary,',','') as salary:float, REPLACE(TA,',','') as TA:float, type, org, year;
C = filter B by type=='LBOE';
D = filter C by year==2010;
E = group D by job;
number = foreach E generate group,COUNT(D.salary);
average = foreach E genetate group,AVG(D.salary);
minim = foreach E genetate group,MIN(D.salary);
maxim = foreach E genetate group,MAX(D.salary);样本数据
(ABBOTT,DEEDEE W,GRADES 9-12 TEACHER,52,122.10,0,LBOE,ATLANTA INDEPENDENT SCHOOL SYSTEM,2010)
(ABBOTT,RYAN V,GRADE 4 TEACHER,56,567.24,0,LBOE,ATLANTA INDEPENDENT SCHOOL SYSTEM,2010)
(ABBOUD,CLAUDIA MORA,GRADES K-5 TEACHER,63,957.50,0,LBOE,ATLANTA INDEPENDENT SCHOOL SYSTEM,2010)
(ABDUL-JABBAR,KHADEEJA ,GRADES 9-12 TEACHER,16,791.73,0,LBOE,ATLANTA INDEPENDENT SCHOOL SYSTEM,2010)
(ABDUL-RAZACQ,SALAHUD-DIN ,INSTRUCTIONAL SPECIALIST P-8,45,832.92,0,LBOE,ATLANTA INDEPENDENT SCHOOL SYSTEM,2010)
(ABDULLAH,DIANA ,SPECIAL ED PARAPRO/AIDE,10,934.94,0,LBOE,ATLANTA INDEPENDENT SCHOOL SYSTEM,2010)
(ABDULLAH,NADIYAH W,GRADES 6-8 TEACHER,75,109.92,0,LBOE,ATLANTA INDEPENDENT SCHOOL SYSTEM,2010)
(ABDULLAH,RHONDALYN Y,SPECIAL ED PARAPRO/AIDE,28,649.34,0,LBOE,ATLANTA INDEPENDENT SCHOOL SYSTEM,2010)
(OSBORNE,CHRISTINE L,INSTRUCTIONAL SUPERVISOR,78,875.59,3,265.71,LBOE,COBB COUNTY SCHOOL DISTRICT,2010)
(OSBORNE,DORIS A,OCCUPATIONAL THERAPIST ,65,421.79,1,156.05,LBOE,COBB COUNTY SCHOOL DISTRICT,2010)第7行组操作后的示例数据。
(GRADE 2 TEACHER,{(OSBORNE,VIRGINIA E,GRADE 2 TEACHER,65587.90,0,LBOE,COBB COUNTY SCHOOL DISTRICT,2010)})
(GRADE 4 TEACHER,{(ABBOTT,RYAN V,GRADE 4 TEACHER,56567.24,0,LBOE,ATLANTA INDEPENDENT SCHOOL SYSTEM,2010)})
(MAINTENANCE PERSONNEL,{(BROOKS,RICHARD M,MAINTENANCE PERSONNEL,72655.52,0,LBOE,FULTON COUNTY BOARD OF EDUCATION,2010),(SUMNER,ROBERT O,MAINTENANCE PERSONNEL,72655.53,0,LBOE,FULTON COUNTY BOARD OF EDUCATION,2010),(MCCULLOUGH,ALVIN J,MAINTENANCE PERSONNEL,72655.52,0,LBOE,FULTON COUNTY BOARD OF EDUCATION,2010),(DALTON,JAMES E,MAINTENANCE PERSONNEL,72655.52,2124.60,LBOE,FULTON COUNTY BOARD OF EDUCATION,2010),(SMITH,KEVIN W,MAINTENANCE PERSONNEL,72655.52,0,LBOE,FULTON COUNTY BOARD OF EDUCATION,2010),(MANGHAM,LARRY G,MAINTENANCE PERSONNEL,72655.52,0,LBOE,FULTON COUNTY BOARD OF EDUCATION,2010)})是猪身上的虫子吗?请帮帮我。
发布于 2015-09-02 08:45:31
这是更新的猪脚本。
register 'pig/contrib/piggybank/java/piggybank.jar';
define Replace org.apache.pig.piggybank.evaluation.string.REPLACE();
A = LOAD '/user/hduser/salaryTravel.csv' using org.apache.pig.piggybank.storage. CSVLoader() AS (name:chararray,job:chararray,salary:chararray,TA:chararray,type:chararray,org:chararray,year:int);
B = foreach A generate name,job,REPLACE(salary,',','') as salary, REPLACE(TA,',','') as TA, type, org, year;
B1 = foreach B generate name, job, (double)salary, (double)TA, type, org, year;
C = filter B1 by type=='LBOE';
D = filter C by year==2010;
E = group D by job;
number = foreach E generate group,COUNT(D.salary);
average = foreach E generate group,AVG(D.salary);
minim = foreach E generate group,MIN(D.salary);
maxim = foreach E generate group,MAX(D.salary);问题是,您需要向salary和TA属性提供显式转换。
https://stackoverflow.com/questions/32344934
复制相似问题