假设数据集包含两个字段:字段、问题时间
fields question time
php,error,gd,image-processing 1235000501
php,error,gd,image-processing 1235000551
lisp,scheme,subjective,clojure 1235000177
lisp,scheme,subjective,clojure 1235001545
lisp,scheme,subjective,clojure 1235002457
lisp,scheme,subjective,clojure 1235002809
lisp,scheme,subjective,clojure 1235003266
lisp,scheme,subjective,clojure 1235007817
lisp,scheme,subjective,clojure 1235007913
lisp,scheme,subjective,clojure 1235020626
lisp,scheme,subjective,clojure 1235040652我尝试了下面的代码
DEFINE UnixToISO org.apache.pig.piggybank.evaluation.datetime.convert.UnixToISO();
A= LOAD '/user/home/book3.csv' using PigStorage() as (fields:chararray,question time:long);
B= foreach A generate fields,UnixToISO(question time * 1000 ) as temp;
DUMP B;没有与输入相同的更改
C= foreach B generate fields, ToDate(temp) as date_time;
DUMP C;没有与输入相同的更改
D= foreach C generate fields, GetHour(date_time) as hour;
DUMP D;没有相同的更改,因为input.What在我的代码中是错误的?
发布于 2016-02-12 06:43:29
我认为示例数据的格式不符合分隔两列中数据的分隔符。我提取了示例数据,删除了列数据之间的空格,并将它们替换为一个tab.In,这样我就能够加载两列数据并提取时间戳和获取小时。参见下面的脚本和输出。
脚本
A = LOAD 'test4.txt' using PigStorage('\t') AS (fields:chararray,question_time:long);
B = foreach A generate fields,ToDate(question_time * 1000 ) as temp;
C = foreach B generate fields, GetHour(temp) as hour;
DUMP C;


https://stackoverflow.com/questions/35347157
复制相似问题