首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >使用elephantbird PIG读取JSON时出错

使用elephantbird PIG读取JSON时出错
EN

Stack Overflow用户
提问于 2012-02-25 06:24:45
回答 1查看 1.1K关注 0票数 0

我无法使用ElephantBird和Pig读取JSON文件。我想知道我在哪里弄错了。

数据:

代码语言:javascript
复制
{ "nrcpts": "1",
  "src": "info@example.com",
  "sendmailid": "p6D0r0u1006229",
  "relay": "app03.example.com",
  "classnumber": "0",
  "msgid": "WARQZCXAEMSSVWPPOOYZXR
LQIKMFUY.155763@example.com",
  "pid": "6229",
  "month": "Jul",
  "time": "20:53:00",
  "day": "12",
  "mailserver": "mail5",
  "size": "57395"
}

代码:

代码语言:javascript
复制
json1 = load '/user/hdetl/funnel/uetsample.dat' using com.twitter.elephantbird.pig.load.JsonLoader();

dat   = FOREACH json1 GENERATE $0#'mailserver' AS mailserver;
dump dat;

错误:

代码语言:javascript
复制
Input(s):
Failed to read data from "/user/hdetl/funnel/uetsample.dat"

detailed error :
Pig Stack Trace
---------------
ERROR 2997: Unable to recreate exception from backed error: Error: in

org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to open iterator for alias dat. Backend error : Unable to recreate exception from back
ed error: Error: in
        at org.apache.pig.PigServer.openIterator(PigServer.java:891)
        at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:655)
        at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:303)
        at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:188)
        at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:164)
        at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:69)
        at org.apache.pig.Main.run(Main.java:495)
        at org.apache.pig.Main.main(Main.java:111)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 2997: Unable to recreate exception from backed error: Error: in
        at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getErrorMessages(Launcher.java:221)
        at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getStats(Launcher.java:151)
        at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:344)
        at org.apache.pig.PigServer.launchPlan(PigServer.java:1314)
        at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1299)
        at org.apache.pig.PigServer.storeEx(PigServer.java:996)
        at org.apache.pig.PigServer.store(PigServer.java:963)
        at org.apache.pig.PigServer.openIterator(PigServer.java:876)
EN

回答 1

Stack Overflow用户

发布于 2014-10-03 20:32:47

很老的帖子,但有些人可能有类似的问题。

我已经根据问题中提供的数据创建了输入文件。

我无法加载t文件,因为在行中输入了不必要的内容:

代码语言:javascript
复制
"msgid": "WARQZCXAEMSSVWPPOOYZXR
LQIKMFUY.155763@example.com",

但是修复它并没有得到预期的结果。我已经从文件中删除了所有条目,所以最终我只有一行。

文件已加载:

代码语言:javascript
复制
dump json1
([time#20:53:00,msgid#WARQZCXAEMSSVWPPOOYZXRLQIKMFUY.155763@example.com,relay#app03.example.com,mailserver#mail5,month#Jul,pid#6229,classnumber#0,day#12,src#info@example.com,sendmailid#p6D0r0u1006229,nrcpts#1,size#57395])

和你的foreach工作:

代码语言:javascript
复制
dat   = FOREACH json1 GENERATE $0#'mailserver' AS mailserver;
dump dat

(mail5)
票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/9438934

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档