首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >向HadoopJarStepConfig添加额外的论据失败

向HadoopJarStepConfig添加额外的论据失败
EN

Stack Overflow用户
提问于 2013-04-27 00:55:04
回答 1查看 672关注 0票数 0

我正在尝试通过AWS SDK获取此命令:

代码语言:javascript
复制
hadoop jar /home/hadoop/contrib/streaming/hadoop-streaming.jar -input hdfs:///logs/ -output hdfs:///no_dups -mapper dedup_mapper.py -reducer dedup_reducer.py -file deduplication.py dedup_mapper.py dedup_reducer.py timber.py signature_v4.py

我的java代码是:

代码语言:javascript
复制
HadoopJarStepConfig config = new StreamingStep()
        .withInputs("hdfs:///logs")
        .withOutput("hdfs:///no_dups")
        .withMapper("dedup_mapper.py")
        .withReducer("dedup_reducer.py")
        .toHadoopJarStepConfig();

Collection<String> aggs = config.getArgs();
aggs.add("-file deduplication.py timber.py dedup_mapper.py dedup_reducer.py signature_v4.py");
config.setArgs(aggs);

它会(在调用toString()时)生成以下AddJobFlowStepsRequest:

代码语言:javascript
复制
{JobFlowId: j-3TDECOMCOO8HE, Steps: [{Name: DeDup, ActionOnFailure: CONTINUE, HadoopJarStep: {Properties: [], Jar: /home/hadoop/contrib/streaming/hadoop-streaming.jar, Args: [-input, hdfs:///logs, -output, hdfs:///no_dups, -mapper, dedup_mapper.py, -reducer, dedup_reducer.py, -file deduplication.py timber.py dedup_mapper.py dedup_reducer.py signature_v4.py], }, }], }

最后,我在主节点上看到的错误:

代码语言:javascript
复制
2013-04-26 16:43:48,116 ERROR org.apache.hadoop.streaming.StreamJob (main): Unrecognized option: -file deduplication.py timber.py dedup_mapper.py dedup_reducer.py signature_v4.p

奇怪的是,错误日志列出了可用的选项,-file就是其中之一。还有没有人看过这个问题?

更多日志:

代码语言:javascript
复制
2013-04-26T16:43:46.105Z INFO Fetching jar file.

2013-04-26T16:43:47.609Z INFO Working dir /mnt/var/lib/hadoop/steps/9

2013-04-26T16:43:47.609Z INFO Executing /usr/lib/jvm/java-6-sun/bin/java -cp /home/hadoop/conf:/usr/lib/jvm/java-6-sun/lib/tools.jar:/home/hadoop:/home/hadoop/hadoop-core-1.0.3.jar:/home/hadoop/hadoop-tools.jar:/home/hadoop/hadoop-tools-1.0.3.jar:/home/hadoop/hadoop-core.jar:/home/hadoop/lib/*:/home/hadoop/lib/jetty-ext/* -Xmx1000m -Dhadoop.log.dir=/mnt/var/log/hadoop/steps/9 -Dhadoop.log.file=syslog -Dhadoop.home.dir=/home/hadoop -Dhadoop.id.str=hadoop -Dhadoop.root.logger=INFO,DRFA -Djava.io.tmpdir=/mnt/var/lib/hadoop/steps/9/tmp -Djava.library.path=/home/hadoop/native/Linux-amd64-64 org.apache.hadoop.util.RunJar /home/hadoop/contrib/streaming/hadoop-streaming.jar -input hdfs:///logs -output hdfs:///no_dups -mapper dedup_mapper.py -reducer dedup_reducer.py -file deduplication.py timber.py dedup_mapper.py dedup_reducer.py signature_v4.py

2013-04-26T16:43:48.611Z INFO Execution ended with ret val 1

2013-04-26T16:43:48.612Z WARN Step failed with bad retval
EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2013-04-29 17:41:14

出现该错误的原因是因为整个命令被解释为单个命令选项。

解决方案是添加命令选项,然后像这样添加参数:

代码语言:javascript
复制
args.add("-file");
args.add("myfile.txt");

如果你想添加多个文件,你可以这样做:

代码语言:javascript
复制
args.add("-file");
args.add("myfile.txt");
args.add("-file");
args.add("myfile2.txt");

如果您只是在一个参数中以列表的形式给出文件,那么整个行将被解释为文件名,并且可能会抛出错误。

票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/16241520

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档