首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >在GCP中查找jar文件的路径

在GCP中查找jar文件的路径
EN

Stack Overflow用户
提问于 2019-11-04 03:22:28
回答 1查看 440关注 0票数 0

在Google file Platform中找到hadoop-streaming 1.2.1.jar jar文件的路径。

https://github.com/devangpatel01/TF-IDF-implementation-using-map-reduce-Hadoop-python-

我正在尝试使用hadoop在GCP上运行此mapreduce,但我无法找到hadoop-streaming-1.2.1.jar的路径。我尝试手动下载jar文件并将其上传到hadoop中,然后运行mapper1.py。但是我得到一个错误,说路径是错误的。上面的程序在本地机器上运行。如何编辑该命令以在GCP上运行它?

hdfs://cluster-29-m/input_prgs/input_prgs/mapper1.py hdfs://cluster-29-m/input_prgs/input_prgs/input1/000000_0 jar hdfs://cluster-29-m/input_prgs/input_prgs/output1 hdfs://cluster-29-m/input_prgs/input_prgs/mapper1.py hdfs://cluster-29-m/input_prgs/input_prgs/reducer1.py -input -output hadoop -mapper -reducer hadoop

EN

回答 1

Stack Overflow用户

发布于 2019-11-14 19:22:23

我使用了不同的Mapper-Reducer程序,可以运行mapreduce。我使用来自https://github.com/SatishUC15/TFIDF-HadoopMapReduce#tfidf-hadoop的代码,并在我的GCP集群上运行以下命令。

代码语言:javascript
复制
> hadoop jar /usr/lib/hadoop-mapreduce/hadoop-streaming.jar -file /home/kirthyodackal/MapperPhaseOne.py /home/kirthyodackal/ReducerPhaseOne.py -mapper "python MapperPhaseOne.py" -reducer "python ReducerPhaseOne.py" -input hdfs://cluster-3299-m/mapinput/inputfile -output hdfs://cluster-3299-m/mappred1

> hadoop jar /usr/lib/hadoop-mapreduce/hadoop-streaming.jar -file /home/kirthyodackal/MapperPhaseTwo.py /home/kirthyodackal/ReducerPhaseTwo.py -mapper "python MapperPhaseTwo.py" -reducer "python ReducerPhaseTwo.py" -input hdfs://cluster-3299-m/mappred1/part-00000 hdfs://cluster-3299-m/mappred1/part-00001 hdfs://cluster-3299-m/mappred1/part-00002 hdfs://cluster-3299-m/mappred1/part-00003 hdfs://cluster-3299-m/mappred1/part-00004  -output hdfs://cluster-3299-m/mappred2

> hadoop jar /usr/lib/hadoop-mapreduce/hadoop-streaming.jar -file /home/kirthyodackal/MapperPhaseThree.py /home/kirthyodackal/ReducerPhaseThree.py -mapper "python MapperPhaseThree.py" -reducer "python ReducerPhaseThree.py" -input hdfs://cluster-3299-m/mappred2/part-00000 hdfs://cluster-3299-m/mappred2/part-00001 hdfs://cluster-3299-m/mappred2/part-00002 hdfs://cluster-3299-m/mappred2/part-00003 hdfs://cluster-3299-m/mappred2/part-00004  -output hdfs://cluster-3299-m/mappredf

下面的链接概述了我是如何在GCP上使用MapReduce的。https://github.com/kirthy21/Data-Analysis-Stack-Exchange-Hadoop-Pig-Hive-MapReduce-TFIDF

票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/58683808

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档