文章/答案/技术大牛

发布

社区首页 >问答首页 >Spark/Yarn: FileNotFoundException

问Spark/Yarn: FileNotFoundException
EN

Stack Overflow用户

提问于 2018-08-22 16:43:58

回答 1查看 807关注 0票数 1

我在spark中运行以下代码。

scala>import com.databricks.spark.xml.XmlInputFormat
scala>import org.apache.hadoop.io._
scala>sc.hadoopConfiguration.set(XmlInputFormat.START_TAG_KEY,"<mytag>")
scala>sc.hadoopConfiguration.set(XmlInputFormat.END_TAG_KEY,"</mytag>")
scala>sc.hadoopConfiguration.set(XmlInputFormat.ENCODING_KEY,"utf-8")
scala>val record1 = sc.newAPIHadoopFile("file:///home/myuser/myfile.xml", classOf[XmlInputFormat], classOf[LongWritable],classOf[Text])

当我通过将master设置为local来运行它时，它工作得很好。

spark2-shell --jars spark-xml_2.10-0.4.1.jar --master local[*]

但是当我尝试在yarn中运行它时，它返回java.io.FileNotFoundException: File file:/home/myuser/myfile.xml does not exist。

spark2-shell --jars spark-xml_2.10-0.4.1.jar --master yarn

我尝试将--deply-mode添加为client和cluster，但没有成功。

apache-spark

hadoop-yarn

filenotfoundexception

回答 1

Stack Overflow用户

发布于 2018-08-23 02:30:46

此文件file:///home/myuser/myfile.xml"似乎只能在您的驱动程序上访问，但不能在您的执行器上访问。A)手动读取HDFS上的文件并从中读取

或者B)使用spark2-shell的--files选项，它会自动将文件上传到HDFS：

spark2-shell --files /home/myuser/myfile.xml --jars spark-xml_2.10-0.4.1.jar --master yar

然后

val record1 = sc.newAPIHadoopFile(org.apache.spark.SparkFiles.get("myfile.xml"), classOf[XmlInputFormat], classOf[LongWritable],classOf[Text])

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/51962954

复制

相似问题

问Spark/Yarn: FileNotFoundException
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问Spark/Yarn: FileNotFoundExceptionEN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问Spark/Yarn: FileNotFoundException
EN