对于shell命令,我想为spark.es.query参数指定一个文件:
] $SPARK_HOME/bin/spark-shell --master local[4]
--jars ~/spark/jars/elasticsearch-spark-20_2.11-5.1.2.jar
--conf spark.es.nodes="localhost" --conf spark.es.resource="myindex/mytype"
--conf spark.es.query="/home/pat/spark/myquery.json"在外壳中:
scala> import org.elasticsearch.spark._
scala> val es_rdd = sc.esRDD("myindex/mytype")
scala> es_rdd.first()我得到的输出:
17/02/04 07:41:31 ERROR TaskContextImpl: Error in TaskCompletionListener
org.elasticsearch.hadoop.EsHadoopIllegalArgumentException: Cannot determine
specified query - doesn't appear to be URI or JSON based and location
[/home/pat/spark/myquery.json] cannot be opened当然,路径上存在文件。这是指定查询文件的好方法吗?
发布于 2017-02-04 13:11:25
您将得到此错误,因为spark和es-连接器希望将文件路径作为URI传递:
SPARK_HOME/bin/spark-shell --master local[4] \
--jars ~/spark/jars/elasticsearch-spark-20_2.11-5.1.2.jar \
--conf spark.es.nodes="localhost" \
--conf spark.es.resource="myindex/mytype" \
--conf spark.es.query="file:///home/pat/spark/myquery.json"https://stackoverflow.com/questions/42040585
复制相似问题