首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >运行python示例的apache星火流

运行python示例的apache星火流
EN

Stack Overflow用户
提问于 2017-09-20 20:39:47
回答 1查看 1.2K关注 0票数 1

我试图运行示例目录中给出的python火花流作业-

https://spark.apache.org/docs/2.1.1/streaming-programming-guide.html

代码语言:javascript
复制
"""
 Counts words in UTF8 encoded, '\n' delimited text received from the network every second.
 Usage: kafka_wordcount.py <zk> <topic>
 To run this on your local machine, you need to setup Kafka and create a producer first, see
 http://kafka.apache.org/documentation.html#quickstart

 and then run the example
    `$ bin/spark-submit --jars \
      external/kafka-assembly/target/scala-*/spark-streaming-kafka-assembly-*.jar \
      examples/src/main/python/streaming/kafka_wordcount.py \
      localhost:2181 test`
"""
from __future__ import print_function

import sys

from pyspark import SparkContext
from pyspark.streaming import StreamingContext
from pyspark.streaming.kafka import KafkaUtils

if __name__ == "__main__":
    if len(sys.argv) != 3:
        print("Usage: kafka_wordcount.py <zk> <topic>", file=sys.stderr)
        exit(-1)

    sc = SparkContext(appName="PythonStreamingKafkaWordCount")
    ssc = StreamingContext(sc, 1)

    zkQuorum, topic = sys.argv[1:]
    kvs = KafkaUtils.createStream(ssc, zkQuorum, "spark-streaming-consumer", {topic: 1})
    lines = kvs.map(lambda x: x[1])
    counts = lines.flatMap(lambda line: line.split(" ")) \
        .map(lambda word: (word, 1)) \
        .reduceByKey(lambda a, b: a+b)
    # counts.pprint()

    ssc.start()
    ssc.awaitTermination()

我下载了星火流-kafka-0-8_2.11-2.1.0.jar到我的本地目录,并运行我的spark submit命令。

代码语言:javascript
复制
bin/spark-submit --jars ../external/spark-streaming-kafka*.jar examples/src/main/python/streaming/kafka_wordcount.py localhost:2181 test

我收到了以下错误-

代码语言:javascript
复制
Exception in thread "Thread-3" java.lang.NoClassDefFoundError: kafka/common/TopicAndPartition
EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2017-09-20 22:38:27

您需要使用spark-streaming-kafka-assembly jar,而不是spark-streaming-kafka。程序集jar具有所有依赖项(包括kafka客户端)。

票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/46331401

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档