我已经在集群中安装了Datastax Enterprise4.6,但是我不明白为什么pyspark抛出这个错误。scala接口工作得很好,但是python就不行了,有谁知道怎么解决这个问题吗?
Python 2.6.6 Centos 6.5
干杯
bash-4.1$ dse pyspark --master spark://IP:7077
Python 2.6.6 (r266:84292, Jan 22 2014, 01:49:05)
[GCC 4.4.7 20120313 (Red Hat 4.4.7-4)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
Traceback (most recent call last):
File "/usr/share/dse/spark/python/pyspark/shell.py", line 33, in <module>
import pyspark
File "/usr/share/dse/spark/python/pyspark/__init__.py", line 63, in <module>
from pyspark.context import SparkContext
File "/usr/share/dse/spark/python/pyspark/context.py", line 34, in <module>
from pyspark import rdd
File "/usr/share/dse/spark/python/pyspark/rdd.py", line 1972
return {convertColumnValue(v) for v in columnValue}
^
SyntaxError: invalid syntax
>>>发布于 2015-01-14 00:19:44
DSE4.6中包含的PySpark支持需要Python2.7.x,并将抛出您在Python2.6.x上看到的错误。即将发布的修补程序应该会修复Python 2.6.x的问题。目前还没有具体的日期。
https://stackoverflow.com/questions/27921270
复制相似问题