Pyspark和python2.7对我来说很好。我安装了python 3.5.1 (从源代码安装),当我在终端中运行pyspark时,我得到这个错误
Python 3.5.1 (default, Apr 25 2016, 12:41:28)
[GCC 4.8.4] on linux
Type "help", "copyright", "credits" or "license" for more information.
Traceback (most recent call last):
File "/home/himaprasoon/apps/spark-1.6.0-bin-hadoop2.6/python/pyspark/shell.py", line 30, in <module>
import pyspark
File "/home/himaprasoon/apps/spark-1.6.0-bin-hadoop2.6/python/pyspark/__init__.py", line 41, in <module>
from pyspark.context import SparkContext
File "/home/himaprasoon/apps/spark-1.6.0-bin-hadoop2.6/python/pyspark/context.py", line 28, in <module>
from pyspark import accumulators
File "/home/himaprasoon/apps/spark-1.6.0-bin-hadoop2.6/python/pyspark/accumulators.py", line 98, in <module>
from pyspark.serializers import read_int, PickleSerializer
File "/home/himaprasoon/apps/spark-1.6.0-bin-hadoop2.6/python/pyspark/serializers.py", line 58, in <module>
import zlib
ImportError: No module named 'zlib'我试过python 3.4.3,它也工作得很好
发布于 2016-05-05 07:45:54
您是否检查以确保zlib确实存在于您的python安装中?它应该是默认的,但奇怪的事情发生了。
发布于 2017-08-02 20:57:50
您是否在.bashrc文件中提供了系统python3.5.1到"PYSPARK_PYTHON“的确切路径?
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/__ / .__/\_,_/_/ /_/\_\ version 2.1.1
/_/
Using Python version 3.6.1 (default, Jun 23 2017 16:20:09)
SparkSession available as 'spark'.这是我的PySpark提示符显示的内容。Apache Spark版本为2.1.1
附言:我使用Anaconda3 (Python3.6.1)来编写我的日常PySpark代码,并将我的PYSPARK_DRIVER设置为'jupyter‘
上面的示例是我的默认系统Python 3.6
发布于 2018-05-15 23:14:16
尝试conda install -c conda-forge pyspark,以防问题仍然存在,您可能需要更改您的~/.basrc
https://stackoverflow.com/questions/36834732
复制相似问题