首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >在databricks上安装新版本后,pandas版本未更新

在databricks上安装新版本后,pandas版本未更新
EN

Stack Overflow用户
提问于 2020-09-10 09:35:33
回答 1查看 1.9K关注 0票数 5

当我在databricks上运行python3.7代码时,我正在尝试解决熊猫的问题。

错误是:

代码语言:javascript
复制
 ImportError: cannot import name 'roperator' from 'pandas.core.ops' (/databricks/python/lib/python3.7/site-packages/pandas/core/ops.py)

pandas版本:

代码语言:javascript
复制
pd.__version__
0.24.2

我跑步

代码语言:javascript
复制
 from pandas.core.ops import roperator

在我的笔记本电脑上

代码语言:javascript
复制
pandas 0.25.1

因此,我尝试在databricks上升级pandas。

代码语言:javascript
复制
%sh pip uninstall -y pandas
Successfully uninstalled pandas-1.1.2

%sh pip install pandas==0.25.1
 Collecting pandas==0.25.1
 Downloading pandas-0.25.1-cp37-cp37m-manylinux1_x86_64.whl (10.4 MB)
 Requirement already satisfied: python-dateutil>=2.6.1 in /databricks/conda/envs/databricks-ml/lib/python3.7/site-packages (from pandas==0.25.1) (2.8.0)
 Requirement already satisfied: numpy>=1.13.3 in /databricks/conda/envs/databricks-ml/lib/python3.7/site-packages (from pandas==0.25.1) (1.16.2)
 Requirement already satisfied: pytz>=2017.2 in /databricks/conda/envs/databricks-ml/lib/python3.7/site-packages (from pandas==0.25.1) (2018.9)
 Requirement already satisfied: six>=1.5 in /databricks/conda/envs/databricks-ml/lib/python3.7/site-packages (from python-dateutil>=2.6.1->pandas==0.25.1) (1.12.0)
 Installing collected packages: pandas
 ERROR: After October 2020 you may experience errors when installing or updating packages. 
  This is because pip will change the way that it resolves dependency conflicts.

  We recommend you use --use-feature=2020-resolver to test your packages with the new resolver before it becomes the default.

  mlflow 1.8.0 requires alembic, which is not installed.
  mlflow 1.8.0 requires prometheus-flask-exporter, which is not installed.
  mlflow 1.8.0 requires sqlalchemy<=1.3.13, which is not installed.
  sklearn-pandas 2.0.1 requires numpy>=1.18.1, but you'll have numpy 1.16.2 which is incompatible.
   sklearn-pandas 2.0.1 requires pandas>=1.0.5, but you'll have pandas 0.25.1 which is incompatible.
   sklearn-pandas 2.0.1 requires scikit-learn>=0.23.0, but you'll have scikit-learn 0.20.3 which is incompatible.
   sklearn-pandas 2.0.1 requires scipy>=1.4.1, but you'll have scipy 1.2.1 which is incompatible.
   Successfully installed pandas-0.25.1

当我运行时:

代码语言:javascript
复制
 import pandas as pd
  pd.__version__

它仍然是:

代码语言:javascript
复制
 0.24.2

我错过了什么吗?

谢谢

EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2020-09-10 17:19:17

非常推荐通过cluster initialization script安装库。%sh命令仅在驱动程序节点上执行,而不在执行器节点上执行。而且它也不会影响已经在运行的Python实例。

正确的解决方案是使用dbutils.library commands,如下所示:

代码语言:javascript
复制
dbutils.library.installPyPI("pandas", "1.0.1")
dbutils.library.restartPython()

这将把库安装到所有地方,但需要重新启动Python来获取新的库。

此外,尽管可以仅指定包名,但建议显式指定版本,因为某些库版本可能与运行时不兼容。此外,考虑使用已经更新库版本的较新运行时-检查release notes for runtimes以确定开箱即用安装的库版本。

对于较新的Databricks运行时,可以使用新的神奇命令:%pip%conda来安装依赖项。有关更多详细信息,请参阅documentation

票数 2
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/63821633

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档