首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >熊猫Modin射线库启动失败

熊猫Modin射线库启动失败
EN

Stack Overflow用户
提问于 2022-02-09 22:43:22
回答 1查看 935关注 0票数 0

我正在尝试用莫丁加速我的熊猫数据处理。

代码语言:javascript
复制
import os
os.environ["MODIN_ENGINE"] = "ray"
import modin.pandas as pd

df = pd.read_csv(r"C:\Users\Harshad\Documents\Files\Data\Pre-processed\data.csv", low_memory=False)

我得到以下警告和错误:

代码语言:javascript
复制
UserWarning: Ray execution environment not yet initialized. Initializing...
To remove this warning, run the following python code before doing dataframe operations:

    import ray
    ray.init()

Traceback (most recent call last):
  File "C:\Users\Harshad\Documents\pythonProject\venv\lib\site-packages\ray\node.py", line 240, in __init__
    self.redis_password)
  File "C:\Users\Harshad\Documents\pythonProject\venv\lib\site-packages\ray\_private\services.py", line 328, in wait_for_node
    raise TimeoutError("Timed out while waiting for node to startup.")
TimeoutError: Timed out while waiting for node to startup.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:/Users/Harshad/Documents/Code/data.py", line 18, in <module>
    low_memory=False)
  File "C:\Users\Harshad\Documents\pythonProject\venv\lib\site-packages\modin\pandas\io.py", line 135, in read_csv
    return _read(**kwargs)
  File "C:\Users\Harshad\Documents\pythonProject\venv\lib\site-packages\modin\pandas\io.py", line 58, in _read
    Engine.subscribe(_update_engine)
  File "C:\Users\Harshad\Documents\pythonProject\venv\lib\site-packages\modin\config\pubsub.py", line 213, in subscribe
    callback(cls)
  File "C:\Users\Harshad\Documents\pythonProject\venv\lib\site-packages\modin\pandas\__init__.py", line 127, in _update_engine
    initialize_ray()
  File "C:\Users\Harshad\Documents\pythonProject\venv\lib\site-packages\modin\core\execution\ray\common\utils.py", line 185, in initialize_ray
    ray.init(**ray_init_kwargs)
  File "C:\Users\Harshad\Documents\pythonProject\venv\lib\site-packages\ray\_private\client_mode_hook.py", line 105, in wrapper
    return func(*args, **kwargs)
  File "C:\Users\Harshad\Documents\pythonProject\venv\lib\site-packages\ray\worker.py", line 922, in init
    ray_params=ray_params)
  File "C:\Users\Harshad\Documents\pythonProject\venv\lib\site-packages\ray\node.py", line 243, in __init__
    "The current node has not been updated within 30 "
Exception: The current node has not been updated within 30 seconds, this could happen because of some of the Ray processes failed to startup.

虽然我已经很清楚地重新运行了代码,在它们之间有超过30秒的时间。

当我在安装modin和ray之后第一次运行它时,它运行得相当好,只有以下警告:

代码语言:javascript
复制
UserWarning: Ray execution environment not yet initialized. Initializing...
To remove this warning, run the following python code before doing dataframe operations:

    import ray
    ray.init()

然后,我将代码修改为:

代码语言:javascript
复制
import os
os.environ["MODIN_ENGINE"] = "ray"
import modin.pandas as pd
import ray
ray.init()
df = pd.read_csv(r"C:\Users\Harshad\Documents\Files\Data\Pre-processed\data.csv", low_memory=False)

我知道这个错误:

代码语言:javascript
复制
Traceback (most recent call last):
  File "C:\Users\Harshad\Documents\pythonProject\venv\lib\site-packages\ray\node.py", line 240, in __init__
    self.redis_password)
  File "C:\Users\Harshad\Documents\pythonProject\venv\lib\site-packages\ray\_private\services.py", line 328, in wait_for_node
    raise TimeoutError("Timed out while waiting for node to startup.")
TimeoutError: Timed out while waiting for node to startup.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:/Users/Harshad/Documents/Code/data.py", line 18, in <module>
    low_memory=False)
  File "C:\Users\Harshad\Documents\pythonProject\venv\lib\site-packages\modin\pandas\io.py", line 135, in read_csv
    return _read(**kwargs)
  File "C:\Users\Harshad\Documents\pythonProject\venv\lib\site-packages\modin\pandas\io.py", line 58, in _read
    Engine.subscribe(_update_engine)
  File "C:\Users\Harshad\Documents\pythonProject\venv\lib\site-packages\modin\config\pubsub.py", line 213, in subscribe
    callback(cls)
  File "C:\Users\Harshad\Documents\pythonProject\venv\lib\site-packages\modin\pandas\__init__.py", line 127, in _update_engine
    initialize_ray()
  File "C:\Users\Harshad\Documents\pythonProject\venv\lib\site-packages\modin\core\execution\ray\common\utils.py", line 185, in initialize_ray
    ray.init(**ray_init_kwargs)
  File "C:\Users\Harshad\Documents\pythonProject\venv\lib\site-packages\ray\_private\client_mode_hook.py", line 105, in wrapper
    return func(*args, **kwargs)
  File "C:\Users\Harshad\Documents\pythonProject\venv\lib\site-packages\ray\worker.py", line 922, in init
    ray_params=ray_params)
  File "C:\Users\Harshad\Documents\pythonProject\venv\lib\site-packages\ray\node.py", line 243, in __init__
    "The current node has not been updated within 30 "
Exception: The current node has not been updated within 30 seconds, this could happen because of some of the Ray processes failed to startup

当我查看这一期的Github时,发现它是一个bug。

如何解决这些警告和错误?

编辑:我重新启动了我的吡喃环境,允许一个循环的重新运行。这表明这是一个Pycharm/环境问题?

我怎样才能解决这个问题?

EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2022-02-09 23:35:10

导入initing ray之前先尝试modin

代码语言:javascript
复制
import os
os.environ["MODIN_ENGINE"] = "ray"
import ray
ray.init()
import modin.pandas as pd
票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/71057731

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档