当我运行luigi任务时,有时会遇到框架崩溃,导致以下任务全部失败。在这里,错误日志信息:
2017-10-05 22:02:02,564 luigi-interface WARNING Failed pinging scheduler
2017-10-05 22:02:03,129 requests.packages.urllib3.connectionpool INFO Starting new HTTP connection (126): localhost
2017-10-05 22:02:03,130 luigi-interface ERROR Failed connecting to remote scheduler 'http://localhost:8082'
Traceback (most recent call last):
...
File "/home/develop/data_warehouse/venv/local/lib/python2.7/site-packages/requests/sessions.py", line 585, in send
r = adapter.send(request, **kwargs)
File "/home/develop/data_warehouse/venv/local/lib/python2.7/site-packages/requests/adapters.py", line 467, in send
raise ConnectionError(e, request=request)
ConnectionError: HTTPConnectionPool(host='localhost', port=8082): Max retries exceeded with url: /api/add_worker (Caused by NewConnectionError('<requests.packages.urllib3.connection.HTTPConnection object at 0x7f15128cb3d0>: Failed to establish a new connection: [Errno 111] Connection refused',))
2017-10-05 22:02:03,180 luigi-interface INFO Worker Worker(salt=150908931, workers=3, host=etl2, username=develop, pid=18019) was stopped. Shutting down Keep-Alive thread
Traceback (most recent call last):
File "app_metadata.py", line 1567, in <module>
luigi.run()
File "/home/develop/data_warehouse/venv/local/lib/python2.7/site-packages/luigi/interface.py", line 210, in run
return _run(*args, **kwargs)['success']
File "/home/develop/data_warehouse/venv/local/lib/python2.7/site-packages/luigi/interface.py", line 238, in _run
return _schedule_and_run([cp.get_task_obj()], worker_scheduler_factory)
File "/home/develop/data_warehouse/venv/local/lib/python2.7/site-packages/luigi/interface.py", line 197, in _schedule_and_run
success &= worker.run()
File "/home/develop/data_warehouse/venv/local/lib/python2.7/site-packages/luigi/worker.py", line 867, in run
self._add_worker()
File "/home/develop/data_warehouse/venv/local/lib/python2.7/site-packages/luigi/worker.py", line 652, in _add_worker
self._scheduler.add_worker(self._id, self._worker_info)
File "/home/develop/data_warehouse/venv/local/lib/python2.7/site-packages/luigi/rpc.py", line 219, in add_worker
return self._request('/api/add_worker', {'worker': worker, 'info': info})
File "/home/develop/data_warehouse/venv/local/lib/python2.7/site-packages/luigi/rpc.py", line 146, in _request
page = self._fetch(url, body, log_exceptions, attempts)
File "/home/develop/data_warehouse/venv/local/lib/python2.7/site-packages/luigi/rpc.py", line 138, in _fetch
last_exception
luigi.rpc.RPCError: Errors (3 attempts) when connecting to remote scheduler 'http://localhost:8082'听起来像是试图平移中央调度,但是失败了,然后崩溃了,以后的任务都被阻塞了,无法成功运行。
而且,其他一些人也遇到了类似的错误,但他的解决方案行不通。Github -连接到远程调度程序#1894失败
发布于 2017-10-11 00:09:12
如果您的中央调度程序超载,我会尝试使超时时间稍长一点。您还可以增加重试和重试等待时间。
在luigi.cfg中
[core]
rpc-connect-timeout=60.0 #default is 10.0
rpc-retry-attempts=10 #default is 3
rpc-retry-wait=60 #default is 30您还可能希望添加一个手表,使调度程序进程在崩溃时自动重新启动。
发布于 2019-10-16 21:46:02
您是否正确地配置了中央调度程序?见docs:scheduler.html
如果不是,请尝试通过从命令行指定--local-scheduler来使用本地调度程序。
https://stackoverflow.com/questions/46660372
复制相似问题