Apache光束Python SDK升级到2.11.0问题。
我正在使用requirements.txt将sdk从2.4.0升级到2.11.0。它有如下依赖关系:
apache_beam==2.11.0
google-cloud-dataflow==2.4.0
httplib2==0.11.3
google-cloud==0.27.0
google-cloud-storage==1.3.0
workflow为了管理光束管道中的依赖关系,我们有这个txt文件。google计算引擎上有两个vm实例,一个是master实例,另一个是worker实例。这些实例将安装requirements.txt文件中列出的所有包。
这些作业是通过DataflowRunner运行的。如果使用以下命令手动运行代码
python code.py --项目--设置文件路径--要求文件路径--工作机类型N1--标准-8--运行者DataflowRunner。
该作业不是将版本升级到2.11.0,而是在堆栈驱动程序日志中显示.Error消息失败:
2019-03-26 19:02:02.000 IST
Failed to install packages: failed to install requirements: exit status 1
Expand all | Collapse all {
insertId: "27857323862365974846:1225647:0:438995"
jsonPayload: {
line: "boot.go:144"
message: "Failed to install packages: failed to install requirements: exit status 1"
}
labels: {
compute.googleapis.com/resource_id: "278567544395974846"
compute.googleapis.com/resource_name: "icf-20190334132038-03260625-b9fa-harness-gtml"
compute.googleapis.com/resource_type: "instance"
dataflow.googleapis.com/job_id: "2019-03-26_06_25_16-6068768320191854196"
dataflow.googleapis.com/job_name: "icf-20190326132038"
dataflow.googleapis.com/region: "global"
}
logName: "projects/project-id/logs/dataflow.googleapis.com%2Fworker-startup"
receiveTimestamp: "2019-03-26T13:32:07.627920858Z"
resource: {
labels: {
job_id: "2019-03-26_06_25_16-6068768320191854196"
job_name: "icf-20190326132038"
project_id: "project-id"
region: "global"
step_id: ""
}
type: "dataflow_step"
}
severity: "CRITICAL"
timestamp: "2019-03-26T13:32:02Z"
}注意:在worker和master上运行pip install apache-master==2.11.0时,代码会运行。*
发布于 2019-03-30 07:23:05
我不确定,但在没有看到其余日志的情况下,很可能是这里的问题。是不兼容的依赖项。您是否能够在本地运行管道并查看是否有任何dep问题?
https://stackoverflow.com/questions/55363542
复制相似问题