我希望在一个Python文件中放置一些常量,然后将其导入另一个Python文件中。我创建了两个文件,一个是常量文件,另一个是导入文件,一切都在本地运行良好:
constants.py
CONST = "hi guy"test_constants.py
from constants import CONST
import sys
for line in sys.stdin:
print(CONST)局部测试
$ echo "dummy" | python test_constants.py
hi guy基于蜂巢(直线)的测试
hive> add file hdfs://path/.../test_constants.py;
No rows affected (0.191 seconds)
hive> add file hdfs://path/.../constants.py;
No rows affected (0.049 seconds)
hive> list files;
resource
/tmp/bb09f878-7e36-4aa2-8566-a30950072bcb_resources/test_constants.py
/tmp/bb09f878-7e36-4aa2-8566-a30950072bcb_resources/constants.py
2 rows selected (0.179 seconds)
hive> with t as (select 1 as dummy)
select transform (dummy)
using 'python test_constants.py'
as dummy_out
from t;
Error: org.apache.hive.service.cli.HiveSQLException:
Error while processing statement: FAILED:
Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask.
Vertex failed, vertexName=Map 1, vertexId=vertex_1535407036047_170618_1_00, diagnostics=[Task failed, taskId=task_1535407036047_170618_1_00_000000, diagnostics=[TaskAttempt 0 failed, info=[Error: Error while running task ( failure ) : attempt_1535407036047_170618_1_00_000000_0:
java.lang.RuntimeException: java.lang.RuntimeException: Hive Runtime Error while closing operators日志看起来如下:
Log Type: stderr
Log Upload Time: Mon Oct 29 15:50:42 -0700 2018
Log Length: 251
2018-10-29 15:45:16 Starting to run new task attempt: attempt_1535407036047_170618_1_00_000000_3
Traceback (most recent call last):
File "test_constants.py", line 1, in <module>
from constants import CONST
ImportError: No module named constants这两个文件似乎都位于同一个文件夹中,因此导入看起来应该工作,但它不工作。
2018-10-30:
但是,@serge_k的回答是可行的,不过,我最初遇到了麻烦,因为我的Python的路径最初是不可用的。在将所有文件移动到HDFS上的/tmp之后,一切都按预期进行了工作。
hive> add file hdfs://dev/tmp/transforms;
No rows affected (0.108 seconds)
hive> list files;
resource
/tmp/61ecb363-ead6-4679-8f58-3611db9487b2_resources/transforms
1 row selected (0.202 seconds)
hive> select transform (col) using 'python transforms/test_constants.py' as dummy_out from dummy.test;
dummy_out
hi guy
hi guy
hi guy
hi guy
hi guy
hi guy
hi guy
hi guy
hi guy
hi guy
10 rows selected (63.734 seconds)发布于 2018-10-30 08:23:03
将您的python脚本放在一个文件夹中,例如files,将整个文件夹添加到分布式缓存中,并将脚本调用为python files/script_name.py
hive> add file ./files;
Added resources: [./files]
hive> with t as (select 1 as dummy) select transform (dummy)
using 'python files/test_constants.py' as dummy_out from t;
OK
hi guyhttps://stackoverflow.com/questions/53055491
复制相似问题