首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >配置单元合并查询-计算cardinality_violation时出错(_col0,_col1)

配置单元合并查询-计算cardinality_violation时出错(_col0,_col1)
EN

Stack Overflow用户
提问于 2021-02-24 04:36:57
回答 1查看 518关注 0票数 1

我正在尝试运行配置单元查询。它会失败,并显示以下错误。

代码语言:javascript
复制
Error: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row (tag=0) {"key":{"_col0":{"transactionid":0,"bucketid":-1,"rowid":1},"_col1":"2020-10-28"},"value":{"_col0":1}}
        at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:256)
        at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:444)
        at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:170)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:164)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row (tag=0) {"key":{"_col0":{"transactionid":0,"bucketid":-1,"rowid":1},"_col1":"2020-10-28"},"value":{"_col0":1}}
        at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:244)
        ... 7 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating cardinality_violation(_col0,_col1)
        at org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:86)
        at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:841)
        at org.apache.hadoop.hive.ql.exec.FilterOperator.process(FilterOperator.java:122)
        at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:841)
        at org.apache.hadoop.hive.ql.exec.GroupByOperator.forward(GroupByOperator.java:1022)
        at org.apache.hadoop.hive.ql.exec.GroupByOperator.processAggr(GroupByOperator.java:827)
        at org.apache.hadoop.hive.ql.exec.GroupByOperator.processKey(GroupByOperator.java:701)
        at org.apache.hadoop.hive.ql.exec.GroupByOperator.process(GroupByOperator.java:767)
        at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:235)
        ... 7 more
Caused by: java.lang.RuntimeException: Cardinality Violation in Merge statement: [0, -1, 1],2020-10-12
        at org.apache.hadoop.hive.ql.udf.generic.GenericUDFCardinalityViolation.evaluate(GenericUDFCardinalityViolation.java:56)
        at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:186)
        at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77)
        at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:65)
        at org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:81)
        ... 15 more

Container killed by the ApplicationMaster.
Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143

下面是查询,

代码语言:javascript
复制
MERGE INTO TABLE1 A
using  (select * from TABLE2) B
ON
LOWER(TRIM(A.A)) = LOWER(TRIM(B.A)) AND
LOWER(TRIM(A.B)) = LOWER(TRIM(B.B))
WHEN MATCHED AND LOWER(TRIM(A.C)) = LOWER(TRIM(B.C))  OR TRIM(A.D)= TRIM(B.D)
THEN
UPDATE SET
A= regexp_replace(A,"[^ ']","#"),
B= regexp_replace(B,"[^@.]","#"),
C= regexp_replace(C,"[^.-]","#"),
D= regexp_replace(D, "[^ ']","#"),
E= regexp_replace(E, "[^ ']","#" ),
F= regexp_replace(F, "[^ .+-]","#"),
G= regexp_replace(G,"[^ ']","#"),
H= regexp_replace(H,"[^ ']","#"),
I= regexp_replace(I,"[^ ']","#"),
J= regexp_replace(J,"[^ ']","#"),
K= regexp_replace(K,"[^ ']","#"),
L= regexp_replace(L,"[^ .+-]","#"),
M= regexp_replace(M,"[^ ']","#"),
N= regexp_replace(N,"[^ ']","#"),
O= regexp_replace(O,"[^ ']","#"),
P= regexp_replace(P,"[^ ']","#"),
Q= regexp_replace(Q,"[^ ']","#"),
R= regexp_replace(R,"[^ .+-]","#"),
S= regexp_replace(S,"[^ ']","#"),
T= regexp_replace(T,"[^ +-.]","#");

尝试切换基数,但失败,出现绑定异常的arrayould。

请务必分享有任何信息或解决方案的知识或见解。

检查了几个重叠堆栈,没有发现任何与此问题相关的线索。

提前感谢

EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2021-02-24 05:06:26

基数检查(hive.merge.cardinality.check=false)的切换将导致一些数据损坏,如果它能正常工作的话。

检查您的数据并解决问题。问题是TABLE2中有超过1行与TABLE1中的同一行匹配。它可以在连接键中重复,你可以使用row_number过滤器或distinct等来修复它,或者修复你的ON子句,添加更多的键使其唯一。

票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/66340742

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档