首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >如何查询hive中的具体存储桶数据

如何查询hive中的具体存储桶数据
EN

Stack Overflow用户
提问于 2020-04-23 17:13:58
回答 1查看 71关注 0票数 0

我在hive中创建了一个分桶的表,其模式如下:

代码语言:javascript
复制
CREATE TABLE Songs_data_bucket (
Song_id STRING,
artist_id STRING,
album_name STRING,
song_views INT,
song_rating FLOAT)
CLUSTERED BY(song_rating) 
INTO 4 BUCKETS  
ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n';

在这里,对song_rating列进行了分类,并将整个数据划分为4个存储桶。现在,当我尝试仅使用以下命令检查第一个存储桶的内容时

代码语言:javascript
复制
SELECT * FROM Songs_data_bucket TABLESAMPLE(BUCKET 0 out of 4 on song_rating )

我得到了一个错误

代码语言:javascript
复制
14:40:46.835 [cf87ec7a-8910-453c-92ea-4aa98426a8f7 main] ERROR org.apache.hadoop.hive.ql.parse.CalcitePlanner - CBO failed, skipping CBO.
org.apache.hadoop.hive.ql.optimizer.calcite.CalciteSemanticException: Table Sample specified for songs_data_bucket. Currently we don't support Table Sample clauses in CBO, turn off cbo for queries on tableSamples.
        at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genTableLogicalPlan(CalcitePlanner.java:1660) ~[hive-exec-2.1.0.jar:2.1.0]
        at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genLogicalPlan(CalcitePlanner.java:3116) ~[hive-exec-2.1.0.jar:2.1.0]
        at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:939) ~[hive-exec-2.1.0.jar:2.1.0]
        at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:893) ~[hive-exec-2.1.0.jar:2.1.0]
        at org.apache.calcite.tools.Frameworks$1.apply(Frameworks.java:113) ~[calcite-core-1.6.0.jar:1.6.0]
        at org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:969) ~[calcite-core-1.6.0.jar:1.6.0]
        at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:149) ~[calcite-core-1.6.0.jar:1.6.0]
        at org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:106) ~[calcite-core-1.6.0.jar:1.6.0]
        at org.apache.hadoop.hive.ql.parse.CalcitePlanner.getOptimizedAST(CalcitePlanner.java:712) ~[hive-exec-2.1.0.jar:2.1.0]
        at org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:280) [hive-exec-2.1.0.jar:2.1.0]
        at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10755) [hive-exec-2.1.0.jar:2.1.0]
        at org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:239) [hive-exec-2.1.0.jar:2.1.0]
        at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:250) [hive-exec-2.1.0.jar:2.1.0]
        at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:437) [hive-exec-2.1.0.jar:2.1.0]
        at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:329) [hive-exec-2.1.0.jar:2.1.0]
        at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1158) [hive-exec-2.1.0.jar:2.1.0]
        at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1145) [hive-exec-2.1.0.jar:2.1.0]
        at org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:184) [hive-service-2.1.0.jar:2.1.0]
        at org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:269) [hive-service-2.1.0.jar:2.1.0]
        at org.apache.hive.service.cli.operation.Operation.run(Operation.java:324) [hive-service-2.1.0.jar:2.1.0]
        at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:460) [hive-service-2.1.0.jar:2.1.0]
        at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:447) [hive-service-2.1.0.jar:2.1.0]
        at org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:294) [hive-service-2.1.0.jar:2.1.0]
        at org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:497) [hive-service-2.1.0.jar:2.1.0]
        at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source) ~[?:?]
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_231]
        at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_231]
        at org.apache.hive.jdbc.HiveConnection$SynchronizedHandler.invoke(HiveConnection.java:1426) [hive-jdbc-2.1.0.jar:2.1.0]
        at com.sun.proxy.$Proxy23.ExecuteStatement(Unknown Source) [?:?]
        at org.apache.hive.jdbc.HiveStatement.runAsyncOnServer(HiveStatement.java:308) [hive-jdbc-2.1.0.jar:2.1.0]
        at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:250) [hive-jdbc-2.1.0.jar:2.1.0]
        at org.apache.hive.beeline.Commands.executeInternal(Commands.java:977) [hive-beeline-2.1.0.jar:2.1.0]
        at org.apache.hive.beeline.Commands.execute(Commands.java:1148) [hive-beeline-2.1.0.jar:2.1.0]
        at org.apache.hive.beeline.Commands.sql(Commands.java:1063) [hive-beeline-2.1.0.jar:2.1.0]
        at org.apache.hive.beeline.BeeLine.dispatch(BeeLine.java:1137) [hive-beeline-2.1.0.jar:2.1.0]
        at org.apache.hive.beeline.BeeLine.execute(BeeLine.java:965) [hive-beeline-2.1.0.jar:2.1.0]
        at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:875) [hive-beeline-2.1.0.jar:2.1.0]
        at org.apache.hive.beeline.cli.HiveCli.runWithArgs(HiveCli.java:35) [hive-beeline-2.1.0.jar:2.1.0]
        at org.apache.hive.beeline.cli.HiveCli.main(HiveCli.java:29) [hive-beeline-2.1.0.jar:2.1.0]
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_231]
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_231]
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_231]
        at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_231]
        at org.apache.hadoop.util.RunJar.run(RunJar.java:244) [hadoop-common-2.10.0.jar:?]
        at org.apache.hadoop.util.RunJar.main(RunJar.java:158) [hadoop-common-2.10.0.jar:?]
OK
No rows selected (0.491 seconds)

从日志中看,hive似乎不再支持表空间。有没有办法查询特定bbucket的数据,而不是使用上面的命令,或者我在命令中遗漏了一些东西。

请帮助查询...

EN

回答 1

Stack Overflow用户

发布于 2020-04-23 19:15:45

仔细阅读日志后,我发现将属性hive.cbo.enable设置为false解决了我的问题。看起来像是hive团队做了一些优化,但不管怎样,它解决了我的问题。

票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/61383522

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档