首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >如何让hive并发运行mapreduce作业?

如何让hive并发运行mapreduce作业?
EN

Stack Overflow用户
提问于 2012-01-15 15:24:04
回答 1查看 10.2K关注 0票数 6

我是新来蜂巢的我遇到了一个问题,

我在蜂巢里有一张桌子,就像这样:

代码语言:javascript
复制
create table td(id int, time string, ip string, v1 bigint, v2 int, v3 int,
v4 int, v5 bigint, v6 int)  PARTITIONED BY(dt STRING)
ROW FORMAT DELIMITED FIELDS
TERMINATED BY ','  lines TERMINATED BY '\n' ;  

我运行sql,如下所示:

代码语言:javascript
复制
from td
INSERT OVERWRITE  DIRECTORY '/tmp/total.out' select count(v1)
INSERT OVERWRITE  DIRECTORY '/tmp/totaldistinct.out' select count(distinct v1)
INSERT OVERWRITE  DIRECTORY '/tmp/distinctuin.out' select distinct v1

INSERT OVERWRITE  DIRECTORY '/tmp/v4.out' select v4 , count(v1), count(distinct v1) group by v4
INSERT OVERWRITE  DIRECTORY '/tmp/v3v4.out' select v3, v4 , count(v1), count(distinct v1) group by v3, v4

INSERT OVERWRITE  DIRECTORY '/tmp/v426.out' select count(v1), count(distinct v1)  where v4=2 or v4=6
INSERT OVERWRITE  DIRECTORY '/tmp/v3v426.out' select v3, count(v1), count(distinct v1) where v4=2 or v4=6 group by v3

INSERT OVERWRITE  DIRECTORY '/tmp/v415.out' select count(v1), count(distinct v1)  where v4=1 or v4=5
INSERT OVERWRITE  DIRECTORY '/tmp/v3v415.out' select v3, count(v1), count(distinct v1) where v4=1 or v4=5 group by v3

它可以工作,并且输出结果就是我想要的。

但有一个问题,hive生成9个mapreduce作业并逐个运行这些作业。

我对此查询运行explain,得到以下消息:

代码语言:javascript
复制
STAGE DEPENDENCIES:
  Stage-9 is a root stage
  Stage-0 depends on stages: Stage-9
  Stage-10 depends on stages: Stage-9
  Stage-1 depends on stages: Stage-10
  Stage-11 depends on stages: Stage-9
  Stage-2 depends on stages: Stage-11
  Stage-12 depends on stages: Stage-9
  Stage-3 depends on stages: Stage-12
  Stage-13 depends on stages: Stage-9
  Stage-4 depends on stages: Stage-13
  Stage-14 depends on stages: Stage-9
  Stage-5 depends on stages: Stage-14
  Stage-15 depends on stages: Stage-9
  Stage-6 depends on stages: Stage-15
  Stage-16 depends on stages: Stage-9
  Stage-7 depends on stages: Stage-16
  Stage-17 depends on stages: Stage-9
  Stage-8 depends on stages: Stage-17

阶段9-17似乎对应于mapreduce作业0-8

但是从上面的解释消息来看,阶段10-17仅依赖于阶段9,

所以我有一个问题,为什么作业1-8不能同时运行?

或者,如何使作业1-8同时运行?

非常感谢您的帮助!

EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2012-01-17 12:26:01

在hive-default.xml中,有一个名为"hive.exec.parallel“属性,可用于并行执行作业。默认值为"false“。您可以将其更改为"true“以获得此能力。您可以使用另一个属性"hive.exec.parallel.thread.number“来控制最多可以并行执行多少个作业。

有关更多细节,请访问:https://issues.apache.org/jira/browse/HIVE-549

票数 5
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/8868186

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档