首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >如何在hive分区数据中创建表?

如何在hive分区数据中创建表?
EN

Stack Overflow用户
提问于 2018-03-29 19:44:14
回答 2查看 32.1K关注 0票数 1
代码语言:javascript
复制
drwxr-xr-x   - mgupta supergroup          0 2018-03-26 22:16 /kylin/retailer/qi_basket_brand_bucket_fact/_impala_insert_staging
drwxr-xr-x   - mgupta supergroup          0 2018-03-26 22:18 /kylin/retailer/qi_basket_brand_bucket_fact/product_hierarchy_type=CI
[mgupta@sjc-dev-binn01 ~]$ hadoop fs -ls /kylin/retailer/qi_basket_brand_bucket_fact/product_hierarchy_type=CI
Found 27 items
drwxr-xr-x   - mgupta supergroup          0 2018-03-26 22:16 /kylin/retailer/qi_basket_brand_bucket_fact/product_hierarchy_type=CI/month_id=201601
drwxr-xr-x   - mgupta supergroup          0 2018-03-26 22:16 /kylin/retailer/qi_basket_brand_bucket_fact/product_hierarchy_type=CI/month_id=201602
drwxr-xr-x   - mgupta supergroup          0 2018-03-26 22:16 /kylin/retailer/qi_basket_brand_bucket_fact/product_hierarchy_type=CI/month_id=201603
drwxr-xr-x   - mgupta supergroup          0 2018-03-26 22:16 /kylin/retailer/qi_basket_brand_bucket_fact/product_hierarchy_type=CI/month_id=201604
drwxr-xr-x   - mgupta supergroup          0 2018-03-26 22:16 /kylin/retailer/qi_basket_brand_bucket_fact/product_hierarchy_type=CI/month_id=201605
drwxr-xr-x   - mgupta supergroup          0 2018-03-26 22:16 /kylin/retailer/qi_basket_brand_bucket_fact/product_hierarchy_type=CI/month_id=201606
drwxr-xr-x   - mgupta supergroup          0 2018-03-26 22:16 /kylin/retailer/qi_basket_brand_bucket_fact/product_hierarchy_type=CI/month_id=201607
drwxr-xr-x   - mgupta supergroup          0 2018-03-26 22:16 /kylin/retailer/qi_basket_brand_bucket_fact/product_hierarchy_type=CI/month_id=201608
drwxr-xr-x   - mgupta supergroup          0 2018-03-26 22:17 /kylin/retailer/qi_basket_brand_bucket_fact/product_hierarchy_type=CI/month_id=201609
drwxr-xr-x   - mgupta supergroup          0 2018-03-26 22:17 /kylin/retailer/qi_basket_brand_bucket_fact/product_hierarchy_type=CI/month_id=201610
drwxr-xr-x   - mgupta supergroup          0 2018-03-26 22:17 /kylin/retailer/qi_basket_brand_bucket_fact/product_hierarchy_type=CI/month_id=201611
drwxr-xr-x   - mgupta supergroup          0 2018-03-26 22:17 /kylin/retailer/qi_basket_brand_bucket_fact/product_hierarchy_type=CI/month_id=201612
drwxr-xr-x   - mgupta supergroup          0 2018-03-26 22:17 /kylin/retailer/qi_basket_brand_bucket_fact/product_hierarchy_type=CI/month_id=201701
drwxr-xr-x   - mgupta supergroup          0 2018-03-26 22:17 /kylin/retailer/qi_basket_brand_bucket_fact/product_hierarchy_type=CI/month_id=201702
drwxr-xr-x   - mgupta supergroup          0 2018-03-26 22:17 /kylin/retailer/qi_basket_brand_bucket_fact/product_hierarchy_type=CI/month_id=201703
drwxr-xr-x   - mgupta supergroup          0 2018-03-26 22:17 /kylin/retailer/qi_basket_brand_bucket_fact/product_hierarchy_type=CI/month_id=201704
drwxr-xr-x   - mgupta supergroup          0 2018-03-26 22:17 /kylin/retailer/qi_basket_brand_bucket_fact/product_hierarchy_type=CI/month_id=201705
drwxr-xr-x   - mgupta supergroup          0 2018-03-26 22:17 /kylin/retailer/qi_basket_brand_bucket_fact/product_hierarchy_type=CI/month_id=201706
drwxr-xr-x   - mgupta supergroup          0 2018-03-26 22:17 /kylin/retailer/qi_basket_brand_bucket_fact/product_hierarchy_type=CI/month_id=201707
drwxr-xr-x   - mgupta supergroup          0 2018-03-26 22:18 /kylin/retailer/qi_basket_brand_bucket_fact/product_hierarchy_type=CI/month_id=201708
drwxr-xr-x   - mgupta supergroup          0 2018-03-26 22:18 /kylin/retailer/qi_basket_brand_bucket_fact/product_hierarchy_type=CI/month_id=201709
drwxr-xr-x   - mgupta supergroup          0 2018-03-26 22:18 /kylin/retailer/qi_basket_brand_bucket_fact/product_hierarchy_type=CI/month_id=201710
drwxr-xr-x   - mgupta supergroup          0 2018-03-26 22:18 /kylin/retailer/qi_basket_brand_bucket_fact/product_hierarchy_type=CI/month_id=201711
drwxr-xr-x   - mgupta supergroup          0 2018-03-26 22:18 /kylin/retailer/qi_basket_brand_bucket_fact/product_hierarchy_type=CI/month_id=201712
drwxr-xr-x   - mgupta supergroup          0 2018-03-26 22:18 /kylin/retailer/qi_basket_brand_bucket_fact/product_hierarchy_type=CI/month_id=201801
drwxr-xr-x   - mgupta supergroup          0 2018-03-26 22:18 /kylin/retailer/qi_basket_brand_bucket_fact/product_hierarchy_type=CI/month_id=201802
drwxr-xr-x   - mgupta supergroup          0 2018-03-26 22:18 /kylin/retailer/qi_basket_brand_bucket_fact/product_hierarchy_type=CI/month_id=201803
[mgupta@sjc-dev-binn01 ~]$ hadoop fs -ls /kylin/retailer/qi_basket_brand_bucket_fact/product_hierarchy_type=CI/month_id=201601
Found 3 items
drwxr-xr-x   - mgupta supergroup          0 2018-03-26 22:16 /kylin/retailer/qi_basket_brand_bucket_fact/product_hierarchy_type=CI/month_id=201601/company_sid=0
drwxr-xr-x   - mgupta supergroup          0 2018-03-26 22:16 /kylin/retailer/qi_basket_brand_bucket_fact/product_hierarchy_type=CI/month_id=201601/company_sid=38527
drwxr-xr-x   - mgupta supergroup          0 2018-03-26 22:16 /kylin/retailer/qi_basket_brand_bucket_fact/product_hierarchy_type=CI/month_id=201601/company_sid=__HIVE_DEFAULT_PARTITION__
[mgupta@sjc-dev-binn01 ~]$ hadoop fs -ls /kylin/retailer/qi_basket_brand_bucket_fact/product_hierarchy_type=CI/month_id=201601/company_sid=0
Found 1 items
-rw-r--r--   3 mgupta supergroup    2069014 2018-03-26 22:16 /kylin/retailer/qi_basket_brand_bucket_fact/product_hierarchy_type=CI/month_id=201601/company_sid=0/f9466a0068b906cf-6ace7f8500000049_294515768_data.0.parq
[mgupta@sjc-dev-binn01 ~]$
EN

回答 2

Stack Overflow用户

回答已采纳

发布于 2018-03-29 21:35:03

您可以尝试下面给出的步骤。

方法1

  1. 识别模式(列名和类型,包括已分区的列)
  2. 创建一个配置单元分区表(请确保将分区列和分隔符
  3. 数据添加到分区表中。在这种情况下,加载文件将没有分区列,因为您将通过load命令对其进行硬编码)

创建表 (col1 data_type1,col2 data_type2..)按(Part_col data_type3)行格式分隔字段,以'‘结尾的行格式分隔字段将数据inpath '/hdfs/loc/file1’加载到表分区(=‘201601’);将数据inpath '/hdfs/loc/file1‘加载到表分区(=’201602‘)将数据inpath '/hdfs/loc/file1’加载到表分区(=‘201603’)

以此类推。

方法2

使用与主表相同的模式创建一个临时表(临时表),但不使用任何分区将整个数据插入到此表中(请确保将‘

  • partitions
  • Load ’作为这些文件中的一个字段)

  • 使用动态分区插入将数据从临时表加载到主表。

创建表 (col1 data_type1,col2 data_type2..)以'‘create table (col1 data_type1,col2 data_type2..)结尾的行格式分隔字段由(part_col目录)分区;将数据inpath‘/hdfs/loc/data_type3/’加载到表中;设置hive.exec.dynamic.partition=true;设置hive.exec.dynamic.partition.mode=nonstrict;插入到表分区(Part_col)中从part_col选择col1,col2,....part_col

方法2的关键方面是:

  • 使'part_col‘在最后的insert语句中作为加载文件
  • 中的一个字段可用,获取'part_col’作为from select子句的最后一个字段。
票数 8
EN

Stack Overflow用户

发布于 2021-03-03 00:49:34

让我们创建一个包含年和月分区的表,并在表中添加时间戳:

代码语言:javascript
复制
CREATE TABLE `mypart_p`(
   `id` bigint, 
   `open_ts` string 
)
PARTITIONED BY (YEAR INT, MONTH INT)

现在我必须修改表格了。

代码语言:javascript
复制
ALTER TABLE mypart_p ADD PARTITION (YEAR=2020, MONTH=1)

我必须每年和每个月都这样做,用python循环地做。现在让我们用数据填充它,并指定该数据属于哪个分区:

代码语言:javascript
复制
INSERT into mypart_p PARTITION (YEAR=2020, MONTH=1)

select id,
open_ts

FROM some_other_table

WHERE substring(open_ts,0,4) = '2020'
AND substring(open_ts,6,2) = '01'
票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/49555189

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档