2.1.1版
问题描述:集合项终止的值作为映射键插入
蜂巢表:
CREATE TABLE profiles(
id int,
name struct<first_name: string, middle_name: string, last_name: string>,
phone struct<home: string, office: string>,
address map<string,struct<streat:string, appartment:int, zip:string>>
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
COLLECTION ITEMS TERMINATED BY '-'
MAP KEYS TERMINATED BY '='
LINES TERMINATED BY '\n'
STORED AS TEXTFILE;数据:
1000,Suresh--S,1234567890-1234567890,home=Venkatapuram1-2020-500001
1001,Mahesh-X-M,1234567890-1234567890,home=Venkatapuram2-2021-500001数据加载:
load data inpath '/handson/profiles_data.txt' overwrite into table profiles;来自select语句的实际数据:
SELECT * FROM profiles;
1000
{"first_name":"Suresh","middle_name":"","last_name":"S"}
{"home":"1234567890","office":"1234567890"}
{"home":
{"streat":"Venkatapuram1",**"appartment":null,"zip":null},"2020":null,
"500001": null}
1001
{"first_name":"Mahesh","middle_name":"X","last_name":"M"}
{"home":"1234567890","office":"1234567890"}
{"home":
{"streat":"Venkatapuram2",**"appartment":null,"zip":null},"2021":null,
"500001": null}预期:
1000
{"first_name":"Suresh","middle_name":"","last_name":"S"}
{"home":"1234567890","office":"1234567890"}
{"home":{"streat":"Venkatapuram1",**"appartment":2020,"zip":"500001"}**}
1001
{"first_name":"Mahesh","middle_name":"X","last_name":"M"}
{"home":"1234567890","office":"1234567890"}
{"home": {"streat":"Venkatapuram2",**"appartment":2021,"zip":"500001"**}}发布于 2017-08-13 10:07:59
正如在:HIVE nested ARRAY in MAP data type中回答的那样,您只能覆盖蜂巢中的前三个分隔符,而hive实际上支持8。在嵌套的数据结构中,对于每个嵌套级别,都会使用相应的分隔符。
在您的单元表中,位于address映射中的结构中的字段之间的分隔符是\u 004 (Unicode 4),它不能被覆盖。
您应该将输入更改为:
1000,Suresh--S,1234567890-1234567890,home=Venkatapuram1\u00042020\u0004500001
1001,Mahesh-X-M,1234567890-1234567890,home=Venkatapuram2\u00042021\u0004500001https://stackoverflow.com/questions/45647111
复制相似问题