動態分區裝載數據

不開啟

一個個分區導入，分區需要做到一對一。

hive (zmgdb)> insert overwrite table p_t3 partition (city='ningbo')
? ? ? ? ? ? > select name,post,address from p_t1 where city='ningbo';

會啟動mapreduce進行導入，mr卡在kill job_xxxx，等了很久沒反應，原因是分配給yarn的內存太小。需要修改yarn-site.xml里的配置。

見博客：?http://blog.csdn.net/zengmingen/article/details/52609873

（如果按照上面的博客配置了，時而行，時而不行。如果是vm虛擬機搭建的，那就重啟吧）

如果一張表，比如按全國的城市分區的表，那一個個分區導入，費時費力。

于是需要有自動能一對一導入的功能，即動態分區裝載數據。

開啟

hive>set hive.exec.dynamic.partition=true;
hive>set hive.exec.dynamic.partition.mode=nostrict;
hive>set hive.exec.max.dynamic.partitions.pernode=1000;

hive (zmgdb)?> create table p_t4 like p_t1;

hive (zmgdb)> ?insert overwrite table p_t4 partition (city)
? ? ? ? ? ? > select * from p_t1;

hive會啟動mapreduce導入。

hive (zmgdb)> select * from p_t4;
OK
p_t4.name ? ? ? p_t4.post ? ? ? p_t4.address ? ?p_t4.city
1 ? ? ? dddd ? ?dddd ? ?beijing
2 ? ? ? www ? ? www ? ? beijing
3 ? ? ? eeee ? ?wwww ? ?beijing
4 ? ? ? tttt ? ?cccc ? ?beijing
5 ? ? ? yyycc ? dddd ? ?beijing
1 ? ? ? dddd ? ?dddd ? ?ningbo
2 ? ? ? www ? ? www ? ? ningbo
3 ? ? ? eeee ? ?wwww ? ?ningbo
4 ? ? ? tttt ? ?cccc ? ?ningbo
5 ? ? ? yyycc ? dddd ? ?ningbo
1 ? ? ? dddd ? ?dddd ? ?taizhou
2 ? ? ? www ? ? www ? ? taizhou
3 ? ? ? eeee ? ?wwww ? ?taizhou
4 ? ? ? tttt ? ?cccc ? ?taizhou
5 ? ? ? yyycc ? dddd ? ?taizhou
Time taken: 0.155 seconds, Fetched: 15 row(s)
hive (zmgdb)>?

table p_t1

table p_t4

本文來自互聯網用戶投稿，該文觀點僅代表作者本人，不代表本站立場。本站僅提供信息存儲空間服務，不擁有所有權，不承擔相關法律責任。
如若轉載，請注明出處：http://www.pswp.cn/news/539178.shtml
繁體地址，請注明出處：http://hk.pswp.cn/news/539178.shtml
英文地址，請注明出處：http://en.pswp.cn/news/539178.shtml

如若內容造成侵權/違法違規/事實不符，請聯系多彩編程網進行投訴反饋email:809451989@qq.com，一經查實，立即刪除！