1、應用場景:面向結構化數據,即:結構清晰的數據2、CLASS_PATH有以下幾種選擇:選擇一:CSV【簡單類型】數據呈現:"1","2","Football""2","2","Soccer""3","2","Baseball & Softball"代碼:createtableifnotexists TABLE_NAME(id string,page string,word string)row format serde 'org.apache.hadoop.hive.serde2.OpenCSVSerde'with serdeproperties('separatorChar'=',','quoteChar'='"','escapeChar'='\\')選擇二:regex【正則】數據呈現:123,張三,16853210211116,true,26238.5,閱讀;跑步;唱歌,java:98;mysql:54,province:南京;city:江寧代碼:createtableifnotexists TABLE_NAME(id int,name string,timebigint,isPartyMember boolean,hobby array<string>,scores map<string,int>,address struct<province:string,city:string>)row format serde 'org.apache.hadoop.hive.serde2.RegexSerDe'with serdeproperties('input.regex'='^(//d+),(.*?),(//d+),(true|false),(\\d+\\.?\\d+?)$')選擇三:JsonSerDe數據呈現:{"name":"henry","age":22,"gender":"male","phone":"18014499655"}代碼:createtableifnotexists json(name string,age int,gender string,phone string)row format serde 'org.apache.hive.hcatalog.data.JsonSerDe'
*額外處理
1、store【存儲】基本語法:stored as'存儲格式'存儲格式:textfile?,orc,parquet,sequencefile,...案例:stored as textfile2、tblproperties【表屬性】(通用):案例【實際情況具體分析】:tblproperties('skip.header.line.count'='1' 【跳過表頭,即:第一行】...)
案例一:/*1|henry|1.81|1995-03-18|江蘇,南京,玄武,北京東路68號|logicjava:88,javaoop:76,mysql:80,ssm:82|beauty,money,joke2|arill|1.59|1996-7-30|安徽,蕪湖,南山,西湖東路68號|logicjava:79,javaoop:58,mysql:65,ssm:85|beauty,power,sleeping3|mary|1.72|1995-09-02|山東,青島,長虹,天山東路68*/droptableifexists students;createtableifnotexists students(number int,name string,height decimal(3,2),birthday date,house struct<province:string,city:string,district:string,street:string>,scores map<string,int>,hobby array<string>)row format delimitedfieldsterminatedby"|"collection items terminatedby","map keysterminatedby":"stored as textfile;loaddata inpath '/zhou/students.txt'overwrite intotable zhou.students;案例二:/*user_id,auction_id,cat_id,cat1,property,buy_mount,day
786295544,41098319944,50014866,50022520,21458:86755362;13023209:3593274;10984217:21985;122217965:3227750;21477:28695579;22061:30912;122217803:3230095,2,123434123*/droptableifexists sam_mum_baby_trade;create external tableifnotexists sam_mum_baby_trade(user_id bigint,auction_id bigint,cat_id bigint,cat1 bigint,property map<bigint,bigint>,buy_mount int,daybigint)row format delimitedfieldsterminatedby","collection items terminatedby";"map keysterminatedby":"stored as textfiletblproperties ('skip.header.line.count'='1');loaddata inpath '/zhou/sam_mum_baby_trade.csv'intotable zhou.sam_mum_baby_trade;案例三:/*"1","2","Football""2","2","Soccer""3","2","Baseball & Softball"*/droptableifexists categories;createtableifnotexists categories(id string,page string,word string)row format serde 'org.apache.hadoop.hive.serde2.OpenCSVSerde'with serdeproperties('separatorChar'=',','quoteChar'='"','escapeChar'='\\')stored as textfile;loaddata inpath '/zhou/categories.csv'overwrite intotable zhou.categories;select*from categories;案例四:/*{"name":"henry","age":22,"gender":"male","phone":"18014499655"}*///Jsondroptableifexists json;createtableifnotexists json(name string,age int,gender string,phone string)row format serde 'org.apache.hive.hcatalog.data.JsonSerDe'stored as textfile;loaddata inpath '/zhou/json.log'overwrite intotable zhou.json;案例五:/*125;男;2015-9-7 1:52:22;1521.84883;男;2014-9-18 5:24:42;6391.45652;女;2014-5-4 5:56:45;9603.79*/create external tableifnotexists test1w(user_id int,user_gender string,order_time timestamp,order_amount decimal(6,2))row format serde 'org.apache.hadoop.hive.serde2.RegexSerDe'with serdeproperties('input.regex'='(\\d+);(.*?);(\\d{4}-\\d{1,2}-\\d{1,2} \\d{1,2}:\\d{1,2}:\\d{1,2});(\\d+\.?\\d+?)')stored as textfilelocation '/zhou/test1w';select*from test1w;
二:hive建表【高階語法】
1:CTAS
【本質】:在原有表的基礎上查詢并創建新表
基本語法:create table if not exists NEW_TABLE_NAME as select ... from OLD_TABLE_NAME ...
案例:原有的表:hive_ext_regex_test1w語句:create table if not exists hive_ext_test_before2015 asselect * from hive_ext_regex_test1wwhere year(order_time)<=2015;
前言:
描述:flask定時任務調用的方法中使用了current_app.logger.info()記錄日志報錯 報錯代碼 raise RuntimeError(unbound_message) from None
RuntimeError: Working outside of application context.This typically means that you attempted to use functiona…
Git簡介與使用
Intro Git is a free and open source distributed version control system designed to handle everything from small to very large projects with speed and efficiency. Git是一款分布式版本控制系統(VSC),是團隊合作開發…
文章目錄 笛卡爾積任意笛卡爾積投影映射概述詳解一一、定義二、性質三、應用四、結論 詳解二定義與性質應用與意義示例結論 參考文獻 笛卡爾積
任意笛卡爾積 { A t , t ∈ T } \{A_t,t \in T\} {At?,t∈T}是一個集合族,其中T為一個非空指標集,稱 t ∈…