1. 系統(tǒng)內置函數(shù)
查看系統(tǒng)內置函數(shù):show functions ;
顯示內置函數(shù)的用法: desc function lag; – lag為函數(shù)名
顯示詳細的內置函數(shù)用法: desc function extended lag;
1.1 行轉列
行轉列是指多行數(shù)據(jù)轉換為一個列的字段。
Hive行轉列用到的函數(shù)concat(str1,str2,...)
字段或字符串拼接concat_ws('分割符',str1,str2,...)
將字段或字符串按分割符號拼接collect_set(column1)
, 收集某個字段的值,進行去重匯總,產(chǎn)生Array類型,即將某列數(shù)據(jù)轉換成數(shù)組
行轉列函數(shù)的應用:將星座和血型相同的人歸類到一起
原數(shù)據(jù)
腳本
with t1 as (
select name, concat(constellation,',', blood_type) as cb
from constellation
)
select cb,concat_ws(',',collect_set(name)) as names from t1
group by cb;
行轉列后的結果
1.2 列轉行
explode(col)
:將hive一列中復雜的array或者map結構拆分成多行。explode(ARRAY)
數(shù)組的每個元素生成一行explode(MAP)
map中每個key-value對,生成一行,key為一列,value為一列
腳本
select explode(names) name from constellation_01;
列轉行后執(zhí)行結果lateral view
和 split, explode等UDTF一起使用。explode能夠將一列數(shù)據(jù)拆分成多行,形成一張臨時表,與原表進行聚合
select cb,name
from constellation_01
lateral view explode(names) constellation_01_temp as name;
1.3 窗口函數(shù)
不僅展示窗口計算的字段,也展示原字段
源數(shù)據(jù)
jack,2017-01-01,10
tony,2017-01-02,15
jack,2017-02-03,23
tony,2017-01-04,29
jack,2017-01-05,46
jack,2017-04-06,42
tony,2017-01-07,50
jack,2017-01-08,55
mart,2017-04-08,62
mart,2017-04-09,68
neil,2017-05-10,12
mart,2017-04-11,75
neil,2017-06-12,80
mart,2017-04-13,94
執(zhí)行腳本
create table business(
name string,
orderdate string,
cost int
) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',';
load data local inpath "/opt/module/datas/business.txt" into table business;
執(zhí)行結果
1.3.1 聚合開窗函數(shù)
窗口函數(shù)和聚合函數(shù)的結合使用
執(zhí)行腳本
select name,orderdate,cost,
count(*) over(partition by month(orderdate)),-- 顧客信息及每月總購買人數(shù)
sum(cost) over(partition by month(orderdate)), -- 顧客信息及每月總購買金額
sum(cost) over(partition by month(orderdate) order by orderdate rows between unbounded preceding and current row), -- 起點到當前行的累加金額
sum(cost) over(partition by month(orderdate) order by orderdate rows between 1 preceding and current row), -- 前一行 和 當前行
sum(cost) over(partition by month(orderdate) order by orderdate rows between current row and 1 following), -- 當前行 和 后一行
sum(cost) over(partition by month(orderdate) order by orderdate rows between 1 preceding and 1 following), -- 前一行 到 后一行
sum(cost) over(partition by month(orderdate) order by orderdate rows between current row and unbounded following) -- 當前行到后面所有行
from business
計算結果
1.3.2 排序開窗函數(shù)
ROW_NUMBER()
從1開始,按照順序,生成分組內記錄的序列RANK()
生成數(shù)據(jù)項在分組中的排名,排名相等會在名次中留下空位DENSE_RANK()
生成數(shù)據(jù)項在分組中的排名,排名相等會在名次中不會留下空位
執(zhí)行腳本
select name,orderdate,cost,
row_number() over (partition by month(orderdate) order by cost desc), -- 按花費金額由多到少排序,依次編號
rank() over (partition by month(orderdate) order by cost desc), -- 按花費金額由多到少排序,相等的排名會留下空位
dense_rank() over (partition by month(orderdate) order by cost desc) -- 按花費金額由多到少排序,相等的排名不會留下空位
from business;
計算結果
1.3.3 lag和lead函數(shù)
LAG(col,n)
: 往前第n行第col列的數(shù)據(jù)LEAD(col,n)
:往后第n行第col列的數(shù)據(jù)
執(zhí)行腳本
select name,orderdate,cost,
lag(orderdate,1) over(partition by month(orderdate) order by orderdate), -- 上一次購買日期(往前第n行數(shù)據(jù))
lead(orderdate,1)over(partition by month(orderdate) order by orderdate) -- 下一次購買日期(往后第n行數(shù)據(jù))
from business;
計算結果
1.4 JSON解析函數(shù)
hive中內置的json_tuple()函數(shù),可以將json數(shù)據(jù)解析成普通的結構化數(shù)據(jù)表
源數(shù)據(jù)
{"movie":"1193","rate":"5","timeStamp":"978300760","uid":"1"}
{"movie":"661","rate":"3","timeStamp":"978302109","uid":"1"}
{"movie":"914","rate":"3","timeStamp":"978301968","uid":"1"}
{"movie":"3408","rate":"4","timeStamp":"978300275","uid":"1"}
{"movie":"2355","rate":"5","timeStamp":"978824291","uid":"1"}
{"movie":"1197","rate":"3","timeStamp":"978302268","uid":"1"}
執(zhí)行腳本文章來源:http://www.zghlxwxcb.cn/news/detail-662953.html
create table t_json(json string);
load data local inpath "/export/data/datawarehouse/movie.txt" overwrite into table t_json;
create table movie_rate
as
select json_tuple(json,'movie','rate','timeStamp','uid') as (movie,rate,ts,uid) from t_json;
執(zhí)行結果文章來源地址http://www.zghlxwxcb.cn/news/detail-662953.html
到了這里,關于Hive的窗口函數(shù)與行列轉換函數(shù)及JSON解析函數(shù)的文章就介紹完了。如果您還想了解更多內容,請在右上角搜索TOY模板網(wǎng)以前的文章或繼續(xù)瀏覽下面的相關文章,希望大家以后多多支持TOY模板網(wǎng)!