Hive之explode()函数和posexplode()函数和lateral view函数
目录1、explode()函数2、posexplode()函数3、lateral view函数1、explode()函数英文释义:explode() takes in an array (or a map) as an input and outputs the elements of the array (map) as separate rows. UDTFs can be used in t
目录
1、explode()函数
英文释义:
explode() takes in an array (or a map) as an input and outputs the elements of the array (map) as separate rows. UDTFs can be used in the SELECT expression list and as a part of LATERAL VIEW.
As an example of using explode() in the SELECT expression list, consider a table named myTable that has a single column (myCol) and two rows:
总结:explode就是将hive一行中复杂的array或者map结构拆分成多行
举例:
hive>with temp as
( select '2:00,3:00,4:00,5:00' as examp_data)
select
data
from temp
lateral view explode(split(examp_data,',')) view1 as data;
2:00
3:00
4:00
5:00
思考:现在是一列,如果是两列呢?
hive>with temp as
( select 'a,b,c,d' as examp_data1,'2:00,3:00,4:00,5:00' as examp_data2)
select
data1
,data2
from temp
lateral view explode(split(examp_data1,',')) view1 as data1
lateral view explode(split(examp_data2,',')) view1 as data2
data1,data2
a,2:00
a,3:00
a,4:00
a,5:00
b,2:00
b,3:00
b,4:00
b,5:00
c,2:00
c,3:00
c,4:00
c,5:00
d,2:00
d,3:00
d,4:00
d,5:00
显然和我们想象的是有出入的,我们想让’a,b,c,d’和’2:00,3:00,4:00,5:00’一一对应该如何实现呢?
2、posexplode()函数
特点是不仅炸裂出数值,还附带索引,实现多列进行多行转换;
上述的例子可以通过posexplode()函数实现;
示例如下:
hive>with temp as
( select 'a,b,c,d' as examp_data1,'2:00,3:00,4:00,5:00' as examp_data2)
select
data1
,data2
from temp
lateral view posexplode(split(examp_data1,',')) view1 as index1,data1
lateral view posexplode(split(examp_data2,',')) view1 as index2,data2
where index1=index2
data1,data2
a,2:00
b,3:00
c,4:00
d,5:00
3、lateral view函数
英文解释:
Lateral View Syntax
lateralView: LATERAL VIEW udtf(expression) tableAlias AS columnAlias (’,’ columnAlias)*
fromClause: FROM baseTable (lateralView)*
Description
Lateral view is used in conjunction with user-defined table generating functions such as explode(). As mentioned in Built-in Table-Generating Functions, a UDTF generates zero or more output rows for each input row. A lateral view first applies the UDTF to each row of base table and then joins resulting output rows to the input rows to form a virtual table having the supplied table alias.
lateral view用于和split, explode等UDTF一起使用,它能够将一列数据拆成多行数据,在此基础上可以对拆分后的数据进行聚合;
1、Lateral View用于和UDTF函数(explode、split)结合来使用
2、首先通过UDTF函数拆分成多行,再将多行结果组合成一个支持别名的虚拟表
3、主要解决在select使用UDTF做查询过程中,查询只能包含单个UDTF,不能包含其他字段、以及多个UDTF的问题
4、语法:LATERAL VIEW udtf(expression) tableAlias AS columnAlias (‘,’ columnAlias)
上面已举例,不在重复;
4、lateral view 和 lateral view outer 的区别
lateral view explode 炸裂的数组中如果存在null,则这一条数据舍弃;lateral view ourer explode 炸裂的数组中如果存在null,则这一条数据保留,炸裂的字段值填充为null ;
测试:
with user_view as ( --构造虚拟表
select 'xiaohong' name, '中级|黄金' level,'15' num,'看电视,跳舞' favs
union all
select 'xiaohei' name,'中级|黄金' level,'25' num, null as favs
)
select
name
,column1
,num
,column2
from user_view
lateral view explode (split(level,'\\|')) table1 as column1
lateral view explode (split(favs,'\\,')) table2 as column2
结果如下:
with user_view as ( --构造虚拟表
select 'xiaohong' name, '中级|黄金' level,'15' num,'看电视,跳舞' favs
union all
select 'xiaohei' name,'中级|黄金' level,'25' num, null as favs
)
select
name
,column1
,num
,column2
from user_view
lateral view outer explode (split(level,'\\|')) table1 as column1
lateral view outer explode (split(favs,'\\,')) table2 as column2
结果如下:
5、lateral view 后面跟where 限制条件
select o.*, table_view.new_col
from table_origin o
lateral view UDTF(expression) table_view as new_col
where day ='2024-01-01'
注意:
lateral view的位置是from后where条件前
生成的虚拟表的表名不可省略
from后可带多个lateral view
如果要拆分的字段有null值,需要使用lateral view outer 替代,避免数据缺失
6、explode+str_to_map
with temp as (
select '小编' as name,32 as math,88 as english
union all
select '建国' as name,95 as math,93 as english
)
select table_4.name,
a.item,
a.score
from table_4
lateral view explode(
str_to_map(concat('math=',math,'&english=',english),'&','=')
) a as item,score;
name | item | score |
---|---|---|
小编 | math | 32 |
小编 | english | 88 |
建国 | math | 95 |
建国 | english | 93 |
更多推荐
所有评论(0)