Hive 爆破结构数组 key: value:
Hive Explode the Array of Struct key: value:
这是下面的 Hive Table
CREATE EXTERNAL TABLE IF NOT EXISTS SampleTable
(
USER_ID string,
DETAIL_DATA array<struct<key:string,value:string>>
)
这是上面的数据table-
11111 [{"key":"client_status","value":"ACTIVE"},{"key":"name","value":"Jane Doe"}]
有什么方法可以使用 HiveQL 获得以下输出吗?
**client_status** | **name**
-------------------+----------------
ACTIVE Jane Doe
我尝试使用 explode() 但我得到的结果是这样的:
SELECT details
FROM sample_table
lateral view explode(DETAIL_DATA) exploded_table as details;
**details**
-------------------------------------------+
{"key":"client_status","value":"ACTIVE"}
------------------------------------------+
{"key":"name","value":"Jane Doe"}
使用laterral view [outer] inline
获取已经提取的结构元素并使用条件聚合获取与分组在单行中的一些键对应的值,使用group_by user_id.
演示:
with sample_table as (--This is your data example
select '11111' USER_ID,
array(named_struct('key','client_status','value','ACTIVE'),named_struct('key','name','value','Jane Doe')) DETAIL_DATA
)
SELECT max(case when e.key='name' then e.value end) as name,
max(case when e.key='client_status' then e.value end) as status
FROM sample_table
lateral view inline(DETAIL_DATA) e as key, value
group by USER_ID
结果:
name status
------------------------
Jane Doe ACTIVE
如果你能保证结构在数组中的顺序(有状态的总是在前),你可以直接处理嵌套元素
SELECT detail_data[0].value as client_status,
detail_data[1].value as name
from sample_table
另一种方法,如果您不知道数组中的顺序,但数组的大小为 2,不使用 explode 的 CASE 表达式会提供更好的性能:
SELECT case when DETAIL_DATA[0].key='name' then DETAIL_DATA[0].value else DETAIL_DATA[1].value end as name,
case when DETAIL_DATA[0].key='client_status' then DETAIL_DATA[0].value else DETAIL_DATA[1].value end as status
FROM sample_table
这是下面的 Hive Table
CREATE EXTERNAL TABLE IF NOT EXISTS SampleTable
(
USER_ID string,
DETAIL_DATA array<struct<key:string,value:string>>
)
这是上面的数据table-
11111 [{"key":"client_status","value":"ACTIVE"},{"key":"name","value":"Jane Doe"}]
有什么方法可以使用 HiveQL 获得以下输出吗?
**client_status** | **name**
-------------------+----------------
ACTIVE Jane Doe
我尝试使用 explode() 但我得到的结果是这样的:
SELECT details
FROM sample_table
lateral view explode(DETAIL_DATA) exploded_table as details;
**details**
-------------------------------------------+
{"key":"client_status","value":"ACTIVE"}
------------------------------------------+
{"key":"name","value":"Jane Doe"}
使用laterral view [outer] inline
获取已经提取的结构元素并使用条件聚合获取与分组在单行中的一些键对应的值,使用group_by user_id.
演示:
with sample_table as (--This is your data example
select '11111' USER_ID,
array(named_struct('key','client_status','value','ACTIVE'),named_struct('key','name','value','Jane Doe')) DETAIL_DATA
)
SELECT max(case when e.key='name' then e.value end) as name,
max(case when e.key='client_status' then e.value end) as status
FROM sample_table
lateral view inline(DETAIL_DATA) e as key, value
group by USER_ID
结果:
name status
------------------------
Jane Doe ACTIVE
如果你能保证结构在数组中的顺序(有状态的总是在前),你可以直接处理嵌套元素
SELECT detail_data[0].value as client_status,
detail_data[1].value as name
from sample_table
另一种方法,如果您不知道数组中的顺序,但数组的大小为 2,不使用 explode 的 CASE 表达式会提供更好的性能:
SELECT case when DETAIL_DATA[0].key='name' then DETAIL_DATA[0].value else DETAIL_DATA[1].value end as name,
case when DETAIL_DATA[0].key='client_status' then DETAIL_DATA[0].value else DETAIL_DATA[1].value end as status
FROM sample_table