Hive SQL 在结果分组中包含 0 个值
Hive SQL include 0 values in group by result
我在 Hive 中有以下 table:
account day type
a 1 X
b 1 Y
c 1 Z
a 2 Y
b 2 Z
c 3 Z
如果我执行以下 SQL,我可以生成此 table:
SELECT day, type, count(distinct account) as account_count from my_table
day type account_count
1 X 1
1 Y 1
1 Z 1
2 Y 1
2 Z 1
3 Z 1
但是,我想生成计数也是 zero
的行,这样 table 具有以下结构:
day type account_count
1 X 1
1 Y 1
1 Z 1
2 X 0
2 Y 1
2 Z 1
3 X 0
3 Y 0
3 Z 1
是否可以生成这个 table 结构?
是的。使用 cross join
生成行,然后使用 left join
填充最后一列中的值:
select d.day, t.type, count(distinct mt.type)
from (select distinct day from my_table) d cross join
(select distinct type from my_table) t left join
my_table my
on mt.day = d.day and mt.type = d.type
group by d.date, t.type;
如果计数始终为 0 或 1(即没有重复),那么这样效率更高:
select d.day, t.type, (case when mt.type is null then 0 else 1 end)
from (select distinct day from my_table) d cross join
(select distinct type from my_table) t left join
my_table my
on mt.day = d.day and mt.type = d.type;
我在 Hive 中有以下 table:
account day type
a 1 X
b 1 Y
c 1 Z
a 2 Y
b 2 Z
c 3 Z
如果我执行以下 SQL,我可以生成此 table:
SELECT day, type, count(distinct account) as account_count from my_table
day type account_count
1 X 1
1 Y 1
1 Z 1
2 Y 1
2 Z 1
3 Z 1
但是,我想生成计数也是 zero
的行,这样 table 具有以下结构:
day type account_count
1 X 1
1 Y 1
1 Z 1
2 X 0
2 Y 1
2 Z 1
3 X 0
3 Y 0
3 Z 1
是否可以生成这个 table 结构?
是的。使用 cross join
生成行,然后使用 left join
填充最后一列中的值:
select d.day, t.type, count(distinct mt.type)
from (select distinct day from my_table) d cross join
(select distinct type from my_table) t left join
my_table my
on mt.day = d.day and mt.type = d.type
group by d.date, t.type;
如果计数始终为 0 或 1(即没有重复),那么这样效率更高:
select d.day, t.type, (case when mt.type is null then 0 else 1 end)
from (select distinct day from my_table) d cross join
(select distinct type from my_table) t left join
my_table my
on mt.day = d.day and mt.type = d.type;