Hive SQL 在结果分组中包含 0 个值

Hive SQL include 0 values in group by result

我在 Hive 中有以下 table:

account  day  type
a        1    X
b        1    Y
c        1    Z
a        2    Y
b        2    Z
c        3    Z

如果我执行以下 SQL,我可以生成此 table:

SELECT day, type, count(distinct account) as account_count from my_table

day  type  account_count
1    X     1
1    Y     1
1    Z     1
2    Y     1
2    Z     1
3    Z     1

但是,我想生成计数也是 zero 的行,这样 table 具有以下结构:

day  type  account_count
1    X     1
1    Y     1
1    Z     1
2    X     0
2    Y     1
2    Z     1
3    X     0
3    Y     0
3    Z     1

是否可以生成这个 table 结构?

是的。使用 cross join 生成行,然后使用 left join 填充最后一列中的值:

select d.day, t.type, count(distinct mt.type)
from (select distinct day from my_table) d cross join
     (select distinct type from my_table) t left join
     my_table my
     on mt.day = d.day and mt.type = d.type
group by d.date, t.type;

如果计数始终为 0 或 1(即没有重复),那么这样效率更高:

select d.day, t.type, (case when mt.type is null then 0 else 1 end)
from (select distinct day from my_table) d cross join
     (select distinct type from my_table) t left join
     my_table my
     on mt.day = d.day and mt.type = d.type;