如何在 Hive 中合并两个 table 以获得单个 table

How to combine two tables to get singel table in Hive

我有以下表格,需要将它们合并到配置单元中

任何人都可以帮助我如何实现这一目标。我用 coalesce 尝试了日期部分,没问题。 但是fam部分无法合并成单列。

非常感谢您的帮助。

谢谢,巴布

分两步使用全外连接,例如

with join1 as (
  select coalesce(t1.date, t2.date) as date
       , coalesce(fam1, fam2) as fam
       , coalesce(famcnt1, 0) as famsct1
       , coalesce(famcnt2, 0) as famsct2
    from table1 as t1
    full outer join table2 as t2
      on (t1.date = t2.date and fam1 = fam2)
)
select coalesce(t1.date, t3.date) as date
     , coalesce(fam, fam3) as fam
     , coalesce(famcnt1, 0) as famsct1
     , coalesce(famcnt2, 0) as famsct2
     , coalesce(famcnt3, 0) as famsct3
from join1 as t1
  full outer join table3 as t3
    on (t1.date = t3.date and fam = fam3)

您可以使用 full outer join。但是,带有 left joinunion 通常看起来更干净:

select df.date, df.name,
       coalesce(t1.famcnt1, 0) as famcnt1,
       coalesce(t2.famcnt2, 0) as famcnt2,
       coalesce(t3.famcnt3, 0) as famcnt3
from ((select date, fam1 from table1
      ) union   -- on purpose to remove duplicates
      (select date, fam1 from table2
      ) union   -- on purpose to remove duplicates
      (select date, fam1 from table3
      )
     ) df left join
     table1 t1
     on t1.date = df.date and t1.name = df.name left join
     table2 t2
     on t2.date = df.date and t2.name = df.name left join
     table3 t3
     on t3.date = df.date and t3.name = df.name;

如果您对 NULL 而不是 0 感到满意,那么根本不需要 COALESCE()