跨多行合并空值
Coalesce null across multiple rows
我有一个table这种形式
id_customer__ | status | time_stmpd_at | idx
---------------+-----------------------+-----------------------+-----
112220 | enabled____________at | 2017-12-13 16:12:42.0 | 1
112220 | sale_locked_at__ | 2017-12-13 14:52:43.0 | 2
112220 | qual_sale_at | 2017-12-06 12:22:50.0 | 3
112220 | quality_control___at | 2017-11-28 18:22:02.0 | 4
112220 | returned__at | 2017-10-12 23:02:41.0 | 5
我想要 status
其中 idx
= 2 和 time_stmpd_at
其中 idx = 1。并且能够对所有客户 ID 执行此操作。
我试过像这样将条件放入 select 语句中:
select
id_customer__,
if(idx=2, status, NULL) as previous_status,
if(idx=1, time_stmpd_at, NULL) as time_stmpd_at
from htable
但这给我留下了
id_customer__ | previous_status | time_stmpd_at
---------------+------------------+-----------------------
119650 | NULL | 2017-12-13 16:12:42.0
119650 | sale_locked_at__ | NULL
119650 | NULL | NULL
119650 | NULL | NULL
119650 | NULL | NULL
接下来我必须将该字段合并为一行。但是我觉得一定有更好的办法。对整体方法有什么建议吗?
您可以使用条件聚合来做到这一点。
select
id_customer__,
max(case when idx=2 then status end) as previous_status,
max(case when idx=1 then time_stmpd_at end) as time_stmpd_at
from htable
group by id_customer__
您可以使用 MAX 并将 table 限制为仅您想要的索引(您不必这样做,但为什么要计算不相关的行):
SELECT id_customer__,
MAX(CASE WHEN idx=1 THEN time_stmpd_at ELSE NULL END) time_stmpd_at,
MAX(CASE WHEN idx=2 THEN status ELSE NULL END) status
FROM htable
WHERE idx IN (1,2)
GROUP BY id_customer__
或者您可以单独提取这些索引并在 id_customer__
上加入它们
SELECT h1.id_customer__, h1.time_stmpd_at , h2.status
FROM
(SELECT * FROM htable WHERE idx=1) h1 INNER JOIN
(SELECT * FROM htable WHERE idx=2) h2 ON h1.id_customer__ = h2.id_customer__
(基于@VamsiPrabhala 的回答,但改为使用 arbitrary
聚合)
我建议使用 arbitrary
聚合(而不是 max
),因为它能更好地表达意思:
select id_customer,
arbitrary(status) filter(where idx=2) as previous_status,
arbitrary(time_stmpd_at) filter(where idx=1) as time_stmpd_at
from htable
group by id_customer
使用arbitrary
有两个原因:
arbitrary
表示您没有进行任何 max
聚合(这很好,以防将来有人阅读此查询)
- 没有任何聚合会拒绝多个值。如果有的话,我会建议在
arbitrary
上使用它
我有一个table这种形式
id_customer__ | status | time_stmpd_at | idx
---------------+-----------------------+-----------------------+-----
112220 | enabled____________at | 2017-12-13 16:12:42.0 | 1
112220 | sale_locked_at__ | 2017-12-13 14:52:43.0 | 2
112220 | qual_sale_at | 2017-12-06 12:22:50.0 | 3
112220 | quality_control___at | 2017-11-28 18:22:02.0 | 4
112220 | returned__at | 2017-10-12 23:02:41.0 | 5
我想要 status
其中 idx
= 2 和 time_stmpd_at
其中 idx = 1。并且能够对所有客户 ID 执行此操作。
我试过像这样将条件放入 select 语句中:
select
id_customer__,
if(idx=2, status, NULL) as previous_status,
if(idx=1, time_stmpd_at, NULL) as time_stmpd_at
from htable
但这给我留下了
id_customer__ | previous_status | time_stmpd_at
---------------+------------------+-----------------------
119650 | NULL | 2017-12-13 16:12:42.0
119650 | sale_locked_at__ | NULL
119650 | NULL | NULL
119650 | NULL | NULL
119650 | NULL | NULL
接下来我必须将该字段合并为一行。但是我觉得一定有更好的办法。对整体方法有什么建议吗?
您可以使用条件聚合来做到这一点。
select
id_customer__,
max(case when idx=2 then status end) as previous_status,
max(case when idx=1 then time_stmpd_at end) as time_stmpd_at
from htable
group by id_customer__
您可以使用 MAX 并将 table 限制为仅您想要的索引(您不必这样做,但为什么要计算不相关的行):
SELECT id_customer__,
MAX(CASE WHEN idx=1 THEN time_stmpd_at ELSE NULL END) time_stmpd_at,
MAX(CASE WHEN idx=2 THEN status ELSE NULL END) status
FROM htable
WHERE idx IN (1,2)
GROUP BY id_customer__
或者您可以单独提取这些索引并在 id_customer__
SELECT h1.id_customer__, h1.time_stmpd_at , h2.status
FROM
(SELECT * FROM htable WHERE idx=1) h1 INNER JOIN
(SELECT * FROM htable WHERE idx=2) h2 ON h1.id_customer__ = h2.id_customer__
(基于@VamsiPrabhala 的回答,但改为使用 arbitrary
聚合)
我建议使用 arbitrary
聚合(而不是 max
),因为它能更好地表达意思:
select id_customer,
arbitrary(status) filter(where idx=2) as previous_status,
arbitrary(time_stmpd_at) filter(where idx=1) as time_stmpd_at
from htable
group by id_customer
使用arbitrary
有两个原因:
arbitrary
表示您没有进行任何max
聚合(这很好,以防将来有人阅读此查询)- 没有任何聚合会拒绝多个值。如果有的话,我会建议在
arbitrary
上使用它