从每个组中返回具有最新时间戳的行
Returning the row with the most recent timestamp from each group
我有一个 table (Postgres 9.3) 定义如下:
CREATE TABLE tsrs (
id SERIAL PRIMARY KEY,
customer_id INTEGER NOT NULL REFERENCES customers,
timestamp TIMESTAMP WITHOUT TIME ZONE,
licensekeys_checksum VARCHAR(32));
此处的相关详细信息是 customer_id
、timestamp
和 licensekeys_checksum
。可以有多个具有相同 customer_id
的条目,其中一些可能具有匹配的 licensekey_checksum
条目,而一些可能不同。永远不会有具有相同校验和和相同时间戳的行。
我想 return 一个 table 包含 1 行的每组行与匹配的 licensekeys_checksum
条目。每个组的行 returned 应该是具有最新/最近时间戳的行。
示例输入:
1, 2, 2014-08-21 16:03:35, 3FF2561A
2, 2, 2014-08-22 10:00:41, 3FF2561A
2, 2, 2014-06-10 10:00:41, 081AB3CA
3, 5, 2014-02-01 12:03:23, 299AFF90
4, 5, 2013-12-13 08:14:26, 299AFF90
5, 6, 2013-09-09 18:21:53, 49FFA891
期望的输出:
2, 2, 2014-08-22 10:00:41, 3FF2561A
2, 2, 2014-06-10 10:00:41, 081AB3CA
3, 5, 2014-02-01 12:03:23, 299AFF90
5, 6, 2013-09-09 18:21:53, 49FFA891
我已经设法根据下面的评论拼凑出一个查询,并在互联网上搜索了几个小时。 :)
select * from tsrs
inner join (
select licensekeys_checksum, max(timestamp) as mts
from tsrs
group by licensekeys_checksum
) x on x.licensekeys_checksum = tsrs.licensekeys_checksum
and x.mts = tsrs.timestamp;
它似乎有效,但我不确定。我在正确的轨道上吗?
试试这个
select *
from tsrs
where (timestamp,licensekeys_checksum) in (
select max(timestamp)
,licensekeys_checksum
from tsrs
group by licensekeys_checksum)
或
with cte as (
select id
,customer_id
,timestamp
,licensekeys_checksum
,row_number () over (partition by licensekeys_checksum ORDER BY timestamp DESC) as rk
from tsrs)
select id
,customer_id
,timestamp
,licensekeys_checksum
from cte where rk=1 order by id
参考:Window Functions, row_number(), and CTE
替代重复数据删除,使用NOT EXISTS(...)
SELECT *
FROM tsrs t
WHERE NOT EXISTS (
SELECT *
FROM tsrs x
WHERE x.customer_id = t.customer_id -- same customer
AND x.licensekeys_checksum = t.licensekeys_checksum -- same checksum
AND x.ztimestamp > t.ztimestamp -- but more recent
);
您在问题中的查询应该比(先前)接受的答案中的查询执行得更好。用 EXPLAIN ANALYZE
.
测试
DISTINCT ON
通常更简单、更快:
SELECT DISTINCT ON (licensekeys_checksum) *
FROM tsrs
ORDER BY licensekeys_checksum, timestamp DESC NULLS LAST;
详细解释:
- Select first row in each GROUP BY group?
我有一个 table (Postgres 9.3) 定义如下:
CREATE TABLE tsrs (
id SERIAL PRIMARY KEY,
customer_id INTEGER NOT NULL REFERENCES customers,
timestamp TIMESTAMP WITHOUT TIME ZONE,
licensekeys_checksum VARCHAR(32));
此处的相关详细信息是 customer_id
、timestamp
和 licensekeys_checksum
。可以有多个具有相同 customer_id
的条目,其中一些可能具有匹配的 licensekey_checksum
条目,而一些可能不同。永远不会有具有相同校验和和相同时间戳的行。
我想 return 一个 table 包含 1 行的每组行与匹配的 licensekeys_checksum
条目。每个组的行 returned 应该是具有最新/最近时间戳的行。
示例输入:
1, 2, 2014-08-21 16:03:35, 3FF2561A
2, 2, 2014-08-22 10:00:41, 3FF2561A
2, 2, 2014-06-10 10:00:41, 081AB3CA
3, 5, 2014-02-01 12:03:23, 299AFF90
4, 5, 2013-12-13 08:14:26, 299AFF90
5, 6, 2013-09-09 18:21:53, 49FFA891
期望的输出:
2, 2, 2014-08-22 10:00:41, 3FF2561A
2, 2, 2014-06-10 10:00:41, 081AB3CA
3, 5, 2014-02-01 12:03:23, 299AFF90
5, 6, 2013-09-09 18:21:53, 49FFA891
我已经设法根据下面的评论拼凑出一个查询,并在互联网上搜索了几个小时。 :)
select * from tsrs
inner join (
select licensekeys_checksum, max(timestamp) as mts
from tsrs
group by licensekeys_checksum
) x on x.licensekeys_checksum = tsrs.licensekeys_checksum
and x.mts = tsrs.timestamp;
它似乎有效,但我不确定。我在正确的轨道上吗?
试试这个
select *
from tsrs
where (timestamp,licensekeys_checksum) in (
select max(timestamp)
,licensekeys_checksum
from tsrs
group by licensekeys_checksum)
或
with cte as (
select id
,customer_id
,timestamp
,licensekeys_checksum
,row_number () over (partition by licensekeys_checksum ORDER BY timestamp DESC) as rk
from tsrs)
select id
,customer_id
,timestamp
,licensekeys_checksum
from cte where rk=1 order by id
参考:Window Functions, row_number(), and CTE
替代重复数据删除,使用NOT EXISTS(...)
SELECT *
FROM tsrs t
WHERE NOT EXISTS (
SELECT *
FROM tsrs x
WHERE x.customer_id = t.customer_id -- same customer
AND x.licensekeys_checksum = t.licensekeys_checksum -- same checksum
AND x.ztimestamp > t.ztimestamp -- but more recent
);
您在问题中的查询应该比(先前)接受的答案中的查询执行得更好。用 EXPLAIN ANALYZE
.
DISTINCT ON
通常更简单、更快:
SELECT DISTINCT ON (licensekeys_checksum) *
FROM tsrs
ORDER BY licensekeys_checksum, timestamp DESC NULLS LAST;
详细解释:
- Select first row in each GROUP BY group?