PostgreSQL:如何select每个帐户在给定日期范围内每一天的最后余额?
PostgreSQL: How to select last balance for each account on each day in a given date range?
我是 运行 PostgreSQL 9.3 并且有一个 table 看起来像这样:
entry_date | account_id | balance
---------------------+------------+---------
2016-02-01 00:00:00 | 123 | 100
2016-02-01 06:00:00 | 123 | 200
2016-02-01 12:00:00 | 123 | 300
2016-02-01 18:00:00 | 123 | 250
2016-02-01 00:00:00 | 456 | 400
2016-02-01 06:00:00 | 456 | 300
2016-02-01 12:00:00 | 456 | 200
2016-02-01 18:00:00 | 456 | 299
2016-02-02 00:00:00 | 123 | 250
2016-02-02 06:00:00 | 123 | 300
2016-02-02 12:00:00 | 123 | 400
2016-02-02 18:00:00 | 123 | 450
2016-02-02 00:00:00 | 456 | 299
2016-02-02 06:00:00 | 456 | 200
2016-02-02 12:00:00 | 456 | 100
2016-02-02 18:00:00 | 456 | 0
(16 rows)
我的目标是在给定日期范围内的每一天检索每个帐户的最终余额。所以我想要的结果是:
entry_date | account_id | balance
---------------------+------------+---------
2016-02-01 18:00:00 | 123 | 250
2016-02-01 18:00:00 | 456 | 299
2016-02-02 18:00:00 | 123 | 450
2016-02-02 18:00:00 | 456 | 0
(4 rows)
请注意,我示例中的时间戳比实际中的时间戳要整齐得多...我不能总是依赖 18:00 作为每天的最后时间。
我该如何编写这个 SQL 查询?
我试过这个的变体:
SELECT max(entry_date), account_id, max(balance)
FROM ledger
WHERE entry_date BETWEEN '2016-02-01'::timestamp AND '2016-02-02'::timestamp
GROUP BY account_id, entry_date;
架构如下:
CREATE TABLE ledger (
entry_date timestamp(3),
account_id int,
balance int
);
INSERT INTO ledger VALUES ('2016-02-01T00:00:00.000Z', 123, 100);
INSERT INTO ledger VALUES ('2016-02-01T06:00:00.000Z', 123, 200);
INSERT INTO ledger VALUES ('2016-02-01T12:00:00.000Z', 123, 300);
INSERT INTO ledger VALUES ('2016-02-01T18:00:00.000Z', 123, 250);
INSERT INTO ledger VALUES ('2016-02-01T00:00:00.000Z', 456, 400);
INSERT INTO ledger VALUES ('2016-02-01T06:00:00.000Z', 456, 300);
INSERT INTO ledger VALUES ('2016-02-01T12:00:00.000Z', 456, 200);
INSERT INTO ledger VALUES ('2016-02-01T18:00:00.000Z', 456, 299);
INSERT INTO ledger VALUES ('2016-02-02T00:00:00.000Z', 123, 250);
INSERT INTO ledger VALUES ('2016-02-02T06:00:00.000Z', 123, 300);
INSERT INTO ledger VALUES ('2016-02-02T12:00:00.000Z', 123, 400);
INSERT INTO ledger VALUES ('2016-02-02T18:00:00.000Z', 123, 450);
INSERT INTO ledger VALUES ('2016-02-02T00:00:00.000Z', 456, 299);
INSERT INTO ledger VALUES ('2016-02-02T06:00:00.000Z', 456, 200);
INSERT INTO ledger VALUES ('2016-02-02T12:00:00.000Z', 456, 100);
INSERT INTO ledger VALUES ('2016-02-02T18:00:00.000Z', 456, 0);
这是一个SQLFiddle:http://sqlfiddle.com/#!15/56886
提前致谢!
在 Postgres 中,我认为最简单的方法是 distinct on
:
SELECT DISTINCT ON (account_id) l.*
FROM ledger l
WHERE entry_date BETWEEN '2016-02-01'::timestamp AND '2016-02-02'::timestamp
ORDER BY account_id, entry_date DESC;
DISTINCT ON
根据 ORDER BY
中的键对数据进行排序。然后它选择 ON
列表中键的唯一值,选择遇到的第一个值。
编辑:
完全同样的想法适用于一天的一条记录——我只是误读了原始要求:
SELECT DISTINCT ON (account_id, date_trunc('day', entry_date)) l.*
FROM ledger l
WHERE entry_date BETWEEN '2016-02-01'::timestamp AND '2016-02-02'::timestamp
ORDER BY account_id, date_trunc('day', entry_date), entry_date DESC;
您可以将 ROW_NUMBER
与 PARTITION BY
一起使用:
SELECT entry_date, account_id, balance
FROM (
SELECT entry_date, account_id, balance,
ROW_NUMBER() OVER (PARTITION BY account_id, entry_date::date
ORDER BY entry_date DESC) AS rn
FROM ledger
WHERE entry_date BETWEEN '2016-02-01'::timestamp AND '2016-02-02'::timestamp) AS t
WHERE t.rn = 1
PARTITION BY
每天创建 account_id
个值的切片,因为 entry_date
也用于同一子句 在转换为日期值后 .每个切片按 entry_date
降序排列,因此 ROW_NUMBER = 1
对应于当天的最后一条记录。
我是 运行 PostgreSQL 9.3 并且有一个 table 看起来像这样:
entry_date | account_id | balance
---------------------+------------+---------
2016-02-01 00:00:00 | 123 | 100
2016-02-01 06:00:00 | 123 | 200
2016-02-01 12:00:00 | 123 | 300
2016-02-01 18:00:00 | 123 | 250
2016-02-01 00:00:00 | 456 | 400
2016-02-01 06:00:00 | 456 | 300
2016-02-01 12:00:00 | 456 | 200
2016-02-01 18:00:00 | 456 | 299
2016-02-02 00:00:00 | 123 | 250
2016-02-02 06:00:00 | 123 | 300
2016-02-02 12:00:00 | 123 | 400
2016-02-02 18:00:00 | 123 | 450
2016-02-02 00:00:00 | 456 | 299
2016-02-02 06:00:00 | 456 | 200
2016-02-02 12:00:00 | 456 | 100
2016-02-02 18:00:00 | 456 | 0
(16 rows)
我的目标是在给定日期范围内的每一天检索每个帐户的最终余额。所以我想要的结果是:
entry_date | account_id | balance
---------------------+------------+---------
2016-02-01 18:00:00 | 123 | 250
2016-02-01 18:00:00 | 456 | 299
2016-02-02 18:00:00 | 123 | 450
2016-02-02 18:00:00 | 456 | 0
(4 rows)
请注意,我示例中的时间戳比实际中的时间戳要整齐得多...我不能总是依赖 18:00 作为每天的最后时间。
我该如何编写这个 SQL 查询?
我试过这个的变体:
SELECT max(entry_date), account_id, max(balance)
FROM ledger
WHERE entry_date BETWEEN '2016-02-01'::timestamp AND '2016-02-02'::timestamp
GROUP BY account_id, entry_date;
架构如下:
CREATE TABLE ledger (
entry_date timestamp(3),
account_id int,
balance int
);
INSERT INTO ledger VALUES ('2016-02-01T00:00:00.000Z', 123, 100);
INSERT INTO ledger VALUES ('2016-02-01T06:00:00.000Z', 123, 200);
INSERT INTO ledger VALUES ('2016-02-01T12:00:00.000Z', 123, 300);
INSERT INTO ledger VALUES ('2016-02-01T18:00:00.000Z', 123, 250);
INSERT INTO ledger VALUES ('2016-02-01T00:00:00.000Z', 456, 400);
INSERT INTO ledger VALUES ('2016-02-01T06:00:00.000Z', 456, 300);
INSERT INTO ledger VALUES ('2016-02-01T12:00:00.000Z', 456, 200);
INSERT INTO ledger VALUES ('2016-02-01T18:00:00.000Z', 456, 299);
INSERT INTO ledger VALUES ('2016-02-02T00:00:00.000Z', 123, 250);
INSERT INTO ledger VALUES ('2016-02-02T06:00:00.000Z', 123, 300);
INSERT INTO ledger VALUES ('2016-02-02T12:00:00.000Z', 123, 400);
INSERT INTO ledger VALUES ('2016-02-02T18:00:00.000Z', 123, 450);
INSERT INTO ledger VALUES ('2016-02-02T00:00:00.000Z', 456, 299);
INSERT INTO ledger VALUES ('2016-02-02T06:00:00.000Z', 456, 200);
INSERT INTO ledger VALUES ('2016-02-02T12:00:00.000Z', 456, 100);
INSERT INTO ledger VALUES ('2016-02-02T18:00:00.000Z', 456, 0);
这是一个SQLFiddle:http://sqlfiddle.com/#!15/56886
提前致谢!
在 Postgres 中,我认为最简单的方法是 distinct on
:
SELECT DISTINCT ON (account_id) l.*
FROM ledger l
WHERE entry_date BETWEEN '2016-02-01'::timestamp AND '2016-02-02'::timestamp
ORDER BY account_id, entry_date DESC;
DISTINCT ON
根据 ORDER BY
中的键对数据进行排序。然后它选择 ON
列表中键的唯一值,选择遇到的第一个值。
编辑:
完全同样的想法适用于一天的一条记录——我只是误读了原始要求:
SELECT DISTINCT ON (account_id, date_trunc('day', entry_date)) l.*
FROM ledger l
WHERE entry_date BETWEEN '2016-02-01'::timestamp AND '2016-02-02'::timestamp
ORDER BY account_id, date_trunc('day', entry_date), entry_date DESC;
您可以将 ROW_NUMBER
与 PARTITION BY
一起使用:
SELECT entry_date, account_id, balance
FROM (
SELECT entry_date, account_id, balance,
ROW_NUMBER() OVER (PARTITION BY account_id, entry_date::date
ORDER BY entry_date DESC) AS rn
FROM ledger
WHERE entry_date BETWEEN '2016-02-01'::timestamp AND '2016-02-02'::timestamp) AS t
WHERE t.rn = 1
PARTITION BY
每天创建 account_id
个值的切片,因为 entry_date
也用于同一子句 在转换为日期值后 .每个切片按 entry_date
降序排列,因此 ROW_NUMBER = 1
对应于当天的最后一条记录。