如何计算 PostgreSQL 中条件的出现次数?
How do I count occurrences with conditions in PostgreSQL?
我在 PostgreSQL 工作。假设我有这个 Person
table:
| id | time | name | type |
----------------------------------------------------
| 1 | 2022-04-25 07:49:58.0 | Brian | Rejection 1 |
| 2 | 2022-04-25 07:49:58.0 | Brian | Rejection 2 |
| 3 | 2022-04-27 13:05:51.0 | Fredd | Rejection 1 |
| 4 | 2022-05-01 02:13:44.0 | Janet | Rejection 1 |
| 5 | 2022-05-01 03:45:06.0 | Janet | Rejection 2 |
| 6 | 2022-05-01 08:01:34.0 | Peter | Approval |
| 7 | 2022-05-01 12:12:53.0 | Frank | Rejection 2 |
| 8 | 2022-05-02 01:26:38.0 | Frank | Approval |
注意:我们有 2 种拒绝类型 Rejection 1
和 Rejection 2
。
我想查询每个名称的拒绝次数和批准次数。但是,如果同时有2次拒绝,对于同一个名字,就像例子中的前两行,应该只算一次。
补充一下,对于同一个名字,可以同时出现每种拒绝类型中的一种,但是对于同一个对象,不可能同时出现两种类型的拒绝。名字.
这就是我所期待的 return:
| name | approvals | rejections |
----------------------------------
| Brian | 0 | 1 |
| Fredd | 0 | 1 |
| Janet | 0 | 2 |
| Peter | 1 | 0 |
| Frank | 1 | 1 |
我能得到的最接近的是:
SELECT
name,
COALESCE(SUM(CASE WHEN log_type = 'Approval' THEN 1 ELSE 0 END), 0) approvals,
COALESCE(SUM(CASE WHEN log_type = 'Rejection 1' OR log_type = 'Rejection 2' THEN 1 ELSE 0 END), 0) rejections
FROM
person
GROUP BY
name
这个问题是它计算了两个具有相同时间和名称的拒绝,而不是 1。
使用 ROW_NUMBER
删除重复项,然后使用简单的计数查询查找计数:
SELECT
name,
COUNT(*) FILTER (WHERE log_type = 'Approval') approvals,
COUNT(*) FILTER (WHERE log_type LIKE 'Rejection%') rejections
FROM
(
SELECT *, ROW_NUMBER() OVER (PARTITION BY time, name, SUBSTRING(log_type FROM '\w+')) rn
FROM person
) t
WHERE rn = 1
GROUP BY name;
如果 log_type
是 'Rejection X'
:
,您可以在 COUNT()
中使用 DISTINCT
来计算不同的 time
s
SELECT name,
COUNT(CASE WHEN log_type = 'Approval' THEN 1 END) approvals,
COUNT(DISTINCT CASE WHEN log_type IN ('Rejection 1', 'Rejection 2') THEN time END) rejections
FROM person
GROUP BY name;
参见demo。
我们可以在 CASE 中获取日期,然后使用 DISTINCT COUNT,它会忽略空值。
我给出了第一个查询来显示中间结果和带有和不带有 DISTINCT 的计数,以显示它在做什么。我已经使用测试 LEFT(log_type,6) = 'Reject'
对 2 种拒绝类型进行了分组。
我建议将时间四舍五入是个好主意,这样 2 次拒绝在一起就会被视为重复。我们当前查询事件1秒不同将被视为不同的拒绝。
create table person(
id int,
time date,
name varchar(20),
log_type varchar(20));
insert into person values
( 1,'2022-04-25 07:49:58.0','Brian','Rejection 1'),
( 2,'2022-04-25 07:49:58.0','Brian','Rejection 2'),
( 3,'2022-04-27 13:05:51.0','Fredd','Rejection 1'),
( 4,'2022-05-01 02:13:44.0','Janet','Rejection 1'),
( 5,'2022-05-01 03:45:06.0','Janet','Rejection 2'),
( 6,'2022-05-01 08:01:34.0','Peter','Approval'),
( 7,'2022-05-01 12:12:53.0','Frank','Rejection 2'),
( 8,'2022-05-02 01:26:38.0','Frank','Approval');
✓
8 行受影响
SELECT
name,
CASE WHEN LEFT(log_type,6) = 'Reject' THEN time END R,
CASE WHEN log_type = 'Approval' THEN time END A
FROM person;
name | r | a
:---- | :--------- | :---------
Brian | 2022-04-25 | null
Brian | 2022-04-25 | null
Fredd | 2022-04-27 | null
Janet | 2022-05-01 | null
Janet | 2022-05-01 | null
Peter | null | 2022-05-01
Frank | 2022-05-01 | null
Frank | null | 2022-05-02
SELECT
name,
COUNT(CASE WHEN LEFT(log_type,6) = 'Reject' THEN time END) all_rejections,
COUNT(CASE WHEN log_type = 'Approval' THEN time END) all_approvals,
COUNT(DISTINCT CASE WHEN LEFT(log_type,6) = 'Reject' THEN time END) distinct_rejections,
COUNT(DISTINCT CASE WHEN log_type = 'Approval' THEN time END) distinct_approvals
FROM person
GROUP BY name;
name | all_rejections | all_approvals | distinct_rejections | distinct_approvals
:---- | -------------: | ------------: | ------------------: | -----------------:
Brian | 2 | 0 | 1 | 0
Frank | 1 | 1 | 1 | 1
Fredd | 1 | 0 | 1 | 0
Janet | 2 | 0 | 1 | 0
Peter | 0 | 1 | 0 | 1
db<>fiddle here
我在 PostgreSQL 工作。假设我有这个 Person
table:
| id | time | name | type |
----------------------------------------------------
| 1 | 2022-04-25 07:49:58.0 | Brian | Rejection 1 |
| 2 | 2022-04-25 07:49:58.0 | Brian | Rejection 2 |
| 3 | 2022-04-27 13:05:51.0 | Fredd | Rejection 1 |
| 4 | 2022-05-01 02:13:44.0 | Janet | Rejection 1 |
| 5 | 2022-05-01 03:45:06.0 | Janet | Rejection 2 |
| 6 | 2022-05-01 08:01:34.0 | Peter | Approval |
| 7 | 2022-05-01 12:12:53.0 | Frank | Rejection 2 |
| 8 | 2022-05-02 01:26:38.0 | Frank | Approval |
注意:我们有 2 种拒绝类型 Rejection 1
和 Rejection 2
。
我想查询每个名称的拒绝次数和批准次数。但是,如果同时有2次拒绝,对于同一个名字,就像例子中的前两行,应该只算一次。
补充一下,对于同一个名字,可以同时出现每种拒绝类型中的一种,但是对于同一个对象,不可能同时出现两种类型的拒绝。名字.
这就是我所期待的 return:
| name | approvals | rejections |
----------------------------------
| Brian | 0 | 1 |
| Fredd | 0 | 1 |
| Janet | 0 | 2 |
| Peter | 1 | 0 |
| Frank | 1 | 1 |
我能得到的最接近的是:
SELECT
name,
COALESCE(SUM(CASE WHEN log_type = 'Approval' THEN 1 ELSE 0 END), 0) approvals,
COALESCE(SUM(CASE WHEN log_type = 'Rejection 1' OR log_type = 'Rejection 2' THEN 1 ELSE 0 END), 0) rejections
FROM
person
GROUP BY
name
这个问题是它计算了两个具有相同时间和名称的拒绝,而不是 1。
使用 ROW_NUMBER
删除重复项,然后使用简单的计数查询查找计数:
SELECT
name,
COUNT(*) FILTER (WHERE log_type = 'Approval') approvals,
COUNT(*) FILTER (WHERE log_type LIKE 'Rejection%') rejections
FROM
(
SELECT *, ROW_NUMBER() OVER (PARTITION BY time, name, SUBSTRING(log_type FROM '\w+')) rn
FROM person
) t
WHERE rn = 1
GROUP BY name;
如果 log_type
是 'Rejection X'
:
COUNT()
中使用 DISTINCT
来计算不同的 time
s
SELECT name,
COUNT(CASE WHEN log_type = 'Approval' THEN 1 END) approvals,
COUNT(DISTINCT CASE WHEN log_type IN ('Rejection 1', 'Rejection 2') THEN time END) rejections
FROM person
GROUP BY name;
参见demo。
我们可以在 CASE 中获取日期,然后使用 DISTINCT COUNT,它会忽略空值。
我给出了第一个查询来显示中间结果和带有和不带有 DISTINCT 的计数,以显示它在做什么。我已经使用测试 LEFT(log_type,6) = 'Reject'
对 2 种拒绝类型进行了分组。
我建议将时间四舍五入是个好主意,这样 2 次拒绝在一起就会被视为重复。我们当前查询事件1秒不同将被视为不同的拒绝。
create table person( id int, time date, name varchar(20), log_type varchar(20)); insert into person values ( 1,'2022-04-25 07:49:58.0','Brian','Rejection 1'), ( 2,'2022-04-25 07:49:58.0','Brian','Rejection 2'), ( 3,'2022-04-27 13:05:51.0','Fredd','Rejection 1'), ( 4,'2022-05-01 02:13:44.0','Janet','Rejection 1'), ( 5,'2022-05-01 03:45:06.0','Janet','Rejection 2'), ( 6,'2022-05-01 08:01:34.0','Peter','Approval'), ( 7,'2022-05-01 12:12:53.0','Frank','Rejection 2'), ( 8,'2022-05-02 01:26:38.0','Frank','Approval');
✓
8 行受影响
SELECT name, CASE WHEN LEFT(log_type,6) = 'Reject' THEN time END R, CASE WHEN log_type = 'Approval' THEN time END A FROM person;
name | r | a :---- | :--------- | :--------- Brian | 2022-04-25 | null Brian | 2022-04-25 | null Fredd | 2022-04-27 | null Janet | 2022-05-01 | null Janet | 2022-05-01 | null Peter | null | 2022-05-01 Frank | 2022-05-01 | null Frank | null | 2022-05-02
SELECT name, COUNT(CASE WHEN LEFT(log_type,6) = 'Reject' THEN time END) all_rejections, COUNT(CASE WHEN log_type = 'Approval' THEN time END) all_approvals, COUNT(DISTINCT CASE WHEN LEFT(log_type,6) = 'Reject' THEN time END) distinct_rejections, COUNT(DISTINCT CASE WHEN log_type = 'Approval' THEN time END) distinct_approvals FROM person GROUP BY name;
name | all_rejections | all_approvals | distinct_rejections | distinct_approvals :---- | -------------: | ------------: | ------------------: | -----------------: Brian | 2 | 0 | 1 | 0 Frank | 1 | 1 | 1 | 1 Fredd | 1 | 0 | 1 | 0 Janet | 2 | 0 | 1 | 0 Peter | 0 | 1 | 0 | 1
db<>fiddle here