SQL 获取唯一值的计数

Question

如何使用 sqlite3 为这个内存数据库示例获取不同的“计数”列？使用版本 3.27.2

示例数据库

CREATE TABLE events (
    id1, 
    id2, 
    id3, 
    PRIMARY KEY (id1, id2)
);

INSERT INTO events (id1, id2, id3)
VALUES 
   (1,1,99),
   (1,2,99),
   (1,3,52),
   (2,1,6),
   (2,2,7),
   (2,3,8)
;

.mode columns
.header on
SELECT * FROM events;

所需的打印输出

部分成功 以下适用于前两个新列。

SELECT id1, count(id3) AS total_count, count(DISTINCT id3) AS unique_count
FROM events
GROUP BY id1;

获取最后一列的最佳方法是什么？以下returnserror: no such column: total_count

SELECT id1, count(id3) AS total_count, count(DISTINCT id3) AS unique_count, (total_count - unique_count) AS repeated_count
FROM events
GROUP BY id1;

Answer 1

也许试试 CTE。尚未验证语法，但这似乎是查看 SQLlite 文档的有效选项。

With X as 
(
SELECT id1, count(id3) AS total_count, count(DISTINCT id3) AS unique_count
FROM events
GROUP BY id1;
)
select id1, total_count, unique_count, (total_count - unique_count) AS repeated_count
from X

Answer 2

这没那么容易:-)

在

(1,1,99), (1,2,99), (1,3,52)

有一个ID重复（99）

在

(1,1,99), (1,2,99), (1,3,52), (1,4,99)

又重复了一个ID（还是99）

在

(1,1,99), (1,2,99), (1,3,52), (1,4,52)

有两个 ID 重复（52 和 99）。

仅按 ID1 进行聚合时，您会失去该知识。您会看到有多少行以及有多少不同的 ID3，但看不到这些 ID3 中有哪些重复。这意味着您需要一个中间步骤，即最终聚合之前的预聚合。

select
  id1,
  count(*) as total_count,
  count(distinct id3) as unique_count,
  count(case when cnt > 1 then 1 end) as repeated_count
from
(
  select id1, id3, count(*) as cnt
  from events
  group by id1, id3
) pre_aggregated
group by id1
order by id1;

Answer 3

如果你group by id1, id3喜欢这样：

SELECT id1, id3, COUNT(*) counter
FROM events
GROUP BY id1, id3;

你得到每个组合的行数 id1, id3:

id1	id3	counter
1	52	1
1	99	2
2	6	1
2	7	1
2	8	1

现在，您所要做的就是：

对每个 id1 的列 counter 求和以获得该列 total_count
计算每个 id1 的行数以获得列 unique_count
计算每个 id1 的行数，其中 counter 列是 > 1 以获取 repeated_id3

您可以使用 SUM() 和 COUNT() window 函数执行此操作：

SELECT DISTINCT id1, 
       SUM(COUNT(*)) OVER (PARTITION BY id1) AS total_count, 
       COUNT(*) OVER (PARTITION BY id1) AS unique_count,
       SUM(COUNT(*) > 1) OVER (PARTITION BY id1) repeated_id3
FROM events
GROUP BY id1, id3;

参见demo。
结果：

id1	total_count	unique_count	repeated_id3
1	3	2	1
2	3	3	0

SQL 获取唯一值的计数

SQL get counts of unique values

sql

sqlite

count

distinct