mysql 在大 table 中计算行的性能

Question

这个相当明显的问题几乎没有（找不到任何）可靠的答案。

我从 200 万行的 table 中做简单的 select。

select count(id) as total from big_table

我尝试此查询的任何机器通常至少需要 5 秒才能完成。这是 unacceptable 用于实时查询。

我需要获取行的精确值的原因是为了稍后进行精确的统计计算。

不幸的是，使用最后一个自动增量值不是一个选项，因为行也会定期删除。

Answer 1

你有索引吗？

ALTER TABLE big_table ADD INDEX id

您可以检查并尝试添加这个

Answer 2

在 InnoDB 引擎上运行确实会很慢。如 section 14.24 of the MySQL 5.7 Reference Manual, “InnoDB Restrictions and Limitations” 所述，第 3 个要点：

InnoDB InnoDB does not keep an internal count of rows in a table because concurrent transactions might “see” different numbers of rows at the same time. Consequently, SELECT COUNT(*) statements only count rows visible to the current transaction.

For information about how InnoDB processes SELECT COUNT(*) statements, refer to the COUNT() description in Section 12.20.1, “Aggregate Function Descriptions”.

建议的解决方案是计数器 table。这是一个单独的 table，具有一行和一列，具有当前记录数。它可以通过触发器保持更新。像这样：

create table big_table_count (rec_count int default 0);
-- one-shot initialisation:
insert into big_table_count select count(*) from big_table;

create trigger big_insert after insert on big_table
    for each row
    update big_table_count set rec_count = rec_count + 1;

create trigger big_delete after delete on big_table
    for each row
    update big_table_count set rec_count = rec_count - 1;

您可以在此处看到一个 fiddle，您应该在其中更改构建部分中的 insert/delete 语句以查看对以下内容的影响：

select rec_count from big_table_count;

您可以将其扩展几个 table，方法是为每个创建这样一个 table，或者在上面的计数器 table 中为每个 table 保留一行].然后它将由 "table_name".

列键入

提高并发性

如果你有很多插入或删除记录的并发会话，上面的方法确实有影响，因为它们需要等待对方完成计数器的更新。

一个解决方案是不要让触发器更新同一条记录，而是让它们插入一条新记录，如下所示：

create trigger big_insert after insert on big_table
    for each row
    insert into big_table_count (rec_count) values (1);

create trigger big_delete after delete on big_table
    for each row
    insert into big_table_count (rec_count) values (-1);

然后获取计数的方法变为：

select sum(rec_count) from big_table_count;

然后，偶尔（例如每天）您应该 re-initialise 计数器 table 以使其保持较小：

truncate table big_table_count;
insert into big_table_count select count(*) from big_table;

mysql 在大 table 中计算行的性能

Performance of mysql counting rows in a big table

mysql

bigdata

提高并发性