SQL - 嵌套查询的计数仍然很快吗?

SQL - is Count still fast with nested queries?

假设我有如下查询:

SELECT message.mid
FROM message
WHERE message.mid <= 100

据我所知,如果将查询更改为以下,执行速度会快得多,因为列没有展开。

SELECT COUNT(message.mid)
FROM message
WHERE message.mid <= 100

但是下面的查询会有同样的好处吗?还会一样快吗?

SELECT COUNT(*)
FROM (
    SELECT message.mid,
           message.something,
           message.something2,
           message.something3,
    FROM message
    WHERE message.mid <= 100
) AS A

We can ask MySQL what it will do。这是 5.7.

mysql> explain SELECT COUNT(*) FROM (     SELECT message.mid     FROM message     WHERE message.mid <= 100 
+----+-------------+---------+------------+-------+---------------+---------+---------+------+------+----------+--------------------------+
| id | select_type | table   | partitions | type  | possible_keys | key     | key_len | ref  | rows | filtered | Extra                    |
+----+-------------+---------+------------+-------+---------------+---------+---------+------+------+----------+--------------------------+
|  1 | SIMPLE      | message | NULL       | range | PRIMARY       | PRIMARY | 4       | NULL |  100 |   100.00 | Using where; Using index |
+----+-------------+---------+------------+-------+---------------+---------+---------+------+------+----------+--------------------------+
1 row in set, 1 warning (0.00 sec)


mysql> explain SELECT count(message.mid)     FROM message     WHERE message.mid <= 100;
+----+-------------+---------+------------+-------+---------------+---------+---------+------+------+----------+--------------------------+
| id | select_type | table   | partitions | type  | possible_keys | key     | key_len | ref  | rows | filtered | Extra                    |
+----+-------------+---------+------------+-------+---------------+---------+---------+------+------+----------+--------------------------+
|  1 | SIMPLE      | message | NULL       | range | PRIMARY       | PRIMARY | 4       | NULL |  100 |   100.00 | Using where; Using index |
+----+-------------+---------+------------+-------+---------------+---------+---------+------+------+----------+--------------------------+
1 row in set, 1 warning (0.00 sec)

看起来是一样的。 MySQL 已经优化了子查询。

这是我们在 MySQL 没有优化子查询时看到的示例。

mysql> explain SELECT * FROM (     SELECT message.mid     FROM message where mid < 100 group by mid) m;
+----+-------------+------------+------------+-------+---------------+---------+---------+------+------+----------+--------------------------+
| id | select_type | table      | partitions | type  | possible_keys | key     | key_len | ref  | rows | filtered | Extra                    |
+----+-------------+------------+------------+-------+---------------+---------+---------+------+------+----------+--------------------------+
|  1 | PRIMARY     | <derived2> | NULL       | ALL   | NULL          | NULL    | NULL    | NULL |   99 |   100.00 | NULL                     |
|  2 | DERIVED     | message    | NULL       | range | PRIMARY       | PRIMARY | 4       | NULL |   99 |   100.00 | Using where; Using index |
+----+-------------+------------+------------+-------+---------------+---------+---------+------+------+----------+--------------------------+

"Optimizing Derived Tables and View References" 有此优化工作原理的示例。

Example 1:

SELECT * FROM (SELECT * FROM t1) AS derived_t1;

With merging, that query is executed similar to:

SELECT * FROM t1;

该页面概述了 MySQL 用于提高子查询效率的许多其他优化技巧。

COUNT(*) 表示要计算行数。

COUNT(x) 表示计算 x IS NOT NULL 所在的行数。所以稍微慢一点,可能会有不同的答案。

SELECT mid(相对于 SELECT COUNT(...))——更慢更笨重。它返回 mid 的所有值,而不仅仅是一个数字。

SELECT COUNT(..) FROM ( SELECT ... ) -- 慢得多(在较旧的 MySQL 版本中),因为它必须使用子查询的结果生成临时 table。另外,COUNT 收集的只是一个简单的数字;子查询正在收集大量行。

如果 mid 被索引(包括 PRIMARY KEY,那么 WHERE mid <= 100 是索引(或 table)的 "range" 扫描。那是,它只涉及一些行。

如果 mid 未编入索引,则将扫描整个 table -- 因此速度较慢。