SQL - 嵌套查询的计数仍然很快吗?
SQL - is Count still fast with nested queries?
假设我有如下查询:
SELECT message.mid
FROM message
WHERE message.mid <= 100
据我所知,如果将查询更改为以下,执行速度会快得多,因为列没有展开。
SELECT COUNT(message.mid)
FROM message
WHERE message.mid <= 100
但是下面的查询会有同样的好处吗?还会一样快吗?
SELECT COUNT(*)
FROM (
SELECT message.mid,
message.something,
message.something2,
message.something3,
FROM message
WHERE message.mid <= 100
) AS A
We can ask MySQL what it will do。这是 5.7.
mysql> explain SELECT COUNT(*) FROM ( SELECT message.mid FROM message WHERE message.mid <= 100
+----+-------------+---------+------------+-------+---------------+---------+---------+------+------+----------+--------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------+------------+-------+---------------+---------+---------+------+------+----------+--------------------------+
| 1 | SIMPLE | message | NULL | range | PRIMARY | PRIMARY | 4 | NULL | 100 | 100.00 | Using where; Using index |
+----+-------------+---------+------------+-------+---------------+---------+---------+------+------+----------+--------------------------+
1 row in set, 1 warning (0.00 sec)
mysql> explain SELECT count(message.mid) FROM message WHERE message.mid <= 100;
+----+-------------+---------+------------+-------+---------------+---------+---------+------+------+----------+--------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------+------------+-------+---------------+---------+---------+------+------+----------+--------------------------+
| 1 | SIMPLE | message | NULL | range | PRIMARY | PRIMARY | 4 | NULL | 100 | 100.00 | Using where; Using index |
+----+-------------+---------+------------+-------+---------------+---------+---------+------+------+----------+--------------------------+
1 row in set, 1 warning (0.00 sec)
看起来是一样的。 MySQL 已经优化了子查询。
这是我们在 MySQL 没有优化子查询时看到的示例。
mysql> explain SELECT * FROM ( SELECT message.mid FROM message where mid < 100 group by mid) m;
+----+-------------+------------+------------+-------+---------------+---------+---------+------+------+----------+--------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+------------+------------+-------+---------------+---------+---------+------+------+----------+--------------------------+
| 1 | PRIMARY | <derived2> | NULL | ALL | NULL | NULL | NULL | NULL | 99 | 100.00 | NULL |
| 2 | DERIVED | message | NULL | range | PRIMARY | PRIMARY | 4 | NULL | 99 | 100.00 | Using where; Using index |
+----+-------------+------------+------------+-------+---------------+---------+---------+------+------+----------+--------------------------+
"Optimizing Derived Tables and View References" 有此优化工作原理的示例。
Example 1:
SELECT * FROM (SELECT * FROM t1) AS derived_t1;
With merging, that query is executed similar to:
SELECT * FROM t1;
该页面概述了 MySQL 用于提高子查询效率的许多其他优化技巧。
COUNT(*)
表示要计算行数。
COUNT(x)
表示计算 x IS NOT NULL
所在的行数。所以稍微慢一点,可能会有不同的答案。
SELECT mid
(相对于 SELECT COUNT(...)
)——更慢更笨重。它返回 mid
的所有值,而不仅仅是一个数字。
SELECT COUNT(..) FROM ( SELECT ... )
-- 慢得多(在较旧的 MySQL 版本中),因为它必须使用子查询的结果生成临时 table。另外,COUNT
收集的只是一个简单的数字;子查询正在收集大量行。
如果 mid
被索引(包括 PRIMARY KEY
,那么 WHERE mid <= 100
是索引(或 table)的 "range" 扫描。那是,它只涉及一些行。
如果 mid
未编入索引,则将扫描整个 table -- 因此速度较慢。
假设我有如下查询:
SELECT message.mid
FROM message
WHERE message.mid <= 100
据我所知,如果将查询更改为以下,执行速度会快得多,因为列没有展开。
SELECT COUNT(message.mid)
FROM message
WHERE message.mid <= 100
但是下面的查询会有同样的好处吗?还会一样快吗?
SELECT COUNT(*)
FROM (
SELECT message.mid,
message.something,
message.something2,
message.something3,
FROM message
WHERE message.mid <= 100
) AS A
We can ask MySQL what it will do。这是 5.7.
mysql> explain SELECT COUNT(*) FROM ( SELECT message.mid FROM message WHERE message.mid <= 100
+----+-------------+---------+------------+-------+---------------+---------+---------+------+------+----------+--------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------+------------+-------+---------------+---------+---------+------+------+----------+--------------------------+
| 1 | SIMPLE | message | NULL | range | PRIMARY | PRIMARY | 4 | NULL | 100 | 100.00 | Using where; Using index |
+----+-------------+---------+------------+-------+---------------+---------+---------+------+------+----------+--------------------------+
1 row in set, 1 warning (0.00 sec)
mysql> explain SELECT count(message.mid) FROM message WHERE message.mid <= 100;
+----+-------------+---------+------------+-------+---------------+---------+---------+------+------+----------+--------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------+------------+-------+---------------+---------+---------+------+------+----------+--------------------------+
| 1 | SIMPLE | message | NULL | range | PRIMARY | PRIMARY | 4 | NULL | 100 | 100.00 | Using where; Using index |
+----+-------------+---------+------------+-------+---------------+---------+---------+------+------+----------+--------------------------+
1 row in set, 1 warning (0.00 sec)
看起来是一样的。 MySQL 已经优化了子查询。
这是我们在 MySQL 没有优化子查询时看到的示例。
mysql> explain SELECT * FROM ( SELECT message.mid FROM message where mid < 100 group by mid) m;
+----+-------------+------------+------------+-------+---------------+---------+---------+------+------+----------+--------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+------------+------------+-------+---------------+---------+---------+------+------+----------+--------------------------+
| 1 | PRIMARY | <derived2> | NULL | ALL | NULL | NULL | NULL | NULL | 99 | 100.00 | NULL |
| 2 | DERIVED | message | NULL | range | PRIMARY | PRIMARY | 4 | NULL | 99 | 100.00 | Using where; Using index |
+----+-------------+------------+------------+-------+---------------+---------+---------+------+------+----------+--------------------------+
"Optimizing Derived Tables and View References" 有此优化工作原理的示例。
Example 1:
SELECT * FROM (SELECT * FROM t1) AS derived_t1;
With merging, that query is executed similar to:
SELECT * FROM t1;
该页面概述了 MySQL 用于提高子查询效率的许多其他优化技巧。
COUNT(*)
表示要计算行数。
COUNT(x)
表示计算 x IS NOT NULL
所在的行数。所以稍微慢一点,可能会有不同的答案。
SELECT mid
(相对于 SELECT COUNT(...)
)——更慢更笨重。它返回 mid
的所有值,而不仅仅是一个数字。
SELECT COUNT(..) FROM ( SELECT ... )
-- 慢得多(在较旧的 MySQL 版本中),因为它必须使用子查询的结果生成临时 table。另外,COUNT
收集的只是一个简单的数字;子查询正在收集大量行。
如果 mid
被索引(包括 PRIMARY KEY
,那么 WHERE mid <= 100
是索引(或 table)的 "range" 扫描。那是,它只涉及一些行。
如果 mid
未编入索引,则将扫描整个 table -- 因此速度较慢。