Mysql 子查询总是进行文件排序

Mysql subquery always doing filesort

我有一个 table gamesplatform_pricehistory 和一个索引: (id_app,country,dateup)

这样做

explain select dateup from gamesplatform_pricehistory
    where id_app=1 and country=1
    order by dateup desc limit 1

显示"Using where; Using index"

但是有一个子查询:

explain select app.id, (select dateup from gamesplatform_pricehistory
                           where id_app=app.id and country=1
                           order by dateup desc limit 1)
      from app where id > 0;

显示使用位置;使用索引;使用文件排序

这是直接显示问题的 sqlfiddle: http://sqlfiddle.com/#!2/034bc/1

具有数百万行的基准: (table games_platform 与应用程序相同):

SELECT sql_no_cache thepricehistory.dateup
    FROM games_platform
    LEFT JOIN (SELECT max(dateup) as dateup, id_app
                   FROM gamesplatform_pricehistory
                   WHERE country='229' GROUP BY id_app
              ) thepricehistory
                         ON thepricehistory.id_app =games_platform.id
    WHERE games_platform.id=2 

评估:0.8s

SELECT sql_no_cache ( SELECT dateup FROM gamesplatform_pricehistory
                        WHERE id_app= games_platform.id AND country='229'
                        ORDER BY dateup DESC LIMIT 1
                    ) AS dateup
    FROM games_platform
    WHERE games_platform.id=2 

评估:0.0003s

Using filesort 不一定是坏事。这个名字有点误导。虽然它包含"file",但并不意味着数据写入硬盘的任何位置。它仍然只是在内存中处理。

来自manual

MySQL must do an extra pass to find out how to retrieve the rows in sorted order. The sort is done by going through all rows according to the join type and storing the sort key and pointer to the row for all rows that match the WHERE clause. The keys then are sorted and the rows are retrieved in sorted order. See Section 8.2.1.11, “ORDER BY Optimization”.

您明白为什么在您的查询中会出现这种情况,对吧?使用这种子查询是一种糟糕的风格,因为它是一个 dependent 子查询。对于 app table 中的每一行,都会执行子查询。很坏。用 join.

重写查询
select app.id,
gp.dateup
from app 
join gamesplatform_pricehistory gp on gp.id_app = app.id
where app.id > 0
and gp.country = 1
and gp.dateup = (SELECT MAX(dateup) FROM gamesplatform_pricehistory smgp WHERE smgp.id_app = gp.id_app AND smgp.country = 1)
;

这仍然使用依赖子查询,但 explain 看起来好多了:

| id |        select_type | table |  type | possible_keys |     key | key_len |                        ref | rows |                    Extra |
|----|--------------------|-------|-------|---------------|---------|---------|----------------------------|------|--------------------------|
|  1 |            PRIMARY |   app | index |       PRIMARY | PRIMARY |       4 |                     (null) |    2 | Using where; Using index |
|  1 |            PRIMARY |    gp |   ref |        id_app |  id_app |       5 |    db_2_034bc.app.id,const |    1 | Using where; Using index |
|  2 | DEPENDENT SUBQUERY |  smgp |   ref |        id_app |  id_app |       5 | db_2_034bc.gp.id_app,const |    1 |              Using index |

另一种重写它的方法是:

select app.id,
gp.dateup
from app 
LEFT join 
(SELECT id_app, MAX(dateup) AS dateup 
 FROM gamesplatform_pricehistory
 WHERE country = 1
 GROUP BY id_app
)gp on gp.id_app = app.id
where app.id > 0
;

解释看起来更好:

| id | select_type |                      table |  type | possible_keys |     key | key_len |    ref | rows |                    Extra |
|----|-------------|----------------------------|-------|---------------|---------|---------|--------|------|--------------------------|
|  1 |     PRIMARY |                        app | index |       PRIMARY | PRIMARY |       4 | (null) |    2 | Using where; Using index |
|  1 |     PRIMARY |                 <derived2> |   ALL |        (null) |  (null) |  (null) | (null) |    2 |                          |
|  2 |     DERIVED | gamesplatform_pricehistory | index |        (null) |  id_app |      13 | (null) |    2 | Using where; Using index |

这是一个完全没有依赖子查询的版本:

select app.id,
gp.dateup
from app 
left join gamesplatform_pricehistory gp on gp.id_app = app.id and country = 1
left join gamesplatform_pricehistory gp2 on gp.id_app = app.id and country = 1 and gp.dateup < gp2.dateup
where app.id > 0
and gp2.dateup is null
;

它是这样工作的:当gp.dateup最大时,没有gp2.dateup

请提供SHOW CREATE TABLE.

这些复合索引之一可能会有所帮助:

INDEX(id_app, country, dateup)
INDEX(country, id_app, dateup)