MySQL 行数

Question

我有一个非常大的 table（~1 000 000 行）和带有联合、连接和 where 语句的复杂查询（用户可以 select 不同的 ORDER BY 列和方向）。我需要获取分页的行数。如果我运行查询而不计算行数，它会很快完成。如何以最快的方式实现分页？我尝试使用 EXPLAIN SELECT 和 SHOW TABLE STATUS 来获取近似行数，但它与实际行数有很大不同。我的查询是这样的（简单）：

SELECT * FROM (
    (   
        SELECT * FROM table_1 
        LEFT JOIN `table_a` ON table_1.record_id = table_a.id 
        LEFT JOIN `table_b` ON table_a.id = table_b.record_id 
        WHERE table_1.a > 10 AND table_a.b < 500 AND table_b.c = 1 
        ORDER BY x ASC
        LIMIT 0, 10        
    )      
    UNION        
    (   
        SELECT * FROM table_2
        LEFT JOIN `table_a` ON table_2.record_id = table_a.id 
        LEFT JOIN `table_b` ON table_a.id = table_b.record_id 
        WHERE table_2.d < 10 AND table_a.e > 500 AND table_b.f = 1 
        ORDER BY x ASC
        LIMIT 0, 10                                 
    )                 
) tbl ORDER BY x ASC LIMIT 0, 10

没有限制的查询结果大约是 100 000 行，我怎样才能以最快的方式得到这个近似计数？我的生产查询示例如下所示：

SELECT SQL_CALC_FOUND_ROWS * FROM (
    (   
        SELECT
          articles_log.id AS log_id, articles_log.source_table,
          articles_log.record_id AS id, articles_log.dat AS view_dat, 
          articles_log.lang AS view_lang, '1' AS view_count, '1' AS unique_view_count,
          articles_log.user_agent, articles_log.ref, articles_log.ip,
          articles_log.ses_id, articles_log.bot, articles_log.source_type, articles_log.link,   
          articles_log.user_country, articles_log.user_platform,
          articles_log.user_os, articles_log.user_browser,                                          
          `contents`.dat AS source_dat, `contents_trans`.header, `contents_trans`.custom_text 
        FROM articles_log 
        INNER JOIN `contents` ON articles_log.record_id = `contents`.id
                             AND articles_log.source_table = 'contents'  
        INNER JOIN `contents_trans` ON `contents`.id = `contents_trans`.record_id
                                   AND `contents_trans`.lang='lv' 
        WHERE articles_log.dat > 0
          AND articles_log.dat >= 1488319200
          AND articles_log.dat <= 1489355999
          AND articles_log.bot = '0'
          AND (articles_log.record_id NOT LIKE '%\_404' AND articles_log.record_id <> '404'
               OR articles_log.source_table <> 'contents') 
    )      
    UNION        
    (   
        SELECT
          articles_log.id AS log_id, articles_log.source_table,
          articles_log.record_id AS id, articles_log.dat AS view_dat, 
          articles_log.lang AS view_lang, '1' AS view_count, '1' AS unique_view_count,
          articles_log.user_agent, articles_log.ref, articles_log.ip,
          articles_log.ses_id, articles_log.bot,
          articles_log.source_type, articles_log.link,   
          articles_log.user_country, articles_log.user_platform,
          articles_log.user_os, articles_log.user_browser,                                          
        `news`.dat AS source_dat, `news_trans`.header, `news_trans`.custom_text 
        FROM articles_log 
        INNER JOIN `news` ON articles_log.record_id = `news`.id
                         AND articles_log.source_table = 'news'  
        INNER JOIN `news_trans` ON `news`.id = `news_trans`.record_id
                         AND `news_trans`.lang='lv' 
        WHERE articles_log.dat > 0 
          AND articles_log.dat >= 1488319200
          AND articles_log.dat <= 1489355999
          AND articles_log.bot = '0'
          AND (articles_log.record_id NOT LIKE '%\_404' AND articles_log.record_id <> '404'
               OR articles_log.source_table <> 'contents') 
    )      
) tbl ORDER BY view_dat ASC LIMIT 0, 10

非常感谢！

Answer 1

您可以在运行使用 SQL_CALC_FOUND_ROWS 查询时进行计算，如 documentation:

中所述

select SQL_CALC_FOUND_ROWS *
. . .

然后运行宁：

select FOUND_ROWS()

但是，第一个运行需要生成所有数据，因此您将获得最多 20 个可能的行——我认为它不符合子查询中的 LIMIT。

鉴于您的查询结构和您想做的事情，我会首先考虑优化查询。例如，是否真的需要 UNION（删除重复项会产生开销）？正如评论中指出的那样，您的联接实际上是伪装成外部联接的内部联接。索引可能会提高性能。

您可能想问另一个问题，提供示例数据和期望的结果以获得有关此类问题的建议。

Answer 2

如果你可以使用 UNION ALL 而不是 UNION（这是 UNION DISTINCT 的快捷方式） - 换句话说 - 如果你不需要删除重复项，你可以尝试添加两个子查询的计数：

SELECT 
    (   
        SELECT COUNT(*) FROM table_1 
        LEFT JOIN `table_a` ON table_1.record_id = table_a.id 
        LEFT JOIN `table_b` ON table_a.id = table_b.record_id 
        WHERE table_1.a > 10 AND table_a.b < 500 AND table_b.c = 1      
    )      
    +
    (   
        SELECT COUNT(*) FROM table_2
        LEFT JOIN `table_a` ON table_2.record_id = table_a.id 
        LEFT JOIN `table_b` ON table_a.id = table_b.record_id 
        WHERE table_2.d < 10 AND table_a.e > 500 AND table_b.f = 1                              
    ) 
    AS cnt

没有 ORDER BY 和 UNION 引擎可能不需要创建巨大的温度 table。

更新

对于您的原始查询，请尝试以下操作：

Select 仅 count(*).
从第一部分（内容）中删除 OR articles_log.source_table <> 'contents'，因为我们知道这绝不是真的。
从第二部分（新闻）中删除 AND (articles_log.record_id NOT LIKE '%\_404' AND articles_log.record_id <> '404' OR articles_log.source_table <> 'contents') 因为我们知道它总是正确的，因为 OR articles_log.source_table <> 'contents' 总是正确的。
删除与 contents 和 news 的连接。您可以直接使用 record_id

*_trans

删除 articles_log.dat > 0 因为它与 articles_log.dat >= 1488319200

结果查询：

SELECT (   
    SELECT COUNT(*)
    FROM articles_log 
    INNER JOIN `contents_trans`
      ON `contents_trans`.record_id = articles_log.record_id
      AND `contents_trans`.lang='lv' 
    WHERE articles_log.bot = '0'
      AND articles_log.dat >= 1488319200
      AND articles_log.dat <= 1489355999
      AND articles_log.record_id NOT LIKE '%\_404'
      AND articles_log.record_id <> '404'
) + (   
    SELECT COUNT(*)
    FROM articles_log 
    INNER JOIN `news_trans`
      ON  `news_trans`.record_id = articles_log.record_id
      AND `news_trans`.lang='lv' 
    WHERE articles_log.bot = '0'
      AND articles_log.dat >= 1488319200
      AND articles_log.dat <= 1489355999
) AS cnt

尝试以下索引组合：

articles_log(bot, dat, record_id)
contents_trans(lang, record_id)
news_trans(lang, record_id)

或

contents_trans(lang, record_id)
news_trans(lang, record_id)
articles_log(record_id, bot, dat)

要看数据，哪个组合更好

我可能在一个或多个点上是错误的，因为我不知道你的数据和业务逻辑。如果是，请尝试调整另一个。

MySQL 行数

MySQL row count

mysql

count

query-optimization