MySQL 行数
MySQL row count
我有一个非常大的 table(~1 000 000 行)和带有联合、连接和 where 语句的复杂查询(用户可以 select 不同的 ORDER BY 列和方向)。我需要获取分页的行数。如果我 运行 查询而不计算行数,它会很快完成。如何以最快的方式实现分页?
我尝试使用 EXPLAIN SELECT 和 SHOW TABLE STATUS 来获取近似行数,但它与实际行数有很大不同。
我的查询是这样的(简单):
SELECT * FROM (
(
SELECT * FROM table_1
LEFT JOIN `table_a` ON table_1.record_id = table_a.id
LEFT JOIN `table_b` ON table_a.id = table_b.record_id
WHERE table_1.a > 10 AND table_a.b < 500 AND table_b.c = 1
ORDER BY x ASC
LIMIT 0, 10
)
UNION
(
SELECT * FROM table_2
LEFT JOIN `table_a` ON table_2.record_id = table_a.id
LEFT JOIN `table_b` ON table_a.id = table_b.record_id
WHERE table_2.d < 10 AND table_a.e > 500 AND table_b.f = 1
ORDER BY x ASC
LIMIT 0, 10
)
) tbl ORDER BY x ASC LIMIT 0, 10
没有限制的查询结果大约是 100 000 行,我怎样才能以最快的方式得到这个近似计数?
我的生产查询示例如下所示:
SELECT SQL_CALC_FOUND_ROWS * FROM (
(
SELECT
articles_log.id AS log_id, articles_log.source_table,
articles_log.record_id AS id, articles_log.dat AS view_dat,
articles_log.lang AS view_lang, '1' AS view_count, '1' AS unique_view_count,
articles_log.user_agent, articles_log.ref, articles_log.ip,
articles_log.ses_id, articles_log.bot, articles_log.source_type, articles_log.link,
articles_log.user_country, articles_log.user_platform,
articles_log.user_os, articles_log.user_browser,
`contents`.dat AS source_dat, `contents_trans`.header, `contents_trans`.custom_text
FROM articles_log
INNER JOIN `contents` ON articles_log.record_id = `contents`.id
AND articles_log.source_table = 'contents'
INNER JOIN `contents_trans` ON `contents`.id = `contents_trans`.record_id
AND `contents_trans`.lang='lv'
WHERE articles_log.dat > 0
AND articles_log.dat >= 1488319200
AND articles_log.dat <= 1489355999
AND articles_log.bot = '0'
AND (articles_log.record_id NOT LIKE '%\_404' AND articles_log.record_id <> '404'
OR articles_log.source_table <> 'contents')
)
UNION
(
SELECT
articles_log.id AS log_id, articles_log.source_table,
articles_log.record_id AS id, articles_log.dat AS view_dat,
articles_log.lang AS view_lang, '1' AS view_count, '1' AS unique_view_count,
articles_log.user_agent, articles_log.ref, articles_log.ip,
articles_log.ses_id, articles_log.bot,
articles_log.source_type, articles_log.link,
articles_log.user_country, articles_log.user_platform,
articles_log.user_os, articles_log.user_browser,
`news`.dat AS source_dat, `news_trans`.header, `news_trans`.custom_text
FROM articles_log
INNER JOIN `news` ON articles_log.record_id = `news`.id
AND articles_log.source_table = 'news'
INNER JOIN `news_trans` ON `news`.id = `news_trans`.record_id
AND `news_trans`.lang='lv'
WHERE articles_log.dat > 0
AND articles_log.dat >= 1488319200
AND articles_log.dat <= 1489355999
AND articles_log.bot = '0'
AND (articles_log.record_id NOT LIKE '%\_404' AND articles_log.record_id <> '404'
OR articles_log.source_table <> 'contents')
)
) tbl ORDER BY view_dat ASC LIMIT 0, 10
非常感谢!
您可以在 运行 使用 SQL_CALC_FOUND_ROWS
查询时进行计算,如 documentation:
中所述
select SQL_CALC_FOUND_ROWS *
. . .
然后运行宁:
select FOUND_ROWS()
但是,第一个 运行 需要生成所有数据,因此您将获得最多 20 个可能的行——我认为它不符合子查询中的 LIMIT
。
鉴于您的查询结构和您想做的事情,我会首先考虑优化查询。例如,是否真的需要 UNION
(删除重复项会产生开销)?正如评论中指出的那样,您的联接实际上是伪装成外部联接的内部联接。索引可能会提高性能。
您可能想问另一个问题,提供示例数据和期望的结果以获得有关此类问题的建议。
如果你可以使用 UNION ALL
而不是 UNION
(这是 UNION DISTINCT
的快捷方式) - 换句话说 - 如果你不需要删除重复项,你可以尝试添加两个子查询的计数:
SELECT
(
SELECT COUNT(*) FROM table_1
LEFT JOIN `table_a` ON table_1.record_id = table_a.id
LEFT JOIN `table_b` ON table_a.id = table_b.record_id
WHERE table_1.a > 10 AND table_a.b < 500 AND table_b.c = 1
)
+
(
SELECT COUNT(*) FROM table_2
LEFT JOIN `table_a` ON table_2.record_id = table_a.id
LEFT JOIN `table_b` ON table_a.id = table_b.record_id
WHERE table_2.d < 10 AND table_a.e > 500 AND table_b.f = 1
)
AS cnt
没有 ORDER BY
和 UNION
引擎可能不需要创建巨大的温度 table。
更新
对于您的原始查询,请尝试以下操作:
- Select 仅
count(*)
.
- 从第一部分(内容)中删除
OR articles_log.source_table <> 'contents'
,因为我们知道这绝不是真的。
- 从第二部分(新闻)中删除
AND (articles_log.record_id NOT LIKE '%\_404' AND articles_log.record_id <> '404' OR articles_log.source_table <> 'contents')
因为我们知道它总是正确的,因为 OR articles_log.source_table <> 'contents'
总是正确的。
- 删除与
contents
和 news
的连接。您可以直接使用 record_id
加入 *_trans
table
- 删除
articles_log.dat > 0
因为它与 articles_log.dat >= 1488319200
是多余的
结果查询:
SELECT (
SELECT COUNT(*)
FROM articles_log
INNER JOIN `contents_trans`
ON `contents_trans`.record_id = articles_log.record_id
AND `contents_trans`.lang='lv'
WHERE articles_log.bot = '0'
AND articles_log.dat >= 1488319200
AND articles_log.dat <= 1489355999
AND articles_log.record_id NOT LIKE '%\_404'
AND articles_log.record_id <> '404'
) + (
SELECT COUNT(*)
FROM articles_log
INNER JOIN `news_trans`
ON `news_trans`.record_id = articles_log.record_id
AND `news_trans`.lang='lv'
WHERE articles_log.bot = '0'
AND articles_log.dat >= 1488319200
AND articles_log.dat <= 1489355999
) AS cnt
尝试以下索引组合:
articles_log(bot, dat, record_id)
contents_trans(lang, record_id)
news_trans(lang, record_id)
或
contents_trans(lang, record_id)
news_trans(lang, record_id)
articles_log(record_id, bot, dat)
要看数据,哪个组合更好
我可能在一个或多个点上是错误的,因为我不知道你的数据和业务逻辑。如果是,请尝试调整另一个。
我有一个非常大的 table(~1 000 000 行)和带有联合、连接和 where 语句的复杂查询(用户可以 select 不同的 ORDER BY 列和方向)。我需要获取分页的行数。如果我 运行 查询而不计算行数,它会很快完成。如何以最快的方式实现分页? 我尝试使用 EXPLAIN SELECT 和 SHOW TABLE STATUS 来获取近似行数,但它与实际行数有很大不同。 我的查询是这样的(简单):
SELECT * FROM (
(
SELECT * FROM table_1
LEFT JOIN `table_a` ON table_1.record_id = table_a.id
LEFT JOIN `table_b` ON table_a.id = table_b.record_id
WHERE table_1.a > 10 AND table_a.b < 500 AND table_b.c = 1
ORDER BY x ASC
LIMIT 0, 10
)
UNION
(
SELECT * FROM table_2
LEFT JOIN `table_a` ON table_2.record_id = table_a.id
LEFT JOIN `table_b` ON table_a.id = table_b.record_id
WHERE table_2.d < 10 AND table_a.e > 500 AND table_b.f = 1
ORDER BY x ASC
LIMIT 0, 10
)
) tbl ORDER BY x ASC LIMIT 0, 10
没有限制的查询结果大约是 100 000 行,我怎样才能以最快的方式得到这个近似计数? 我的生产查询示例如下所示:
SELECT SQL_CALC_FOUND_ROWS * FROM (
(
SELECT
articles_log.id AS log_id, articles_log.source_table,
articles_log.record_id AS id, articles_log.dat AS view_dat,
articles_log.lang AS view_lang, '1' AS view_count, '1' AS unique_view_count,
articles_log.user_agent, articles_log.ref, articles_log.ip,
articles_log.ses_id, articles_log.bot, articles_log.source_type, articles_log.link,
articles_log.user_country, articles_log.user_platform,
articles_log.user_os, articles_log.user_browser,
`contents`.dat AS source_dat, `contents_trans`.header, `contents_trans`.custom_text
FROM articles_log
INNER JOIN `contents` ON articles_log.record_id = `contents`.id
AND articles_log.source_table = 'contents'
INNER JOIN `contents_trans` ON `contents`.id = `contents_trans`.record_id
AND `contents_trans`.lang='lv'
WHERE articles_log.dat > 0
AND articles_log.dat >= 1488319200
AND articles_log.dat <= 1489355999
AND articles_log.bot = '0'
AND (articles_log.record_id NOT LIKE '%\_404' AND articles_log.record_id <> '404'
OR articles_log.source_table <> 'contents')
)
UNION
(
SELECT
articles_log.id AS log_id, articles_log.source_table,
articles_log.record_id AS id, articles_log.dat AS view_dat,
articles_log.lang AS view_lang, '1' AS view_count, '1' AS unique_view_count,
articles_log.user_agent, articles_log.ref, articles_log.ip,
articles_log.ses_id, articles_log.bot,
articles_log.source_type, articles_log.link,
articles_log.user_country, articles_log.user_platform,
articles_log.user_os, articles_log.user_browser,
`news`.dat AS source_dat, `news_trans`.header, `news_trans`.custom_text
FROM articles_log
INNER JOIN `news` ON articles_log.record_id = `news`.id
AND articles_log.source_table = 'news'
INNER JOIN `news_trans` ON `news`.id = `news_trans`.record_id
AND `news_trans`.lang='lv'
WHERE articles_log.dat > 0
AND articles_log.dat >= 1488319200
AND articles_log.dat <= 1489355999
AND articles_log.bot = '0'
AND (articles_log.record_id NOT LIKE '%\_404' AND articles_log.record_id <> '404'
OR articles_log.source_table <> 'contents')
)
) tbl ORDER BY view_dat ASC LIMIT 0, 10
非常感谢!
您可以在 运行 使用 SQL_CALC_FOUND_ROWS
查询时进行计算,如 documentation:
select SQL_CALC_FOUND_ROWS *
. . .
然后运行宁:
select FOUND_ROWS()
但是,第一个 运行 需要生成所有数据,因此您将获得最多 20 个可能的行——我认为它不符合子查询中的 LIMIT
。
鉴于您的查询结构和您想做的事情,我会首先考虑优化查询。例如,是否真的需要 UNION
(删除重复项会产生开销)?正如评论中指出的那样,您的联接实际上是伪装成外部联接的内部联接。索引可能会提高性能。
您可能想问另一个问题,提供示例数据和期望的结果以获得有关此类问题的建议。
如果你可以使用 UNION ALL
而不是 UNION
(这是 UNION DISTINCT
的快捷方式) - 换句话说 - 如果你不需要删除重复项,你可以尝试添加两个子查询的计数:
SELECT
(
SELECT COUNT(*) FROM table_1
LEFT JOIN `table_a` ON table_1.record_id = table_a.id
LEFT JOIN `table_b` ON table_a.id = table_b.record_id
WHERE table_1.a > 10 AND table_a.b < 500 AND table_b.c = 1
)
+
(
SELECT COUNT(*) FROM table_2
LEFT JOIN `table_a` ON table_2.record_id = table_a.id
LEFT JOIN `table_b` ON table_a.id = table_b.record_id
WHERE table_2.d < 10 AND table_a.e > 500 AND table_b.f = 1
)
AS cnt
没有 ORDER BY
和 UNION
引擎可能不需要创建巨大的温度 table。
更新
对于您的原始查询,请尝试以下操作:
- Select 仅
count(*)
. - 从第一部分(内容)中删除
OR articles_log.source_table <> 'contents'
,因为我们知道这绝不是真的。 - 从第二部分(新闻)中删除
AND (articles_log.record_id NOT LIKE '%\_404' AND articles_log.record_id <> '404' OR articles_log.source_table <> 'contents')
因为我们知道它总是正确的,因为OR articles_log.source_table <> 'contents'
总是正确的。 - 删除与
contents
和news
的连接。您可以直接使用record_id
加入 - 删除
articles_log.dat > 0
因为它与articles_log.dat >= 1488319200
是多余的
*_trans
table
结果查询:
SELECT (
SELECT COUNT(*)
FROM articles_log
INNER JOIN `contents_trans`
ON `contents_trans`.record_id = articles_log.record_id
AND `contents_trans`.lang='lv'
WHERE articles_log.bot = '0'
AND articles_log.dat >= 1488319200
AND articles_log.dat <= 1489355999
AND articles_log.record_id NOT LIKE '%\_404'
AND articles_log.record_id <> '404'
) + (
SELECT COUNT(*)
FROM articles_log
INNER JOIN `news_trans`
ON `news_trans`.record_id = articles_log.record_id
AND `news_trans`.lang='lv'
WHERE articles_log.bot = '0'
AND articles_log.dat >= 1488319200
AND articles_log.dat <= 1489355999
) AS cnt
尝试以下索引组合:
articles_log(bot, dat, record_id)
contents_trans(lang, record_id)
news_trans(lang, record_id)
或
contents_trans(lang, record_id)
news_trans(lang, record_id)
articles_log(record_id, bot, dat)
要看数据,哪个组合更好
我可能在一个或多个点上是错误的,因为我不知道你的数据和业务逻辑。如果是,请尝试调整另一个。