MySQL 中的 UNION ALL 性能不佳
Poor UNION ALL performance in MySQL
我有一个包含如下行的数据库:
+------------+---------+------------+-------+
| continent | country | city | value |
+------------+---------+------------+-------+
| Asia | China | Beijing | 3 |
| ... | ... | ... | ... |
| N. America | USA | D.C | 7 |
| .... | .... | .... | .... |
为了生成树状图可视化,我需要将其加工成具有以下形状的 table:
+-----+------------+-------+
| uid | parent-uid | value |
+-----+------------+-------+
在这种情况下,Asia
是 China
的 "parent",也就是 Beijing
的 "parent"。所以对于这三个,你会有一些东西,比如:
+---------+--------+-----+
| Beijing | China | 3 |
| China | Asia | ... |
| Asia | global | ... |
+---------+--------+-----+
China
的 "value" 需要是所有子值的总和。同样,Asia
的值需要是所有子值的总和。
为了完全在 SQL 中完成此操作,我创建了以下三个查询并将它们与 UNION ALL
组合:
# City-level:
SELECT
CONCAT(continent, "-", country, "-", city) as uid,
CONCAT(continent, "-", country) as parentuid,
value
FROM
table
UNION ALL
# Country-level
SELECT
CONCAT(continent, "-", country) as uid,
continent as parentuid,
SUM(value) as value
FROM
table
GROUP BY
country
UNION ALL
# Continent-level
SELECT
continent as uid,
"global" as parentuid,
SUM(value) as value
FROM
table
GROUP BY
continent
每个单独的查询都在几毫秒内完成。城市级、国家级和大洲级的所有 return 结果都在 < 0.01 秒
内
当我将它们合并在一起时,突然需要 8 秒才能得到结果!
我试过用谷歌搜索问题,但一切都只是说 "Use UNION ALL
instead of UNION
"(我已经是)
我认为它可能没有足够的 RAM 来构建临时结果 table 所以这是磁盘垃圾,但我不知道如何增加内存限制。我尝试将 innodb_buffer_pool_size
增加到 1GB (1073741824) 但没有帮助
第一个select
,选择table中的所有行然后获取第一行非常快但是获取所有行将花费很多时间(mysql workbench 默认情况下将 limit 1000
附加到查询的末尾)。
要测试获取所有行是否需要更多时间,请尝试以下查询并告诉我们它消耗的时间:
select * from (
SELECT
CONCAT(continent, "-", country, "-", city) as uid,
CONCAT(continent, "-", country) as parentuid,
value
FROM
table
) t1;
如果需要将近 8 秒,那么您的工会没有问题。为了提高性能,您必须使用 where 子句限制行数。
希望对您有所帮助。
我想我的问题是:WITH ROLLUP
有什么问题?
SELECT
CONCAT_WS('-',continent,country,city) as uid,
CONCAT_WS('-',continent,COALESCE(country,'global')) as parentuid,
value
FROM (
SELECT continent, country, city, SUM(value) as value
FROM table
GROUP BY continent, country, city WITH ROLLUP
) t1
WHERE t1.continent IS NOT NULL;
我可能没有正确调用 CONCAT_WS()
,尤其是如果您有名为 ''
的城市或国家,但我认为这样会更快。 WHERE 子句只是用来删除整体摘要。
这是 MySQL 文档中 WITH ROLLUP
的示例,以帮助解释它的作用:
mysql> SELECT year, country, product, SUM(profit)
-> FROM sales
-> GROUP BY year, country, product WITH ROLLUP;
+------+---------+------------+-------------+
| year | country | product | SUM(profit) |
+------+---------+------------+-------------+
| 2000 | Finland | Computer | 1500 |
| 2000 | Finland | Phone | 100 |
| 2000 | Finland | NULL | 1600 |
| 2000 | India | Calculator | 150 |
| 2000 | India | Computer | 1200 |
| 2000 | India | NULL | 1350 |
| 2000 | USA | Calculator | 75 |
| 2000 | USA | Computer | 1500 |
| 2000 | USA | NULL | 1575 |
| 2000 | NULL | NULL | 4525 |
| 2001 | Finland | Phone | 10 |
| 2001 | Finland | NULL | 10 |
| 2001 | USA | Calculator | 50 |
| 2001 | USA | Computer | 2700 |
| 2001 | USA | TV | 250 |
| 2001 | USA | NULL | 3000 |
| 2001 | NULL | NULL | 3010 |
| NULL | NULL | NULL | 7535 |
+------+---------+------------+-------------+
我有一个包含如下行的数据库:
+------------+---------+------------+-------+
| continent | country | city | value |
+------------+---------+------------+-------+
| Asia | China | Beijing | 3 |
| ... | ... | ... | ... |
| N. America | USA | D.C | 7 |
| .... | .... | .... | .... |
为了生成树状图可视化,我需要将其加工成具有以下形状的 table:
+-----+------------+-------+
| uid | parent-uid | value |
+-----+------------+-------+
在这种情况下,Asia
是 China
的 "parent",也就是 Beijing
的 "parent"。所以对于这三个,你会有一些东西,比如:
+---------+--------+-----+
| Beijing | China | 3 |
| China | Asia | ... |
| Asia | global | ... |
+---------+--------+-----+
China
的 "value" 需要是所有子值的总和。同样,Asia
的值需要是所有子值的总和。
为了完全在 SQL 中完成此操作,我创建了以下三个查询并将它们与 UNION ALL
组合:
# City-level:
SELECT
CONCAT(continent, "-", country, "-", city) as uid,
CONCAT(continent, "-", country) as parentuid,
value
FROM
table
UNION ALL
# Country-level
SELECT
CONCAT(continent, "-", country) as uid,
continent as parentuid,
SUM(value) as value
FROM
table
GROUP BY
country
UNION ALL
# Continent-level
SELECT
continent as uid,
"global" as parentuid,
SUM(value) as value
FROM
table
GROUP BY
continent
每个单独的查询都在几毫秒内完成。城市级、国家级和大洲级的所有 return 结果都在 < 0.01 秒
内当我将它们合并在一起时,突然需要 8 秒才能得到结果!
我试过用谷歌搜索问题,但一切都只是说 "Use UNION ALL
instead of UNION
"(我已经是)
我认为它可能没有足够的 RAM 来构建临时结果 table 所以这是磁盘垃圾,但我不知道如何增加内存限制。我尝试将 innodb_buffer_pool_size
增加到 1GB (1073741824) 但没有帮助
第一个select
,选择table中的所有行然后获取第一行非常快但是获取所有行将花费很多时间(mysql workbench 默认情况下将 limit 1000
附加到查询的末尾)。
要测试获取所有行是否需要更多时间,请尝试以下查询并告诉我们它消耗的时间:
select * from (
SELECT
CONCAT(continent, "-", country, "-", city) as uid,
CONCAT(continent, "-", country) as parentuid,
value
FROM
table
) t1;
如果需要将近 8 秒,那么您的工会没有问题。为了提高性能,您必须使用 where 子句限制行数。
希望对您有所帮助。
我想我的问题是:WITH ROLLUP
有什么问题?
SELECT
CONCAT_WS('-',continent,country,city) as uid,
CONCAT_WS('-',continent,COALESCE(country,'global')) as parentuid,
value
FROM (
SELECT continent, country, city, SUM(value) as value
FROM table
GROUP BY continent, country, city WITH ROLLUP
) t1
WHERE t1.continent IS NOT NULL;
我可能没有正确调用 CONCAT_WS()
,尤其是如果您有名为 ''
的城市或国家,但我认为这样会更快。 WHERE 子句只是用来删除整体摘要。
这是 MySQL 文档中 WITH ROLLUP
的示例,以帮助解释它的作用:
mysql> SELECT year, country, product, SUM(profit)
-> FROM sales
-> GROUP BY year, country, product WITH ROLLUP;
+------+---------+------------+-------------+
| year | country | product | SUM(profit) |
+------+---------+------------+-------------+
| 2000 | Finland | Computer | 1500 |
| 2000 | Finland | Phone | 100 |
| 2000 | Finland | NULL | 1600 |
| 2000 | India | Calculator | 150 |
| 2000 | India | Computer | 1200 |
| 2000 | India | NULL | 1350 |
| 2000 | USA | Calculator | 75 |
| 2000 | USA | Computer | 1500 |
| 2000 | USA | NULL | 1575 |
| 2000 | NULL | NULL | 4525 |
| 2001 | Finland | Phone | 10 |
| 2001 | Finland | NULL | 10 |
| 2001 | USA | Calculator | 50 |
| 2001 | USA | Computer | 2700 |
| 2001 | USA | TV | 250 |
| 2001 | USA | NULL | 3000 |
| 2001 | NULL | NULL | 3010 |
| NULL | NULL | NULL | 7535 |
+------+---------+------------+-------------+