MySQL 查询子查询花费的时间比应该的长
MySQL Query with Subqueries taking longer than it should
我一直在努力找出查询速度变慢的原因。该查询最初是一个 DELETE 查询,但我一直在使用 SELECT * 来自
这是有问题的查询
SELECT * FROM table1
where table1.id IN (
#Per friends suggestion I wrapped the subquery in a subquery (yo dawg) to "cache" it, it works on other queries, but not on this time.
SELECT id FROM (
(
SELECT id FROM (
SELECT table1.id FROM table1
LEFT JOIN table2 ON table2.id = table1.salesperson_id
LEFT JOIN table3 ON table3.id = table2.user_id
LEFT JOIN table4 ON table3.office_id = table4.id
WHERE table1.type = "Snapshot"
AND table4.id = 25 OR table4.parent_id =25
LIMIT 500
) AS ids )
) AS moreIds
)
有问题的 table 是 16 场演出。
运行 所针对的服务器足够强大,不会成为瓶颈。
字段id,salesperson_id和type都是indexed.Checked它的5倍。
子查询本身运行速度极快。子查询:
SELECT id FROM (
SELECT table1.id FROM table1
LEFT JOIN table2 ON table2.id = table1.salesperson_id
LEFT JOIN table3 ON table3.id = table2.user_id
LEFT JOIN table4 ON table3.office_id = table4.id
WHERE table1.type = "Snapshot"
AND table4.id = 25 OR table4.parent_id =25
LIMIT 500
)
在进程列表中,查询卡在 "SENDING DATA" 的状态。但是Workbench表示查询还是运行.
这是查询的解释 SELECT
'1', 'PRIMARY', 'table1', 'index', NULL, 'SALES_FK_ON_SALES_STATE', '5', NULL, '36688459', 'Using where; Using index'
'2', 'DEPENDENT SUBQUERY', '<derived3>', 'ALL', NULL, NULL, NULL, NULL, '500', 'Using where'
'3', 'DERIVED', '<derived4>', 'ALL', NULL, NULL, NULL, NULL, '500', ''
'4', 'DERIVED', 'table4', 'index_merge', 'PRIMARY,IDX_9F61CEFC727ACA70', 'PRIMARY,IDX_9F61CEFC727ACA70', '4,5', NULL, '67', 'Using union(PRIMARY,IDX_9F61CEFC727ACA70); Using where; Using index'
'4', 'DERIVED', 'table3', 'ref', 'PRIMARY,IDX_C077730FFFA0C224', 'IDX_C077730FFFA0C224', '5', 'hugeDb.table4.id', '381', 'Using where; Using index'
'4', 'DERIVED', 'table2', 'ref', 'PRIMARY,UNIQ_36E3BDB1A76ED395', 'UNIQ_36E3BDB1A76ED395', '5', 'hugeDb.table3.id', '1', 'Using where; Using index'
'4', 'DERIVED', 'table1', 'ref', 'SALESPERSON,SALES_FK_ON_SALES_STATE', 'SALES_FK_ON_SALES_STATE', '5', 'hugeDb.table2.id', '115', 'Using where'
这里是 SHOW CREATE TABLES
CREATE TABLE `table4` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`logo_file_id` int(11) DEFAULT NULL,
`contact_address_id` int(11) DEFAULT NULL,
`billing_address_id` int(11) DEFAULT NULL,
`parent_id` int(11) DEFAULT NULL,
`name` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
`url` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
`fax` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
`contact_name` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
`active` tinyint(1) NOT NULL,
`date_modified` datetime DEFAULT NULL,
`date_created` datetime NOT NULL,
`license_number` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
`list_name` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
`email` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
`routing_address_id` int(11) DEFAULT NULL,
`billed_separately` tinyint(1) DEFAULT '0',
PRIMARY KEY (`id`),
UNIQUE KEY `UNIQ_9F61CEFCA7E1931C` (`logo_file_id`),
KEY `IDX_9F61CEFC320EF6E2` (`contact_address_id`),
KEY `IDX_9F61CEFC79D0C0E4` (`billing_address_id`),
KEY `IDX_9F61CEFC727ACA70` (`parent_id`),
KEY `IDX_9F61CEFC40F0487C` (`routing_address_id`),
-- CONSTRAINT `FK_9F61CEFC320EF6E2` FOREIGN KEY (`contact_address_id`) REFERENCES `other_irrelevant_table` (`id`),
-- CONSTRAINT `FK_9F61CEFC79D0C0E4` FOREIGN KEY (`billing_address_id`) REFERENCES `other_irrelevant_table` (`id`),
-- CONSTRAINT `FK_9F61CEFCA7E1931C` FOREIGN KEY (`logo_file_id`) REFERENCES `other_irrelevant_table` (`id`),
-- CONSTRAINT `FK_9F61CEFCE346079F` FOREIGN KEY (`routing_address_id`) REFERENCES `other_irrelevant_table` (`id`),
CONSTRAINT `FK_9F61CEFC727ACA70` FOREIGN KEY (`parent_id`) REFERENCES `table4` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=750 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
CREATE TABLE `table3` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`office_id` int(11) DEFAULT NULL,
`user_id` int(11) DEFAULT NULL,
`active` tinyint(1) NOT NULL,
`date_modified` datetime DEFAULT NULL,
`date_created` datetime NOT NULL,
`profile_id` int(11) DEFAULT NULL,
`deleted` tinyint(1) NOT NULL,
PRIMARY KEY (`id`),
KEY `IDX_C077730FFFA0C224` (`office_id`),
KEY `IDX_C077730FA76ED395` (`user_id`),
KEY `IDX_C077730FCCFA12B8` (`profile_id`),
-- CONSTRAINT `FK_C077730FA76ED395` FOREIGN KEY (`user_id`) REFERENCES `other_irrelevant_table` (`id`),
-- CONSTRAINT `FK_C077730FCCFA12B8` FOREIGN KEY (`profile_id`) REFERENCES `other_irrelevant_table` (`id`),
CONSTRAINT `FK_C077730FFFA0C224` FOREIGN KEY (`office_id`) REFERENCES `table4` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=382425 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
CREATE TABLE `table2` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`user_id` int(11) DEFAULT NULL,
`active` tinyint(1) NOT NULL,
`date_modified` datetime DEFAULT NULL,
`date_created` datetime NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `UNIQ_36E3BDB1A76ED395` (`user_id`),
CONSTRAINT `FK_36E3BDB1A76ED395` FOREIGN KEY (`user_id`) REFERENCES `table3` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=174049 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
CREATE TABLE `table1` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`salesperson_id` int(11) DEFAULT NULL,
`count_active_contracts` int(11) NOT NULL,
`average_initial_price` decimal(12,2) NOT NULL,
`average_contract_value` decimal(12,2) NOT NULL,
`total_sold` int(11) NOT NULL,
`total_active` int(11) NOT NULL,
`active` tinyint(1) NOT NULL,
`date_modified` datetime DEFAULT NULL,
`date_created` datetime NOT NULL,
`type` varchar(255) CHARACTER SET utf8 COLLATE utf8_unicode_ci NOT NULL,
`services_scheduled_today` int(11) NOT NULL,
`services_scheduled_week` int(11) NOT NULL,
`services_scheduled_month` int(11) NOT NULL,
`services_scheduled_summer` int(11) NOT NULL,
`serviced_today` int(11) NOT NULL,
`serviced_this_week` int(11) NOT NULL,
`serviced_this_month` int(11) NOT NULL,
`serviced_this_summer` int(11) NOT NULL,
`autopay_account_percentage` decimal(3,2) NOT NULL,
`value_per_door` decimal(12,2) NOT NULL,
`total_paid` int(11) NOT NULL,
`sales_status_summary` varchar(255) CHARACTER SET utf8 COLLATE utf8_unicode_ci NOT NULL,
`total_serviced` int(11) NOT NULL,
`services_scheduled_year` int(11) NOT NULL,
`serviced_this_year` int(11) NOT NULL,
`services_scheduled_yesterday` int(11) NOT NULL,
`serviced_yesterday` int(11) NOT NULL,
PRIMARY KEY (`id`),
KEY `SALESPERSON` (`type`),
KEY `SALES_FK_ON_SALES_STATE` (`salesperson_id`),
CONSTRAINT `SALES_FK_ON_SALES_STATE` FOREIGN KEY (`salesperson_id`) REFERENCES `table2` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=181662521 DEFAULT CHARSET=utf8;
当您在解释中看到 "DEPENDENT SUBQUERY" 时,它没有缓存子查询的结果。它多次重新执行子查询(对最外层查询中的每个不同值执行一次)。我在解释中看到您的最外层查询正在检查 3600 万行。所以这大概是运行次子查询很多,很多次。
此处记录:https://dev.mysql.com/doc/refman/5.7/en/explain-output.html
For DEPENDENT SUBQUERY, the subquery is re-evaluated only once for each set of different values of the variables from its outer context. For UNCACHEABLE SUBQUERY, the subquery is re-evaluated for each row of the outer context.
避免这种情况的一种方法是使用子查询作为 派生的 table 而不是作为 IN()
谓词的参数。这是像您一样进行半连接的更好方法。
SELECT ... FROM TableA
WHERE TableA.id IN (SELECT id FROM ...)
应该等同于:
SELECT ... FROM TableA
JOIN (SELECT DISTINCT id FROM ...) AS TableB
ON TableA.id = TableB.id
在子查询中使用 DISTINCT 意味着子查询返回的每个 id 只有一行,因此如果有多个匹配项,连接不会乘以 TableA 中的行数。这使它成为 半连接。
以下应该做得更好:
SELECT table1.*
FROM table1
JOIN (
SELECT table1.id FROM table1
LEFT JOIN table2 ON table2.id = table1.salesperson_id
LEFT JOIN table3 ON table3.id = table2.user_id
LEFT JOIN table4 ON table3.office_id = table4.id
WHERE table1.type = 'Snapshot'
AND table4.id = 25 OR table4.parent_id =25
LIMIT 500
) AS ids ON table1.id = ids.id;
您也可以尝试摆脱 index_merge。您得到它是因为您在 table4 中对两个不同的索引列使用了 OR
。它使用两个索引,然后将它们结合起来。有时† 显式使用两个子查询的联合会更好,而不是依赖 index_merge.
SELECT table1.*
FROM table1
JOIN (
SELECT table1.id FROM table1
JOIN table2 ON table2.id = table1.salesperson_id
JOIN table3 ON table3.id = table2.user_id
JOIN (
SELECT id FROM table4 WHERE id=25
UNION
SELECT id FROM table4 WHERE parent_id=25
) AS t4 ON table3.office_id = t4.id
WHERE table1.type = 'Snapshot'
LIMIT 500
) AS ids ON table1.id = ids.id;
您还不必要地使用了 LEFT JOIN,因此我将其替换为 JOIN。 MySQL 优化器会自动将其转换为内部联接,但我认为您应该研究 LEFT JOIN 的含义,并在需要时使用它。
† 我说 "sometimes" 因为哪种方法最好可能取决于您的数据,因此您应该两种方式都进行测试。
由于我需要使用连接来限制删除查询(这在 mysql 中是不可能的),所以还有另一种选择。这绝不是更好的(Can't beat Bill's answer)。
但它确实有效,而且查询非常快,尽管不是很灵活。因为它有一个可以拉取的最小行数,对于 38M 行 table 是 575k(不知道为什么)
但是这里是:
SELECT COUNT(*) FROM table1
JOIN table2 ON table2.id = table1.salesperson_id
JOIN table3 ON table3.id = table2.user_id
JOIN table4 ON table3.office_id = table4.id
WHERE table1.type = "Snapshot"
AND table4.id = 113 OR table4.parent_id =113
AND RAND()<=0.001;
但比尔的回答对每个人来说应该绰绰有余。
P.S。我将在 Where 子句中询问有关 RAND() 的问题,并会在此处 post link 。也许它会在 2025 年帮助一些绝望的开发者。
你被嵌套等冲昏头脑了
SELECT table1.*
FROM
(
SELECT table1.id
FROM table1
JOIN table2 ON table2.id = table1.salesperson_id
JOIN table3 ON table3.id = table2.user_id
JOIN table4 ON table3.office_id = table4.id
WHERE table1.type = "Snapshot"
AND table4.id = 25
OR table4.parent_id =25
LIMIT 500
) AS ids
JOIN table1 USING(id)
一些讨论:
最好找到 500 个 id 并将它们放入 tmp table,而不是绕过 table1.*
的所有列。因此带有 LIMIT 500
.
的子查询
Bill 的 UNION
似乎是不必要的,因为优化器决定使用 "index merge union"。这可能是我第二次看到使用该功能!
IN ( SELECT ... )
可能永远不会比等效的 JOIN
或 EXISTS
快,以适当者为准。 (JOIN
适合您的情况。)
对于table4
,你在logo_file_id
中有一个完美的'自然PK',为什么不摆脱id
并将其提升为PK? (类似 table2
。)
Aarrgghh...按照我之前的建议,你可以绕过table2
!
table1
有 181M 行? INT
总是 4 个字节。你有很多听起来像小计数器的列;考虑使用 TINYINT UNSIGNED
(1 个字节;范围:0..255)或 SMALLINT UNSIGNED
。这应该会显着缩小 table 的大小,从而在一定程度上加快可缓存性和 table 的使用。
我一直在努力找出查询速度变慢的原因。该查询最初是一个 DELETE 查询,但我一直在使用 SELECT * 来自
这是有问题的查询
SELECT * FROM table1
where table1.id IN (
#Per friends suggestion I wrapped the subquery in a subquery (yo dawg) to "cache" it, it works on other queries, but not on this time.
SELECT id FROM (
(
SELECT id FROM (
SELECT table1.id FROM table1
LEFT JOIN table2 ON table2.id = table1.salesperson_id
LEFT JOIN table3 ON table3.id = table2.user_id
LEFT JOIN table4 ON table3.office_id = table4.id
WHERE table1.type = "Snapshot"
AND table4.id = 25 OR table4.parent_id =25
LIMIT 500
) AS ids )
) AS moreIds
)
有问题的 table 是 16 场演出。
运行 所针对的服务器足够强大,不会成为瓶颈。
字段id,salesperson_id和type都是indexed.Checked它的5倍。
子查询本身运行速度极快。子查询:
SELECT id FROM (
SELECT table1.id FROM table1
LEFT JOIN table2 ON table2.id = table1.salesperson_id
LEFT JOIN table3 ON table3.id = table2.user_id
LEFT JOIN table4 ON table3.office_id = table4.id
WHERE table1.type = "Snapshot"
AND table4.id = 25 OR table4.parent_id =25
LIMIT 500
)
在进程列表中,查询卡在 "SENDING DATA" 的状态。但是Workbench表示查询还是运行.
这是查询的解释 SELECT
'1', 'PRIMARY', 'table1', 'index', NULL, 'SALES_FK_ON_SALES_STATE', '5', NULL, '36688459', 'Using where; Using index'
'2', 'DEPENDENT SUBQUERY', '<derived3>', 'ALL', NULL, NULL, NULL, NULL, '500', 'Using where'
'3', 'DERIVED', '<derived4>', 'ALL', NULL, NULL, NULL, NULL, '500', ''
'4', 'DERIVED', 'table4', 'index_merge', 'PRIMARY,IDX_9F61CEFC727ACA70', 'PRIMARY,IDX_9F61CEFC727ACA70', '4,5', NULL, '67', 'Using union(PRIMARY,IDX_9F61CEFC727ACA70); Using where; Using index'
'4', 'DERIVED', 'table3', 'ref', 'PRIMARY,IDX_C077730FFFA0C224', 'IDX_C077730FFFA0C224', '5', 'hugeDb.table4.id', '381', 'Using where; Using index'
'4', 'DERIVED', 'table2', 'ref', 'PRIMARY,UNIQ_36E3BDB1A76ED395', 'UNIQ_36E3BDB1A76ED395', '5', 'hugeDb.table3.id', '1', 'Using where; Using index'
'4', 'DERIVED', 'table1', 'ref', 'SALESPERSON,SALES_FK_ON_SALES_STATE', 'SALES_FK_ON_SALES_STATE', '5', 'hugeDb.table2.id', '115', 'Using where'
这里是 SHOW CREATE TABLES
CREATE TABLE `table4` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`logo_file_id` int(11) DEFAULT NULL,
`contact_address_id` int(11) DEFAULT NULL,
`billing_address_id` int(11) DEFAULT NULL,
`parent_id` int(11) DEFAULT NULL,
`name` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
`url` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
`fax` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
`contact_name` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
`active` tinyint(1) NOT NULL,
`date_modified` datetime DEFAULT NULL,
`date_created` datetime NOT NULL,
`license_number` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
`list_name` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
`email` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
`routing_address_id` int(11) DEFAULT NULL,
`billed_separately` tinyint(1) DEFAULT '0',
PRIMARY KEY (`id`),
UNIQUE KEY `UNIQ_9F61CEFCA7E1931C` (`logo_file_id`),
KEY `IDX_9F61CEFC320EF6E2` (`contact_address_id`),
KEY `IDX_9F61CEFC79D0C0E4` (`billing_address_id`),
KEY `IDX_9F61CEFC727ACA70` (`parent_id`),
KEY `IDX_9F61CEFC40F0487C` (`routing_address_id`),
-- CONSTRAINT `FK_9F61CEFC320EF6E2` FOREIGN KEY (`contact_address_id`) REFERENCES `other_irrelevant_table` (`id`),
-- CONSTRAINT `FK_9F61CEFC79D0C0E4` FOREIGN KEY (`billing_address_id`) REFERENCES `other_irrelevant_table` (`id`),
-- CONSTRAINT `FK_9F61CEFCA7E1931C` FOREIGN KEY (`logo_file_id`) REFERENCES `other_irrelevant_table` (`id`),
-- CONSTRAINT `FK_9F61CEFCE346079F` FOREIGN KEY (`routing_address_id`) REFERENCES `other_irrelevant_table` (`id`),
CONSTRAINT `FK_9F61CEFC727ACA70` FOREIGN KEY (`parent_id`) REFERENCES `table4` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=750 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
CREATE TABLE `table3` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`office_id` int(11) DEFAULT NULL,
`user_id` int(11) DEFAULT NULL,
`active` tinyint(1) NOT NULL,
`date_modified` datetime DEFAULT NULL,
`date_created` datetime NOT NULL,
`profile_id` int(11) DEFAULT NULL,
`deleted` tinyint(1) NOT NULL,
PRIMARY KEY (`id`),
KEY `IDX_C077730FFFA0C224` (`office_id`),
KEY `IDX_C077730FA76ED395` (`user_id`),
KEY `IDX_C077730FCCFA12B8` (`profile_id`),
-- CONSTRAINT `FK_C077730FA76ED395` FOREIGN KEY (`user_id`) REFERENCES `other_irrelevant_table` (`id`),
-- CONSTRAINT `FK_C077730FCCFA12B8` FOREIGN KEY (`profile_id`) REFERENCES `other_irrelevant_table` (`id`),
CONSTRAINT `FK_C077730FFFA0C224` FOREIGN KEY (`office_id`) REFERENCES `table4` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=382425 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
CREATE TABLE `table2` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`user_id` int(11) DEFAULT NULL,
`active` tinyint(1) NOT NULL,
`date_modified` datetime DEFAULT NULL,
`date_created` datetime NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `UNIQ_36E3BDB1A76ED395` (`user_id`),
CONSTRAINT `FK_36E3BDB1A76ED395` FOREIGN KEY (`user_id`) REFERENCES `table3` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=174049 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
CREATE TABLE `table1` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`salesperson_id` int(11) DEFAULT NULL,
`count_active_contracts` int(11) NOT NULL,
`average_initial_price` decimal(12,2) NOT NULL,
`average_contract_value` decimal(12,2) NOT NULL,
`total_sold` int(11) NOT NULL,
`total_active` int(11) NOT NULL,
`active` tinyint(1) NOT NULL,
`date_modified` datetime DEFAULT NULL,
`date_created` datetime NOT NULL,
`type` varchar(255) CHARACTER SET utf8 COLLATE utf8_unicode_ci NOT NULL,
`services_scheduled_today` int(11) NOT NULL,
`services_scheduled_week` int(11) NOT NULL,
`services_scheduled_month` int(11) NOT NULL,
`services_scheduled_summer` int(11) NOT NULL,
`serviced_today` int(11) NOT NULL,
`serviced_this_week` int(11) NOT NULL,
`serviced_this_month` int(11) NOT NULL,
`serviced_this_summer` int(11) NOT NULL,
`autopay_account_percentage` decimal(3,2) NOT NULL,
`value_per_door` decimal(12,2) NOT NULL,
`total_paid` int(11) NOT NULL,
`sales_status_summary` varchar(255) CHARACTER SET utf8 COLLATE utf8_unicode_ci NOT NULL,
`total_serviced` int(11) NOT NULL,
`services_scheduled_year` int(11) NOT NULL,
`serviced_this_year` int(11) NOT NULL,
`services_scheduled_yesterday` int(11) NOT NULL,
`serviced_yesterday` int(11) NOT NULL,
PRIMARY KEY (`id`),
KEY `SALESPERSON` (`type`),
KEY `SALES_FK_ON_SALES_STATE` (`salesperson_id`),
CONSTRAINT `SALES_FK_ON_SALES_STATE` FOREIGN KEY (`salesperson_id`) REFERENCES `table2` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=181662521 DEFAULT CHARSET=utf8;
当您在解释中看到 "DEPENDENT SUBQUERY" 时,它没有缓存子查询的结果。它多次重新执行子查询(对最外层查询中的每个不同值执行一次)。我在解释中看到您的最外层查询正在检查 3600 万行。所以这大概是运行次子查询很多,很多次。
此处记录:https://dev.mysql.com/doc/refman/5.7/en/explain-output.html
For DEPENDENT SUBQUERY, the subquery is re-evaluated only once for each set of different values of the variables from its outer context. For UNCACHEABLE SUBQUERY, the subquery is re-evaluated for each row of the outer context.
避免这种情况的一种方法是使用子查询作为 派生的 table 而不是作为 IN()
谓词的参数。这是像您一样进行半连接的更好方法。
SELECT ... FROM TableA
WHERE TableA.id IN (SELECT id FROM ...)
应该等同于:
SELECT ... FROM TableA
JOIN (SELECT DISTINCT id FROM ...) AS TableB
ON TableA.id = TableB.id
在子查询中使用 DISTINCT 意味着子查询返回的每个 id 只有一行,因此如果有多个匹配项,连接不会乘以 TableA 中的行数。这使它成为 半连接。
以下应该做得更好:
SELECT table1.*
FROM table1
JOIN (
SELECT table1.id FROM table1
LEFT JOIN table2 ON table2.id = table1.salesperson_id
LEFT JOIN table3 ON table3.id = table2.user_id
LEFT JOIN table4 ON table3.office_id = table4.id
WHERE table1.type = 'Snapshot'
AND table4.id = 25 OR table4.parent_id =25
LIMIT 500
) AS ids ON table1.id = ids.id;
您也可以尝试摆脱 index_merge。您得到它是因为您在 table4 中对两个不同的索引列使用了 OR
。它使用两个索引,然后将它们结合起来。有时† 显式使用两个子查询的联合会更好,而不是依赖 index_merge.
SELECT table1.*
FROM table1
JOIN (
SELECT table1.id FROM table1
JOIN table2 ON table2.id = table1.salesperson_id
JOIN table3 ON table3.id = table2.user_id
JOIN (
SELECT id FROM table4 WHERE id=25
UNION
SELECT id FROM table4 WHERE parent_id=25
) AS t4 ON table3.office_id = t4.id
WHERE table1.type = 'Snapshot'
LIMIT 500
) AS ids ON table1.id = ids.id;
您还不必要地使用了 LEFT JOIN,因此我将其替换为 JOIN。 MySQL 优化器会自动将其转换为内部联接,但我认为您应该研究 LEFT JOIN 的含义,并在需要时使用它。
† 我说 "sometimes" 因为哪种方法最好可能取决于您的数据,因此您应该两种方式都进行测试。
由于我需要使用连接来限制删除查询(这在 mysql 中是不可能的),所以还有另一种选择。这绝不是更好的(Can't beat Bill's answer)。
但它确实有效,而且查询非常快,尽管不是很灵活。因为它有一个可以拉取的最小行数,对于 38M 行 table 是 575k(不知道为什么)
但是这里是:
SELECT COUNT(*) FROM table1
JOIN table2 ON table2.id = table1.salesperson_id
JOIN table3 ON table3.id = table2.user_id
JOIN table4 ON table3.office_id = table4.id
WHERE table1.type = "Snapshot"
AND table4.id = 113 OR table4.parent_id =113
AND RAND()<=0.001;
但比尔的回答对每个人来说应该绰绰有余。 P.S。我将在 Where 子句中询问有关 RAND() 的问题,并会在此处 post link 。也许它会在 2025 年帮助一些绝望的开发者。
你被嵌套等冲昏头脑了
SELECT table1.*
FROM
(
SELECT table1.id
FROM table1
JOIN table2 ON table2.id = table1.salesperson_id
JOIN table3 ON table3.id = table2.user_id
JOIN table4 ON table3.office_id = table4.id
WHERE table1.type = "Snapshot"
AND table4.id = 25
OR table4.parent_id =25
LIMIT 500
) AS ids
JOIN table1 USING(id)
一些讨论:
最好找到 500 个 id 并将它们放入 tmp table,而不是绕过
table1.*
的所有列。因此带有LIMIT 500
. 的子查询
Bill 的
UNION
似乎是不必要的,因为优化器决定使用 "index merge union"。这可能是我第二次看到使用该功能!IN ( SELECT ... )
可能永远不会比等效的JOIN
或EXISTS
快,以适当者为准。 (JOIN
适合您的情况。)对于
table4
,你在logo_file_id
中有一个完美的'自然PK',为什么不摆脱id
并将其提升为PK? (类似table2
。)Aarrgghh...按照我之前的建议,你可以绕过
table2
!table1
有 181M 行?INT
总是 4 个字节。你有很多听起来像小计数器的列;考虑使用TINYINT UNSIGNED
(1 个字节;范围:0..255)或SMALLINT UNSIGNED
。这应该会显着缩小 table 的大小,从而在一定程度上加快可缓存性和 table 的使用。