优化 MySQL 自连接查询
Optimize MySQL self-join query
我有 c_regs table 包含重复的行。我在 form_number 和 property_name 列上创建了索引。不幸的是,这个查询仍然需要很长时间才能完成,尤其是在添加了 t10 和 t11 连接的情况下。有没有办法优化它?谢谢
select
ifnull(x.form_datetime,'') reg_date,
ifnull(x.property_value,'') amg_id,
x.form_number,
x.form_name,
x.form_version,
ifnull(t1.property_value,'') first_name,
ifnull(t2.property_value,'') last_name,
ifnull(t3.property_value,'') address,
ifnull(t4.property_value,'') address_2,
ifnull(t5.property_value,'') city,
ifnull(t6.property_value,'') state_code,
ifnull(t7.property_value,'') zip,
ifnull(t8.property_value,'') phone,
ifnull(t9.property_value,'') email,
ifnull(t10.property_value,'') registrant_type,
t11.property_value auth_type_code
from
(select distinct form_datetime, form_number, form_name, form_version, property_value from c_regs where property_name = 'field.frm_personID') as x
inner join (select distinct * from c_regs) as t1 on t1.form_number = x.form_number and t1.property_name = 'field.frm_firstName'
inner join (select distinct * from c_regs) as t2 on t2.form_number = x.form_number and t2.property_name = 'field.frm_lastName'
inner join (select distinct * from c_regs) as t3 on t3.form_number = x.form_number and t3.property_name = 'field.frm_address'
left join (select distinct * from c_regs) as t4 on t4.form_number = x.form_number and t4.property_name = 'field.frm_address2'
inner join (select distinct * from c_regs) as t5 on t5.form_number = x.form_number and t5.property_name = 'field.frm_city'
inner join (select distinct * from c_regs) as t6 on t6.form_number = x.form_number and t6.property_name = 'field.frm_state'
inner join (select distinct * from c_regs) as t7 on t7.form_number = x.form_number and t7.property_name = 'field.frm_zip'
inner join (select distinct * from c_regs) as t8 on t8.form_number = x.form_number and t8.property_name = 'field.frm_phone'
inner join (select distinct * from c_regs) as t9 on t9.form_number = x.form_number and t9.property_name = 'field.frm_emailAddress'
left join (select distinct * from c_regs) as t10 on t10.form_number = x.form_number and t10.property_name = 'field.frm_youAre'
inner join (select distinct * from c_regs) as t11 on t11.form_number = x.form_number and t11.property_name = 'field.frm_authType'
;
您不需要所有这些联接。通过我的优化,数据将 return 成行而不是列。
(我没有运行这个,先测试一下)
SELECT
ifnull(x.form_datetime,'') reg_date,
ifnull(x.property_value,'') amg_id,
x.form_number,
x.form_name,
x.form_version,
x.property_name,
x.property_value
FROM c_regs x
WHERE x.property_name IN (
'field.frm_firstName',
'field.frm_lastName',
'field.frm_address',
...
)
AND x.form_number = 'the form id'
GROUP BY x.form_number, x.property_name
ORDER BY x.form_number ASC;
AND
只有在您需要特定表格时才需要,而不是所有表格。 (我会建议)
同时问自己一个问题:是否需要在条件中包含字段名称?您可以将我的查询用作 sub-query,然后像以前一样将每个字段合并为列,而无需其他连接。
尝试在您的代码中添加 union 子句
喜欢
SELECT ID, NAME, AMOUNT, DATE
FROM CUSTOMERS
LEFT JOIN ORDERS
ON CUSTOMERS.ID = ORDERS.CUSTOMER_ID
UNION
SELECT ID, NAME, AMOUNT, DATE
FROM CUSTOMERS
RIGHT JOIN ORDERS
ON CUSTOMERS.ID = ORDERS.CUSTOMER_ID;
你不应该一直使用 SELECT DISTINCT
。请记住,如果您在 select-list 中有任何唯一约束,DISTINCT 必然是 no-op,因此可能没有必要。如果有重复项,DISTINCT 是昂贵的,因为它对 table 进行排序,因此重复项排列在一起成为 de-duped.
您也不应该为此类数据做很多 self-joins。 self-join 中的每个子查询都在读取 整个 table.
SELECT form_number,
MAX(form_datetime) AS reg_date,
MAX(form_name) AS form_name,
MAX(form_version) AS form_version,
MAX(CASE property_name WHEN 'field.frm_personID' THEN property_value END) AS amg_id,
MAX(CASE property_name WHEN 'field.frm_firstName' THEN property_value END) AS first_name,
MAX(CASE property_name WHEN 'field.frm_lastName' THEN property_value END) AS last_name,
MAX(CASE property_name WHEN 'field.frm_address' THEN property_value END) AS address,
MAX(CASE property_name WHEN 'field.frm_address2' THEN property_value END) AS address_2,
MAX(CASE property_name WHEN 'field.frm_city' THEN property_value END) AS city,
MAX(CASE property_name WHEN 'field.frm_state' THEN property_value END) AS state_code,
MAX(CASE property_name WHEN 'field.frm_zip' THEN property_value END) AS zip,
MAX(CASE property_name WHEN 'field.frm_phone' THEN property_value END) AS phone,
MAX(CASE property_name WHEN 'field.frm_emailAddress' THEN property_value END) AS email,
MAX(CASE property_name WHEN 'field.frm_youAre' THEN property_value END) AS registrant_type,
MAX(CASE property_name WHEN 'field.frm_authType' THEN property_value END) AS auth_type_code
FROM c_regs
GROUP BY form_number;
解释:GROUP BY
导致给定 form_number 的所有行被视为一组,结果每组一行。
GROUP BY
中未命名的所有其他列必须在分组函数中。我选择了 MAX()。我假设对于日期时间、名称和版本的形式,每组应该只有一个不同的值。
对于属性,我们在 MAX() 函数中放置了一个表达式 return 仅在 属性 具有特定值的行上的值。在其他行上,表达式为 NULL,MAX() 将忽略它。
通过这种方式,您无需执行任何 self-joins 或 DISTINCT 修饰符即可获得所需的结果。查询仅扫描 table 一次,应该更快。
BK 关于大量自连接有害的主张具有误导性。
考虑一个包含 10,000 个实体的 EAV 数据集,每个实体具有 12 个属性,如下所示:
DROP TABLE IF EXISTS my_table;
CREATE TABLE my_table
(entity INT NOT NULL
,attribute INT NOT NULL
,value INT NOT NULL
,PRIMARY KEY(entity,attribute)
);
INSERT INTO my_table VALUES
(1,101,RAND()*100),
(1,102,RAND()*100),
(1,103,RAND()*100),
(1,104,RAND()*100),
(1,105,RAND()*100),
(1,106,RAND()*100),
(1,107,RAND()*100),
(1,108,RAND()*100),
(1,109,RAND()*100),
(1,110,RAND()*100),
(1,111,RAND()*100),
(1,112,RAND()*100);
有了这个初始种子,我可以使用 table 个整数 (0-9) 来快速填充其余的 table...
INSERT IGNORE INTO my_table SELECT i4.i*1000+i3.i*100+i2.i*10+i1.i+1, attribute, RAND()*100 FROM my_table,ints i1, ints i2, ints i3, ints i4;
比尔的查询...
SELECT SQL_NO_CACHE a.entity
, MAX(CASE WHEN attribute = 101 THEN value END) x101
, MAX(CASE WHEN attribute = 102 THEN value END) x102
, MAX(CASE WHEN attribute = 103 THEN value END) x103
, MAX(CASE WHEN attribute = 104 THEN value END) x104
, MAX(CASE WHEN attribute = 105 THEN value END) x105
, MAX(CASE WHEN attribute = 106 THEN value END) x106
, MAX(CASE WHEN attribute = 107 THEN value END) x107
, MAX(CASE WHEN attribute = 108 THEN value END) x108
, MAX(CASE WHEN attribute = 109 THEN value END) x109
, MAX(CASE WHEN attribute = 110 THEN value END) x110
, MAX(CASE WHEN attribute = 111 THEN value END) x111
, MAX(CASE WHEN attribute = 112 THEN value END) x112
FROM my_table a
GROUP
BY a.entity;
+--------+------+------+------+------+------+------+------+------+------+------+------+------+
| entity | x101 | x102 | x103 | x104 | x105 | x106 | x107 | x108 | x109 | x110 | x111 | x112 |
+--------+------+------+------+------+------+------+------+------+------+------+------+------+
| 1 | 78 | 8 | 4 | 95 | 66 | 43 | 16 | 51 | 9 | 89 | 20 | 33 |
...
| 9998 | 61 | 72 | 67 | 20 | 23 | 10 | 31 | 37 | 69 | 18 | 24 | 32 |
| 9999 | 67 | 91 | 32 | 58 | 77 | 81 | 61 | 22 | 75 | 65 | 91 | 42 |
| 10000 | 52 | 38 | 56 | 32 | 14 | 77 | 10 | 99 | 70 | 70 | 82 | 13 |
+--------+------+------+------+------+------+------+------+------+------+------+------+------+
10000 rows in set (0.20 sec)
备选方案...
SELECT SQL_NO_CACHE a.entity
, a.value x101
, b.value x102
, c.value x103
, d.value x104
, e.value x105
, f.value x106
, g.value x107
, h.value x108
, i.value x109
, j.value x110
, k.value x111
, l.value x112
FROM my_table a
LEFT JOIN my_table b ON b.entity = a.entity AND b.attribute = 102
LEFT JOIN my_table c ON c.entity = a.entity AND c.attribute = 103
LEFT JOIN my_table d ON d.entity = a.entity AND d.attribute = 104
LEFT JOIN my_table e ON e.entity = a.entity AND e.attribute = 105
LEFT JOIN my_table f ON f.entity = a.entity AND f.attribute = 106
LEFT JOIN my_table g ON g.entity = a.entity AND g.attribute = 107
LEFT JOIN my_table h ON h.entity = a.entity AND h.attribute = 108
LEFT JOIN my_table i ON i.entity = a.entity AND i.attribute = 109
LEFT JOIN my_table j ON j.entity = a.entity AND j.attribute = 110
LEFT JOIN my_table k ON k.entity = a.entity AND k.attribute = 111
LEFT JOIN my_table l ON l.entity = a.entity AND l.attribute = 112
WHERE a.attribute = 101;
+--------+------+------+------+------+------+------+------+------+------+------+------+------+
| entity | x101 | x102 | x103 | x104 | x105 | x106 | x107 | x108 | x109 | x110 | x111 | x112 |
+--------+------+------+------+------+------+------+------+------+------+------+------+------+
| 1 | 78 | 8 | 4 | 95 | 66 | 43 | 16 | 51 | 9 | 89 | 20 | 33 |
...
| 9998 | 61 | 72 | 67 | 20 | 23 | 10 | 31 | 37 | 69 | 18 | 24 | 32 |
| 9999 | 67 | 91 | 32 | 58 | 77 | 81 | 61 | 22 | 75 | 65 | 91 | 42 |
| 10000 | 52 | 38 | 56 | 32 | 14 | 77 | 10 | 99 | 70 | 70 | 82 | 13 |
+--------+------+------+------+------+------+------+------+------+------+------+------+------+
10000 rows in set (0.23 sec)
因此,Bill 的查询稍微快了一点。但是,一旦您减少了搜索的实体数量(同时保持相同数量的属性 - 所以相同数量的连接),替代查询可能会以接近相同类型的利润率超过 Bill 的...
Bill 的查询添加了 WHERE a.entity <= 5000
| 4998 | 59 | 55 | 93 | 48 | 72 | 32 | 38 | 36 | 6 | 82 | 23 | 62 |
| 4999 | 23 | 10 | 11 | 29 | 69 | 67 | 92 | 72 | 25 | 49 | 79 | 48 |
| 5000 | 39 | 86 | 77 | 0 | 30 | 38 | 48 | 54 | 9 | 97 | 25 | 54 |
+--------+------+------+------+------+------+------+------+------+------+------+------+------+
5000 rows in set (0.12 sec)
添加了 WHERE a.entity <= 5000
的备选方案
| 4998 | 59 | 55 | 93 | 48 | 72 | 32 | 38 | 36 | 6 | 82 | 23 | 62 |
| 4999 | 23 | 10 | 11 | 29 | 69 | 67 | 92 | 72 | 25 | 49 | 79 | 48 |
| 5000 | 39 | 86 | 77 | 0 | 30 | 38 | 48 | 54 | 9 | 97 | 25 | 54 |
+--------+------+------+------+------+------+------+------+------+------+------+------+------+
5000 rows in set (0.11 sec)
因此,真正造成慢速查询和快速查询之间差异的并不是连接的数量,而是索引的坚持不懈的使用。
这太糟糕了:
inner join (select distinct * from c_regs) as t7
on t7.form_number = x.form_number and t7.property_name = 'field.frm_zip'
它扫描整个 c_regs
table,删除重复的行,并将去重复的行复制到没有索引的临时 table 中。然后它会在其中翻找可能(或可能不会)是一行的内容。
请注意 DISTINCT
而不是 保证最多一行将被 return 编辑。 (我会忽略多行问题。)
做起来会好很多
inner join c_regs AS t7 ON
t7.form_number = x.form_number and t7.property_name = 'field.frm_zip'
但这也需要INDEX(form_number, property_name)
。更好的方法是让 PRIMARY KEY
从这两列开始,如此处讨论:http://mysql.rjweb.org/doc.php/index_cookbook_mysql#speeding_up_wp_postmeta
与此同时,在第一个 FROM 之后不需要额外的层 SELECT。
与此同时,您应该着手摆脱 c_regs、 和 中的重复项以防止他们 return!一个suitable naturalPRIMARY KEY
很有可能解决问题。 (再一次,看我的link。)
我有 c_regs table 包含重复的行。我在 form_number 和 property_name 列上创建了索引。不幸的是,这个查询仍然需要很长时间才能完成,尤其是在添加了 t10 和 t11 连接的情况下。有没有办法优化它?谢谢
select
ifnull(x.form_datetime,'') reg_date,
ifnull(x.property_value,'') amg_id,
x.form_number,
x.form_name,
x.form_version,
ifnull(t1.property_value,'') first_name,
ifnull(t2.property_value,'') last_name,
ifnull(t3.property_value,'') address,
ifnull(t4.property_value,'') address_2,
ifnull(t5.property_value,'') city,
ifnull(t6.property_value,'') state_code,
ifnull(t7.property_value,'') zip,
ifnull(t8.property_value,'') phone,
ifnull(t9.property_value,'') email,
ifnull(t10.property_value,'') registrant_type,
t11.property_value auth_type_code
from
(select distinct form_datetime, form_number, form_name, form_version, property_value from c_regs where property_name = 'field.frm_personID') as x
inner join (select distinct * from c_regs) as t1 on t1.form_number = x.form_number and t1.property_name = 'field.frm_firstName'
inner join (select distinct * from c_regs) as t2 on t2.form_number = x.form_number and t2.property_name = 'field.frm_lastName'
inner join (select distinct * from c_regs) as t3 on t3.form_number = x.form_number and t3.property_name = 'field.frm_address'
left join (select distinct * from c_regs) as t4 on t4.form_number = x.form_number and t4.property_name = 'field.frm_address2'
inner join (select distinct * from c_regs) as t5 on t5.form_number = x.form_number and t5.property_name = 'field.frm_city'
inner join (select distinct * from c_regs) as t6 on t6.form_number = x.form_number and t6.property_name = 'field.frm_state'
inner join (select distinct * from c_regs) as t7 on t7.form_number = x.form_number and t7.property_name = 'field.frm_zip'
inner join (select distinct * from c_regs) as t8 on t8.form_number = x.form_number and t8.property_name = 'field.frm_phone'
inner join (select distinct * from c_regs) as t9 on t9.form_number = x.form_number and t9.property_name = 'field.frm_emailAddress'
left join (select distinct * from c_regs) as t10 on t10.form_number = x.form_number and t10.property_name = 'field.frm_youAre'
inner join (select distinct * from c_regs) as t11 on t11.form_number = x.form_number and t11.property_name = 'field.frm_authType'
;
您不需要所有这些联接。通过我的优化,数据将 return 成行而不是列。
(我没有运行这个,先测试一下)
SELECT
ifnull(x.form_datetime,'') reg_date,
ifnull(x.property_value,'') amg_id,
x.form_number,
x.form_name,
x.form_version,
x.property_name,
x.property_value
FROM c_regs x
WHERE x.property_name IN (
'field.frm_firstName',
'field.frm_lastName',
'field.frm_address',
...
)
AND x.form_number = 'the form id'
GROUP BY x.form_number, x.property_name
ORDER BY x.form_number ASC;
AND
只有在您需要特定表格时才需要,而不是所有表格。 (我会建议)
同时问自己一个问题:是否需要在条件中包含字段名称?您可以将我的查询用作 sub-query,然后像以前一样将每个字段合并为列,而无需其他连接。
尝试在您的代码中添加 union 子句
喜欢
SELECT ID, NAME, AMOUNT, DATE
FROM CUSTOMERS
LEFT JOIN ORDERS
ON CUSTOMERS.ID = ORDERS.CUSTOMER_ID
UNION
SELECT ID, NAME, AMOUNT, DATE
FROM CUSTOMERS
RIGHT JOIN ORDERS
ON CUSTOMERS.ID = ORDERS.CUSTOMER_ID;
你不应该一直使用 SELECT DISTINCT
。请记住,如果您在 select-list 中有任何唯一约束,DISTINCT 必然是 no-op,因此可能没有必要。如果有重复项,DISTINCT 是昂贵的,因为它对 table 进行排序,因此重复项排列在一起成为 de-duped.
您也不应该为此类数据做很多 self-joins。 self-join 中的每个子查询都在读取 整个 table.
SELECT form_number,
MAX(form_datetime) AS reg_date,
MAX(form_name) AS form_name,
MAX(form_version) AS form_version,
MAX(CASE property_name WHEN 'field.frm_personID' THEN property_value END) AS amg_id,
MAX(CASE property_name WHEN 'field.frm_firstName' THEN property_value END) AS first_name,
MAX(CASE property_name WHEN 'field.frm_lastName' THEN property_value END) AS last_name,
MAX(CASE property_name WHEN 'field.frm_address' THEN property_value END) AS address,
MAX(CASE property_name WHEN 'field.frm_address2' THEN property_value END) AS address_2,
MAX(CASE property_name WHEN 'field.frm_city' THEN property_value END) AS city,
MAX(CASE property_name WHEN 'field.frm_state' THEN property_value END) AS state_code,
MAX(CASE property_name WHEN 'field.frm_zip' THEN property_value END) AS zip,
MAX(CASE property_name WHEN 'field.frm_phone' THEN property_value END) AS phone,
MAX(CASE property_name WHEN 'field.frm_emailAddress' THEN property_value END) AS email,
MAX(CASE property_name WHEN 'field.frm_youAre' THEN property_value END) AS registrant_type,
MAX(CASE property_name WHEN 'field.frm_authType' THEN property_value END) AS auth_type_code
FROM c_regs
GROUP BY form_number;
解释:GROUP BY
导致给定 form_number 的所有行被视为一组,结果每组一行。
GROUP BY
中未命名的所有其他列必须在分组函数中。我选择了 MAX()。我假设对于日期时间、名称和版本的形式,每组应该只有一个不同的值。
对于属性,我们在 MAX() 函数中放置了一个表达式 return 仅在 属性 具有特定值的行上的值。在其他行上,表达式为 NULL,MAX() 将忽略它。
通过这种方式,您无需执行任何 self-joins 或 DISTINCT 修饰符即可获得所需的结果。查询仅扫描 table 一次,应该更快。
BK 关于大量自连接有害的主张具有误导性。
考虑一个包含 10,000 个实体的 EAV 数据集,每个实体具有 12 个属性,如下所示:
DROP TABLE IF EXISTS my_table;
CREATE TABLE my_table
(entity INT NOT NULL
,attribute INT NOT NULL
,value INT NOT NULL
,PRIMARY KEY(entity,attribute)
);
INSERT INTO my_table VALUES
(1,101,RAND()*100),
(1,102,RAND()*100),
(1,103,RAND()*100),
(1,104,RAND()*100),
(1,105,RAND()*100),
(1,106,RAND()*100),
(1,107,RAND()*100),
(1,108,RAND()*100),
(1,109,RAND()*100),
(1,110,RAND()*100),
(1,111,RAND()*100),
(1,112,RAND()*100);
有了这个初始种子,我可以使用 table 个整数 (0-9) 来快速填充其余的 table...
INSERT IGNORE INTO my_table SELECT i4.i*1000+i3.i*100+i2.i*10+i1.i+1, attribute, RAND()*100 FROM my_table,ints i1, ints i2, ints i3, ints i4;
比尔的查询...
SELECT SQL_NO_CACHE a.entity
, MAX(CASE WHEN attribute = 101 THEN value END) x101
, MAX(CASE WHEN attribute = 102 THEN value END) x102
, MAX(CASE WHEN attribute = 103 THEN value END) x103
, MAX(CASE WHEN attribute = 104 THEN value END) x104
, MAX(CASE WHEN attribute = 105 THEN value END) x105
, MAX(CASE WHEN attribute = 106 THEN value END) x106
, MAX(CASE WHEN attribute = 107 THEN value END) x107
, MAX(CASE WHEN attribute = 108 THEN value END) x108
, MAX(CASE WHEN attribute = 109 THEN value END) x109
, MAX(CASE WHEN attribute = 110 THEN value END) x110
, MAX(CASE WHEN attribute = 111 THEN value END) x111
, MAX(CASE WHEN attribute = 112 THEN value END) x112
FROM my_table a
GROUP
BY a.entity;
+--------+------+------+------+------+------+------+------+------+------+------+------+------+
| entity | x101 | x102 | x103 | x104 | x105 | x106 | x107 | x108 | x109 | x110 | x111 | x112 |
+--------+------+------+------+------+------+------+------+------+------+------+------+------+
| 1 | 78 | 8 | 4 | 95 | 66 | 43 | 16 | 51 | 9 | 89 | 20 | 33 |
...
| 9998 | 61 | 72 | 67 | 20 | 23 | 10 | 31 | 37 | 69 | 18 | 24 | 32 |
| 9999 | 67 | 91 | 32 | 58 | 77 | 81 | 61 | 22 | 75 | 65 | 91 | 42 |
| 10000 | 52 | 38 | 56 | 32 | 14 | 77 | 10 | 99 | 70 | 70 | 82 | 13 |
+--------+------+------+------+------+------+------+------+------+------+------+------+------+
10000 rows in set (0.20 sec)
备选方案...
SELECT SQL_NO_CACHE a.entity
, a.value x101
, b.value x102
, c.value x103
, d.value x104
, e.value x105
, f.value x106
, g.value x107
, h.value x108
, i.value x109
, j.value x110
, k.value x111
, l.value x112
FROM my_table a
LEFT JOIN my_table b ON b.entity = a.entity AND b.attribute = 102
LEFT JOIN my_table c ON c.entity = a.entity AND c.attribute = 103
LEFT JOIN my_table d ON d.entity = a.entity AND d.attribute = 104
LEFT JOIN my_table e ON e.entity = a.entity AND e.attribute = 105
LEFT JOIN my_table f ON f.entity = a.entity AND f.attribute = 106
LEFT JOIN my_table g ON g.entity = a.entity AND g.attribute = 107
LEFT JOIN my_table h ON h.entity = a.entity AND h.attribute = 108
LEFT JOIN my_table i ON i.entity = a.entity AND i.attribute = 109
LEFT JOIN my_table j ON j.entity = a.entity AND j.attribute = 110
LEFT JOIN my_table k ON k.entity = a.entity AND k.attribute = 111
LEFT JOIN my_table l ON l.entity = a.entity AND l.attribute = 112
WHERE a.attribute = 101;
+--------+------+------+------+------+------+------+------+------+------+------+------+------+
| entity | x101 | x102 | x103 | x104 | x105 | x106 | x107 | x108 | x109 | x110 | x111 | x112 |
+--------+------+------+------+------+------+------+------+------+------+------+------+------+
| 1 | 78 | 8 | 4 | 95 | 66 | 43 | 16 | 51 | 9 | 89 | 20 | 33 |
...
| 9998 | 61 | 72 | 67 | 20 | 23 | 10 | 31 | 37 | 69 | 18 | 24 | 32 |
| 9999 | 67 | 91 | 32 | 58 | 77 | 81 | 61 | 22 | 75 | 65 | 91 | 42 |
| 10000 | 52 | 38 | 56 | 32 | 14 | 77 | 10 | 99 | 70 | 70 | 82 | 13 |
+--------+------+------+------+------+------+------+------+------+------+------+------+------+
10000 rows in set (0.23 sec)
因此,Bill 的查询稍微快了一点。但是,一旦您减少了搜索的实体数量(同时保持相同数量的属性 - 所以相同数量的连接),替代查询可能会以接近相同类型的利润率超过 Bill 的...
Bill 的查询添加了 WHERE a.entity <= 5000
| 4998 | 59 | 55 | 93 | 48 | 72 | 32 | 38 | 36 | 6 | 82 | 23 | 62 |
| 4999 | 23 | 10 | 11 | 29 | 69 | 67 | 92 | 72 | 25 | 49 | 79 | 48 |
| 5000 | 39 | 86 | 77 | 0 | 30 | 38 | 48 | 54 | 9 | 97 | 25 | 54 |
+--------+------+------+------+------+------+------+------+------+------+------+------+------+
5000 rows in set (0.12 sec)
添加了 WHERE a.entity <= 5000
的备选方案
| 4998 | 59 | 55 | 93 | 48 | 72 | 32 | 38 | 36 | 6 | 82 | 23 | 62 |
| 4999 | 23 | 10 | 11 | 29 | 69 | 67 | 92 | 72 | 25 | 49 | 79 | 48 |
| 5000 | 39 | 86 | 77 | 0 | 30 | 38 | 48 | 54 | 9 | 97 | 25 | 54 |
+--------+------+------+------+------+------+------+------+------+------+------+------+------+
5000 rows in set (0.11 sec)
因此,真正造成慢速查询和快速查询之间差异的并不是连接的数量,而是索引的坚持不懈的使用。
这太糟糕了:
inner join (select distinct * from c_regs) as t7
on t7.form_number = x.form_number and t7.property_name = 'field.frm_zip'
它扫描整个 c_regs
table,删除重复的行,并将去重复的行复制到没有索引的临时 table 中。然后它会在其中翻找可能(或可能不会)是一行的内容。
请注意 DISTINCT
而不是 保证最多一行将被 return 编辑。 (我会忽略多行问题。)
做起来会好很多
inner join c_regs AS t7 ON
t7.form_number = x.form_number and t7.property_name = 'field.frm_zip'
但这也需要INDEX(form_number, property_name)
。更好的方法是让 PRIMARY KEY
从这两列开始,如此处讨论:http://mysql.rjweb.org/doc.php/index_cookbook_mysql#speeding_up_wp_postmeta
与此同时,在第一个 FROM 之后不需要额外的层 SELECT。
与此同时,您应该着手摆脱 c_regs、 和 中的重复项以防止他们 return!一个suitable naturalPRIMARY KEY
很有可能解决问题。 (再一次,看我的link。)