为什么在用 `type=ref` 扫描替换完整 table 扫描后,这个 MySQL 查询变慢了?
Why is this MySQL query slower after replacing a full table scan with a scan of `type=ref`?
在我的 MySQL 数据库(v5.7.31-34,使用 InnoDB)中,我试图简化以下查询的性能:
SELECT DISTINCT SQL_CALC_FOUND_ROWS `story_allocation_days`.*
FROM `story_allocation_days`
INNER JOIN `workspaces`
ON `workspaces`.`id` = `story_allocation_days`.`workspace_id`
INNER JOIN `participations`
ON `participations`.`workspace_id` = `workspaces`.`id`
INNER JOIN `assignments`
ON `assignments`.`id` = `story_allocation_days`.`assignment_id`
WHERE (story_allocation_days.deleted_at is null)
AND `workspaces`.`is_budgeted` = TRUE
AND (participations.account_id = 5071)
AND (participations.access_integer >= 30)
AND (participations.type = 'MavenParticipation' OR workspaces.account_id = 5071)
AND (assignments.assignee_id IS NOT NULL);
99822 rows in set (5.35 sec)
上面的EXPLAIN
:
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
| 1 | SIMPLE | story_allocation_days | NULL | ALL | index_story_allocation_days_on_workspace_id,index_story_allocation_days_on_assignment_id_and_date,index_story_allocation_days_on_assignment_id | NULL | NULL | NULL | 430531 | 10.00 | Using where; Using temporary |
| 1 | SIMPLE | workspaces | NULL | eq_ref | PRIMARY,index_workspaces_on_account_id | PRIMARY | 4 | bm_rpm.story_allocation_days.workspace_id | 1 | 10.00 | Using where; Distinct |
| 1 | SIMPLE | assignments | NULL | eq_ref | PRIMARY,index_assignments_on_assignee_id | PRIMARY | 4 | bm_rpm.story_allocation_days.assignment_id | 1 | 50.00 | Using where; Distinct |
| 1 | SIMPLE | participations | NULL | ref | index_participations_on_account_id_and_workspace_id,index_participations_on_workspace_id_and_user_id,index_participations_for_sad_api_index,index_participations_on_account_id,index_participations_on_workspace_id | index_participations_for_sad_api_index | 10 | bm_rpm.story_allocation_days.workspace_id,const | 7 | 33.33 | Using where; Using index; Distinct |
这是上面的 FORMAT=JSON
版本:
| {
"query_block": {
"select_id": 1,
"cost_info": {
"query_cost": "153230.79"
},
"duplicates_removal": {
"using_temporary_table": true,
"using_filesort": false,
"nested_loop": [
{
"table": {
"table_name": "story_allocation_days",
"access_type": "ALL",
"possible_keys": [
"index_story_allocation_days_on_workspace_id",
"index_story_allocation_days_on_assignment_id_and_date",
"index_story_allocation_days_on_assignment_id"
],
"rows_examined_per_scan": 430268,
"rows_produced_per_join": 43026,
"filtered": "10.00",
"cost_info": {
"read_cost": "80548.24",
"eval_cost": "8605.36",
"prefix_cost": "89153.60",
"data_read_per_join": "3M"
},
"used_columns": [
"id",
"assignment_id",
"story_id",
"workspace_id",
"account_id",
"current",
"date",
"minutes",
"created_at",
"updated_at",
"deleted_at",
"cost_amount_in_cents",
"bill_amount_in_cents",
"cost_rate_in_cents",
"bill_rate_in_cents"
],
"attached_condition": "isnull(`bm_rpm`.`story_allocation_days`.`deleted_at`)"
}
},
{
"table": {
"table_name": "workspaces",
"access_type": "eq_ref",
"possible_keys": [
"PRIMARY",
"index_workspaces_on_account_id"
],
"key": "PRIMARY",
"used_key_parts": [
"id"
],
"key_length": "4",
"ref": [
"bm_rpm.story_allocation_days.workspace_id"
],
"rows_examined_per_scan": 1,
"rows_produced_per_join": 4302,
"filtered": "10.00",
"distinct": true,
"cost_info": {
"read_cost": "43026.80",
"eval_cost": "860.54",
"prefix_cost": "140785.76",
"data_read_per_join": "21M"
},
"used_columns": [
"id",
"is_budgeted",
"account_id"
],
"attached_condition": "(`bm_rpm`.`workspaces`.`is_budgeted` = TRUE)"
}
},
{
"table": {
"table_name": "assignments",
"access_type": "eq_ref",
"possible_keys": [
"PRIMARY",
"index_assignments_on_assignee_id"
],
"key": "PRIMARY",
"used_key_parts": [
"id"
],
"key_length": "4",
"ref": [
"bm_rpm.story_allocation_days.assignment_id"
],
"rows_examined_per_scan": 1,
"rows_produced_per_join": 2151,
"filtered": "50.00",
"distinct": true,
"cost_info": {
"read_cost": "4302.68",
"eval_cost": "430.27",
"prefix_cost": "145948.98",
"data_read_per_join": "134K"
},
"used_columns": [
"id",
"assignee_id"
],
"attached_condition": "(`bm_rpm`.`assignments`.`assignee_id` is not null)"
}
},
{
"table": {
"table_name": "participations",
"access_type": "ref",
"possible_keys": [
"index_participations_on_account_id_and_workspace_id",
"index_participations_on_workspace_id_and_user_id",
"index_participations_for_sad_api_index",
"index_participations_on_account_id",
"index_participations_on_workspace_id"
],
"key": "index_participations_for_sad_api_index",
"used_key_parts": [
"workspace_id",
"account_id"
],
"key_length": "10",
"ref": [
"bm_rpm.story_allocation_days.workspace_id",
"const"
],
"rows_examined_per_scan": 7,
"rows_produced_per_join": 5537,
"filtered": "33.33",
"using_index": true,
"distinct": true,
"cost_info": {
"read_cost": "3959.11",
"eval_cost": "1107.46",
"prefix_cost": "153230.79",
"data_read_per_join": "11M"
},
"used_columns": [
"id",
"workspace_id",
"type",
"account_id",
"access_integer"
],
"attached_condition": "((`bm_rpm`.`participations`.`access_integer` >= 30) and ((`bm_rpm`.`participations`.`type` = 'MavenParticipation') or (`bm_rpm`.`workspaces`.`account_id` = 5071)))"
}
}
]
}
}
} |
我试图找到一个索引,将其从完整 table 扫描更改为更有效的搜索类型。以下是索引:
mysql> show indexes in story_allocation_days;
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
| story_allocation_days | 0 | PRIMARY | 1 | id | A | 418853 | NULL | NULL | | BTREE | | |
| story_allocation_days | 1 | index_story_allocation_days_on_workspace_id | 1 | workspace_id | A | 295 | NULL | NULL | | BTREE | | |
| story_allocation_days | 1 | index_story_allocation_days_on_assignment_id_and_date | 1 | assignment_id | A | 42533 | NULL | NULL | | BTREE | | |
| story_allocation_days | 1 | index_story_allocation_days_on_assignment_id_and_date | 2 | date | A | 418853 | NULL | NULL | | BTREE | | |
| story_allocation_days | 1 | index_story_allocation_days_on_account_id | 1 | account_id | A | 41 | NULL | NULL | | BTREE | | |
| story_allocation_days | 1 | index_story_allocation_days_on_story_id_and_date | 1 | story_id | A | 2519 | NULL | NULL | | BTREE | | |
| story_allocation_days | 1 | index_story_allocation_days_on_story_id_and_date | 2 | date | A | 83246 | NULL | NULL | | BTREE | | |
| story_allocation_days | 1 | index_story_allocation_days_on_date | 1 | date | A | 740 | NULL | NULL | | BTREE | | |
| story_allocation_days | 1 | index_story_allocation_days_on_created_at | 1 | created_at | A | 6977 | NULL | NULL | | BTREE | | |
| story_allocation_days | 1 | index_story_allocation_days_on_updated_at | 1 | updated_at | A | 7681 | NULL | NULL | | BTREE | | |
| story_allocation_days | 1 | index_story_allocation_days_on_assignment_id | 1 | assignment_id | A | 43891 | NULL | NULL | | BTREE | | |
| story_allocation_days | 1 | index_story_allocation_days_on_story_id | 1 | story_id | A | 2494 | NULL | NULL | | BTREE | | |
| story_allocation_days | 1 | sad_deleted_at | 1 | deleted_at | A | 225 | NULL | NULL | YES | BTREE | | |
我在 story_allocation_days
上尝试了 USE INDEX(index_story_allocation_days_on_assignment_id)
,但这实际上比原始查询慢了一点:
99822 rows in set (5.99 sec)
直觉,我认为 WHERE (story_allocation_days.deleted_at is null)
可能会破坏优化器,所以我尝试创建 sad_deleted_at
索引并使用 NULL
更新所有行的 deleted_at
列到 1970-01-01 00:00:00
,因为索引性能可能因 NULL
值的存在而受到影响。但是,这导致性能也比原来的差:
SELECT DISTINCT SQL_CALC_FOUND_ROWS `story_allocation_days`.*
FROM `story_allocation_days`
INNER JOIN `workspaces`
ON `workspaces`.`id` = `story_allocation_days`.`workspace_id`
INNER JOIN `participations`
ON `participations`.`workspace_id` = `workspaces`.`id`
INNER JOIN `assignments`
ON `assignments`.`id` = `story_allocation_days`.`assignment_id`
WHERE (story_allocation_days.deleted_at = '1970-01-01 00:00:00')
AND `workspaces`.`is_budgeted` = TRUE
AND (participations.account_id = 5071)
AND (participations.access_integer >= 30)
AND (participations.type = 'MavenParticipation' OR workspaces.account_id = 5071)
AND (assignments.assignee_id IS NOT NULL);
99822 rows in set (6.12 sec)
这是相应的 EXPLAIN
语句:
| 1 | SIMPLE | story_allocation_days | NULL | ref | index_story_allocation_days_on_workspace_id,index_story_allocation_days_on_assignment_id_and_date,index_story_allocation_days_on_assignment_id,sad_deleted_at | sad_deleted_at | 6 | const | 210391 | 100.00 | Using temporary |
| 1 | SIMPLE | workspaces | NULL | eq_ref | PRIMARY,index_workspaces_on_account_id | PRIMARY | 4 | bm_rpm.story_allocation_days.workspace_id | 1 | 10.00 | Using where; Distinct |
| 1 | SIMPLE | assignments | NULL | eq_ref | PRIMARY,index_assignments_on_assignee_id | PRIMARY | 4 | bm_rpm.story_allocation_days.assignment_id | 1 | 50.00 | Using where; Distinct |
| 1 | SIMPLE | participations | NULL | ref | index_participations_on_account_id_and_workspace_id,index_participations_on_workspace_id_and_user_id,index_participations_for_sad_api_index,index_participations_on_account_id,index_participations_on_workspace_id | index_participations_for_sad_api_index | 10 | bm_rpm.story_allocation_days.workspace_id,const | 7 | 33.33 | Using where; Using index; Distinct |
同样,上面的 FORMAT=JSON
版本:
# NEW:
| {
"query_block": {
"select_id": 1,
"cost_info": {
"query_cost": "366895.66"
},
"duplicates_removal": {
"using_temporary_table": true,
"using_filesort": false,
"nested_loop": [
{
"table": {
"table_name": "story_allocation_days",
"access_type": "ref",
"possible_keys": [
"index_story_allocation_days_on_workspace_id",
"index_story_allocation_days_on_assignment_id_and_date",
"index_story_allocation_days_on_assignment_id",
"sad_deleted_at"
],
"key": "sad_deleted_at",
"used_key_parts": [
"deleted_at"
],
"key_length": "6",
"ref": [
"const"
],
"rows_examined_per_scan": 210552,
"rows_produced_per_join": 210552,
"filtered": "100.00",
"cost_info": {
"read_cost": "11223.00",
"eval_cost": "42110.40",
"prefix_cost": "53333.40",
"data_read_per_join": "16M"
},
"used_columns": [
"id",
"assignment_id",
"story_id",
"workspace_id",
"account_id",
"current",
"date",
"minutes",
"created_at",
"updated_at",
"deleted_at",
"cost_amount_in_cents",
"bill_amount_in_cents",
"cost_rate_in_cents",
"bill_rate_in_cents"
]
}
},
{
"table": {
"table_name": "workspaces",
"access_type": "eq_ref",
"possible_keys": [
"PRIMARY",
"index_workspaces_on_account_id"
],
"key": "PRIMARY",
"used_key_parts": [
"id"
],
"key_length": "4",
"ref": [
"bm_rpm.story_allocation_days.workspace_id"
],
"rows_examined_per_scan": 1,
"rows_produced_per_join": 21055,
"filtered": "10.00",
"distinct": true,
"cost_info": {
"read_cost": "210552.00",
"eval_cost": "4211.04",
"prefix_cost": "305995.80",
"data_read_per_join": "105M"
},
"used_columns": [
"id",
"is_budgeted",
"account_id"
],
"attached_condition": "(`bm_rpm`.`workspaces`.`is_budgeted` = TRUE)"
}
},
{
"table": {
"table_name": "assignments",
"access_type": "eq_ref",
"possible_keys": [
"PRIMARY",
"index_assignments_on_assignee_id"
],
"key": "PRIMARY",
"used_key_parts": [
"id"
],
"key_length": "4",
"ref": [
"bm_rpm.story_allocation_days.assignment_id"
],
"rows_examined_per_scan": 1,
"rows_produced_per_join": 10527,
"filtered": "50.00",
"distinct": true,
"cost_info": {
"read_cost": "21055.20",
"eval_cost": "2105.52",
"prefix_cost": "331262.04",
"data_read_per_join": "657K"
},
"used_columns": [
"id",
"assignee_id"
],
"attached_condition": "(`bm_rpm`.`assignments`.`assignee_id` is not null)"
}
},
{
"table": {
"table_name": "participations",
"access_type": "ref",
"possible_keys": [
"index_participations_on_account_id_and_workspace_id",
"index_participations_on_workspace_id_and_user_id",
"index_participations_for_sad_api_index",
"index_participations_on_account_id",
"index_participations_on_workspace_id"
],
"key": "index_participations_for_sad_api_index",
"used_key_parts": [
"workspace_id",
"account_id"
],
"key_length": "10",
"ref": [
"bm_rpm.story_allocation_days.workspace_id",
"const"
],
"rows_examined_per_scan": 7,
"rows_produced_per_join": 27096,
"filtered": "33.33",
"using_index": true,
"distinct": true,
"cost_info": {
"read_cost": "19373.95",
"eval_cost": "5419.35",
"prefix_cost": "366895.66",
"data_read_per_join": "55M"
},
"used_columns": [
"id",
"workspace_id",
"type",
"account_id",
"access_integer"
],
"attached_condition": "((`bm_rpm`.`participations`.`access_integer` >= 30) and ((`bm_rpm`.`participations`.`type` = 'MavenParticipation') or (`bm_rpm`.`workspaces`.`account_id` = 5071)))"
}
}
]
}
}
} |
我很困惑为什么查询时间没有改善(实际上变得更糟),即使 story_allocation_days
的 filter
列从 10.0
变为 100.0
并且检查的行数从 400,000+ 到 200,000-ish。
谁能看出为什么索引没有缩短查询时间?
编辑:
我尝试从原始查询中删除 SQL_CALC_FOUND_ROWS
,但 99k 行仍然需要 5.51 秒。下面的 EXPLAIN
,采用 JSON 格式(因为这个问题我 运行 字符不足):
| {
"query_block": {
"select_id": 1,
"cost_info": {
"query_cost": "153230.79"
},
"duplicates_removal": {
"using_temporary_table": true,
"using_filesort": false,
"nested_loop": [
{
"table": {
"table_name": "story_allocation_days",
"access_type": "ALL",
"possible_keys": [
"index_story_allocation_days_on_workspace_id",
"index_story_allocation_days_on_assignment_id_and_date",
"index_story_allocation_days_on_assignment_id"
],
"rows_examined_per_scan": 430268,
"rows_produced_per_join": 43026,
"filtered": "10.00",
"cost_info": {
"read_cost": "80548.24",
"eval_cost": "8605.36",
"prefix_cost": "89153.60",
"data_read_per_join": "3M"
},
"used_columns": [
"id",
"assignment_id",
"story_id",
"workspace_id",
"account_id",
"current",
"date",
"minutes",
"created_at",
"updated_at",
"deleted_at",
"cost_amount_in_cents",
"bill_amount_in_cents",
"cost_rate_in_cents",
"bill_rate_in_cents"
],
"attached_condition": "isnull(`bm_rpm`.`story_allocation_days`.`deleted_at`)"
}
},
{
"table": {
"table_name": "workspaces",
"access_type": "eq_ref",
"possible_keys": [
"PRIMARY",
"index_workspaces_on_account_id"
],
"key": "PRIMARY",
"used_key_parts": [
"id"
],
"key_length": "4",
"ref": [
"bm_rpm.story_allocation_days.workspace_id"
],
"rows_examined_per_scan": 1,
"rows_produced_per_join": 4302,
"filtered": "10.00",
"distinct": true,
"cost_info": {
"read_cost": "43026.80",
"eval_cost": "860.54",
"prefix_cost": "140785.76",
"data_read_per_join": "21M"
},
"used_columns": [
"id",
"is_budgeted",
"account_id"
],
"attached_condition": "(`bm_rpm`.`workspaces`.`is_budgeted` = TRUE)"
}
},
{
"table": {
"table_name": "assignments",
"access_type": "eq_ref",
"possible_keys": [
"PRIMARY",
"index_assignments_on_assignee_id"
],
"key": "PRIMARY",
"used_key_parts": [
"id"
],
"key_length": "4",
"ref": [
"bm_rpm.story_allocation_days.assignment_id"
],
"rows_examined_per_scan": 1,
"rows_produced_per_join": 2151,
"filtered": "50.00",
"distinct": true,
"cost_info": {
"read_cost": "4302.68",
"eval_cost": "430.27",
"prefix_cost": "145948.98",
"data_read_per_join": "134K"
},
"used_columns": [
"id",
"assignee_id"
],
"attached_condition": "(``bm_rpm`.`assignments`.`assignee_id` is not null)"
}
},
{
"table": {
"table_name": "participations",
"access_type": "ref",
"possible_keys": [
"index_participations_on_account_id_and_workspace_id",
"index_participations_on_workspace_id_and_user_id",
"index_participations_for_sad_api_index",
"index_participations_on_account_id",
"index_participations_on_workspace_id"
],
"key": "index_participations_for_sad_api_index",
"used_key_parts": [
"workspace_id",
"account_id"
],
"key_length": "10",
"ref": [
"bm_rpm.story_allocation_days.workspace_id",
"const"
],
"rows_examined_per_scan": 7,
"rows_produced_per_join": 5537,
"filtered": "33.33",
"using_index": true,
"distinct": true,
"cost_info": {
"read_cost": "3959.11",
"eval_cost": "1107.46",
"prefix_cost": "153230.79",
"data_read_per_join": "11M"
},
"used_columns": [
"id",
"workspace_id",
"type",
"account_id",
"access_integer"
],
"attached_condition": "((`bm_rpm`.`participations`.`access_integer` >= 30) and ((`bm_rpm`.`participations`.`type` = 'MavenParticipation') or (`bm_rpm`.`workspaces`.`account_id` = 5071)))"
}
}
]
}
}
} |
编辑#2:
这是 story_allocation_days
的 SHOW CREATE TABLE
:
| story_allocation_days | CREATE TABLE `story_allocation_days` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`assignment_id` int(11) NOT NULL,
`story_id` int(11) NOT NULL,
`workspace_id` int(11) NOT NULL,
`account_id` int(11) NOT NULL,
`current` tinyint(1) NOT NULL,
`date` date NOT NULL,
`minutes` int(11) NOT NULL DEFAULT '0',
`created_at` datetime NOT NULL,
`updated_at` datetime NOT NULL,
`deleted_at` datetime DEFAULT NULL,
`cost_amount_in_cents` bigint(20) DEFAULT NULL,
`bill_amount_in_cents` bigint(20) DEFAULT NULL,
`cost_rate_in_cents` bigint(20) DEFAULT NULL,
`bill_rate_in_cents` bigint(20) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `index_story_allocation_days_on_workspace_id` (`workspace_id`),
KEY `index_story_allocation_days_on_assignment_id_and_date` (`assignment_id`,`date`),
KEY `index_story_allocation_days_on_account_id` (`account_id`),
KEY `index_story_allocation_days_on_story_id_and_date` (`story_id`,`date`),
KEY `index_story_allocation_days_on_date` (`date`),
KEY `index_story_allocation_days_on_created_at` (`created_at`),
KEY `index_story_allocation_days_on_updated_at` (`updated_at`),
KEY `index_story_allocation_days_on_assignment_id` (`assignment_id`),
KEY `index_story_allocation_days_on_story_id` (`story_id`)
) ENGINE=InnoDB AUTO_INCREMENT=9646396 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci |
我会提供以下内容。重写从参与者开始的查询,因为您首先从该粒度开始,而不是所有“悲伤”条目。添加 MySQL 关键字“STRAIGHT_JOIN”告诉引擎...不要为我思考,按照我写的去做...因为您正在做“Distinct”,所以做“*”可能是评估所有列的唯一性。如果你只输入“sad.id”(假设每个 table 都有“Id”作为它自己的主键。我在下面输入 [whateverItsUniqueKeyIdIs]
但认为它可能只是 sad.id
并在下面进行相应更改。如果这是该行的唯一值,则引擎不需要转到实际数据行页面来限定唯一性。
接下来是索引。通过在 table 上使用多列复合索引来覆盖 WHERE 的连接和条件,也将有助于引擎直接根据索引进行查询,而不是转到原始数据页面。不要尝试让 table 在单独的列上有索引,因为只有一个不会像涵盖您正在寻找的标准的那些那样优化。
Table Index
Participants ( account_id, access_integer, type, workspace_id)
Workspaces ( id, is_budgeted, account_id )
Story_Allocation_Days ( workspace_id, deleted_at, assignment_id, [whateverItsUniqueKeyIdIs]
Assignments ( id, assignee_id )
终于正式查询了。
SELECT STRAIGHT_JOIN DISTINCT
SQL_CALC_FOUND_ROWS sad.[whateverItsUniqueKeyIdIs]
FROM
participations p
INNER JOIN workspaces ws
ON p.workspace_id = ws.id
AND ws.is_budgeted = TRUE
INNER JOIN story_allocation_days sad
ON p.workspace_id = sad.workspace_id
AND sad.deleted_at is null
INNER JOIN assignments a
ON sad.assignment_id = a.id
AND a.assignee_id IS NOT NULL
where
p.account_id = 5071
AND p.access_integer >= 30
AND ( p.type = 'MavenParticipation'
OR ws.account_id = 5071)
唯一可能的杀手级阻力是“OR ws.Account_id = 5701”,但应该仍然很快,因为主要依据是参与者限制标准而不是所有账户。
对评论的回复
关于查询的直觉。数据库可能充满膨胀,有时会隐藏在您正在搜索的内容中。作为数据库背后的人,您更了解数据中的粒度。
我尝试查看根“我想要什么”的查询。在你的情况下,你想要一个特定帐户的所有 participants
(有一些次要的其他标准)。通过首先将它放在我的脑海中,它成为我的第一个 table。现在,看看帮助我首先获得那个组件的索引,而且只有那个。然后,我向外加入其他部分,例如您的 workspace
和 story_allocation_days
以及作业。由于比较的上下文,我直接在他们的 JOIN 条件中应用他们的限制条件,但 ws.account_id
上的“OR”除外,但核心仍然始终是 p.account_id = 5071 AND p.access_integer >= 30
现在,拥有适当/有效的索引是另一回事,并且以正确的顺序排列它们 order/position 可能会产生巨大的影响,不仅仅是一个字段有多个索引,而且更有效的单个索引有多个列 -- 基于您 运行 最常查询的类型。
正如我在其他 post 答案中以及您的数据场景中所描述的那样。您最初是从 story_allocation_days
table 开始的。它必须遍历所有内容,针对每个帐户,甚至在它到达参与者 table 之前,然后到所有工作站和作业。引擎不知道你想要什么,所以有有效的索引。
从问题 Participants
中的键 table 开始,并且知道您需要特定的 account_id 和 access_integer 值,这就是您的起点。想象一个由盒子组成的房间,它们保存着所有参与者的数据。房间里的每个箱子上都有先到先得,箱子旁边有一个“Account_ID”,所有的箱子都是按顺序排列的。你可以有 1,000 个盒子,你可以 运行 直接找到你想要的一个帐户 ID。您已经消除了房间其余部分的所有其他盒子。所以现在,你打开那个盒子,在里面,它们按 access_integer 从 1-?? 预先排序。所以现在,你翻到 31 就完成了。
只有这样,您才能找到简短的清单来查找其余的详细信息,而您甚至没有查看原始数据页面。在每一页的顶部都有一个类型和工作区 ID 的注释,因为这是我建议索引的方式。同样,无需进入文档翻页查看类型和工作空间,这些都写在页面顶部以供快速参考。这些是用于加入 table 的下一级的字段。无需获取基础记录的原始数据即可完成所有操作。
通过使用 straight_join 子句,您已经从引擎中消除了查询的所有繁重工作,现在它只需对这些次要级别使用适当的索引进行简单连接 table秒。 HTH.
OR
(通常)会降低性能。 UNION
是一种解决方法:
( SELECT sad.*
FROM `story_allocation_days` AS sad
INNER JOIN `workspaces` AS w ON w.`id` = sad.`workspace_id`
INNER JOIN `participations` AS p ON p.`workspace_id` = w.`id`
INNER JOIN `assignments` AS a ON a.`id` = sad.`assignment_id`
WHERE sad.deleted_at = '1970-01-01'
AND w.`is_budgeted` = TRUE
AND p.account_id = 5071
AND p.access_integer >= 30
AND p.type = 'MavenParticipation'
AND a.assignee_id IS NOT NULL
)
UNION DISTINCT
( SELECT sad.*
FROM `story_allocation_days` AS sad
INNER JOIN `workspaces` AS w ON w.`id` = sad.`workspace_id`
INNER JOIN `participations` AS p ON p.`workspace_id` = w.`id`
INNER JOIN `assignments` AS a ON a.`id` = sad.`assignment_id`
WHERE sad.deleted_at = '1970-01-01'
AND w.`is_budgeted` = TRUE
AND w.account_id = 5071
AND p.account_id = 5071
AND p.access_integer >= 30
AND a.assignee_id IS NOT NULL
)
那就需要这些了。这些索引中列的顺序很重要。 (我假设每个 table 都有 PRIMARY KEY(id)
。)
w: INDEX(account_id, is_budgeted)
p: INDEX(account_id, type, access_integer, workspace_id)
p: INDEX(workspace_id, account_id, access_integer)
sad: INDEX(workspace_id, deleted_at)
sad: INDEX(assignment_id, deleted_at)
有一些额外的索引,因为我不知道优化器更喜欢以什么顺序访问 table。 (它取决于数据。)
我删除了 SQL_CALC_FOUND_ROWS
,因为没有 LIMIT
就没有必要了。
让我知道在 SELECTs
中出现相同的行是否很常见,并且 [=] 中有很多大列(例如 TEXT
或 BLOB
) 21=]。在那种情况下,我会想修改查询以避免拖太多东西:
SELECT sad2.*
FROM ( ( SELECT sad.id
FROM ... (1st 4-way join plus WHERE)
UNION DISTINCT
( SELECT sad.id
FROM ... (2nd 4-way join plus WHERE)
) ) AS u
JOIN `story_allocation_days` AS sad2 USING(id);
这可能会更快,因为只需要拖拉 id
直到完成 UNION
。
在我的 MySQL 数据库(v5.7.31-34,使用 InnoDB)中,我试图简化以下查询的性能:
SELECT DISTINCT SQL_CALC_FOUND_ROWS `story_allocation_days`.*
FROM `story_allocation_days`
INNER JOIN `workspaces`
ON `workspaces`.`id` = `story_allocation_days`.`workspace_id`
INNER JOIN `participations`
ON `participations`.`workspace_id` = `workspaces`.`id`
INNER JOIN `assignments`
ON `assignments`.`id` = `story_allocation_days`.`assignment_id`
WHERE (story_allocation_days.deleted_at is null)
AND `workspaces`.`is_budgeted` = TRUE
AND (participations.account_id = 5071)
AND (participations.access_integer >= 30)
AND (participations.type = 'MavenParticipation' OR workspaces.account_id = 5071)
AND (assignments.assignee_id IS NOT NULL);
99822 rows in set (5.35 sec)
上面的EXPLAIN
:
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
| 1 | SIMPLE | story_allocation_days | NULL | ALL | index_story_allocation_days_on_workspace_id,index_story_allocation_days_on_assignment_id_and_date,index_story_allocation_days_on_assignment_id | NULL | NULL | NULL | 430531 | 10.00 | Using where; Using temporary |
| 1 | SIMPLE | workspaces | NULL | eq_ref | PRIMARY,index_workspaces_on_account_id | PRIMARY | 4 | bm_rpm.story_allocation_days.workspace_id | 1 | 10.00 | Using where; Distinct |
| 1 | SIMPLE | assignments | NULL | eq_ref | PRIMARY,index_assignments_on_assignee_id | PRIMARY | 4 | bm_rpm.story_allocation_days.assignment_id | 1 | 50.00 | Using where; Distinct |
| 1 | SIMPLE | participations | NULL | ref | index_participations_on_account_id_and_workspace_id,index_participations_on_workspace_id_and_user_id,index_participations_for_sad_api_index,index_participations_on_account_id,index_participations_on_workspace_id | index_participations_for_sad_api_index | 10 | bm_rpm.story_allocation_days.workspace_id,const | 7 | 33.33 | Using where; Using index; Distinct |
这是上面的 FORMAT=JSON
版本:
| {
"query_block": {
"select_id": 1,
"cost_info": {
"query_cost": "153230.79"
},
"duplicates_removal": {
"using_temporary_table": true,
"using_filesort": false,
"nested_loop": [
{
"table": {
"table_name": "story_allocation_days",
"access_type": "ALL",
"possible_keys": [
"index_story_allocation_days_on_workspace_id",
"index_story_allocation_days_on_assignment_id_and_date",
"index_story_allocation_days_on_assignment_id"
],
"rows_examined_per_scan": 430268,
"rows_produced_per_join": 43026,
"filtered": "10.00",
"cost_info": {
"read_cost": "80548.24",
"eval_cost": "8605.36",
"prefix_cost": "89153.60",
"data_read_per_join": "3M"
},
"used_columns": [
"id",
"assignment_id",
"story_id",
"workspace_id",
"account_id",
"current",
"date",
"minutes",
"created_at",
"updated_at",
"deleted_at",
"cost_amount_in_cents",
"bill_amount_in_cents",
"cost_rate_in_cents",
"bill_rate_in_cents"
],
"attached_condition": "isnull(`bm_rpm`.`story_allocation_days`.`deleted_at`)"
}
},
{
"table": {
"table_name": "workspaces",
"access_type": "eq_ref",
"possible_keys": [
"PRIMARY",
"index_workspaces_on_account_id"
],
"key": "PRIMARY",
"used_key_parts": [
"id"
],
"key_length": "4",
"ref": [
"bm_rpm.story_allocation_days.workspace_id"
],
"rows_examined_per_scan": 1,
"rows_produced_per_join": 4302,
"filtered": "10.00",
"distinct": true,
"cost_info": {
"read_cost": "43026.80",
"eval_cost": "860.54",
"prefix_cost": "140785.76",
"data_read_per_join": "21M"
},
"used_columns": [
"id",
"is_budgeted",
"account_id"
],
"attached_condition": "(`bm_rpm`.`workspaces`.`is_budgeted` = TRUE)"
}
},
{
"table": {
"table_name": "assignments",
"access_type": "eq_ref",
"possible_keys": [
"PRIMARY",
"index_assignments_on_assignee_id"
],
"key": "PRIMARY",
"used_key_parts": [
"id"
],
"key_length": "4",
"ref": [
"bm_rpm.story_allocation_days.assignment_id"
],
"rows_examined_per_scan": 1,
"rows_produced_per_join": 2151,
"filtered": "50.00",
"distinct": true,
"cost_info": {
"read_cost": "4302.68",
"eval_cost": "430.27",
"prefix_cost": "145948.98",
"data_read_per_join": "134K"
},
"used_columns": [
"id",
"assignee_id"
],
"attached_condition": "(`bm_rpm`.`assignments`.`assignee_id` is not null)"
}
},
{
"table": {
"table_name": "participations",
"access_type": "ref",
"possible_keys": [
"index_participations_on_account_id_and_workspace_id",
"index_participations_on_workspace_id_and_user_id",
"index_participations_for_sad_api_index",
"index_participations_on_account_id",
"index_participations_on_workspace_id"
],
"key": "index_participations_for_sad_api_index",
"used_key_parts": [
"workspace_id",
"account_id"
],
"key_length": "10",
"ref": [
"bm_rpm.story_allocation_days.workspace_id",
"const"
],
"rows_examined_per_scan": 7,
"rows_produced_per_join": 5537,
"filtered": "33.33",
"using_index": true,
"distinct": true,
"cost_info": {
"read_cost": "3959.11",
"eval_cost": "1107.46",
"prefix_cost": "153230.79",
"data_read_per_join": "11M"
},
"used_columns": [
"id",
"workspace_id",
"type",
"account_id",
"access_integer"
],
"attached_condition": "((`bm_rpm`.`participations`.`access_integer` >= 30) and ((`bm_rpm`.`participations`.`type` = 'MavenParticipation') or (`bm_rpm`.`workspaces`.`account_id` = 5071)))"
}
}
]
}
}
} |
我试图找到一个索引,将其从完整 table 扫描更改为更有效的搜索类型。以下是索引:
mysql> show indexes in story_allocation_days;
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
| story_allocation_days | 0 | PRIMARY | 1 | id | A | 418853 | NULL | NULL | | BTREE | | |
| story_allocation_days | 1 | index_story_allocation_days_on_workspace_id | 1 | workspace_id | A | 295 | NULL | NULL | | BTREE | | |
| story_allocation_days | 1 | index_story_allocation_days_on_assignment_id_and_date | 1 | assignment_id | A | 42533 | NULL | NULL | | BTREE | | |
| story_allocation_days | 1 | index_story_allocation_days_on_assignment_id_and_date | 2 | date | A | 418853 | NULL | NULL | | BTREE | | |
| story_allocation_days | 1 | index_story_allocation_days_on_account_id | 1 | account_id | A | 41 | NULL | NULL | | BTREE | | |
| story_allocation_days | 1 | index_story_allocation_days_on_story_id_and_date | 1 | story_id | A | 2519 | NULL | NULL | | BTREE | | |
| story_allocation_days | 1 | index_story_allocation_days_on_story_id_and_date | 2 | date | A | 83246 | NULL | NULL | | BTREE | | |
| story_allocation_days | 1 | index_story_allocation_days_on_date | 1 | date | A | 740 | NULL | NULL | | BTREE | | |
| story_allocation_days | 1 | index_story_allocation_days_on_created_at | 1 | created_at | A | 6977 | NULL | NULL | | BTREE | | |
| story_allocation_days | 1 | index_story_allocation_days_on_updated_at | 1 | updated_at | A | 7681 | NULL | NULL | | BTREE | | |
| story_allocation_days | 1 | index_story_allocation_days_on_assignment_id | 1 | assignment_id | A | 43891 | NULL | NULL | | BTREE | | |
| story_allocation_days | 1 | index_story_allocation_days_on_story_id | 1 | story_id | A | 2494 | NULL | NULL | | BTREE | | |
| story_allocation_days | 1 | sad_deleted_at | 1 | deleted_at | A | 225 | NULL | NULL | YES | BTREE | | |
我在 story_allocation_days
上尝试了 USE INDEX(index_story_allocation_days_on_assignment_id)
,但这实际上比原始查询慢了一点:
99822 rows in set (5.99 sec)
直觉,我认为 WHERE (story_allocation_days.deleted_at is null)
可能会破坏优化器,所以我尝试创建 sad_deleted_at
索引并使用 NULL
更新所有行的 deleted_at
列到 1970-01-01 00:00:00
,因为索引性能可能因 NULL
值的存在而受到影响。但是,这导致性能也比原来的差:
SELECT DISTINCT SQL_CALC_FOUND_ROWS `story_allocation_days`.*
FROM `story_allocation_days`
INNER JOIN `workspaces`
ON `workspaces`.`id` = `story_allocation_days`.`workspace_id`
INNER JOIN `participations`
ON `participations`.`workspace_id` = `workspaces`.`id`
INNER JOIN `assignments`
ON `assignments`.`id` = `story_allocation_days`.`assignment_id`
WHERE (story_allocation_days.deleted_at = '1970-01-01 00:00:00')
AND `workspaces`.`is_budgeted` = TRUE
AND (participations.account_id = 5071)
AND (participations.access_integer >= 30)
AND (participations.type = 'MavenParticipation' OR workspaces.account_id = 5071)
AND (assignments.assignee_id IS NOT NULL);
99822 rows in set (6.12 sec)
这是相应的 EXPLAIN
语句:
| 1 | SIMPLE | story_allocation_days | NULL | ref | index_story_allocation_days_on_workspace_id,index_story_allocation_days_on_assignment_id_and_date,index_story_allocation_days_on_assignment_id,sad_deleted_at | sad_deleted_at | 6 | const | 210391 | 100.00 | Using temporary |
| 1 | SIMPLE | workspaces | NULL | eq_ref | PRIMARY,index_workspaces_on_account_id | PRIMARY | 4 | bm_rpm.story_allocation_days.workspace_id | 1 | 10.00 | Using where; Distinct |
| 1 | SIMPLE | assignments | NULL | eq_ref | PRIMARY,index_assignments_on_assignee_id | PRIMARY | 4 | bm_rpm.story_allocation_days.assignment_id | 1 | 50.00 | Using where; Distinct |
| 1 | SIMPLE | participations | NULL | ref | index_participations_on_account_id_and_workspace_id,index_participations_on_workspace_id_and_user_id,index_participations_for_sad_api_index,index_participations_on_account_id,index_participations_on_workspace_id | index_participations_for_sad_api_index | 10 | bm_rpm.story_allocation_days.workspace_id,const | 7 | 33.33 | Using where; Using index; Distinct |
同样,上面的 FORMAT=JSON
版本:
# NEW:
| {
"query_block": {
"select_id": 1,
"cost_info": {
"query_cost": "366895.66"
},
"duplicates_removal": {
"using_temporary_table": true,
"using_filesort": false,
"nested_loop": [
{
"table": {
"table_name": "story_allocation_days",
"access_type": "ref",
"possible_keys": [
"index_story_allocation_days_on_workspace_id",
"index_story_allocation_days_on_assignment_id_and_date",
"index_story_allocation_days_on_assignment_id",
"sad_deleted_at"
],
"key": "sad_deleted_at",
"used_key_parts": [
"deleted_at"
],
"key_length": "6",
"ref": [
"const"
],
"rows_examined_per_scan": 210552,
"rows_produced_per_join": 210552,
"filtered": "100.00",
"cost_info": {
"read_cost": "11223.00",
"eval_cost": "42110.40",
"prefix_cost": "53333.40",
"data_read_per_join": "16M"
},
"used_columns": [
"id",
"assignment_id",
"story_id",
"workspace_id",
"account_id",
"current",
"date",
"minutes",
"created_at",
"updated_at",
"deleted_at",
"cost_amount_in_cents",
"bill_amount_in_cents",
"cost_rate_in_cents",
"bill_rate_in_cents"
]
}
},
{
"table": {
"table_name": "workspaces",
"access_type": "eq_ref",
"possible_keys": [
"PRIMARY",
"index_workspaces_on_account_id"
],
"key": "PRIMARY",
"used_key_parts": [
"id"
],
"key_length": "4",
"ref": [
"bm_rpm.story_allocation_days.workspace_id"
],
"rows_examined_per_scan": 1,
"rows_produced_per_join": 21055,
"filtered": "10.00",
"distinct": true,
"cost_info": {
"read_cost": "210552.00",
"eval_cost": "4211.04",
"prefix_cost": "305995.80",
"data_read_per_join": "105M"
},
"used_columns": [
"id",
"is_budgeted",
"account_id"
],
"attached_condition": "(`bm_rpm`.`workspaces`.`is_budgeted` = TRUE)"
}
},
{
"table": {
"table_name": "assignments",
"access_type": "eq_ref",
"possible_keys": [
"PRIMARY",
"index_assignments_on_assignee_id"
],
"key": "PRIMARY",
"used_key_parts": [
"id"
],
"key_length": "4",
"ref": [
"bm_rpm.story_allocation_days.assignment_id"
],
"rows_examined_per_scan": 1,
"rows_produced_per_join": 10527,
"filtered": "50.00",
"distinct": true,
"cost_info": {
"read_cost": "21055.20",
"eval_cost": "2105.52",
"prefix_cost": "331262.04",
"data_read_per_join": "657K"
},
"used_columns": [
"id",
"assignee_id"
],
"attached_condition": "(`bm_rpm`.`assignments`.`assignee_id` is not null)"
}
},
{
"table": {
"table_name": "participations",
"access_type": "ref",
"possible_keys": [
"index_participations_on_account_id_and_workspace_id",
"index_participations_on_workspace_id_and_user_id",
"index_participations_for_sad_api_index",
"index_participations_on_account_id",
"index_participations_on_workspace_id"
],
"key": "index_participations_for_sad_api_index",
"used_key_parts": [
"workspace_id",
"account_id"
],
"key_length": "10",
"ref": [
"bm_rpm.story_allocation_days.workspace_id",
"const"
],
"rows_examined_per_scan": 7,
"rows_produced_per_join": 27096,
"filtered": "33.33",
"using_index": true,
"distinct": true,
"cost_info": {
"read_cost": "19373.95",
"eval_cost": "5419.35",
"prefix_cost": "366895.66",
"data_read_per_join": "55M"
},
"used_columns": [
"id",
"workspace_id",
"type",
"account_id",
"access_integer"
],
"attached_condition": "((`bm_rpm`.`participations`.`access_integer` >= 30) and ((`bm_rpm`.`participations`.`type` = 'MavenParticipation') or (`bm_rpm`.`workspaces`.`account_id` = 5071)))"
}
}
]
}
}
} |
我很困惑为什么查询时间没有改善(实际上变得更糟),即使 story_allocation_days
的 filter
列从 10.0
变为 100.0
并且检查的行数从 400,000+ 到 200,000-ish。
谁能看出为什么索引没有缩短查询时间?
编辑:
我尝试从原始查询中删除 SQL_CALC_FOUND_ROWS
,但 99k 行仍然需要 5.51 秒。下面的 EXPLAIN
,采用 JSON 格式(因为这个问题我 运行 字符不足):
| {
"query_block": {
"select_id": 1,
"cost_info": {
"query_cost": "153230.79"
},
"duplicates_removal": {
"using_temporary_table": true,
"using_filesort": false,
"nested_loop": [
{
"table": {
"table_name": "story_allocation_days",
"access_type": "ALL",
"possible_keys": [
"index_story_allocation_days_on_workspace_id",
"index_story_allocation_days_on_assignment_id_and_date",
"index_story_allocation_days_on_assignment_id"
],
"rows_examined_per_scan": 430268,
"rows_produced_per_join": 43026,
"filtered": "10.00",
"cost_info": {
"read_cost": "80548.24",
"eval_cost": "8605.36",
"prefix_cost": "89153.60",
"data_read_per_join": "3M"
},
"used_columns": [
"id",
"assignment_id",
"story_id",
"workspace_id",
"account_id",
"current",
"date",
"minutes",
"created_at",
"updated_at",
"deleted_at",
"cost_amount_in_cents",
"bill_amount_in_cents",
"cost_rate_in_cents",
"bill_rate_in_cents"
],
"attached_condition": "isnull(`bm_rpm`.`story_allocation_days`.`deleted_at`)"
}
},
{
"table": {
"table_name": "workspaces",
"access_type": "eq_ref",
"possible_keys": [
"PRIMARY",
"index_workspaces_on_account_id"
],
"key": "PRIMARY",
"used_key_parts": [
"id"
],
"key_length": "4",
"ref": [
"bm_rpm.story_allocation_days.workspace_id"
],
"rows_examined_per_scan": 1,
"rows_produced_per_join": 4302,
"filtered": "10.00",
"distinct": true,
"cost_info": {
"read_cost": "43026.80",
"eval_cost": "860.54",
"prefix_cost": "140785.76",
"data_read_per_join": "21M"
},
"used_columns": [
"id",
"is_budgeted",
"account_id"
],
"attached_condition": "(`bm_rpm`.`workspaces`.`is_budgeted` = TRUE)"
}
},
{
"table": {
"table_name": "assignments",
"access_type": "eq_ref",
"possible_keys": [
"PRIMARY",
"index_assignments_on_assignee_id"
],
"key": "PRIMARY",
"used_key_parts": [
"id"
],
"key_length": "4",
"ref": [
"bm_rpm.story_allocation_days.assignment_id"
],
"rows_examined_per_scan": 1,
"rows_produced_per_join": 2151,
"filtered": "50.00",
"distinct": true,
"cost_info": {
"read_cost": "4302.68",
"eval_cost": "430.27",
"prefix_cost": "145948.98",
"data_read_per_join": "134K"
},
"used_columns": [
"id",
"assignee_id"
],
"attached_condition": "(``bm_rpm`.`assignments`.`assignee_id` is not null)"
}
},
{
"table": {
"table_name": "participations",
"access_type": "ref",
"possible_keys": [
"index_participations_on_account_id_and_workspace_id",
"index_participations_on_workspace_id_and_user_id",
"index_participations_for_sad_api_index",
"index_participations_on_account_id",
"index_participations_on_workspace_id"
],
"key": "index_participations_for_sad_api_index",
"used_key_parts": [
"workspace_id",
"account_id"
],
"key_length": "10",
"ref": [
"bm_rpm.story_allocation_days.workspace_id",
"const"
],
"rows_examined_per_scan": 7,
"rows_produced_per_join": 5537,
"filtered": "33.33",
"using_index": true,
"distinct": true,
"cost_info": {
"read_cost": "3959.11",
"eval_cost": "1107.46",
"prefix_cost": "153230.79",
"data_read_per_join": "11M"
},
"used_columns": [
"id",
"workspace_id",
"type",
"account_id",
"access_integer"
],
"attached_condition": "((`bm_rpm`.`participations`.`access_integer` >= 30) and ((`bm_rpm`.`participations`.`type` = 'MavenParticipation') or (`bm_rpm`.`workspaces`.`account_id` = 5071)))"
}
}
]
}
}
} |
编辑#2:
这是 story_allocation_days
的 SHOW CREATE TABLE
:
| story_allocation_days | CREATE TABLE `story_allocation_days` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`assignment_id` int(11) NOT NULL,
`story_id` int(11) NOT NULL,
`workspace_id` int(11) NOT NULL,
`account_id` int(11) NOT NULL,
`current` tinyint(1) NOT NULL,
`date` date NOT NULL,
`minutes` int(11) NOT NULL DEFAULT '0',
`created_at` datetime NOT NULL,
`updated_at` datetime NOT NULL,
`deleted_at` datetime DEFAULT NULL,
`cost_amount_in_cents` bigint(20) DEFAULT NULL,
`bill_amount_in_cents` bigint(20) DEFAULT NULL,
`cost_rate_in_cents` bigint(20) DEFAULT NULL,
`bill_rate_in_cents` bigint(20) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `index_story_allocation_days_on_workspace_id` (`workspace_id`),
KEY `index_story_allocation_days_on_assignment_id_and_date` (`assignment_id`,`date`),
KEY `index_story_allocation_days_on_account_id` (`account_id`),
KEY `index_story_allocation_days_on_story_id_and_date` (`story_id`,`date`),
KEY `index_story_allocation_days_on_date` (`date`),
KEY `index_story_allocation_days_on_created_at` (`created_at`),
KEY `index_story_allocation_days_on_updated_at` (`updated_at`),
KEY `index_story_allocation_days_on_assignment_id` (`assignment_id`),
KEY `index_story_allocation_days_on_story_id` (`story_id`)
) ENGINE=InnoDB AUTO_INCREMENT=9646396 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci |
我会提供以下内容。重写从参与者开始的查询,因为您首先从该粒度开始,而不是所有“悲伤”条目。添加 MySQL 关键字“STRAIGHT_JOIN”告诉引擎...不要为我思考,按照我写的去做...因为您正在做“Distinct”,所以做“*”可能是评估所有列的唯一性。如果你只输入“sad.id”(假设每个 table 都有“Id”作为它自己的主键。我在下面输入 [whateverItsUniqueKeyIdIs]
但认为它可能只是 sad.id
并在下面进行相应更改。如果这是该行的唯一值,则引擎不需要转到实际数据行页面来限定唯一性。
接下来是索引。通过在 table 上使用多列复合索引来覆盖 WHERE 的连接和条件,也将有助于引擎直接根据索引进行查询,而不是转到原始数据页面。不要尝试让 table 在单独的列上有索引,因为只有一个不会像涵盖您正在寻找的标准的那些那样优化。
Table Index
Participants ( account_id, access_integer, type, workspace_id)
Workspaces ( id, is_budgeted, account_id )
Story_Allocation_Days ( workspace_id, deleted_at, assignment_id, [whateverItsUniqueKeyIdIs]
Assignments ( id, assignee_id )
终于正式查询了。
SELECT STRAIGHT_JOIN DISTINCT
SQL_CALC_FOUND_ROWS sad.[whateverItsUniqueKeyIdIs]
FROM
participations p
INNER JOIN workspaces ws
ON p.workspace_id = ws.id
AND ws.is_budgeted = TRUE
INNER JOIN story_allocation_days sad
ON p.workspace_id = sad.workspace_id
AND sad.deleted_at is null
INNER JOIN assignments a
ON sad.assignment_id = a.id
AND a.assignee_id IS NOT NULL
where
p.account_id = 5071
AND p.access_integer >= 30
AND ( p.type = 'MavenParticipation'
OR ws.account_id = 5071)
唯一可能的杀手级阻力是“OR ws.Account_id = 5701”,但应该仍然很快,因为主要依据是参与者限制标准而不是所有账户。
对评论的回复
关于查询的直觉。数据库可能充满膨胀,有时会隐藏在您正在搜索的内容中。作为数据库背后的人,您更了解数据中的粒度。
我尝试查看根“我想要什么”的查询。在你的情况下,你想要一个特定帐户的所有 participants
(有一些次要的其他标准)。通过首先将它放在我的脑海中,它成为我的第一个 table。现在,看看帮助我首先获得那个组件的索引,而且只有那个。然后,我向外加入其他部分,例如您的 workspace
和 story_allocation_days
以及作业。由于比较的上下文,我直接在他们的 JOIN 条件中应用他们的限制条件,但 ws.account_id
上的“OR”除外,但核心仍然始终是 p.account_id = 5071 AND p.access_integer >= 30
现在,拥有适当/有效的索引是另一回事,并且以正确的顺序排列它们 order/position 可能会产生巨大的影响,不仅仅是一个字段有多个索引,而且更有效的单个索引有多个列 -- 基于您 运行 最常查询的类型。
正如我在其他 post 答案中以及您的数据场景中所描述的那样。您最初是从 story_allocation_days
table 开始的。它必须遍历所有内容,针对每个帐户,甚至在它到达参与者 table 之前,然后到所有工作站和作业。引擎不知道你想要什么,所以有有效的索引。
从问题 Participants
中的键 table 开始,并且知道您需要特定的 account_id 和 access_integer 值,这就是您的起点。想象一个由盒子组成的房间,它们保存着所有参与者的数据。房间里的每个箱子上都有先到先得,箱子旁边有一个“Account_ID”,所有的箱子都是按顺序排列的。你可以有 1,000 个盒子,你可以 运行 直接找到你想要的一个帐户 ID。您已经消除了房间其余部分的所有其他盒子。所以现在,你打开那个盒子,在里面,它们按 access_integer 从 1-?? 预先排序。所以现在,你翻到 31 就完成了。
只有这样,您才能找到简短的清单来查找其余的详细信息,而您甚至没有查看原始数据页面。在每一页的顶部都有一个类型和工作区 ID 的注释,因为这是我建议索引的方式。同样,无需进入文档翻页查看类型和工作空间,这些都写在页面顶部以供快速参考。这些是用于加入 table 的下一级的字段。无需获取基础记录的原始数据即可完成所有操作。
通过使用 straight_join 子句,您已经从引擎中消除了查询的所有繁重工作,现在它只需对这些次要级别使用适当的索引进行简单连接 table秒。 HTH.
OR
(通常)会降低性能。 UNION
是一种解决方法:
( SELECT sad.*
FROM `story_allocation_days` AS sad
INNER JOIN `workspaces` AS w ON w.`id` = sad.`workspace_id`
INNER JOIN `participations` AS p ON p.`workspace_id` = w.`id`
INNER JOIN `assignments` AS a ON a.`id` = sad.`assignment_id`
WHERE sad.deleted_at = '1970-01-01'
AND w.`is_budgeted` = TRUE
AND p.account_id = 5071
AND p.access_integer >= 30
AND p.type = 'MavenParticipation'
AND a.assignee_id IS NOT NULL
)
UNION DISTINCT
( SELECT sad.*
FROM `story_allocation_days` AS sad
INNER JOIN `workspaces` AS w ON w.`id` = sad.`workspace_id`
INNER JOIN `participations` AS p ON p.`workspace_id` = w.`id`
INNER JOIN `assignments` AS a ON a.`id` = sad.`assignment_id`
WHERE sad.deleted_at = '1970-01-01'
AND w.`is_budgeted` = TRUE
AND w.account_id = 5071
AND p.account_id = 5071
AND p.access_integer >= 30
AND a.assignee_id IS NOT NULL
)
那就需要这些了。这些索引中列的顺序很重要。 (我假设每个 table 都有 PRIMARY KEY(id)
。)
w: INDEX(account_id, is_budgeted)
p: INDEX(account_id, type, access_integer, workspace_id)
p: INDEX(workspace_id, account_id, access_integer)
sad: INDEX(workspace_id, deleted_at)
sad: INDEX(assignment_id, deleted_at)
有一些额外的索引,因为我不知道优化器更喜欢以什么顺序访问 table。 (它取决于数据。)
我删除了 SQL_CALC_FOUND_ROWS
,因为没有 LIMIT
就没有必要了。
让我知道在 SELECTs
中出现相同的行是否很常见,并且 [=] 中有很多大列(例如 TEXT
或 BLOB
) 21=]。在那种情况下,我会想修改查询以避免拖太多东西:
SELECT sad2.*
FROM ( ( SELECT sad.id
FROM ... (1st 4-way join plus WHERE)
UNION DISTINCT
( SELECT sad.id
FROM ... (2nd 4-way join plus WHERE)
) ) AS u
JOIN `story_allocation_days` AS sad2 USING(id);
这可能会更快,因为只需要拖拉 id
直到完成 UNION
。