为什么在用 `type=ref` 扫描替换完整 table 扫描后,这个 MySQL 查询变慢了?

Why is this MySQL query slower after replacing a full table scan with a scan of `type=ref`?

在我的 MySQL 数据库(v5.7.31-34,使用 InnoDB)中,我试图简化以下查询的性能:

SELECT DISTINCT SQL_CALC_FOUND_ROWS `story_allocation_days`.* 
FROM `story_allocation_days` 
INNER JOIN `workspaces` 
ON `workspaces`.`id` = `story_allocation_days`.`workspace_id` 
INNER JOIN `participations` 
ON `participations`.`workspace_id` = `workspaces`.`id` 
INNER JOIN `assignments` 
ON `assignments`.`id` = `story_allocation_days`.`assignment_id` 
WHERE (story_allocation_days.deleted_at is null) 
AND `workspaces`.`is_budgeted` = TRUE 
AND (participations.account_id = 5071) 
AND (participations.access_integer >= 30) 
AND (participations.type = 'MavenParticipation' OR workspaces.account_id = 5071) 
AND (assignments.assignee_id IS NOT NULL);
99822 rows in set (5.35 sec)

上面的EXPLAIN

| id | select_type | table                 | partitions | type   | possible_keys                                                                                                                                                                                                       | key                                    | key_len | ref                                                   | rows   | filtered | Extra                              |
|  1 | SIMPLE      | story_allocation_days | NULL       | ALL    | index_story_allocation_days_on_workspace_id,index_story_allocation_days_on_assignment_id_and_date,index_story_allocation_days_on_assignment_id                                                                      | NULL                                   | NULL    | NULL                                                  | 430531 |    10.00 | Using where; Using temporary       |
|  1 | SIMPLE      | workspaces            | NULL       | eq_ref | PRIMARY,index_workspaces_on_account_id                                                                                                                                                                              | PRIMARY                                | 4       | bm_rpm.story_allocation_days.workspace_id       |      1 |    10.00 | Using where; Distinct              |
|  1 | SIMPLE      | assignments           | NULL       | eq_ref | PRIMARY,index_assignments_on_assignee_id                                                                                                                                                                            | PRIMARY                                | 4       | bm_rpm.story_allocation_days.assignment_id      |      1 |    50.00 | Using where; Distinct              |
|  1 | SIMPLE      | participations        | NULL       | ref    | index_participations_on_account_id_and_workspace_id,index_participations_on_workspace_id_and_user_id,index_participations_for_sad_api_index,index_participations_on_account_id,index_participations_on_workspace_id | index_participations_for_sad_api_index | 10      | bm_rpm.story_allocation_days.workspace_id,const |      7 |    33.33 | Using where; Using index; Distinct |

这是上面的 FORMAT=JSON 版本:

| {
  "query_block": {
    "select_id": 1,
    "cost_info": {
      "query_cost": "153230.79"
    },
    "duplicates_removal": {
      "using_temporary_table": true,
      "using_filesort": false,
      "nested_loop": [
        {
          "table": {
            "table_name": "story_allocation_days",
            "access_type": "ALL",
            "possible_keys": [
              "index_story_allocation_days_on_workspace_id",
              "index_story_allocation_days_on_assignment_id_and_date",
              "index_story_allocation_days_on_assignment_id"
            ],
            "rows_examined_per_scan": 430268,
            "rows_produced_per_join": 43026,
            "filtered": "10.00",
            "cost_info": {
              "read_cost": "80548.24",
              "eval_cost": "8605.36",
              "prefix_cost": "89153.60",
              "data_read_per_join": "3M"
            },
            "used_columns": [
              "id",
              "assignment_id",
              "story_id",
              "workspace_id",
              "account_id",
              "current",
              "date",
              "minutes",
              "created_at",
              "updated_at",
              "deleted_at",
              "cost_amount_in_cents",
              "bill_amount_in_cents",
              "cost_rate_in_cents",
              "bill_rate_in_cents"
            ],
            "attached_condition": "isnull(`bm_rpm`.`story_allocation_days`.`deleted_at`)"
          }
        },
        {
          "table": {
            "table_name": "workspaces",
            "access_type": "eq_ref",
            "possible_keys": [
              "PRIMARY",
              "index_workspaces_on_account_id"
            ],
            "key": "PRIMARY",
            "used_key_parts": [
              "id"
            ],
            "key_length": "4",
            "ref": [
              "bm_rpm.story_allocation_days.workspace_id"
            ],
            "rows_examined_per_scan": 1,
            "rows_produced_per_join": 4302,
            "filtered": "10.00",
            "distinct": true,
            "cost_info": {
              "read_cost": "43026.80",
              "eval_cost": "860.54",
              "prefix_cost": "140785.76",
              "data_read_per_join": "21M"
            },
            "used_columns": [
              "id",
              "is_budgeted",
              "account_id"
            ],
            "attached_condition": "(`bm_rpm`.`workspaces`.`is_budgeted` = TRUE)"
          }
        },
        {
          "table": {
            "table_name": "assignments",
            "access_type": "eq_ref",
            "possible_keys": [
              "PRIMARY",
              "index_assignments_on_assignee_id"
            ],
            "key": "PRIMARY",
            "used_key_parts": [
              "id"
            ],
            "key_length": "4",
            "ref": [
              "bm_rpm.story_allocation_days.assignment_id"
            ],
            "rows_examined_per_scan": 1,
            "rows_produced_per_join": 2151,
            "filtered": "50.00",
            "distinct": true,
            "cost_info": {
              "read_cost": "4302.68",
              "eval_cost": "430.27",
              "prefix_cost": "145948.98",
              "data_read_per_join": "134K"
            },
            "used_columns": [
              "id",
              "assignee_id"
            ],
            "attached_condition": "(`bm_rpm`.`assignments`.`assignee_id` is not null)"
          }
        },
        {
          "table": {
            "table_name": "participations",
            "access_type": "ref",
            "possible_keys": [
              "index_participations_on_account_id_and_workspace_id",
              "index_participations_on_workspace_id_and_user_id",
              "index_participations_for_sad_api_index",
              "index_participations_on_account_id",
              "index_participations_on_workspace_id"
            ],
            "key": "index_participations_for_sad_api_index",
            "used_key_parts": [
              "workspace_id",
              "account_id"
            ],
            "key_length": "10",
            "ref": [
              "bm_rpm.story_allocation_days.workspace_id",
              "const"
            ],
            "rows_examined_per_scan": 7,
            "rows_produced_per_join": 5537,
            "filtered": "33.33",
            "using_index": true,
            "distinct": true,
            "cost_info": {
              "read_cost": "3959.11",
              "eval_cost": "1107.46",
              "prefix_cost": "153230.79",
              "data_read_per_join": "11M"
            },
            "used_columns": [
              "id",
              "workspace_id",
              "type",
              "account_id",
              "access_integer"
            ],
            "attached_condition": "((`bm_rpm`.`participations`.`access_integer` >= 30) and ((`bm_rpm`.`participations`.`type` = 'MavenParticipation') or (`bm_rpm`.`workspaces`.`account_id` = 5071)))"
          }
        }
      ]
    }
  }
} |

我试图找到一个索引,将其从完整 table 扫描更改为更有效的搜索类型。以下是索引:

mysql> show indexes in story_allocation_days;
| Table                 | Non_unique | Key_name                                              | Seq_in_index | Column_name   | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
| story_allocation_days |          0 | PRIMARY                                               |            1 | id            | A         |      418853 |     NULL | NULL   |      | BTREE      |         |               |
| story_allocation_days |          1 | index_story_allocation_days_on_workspace_id           |            1 | workspace_id  | A         |         295 |     NULL | NULL   |      | BTREE      |         |               |
| story_allocation_days |          1 | index_story_allocation_days_on_assignment_id_and_date |            1 | assignment_id | A         |       42533 |     NULL | NULL   |      | BTREE      |         |               |
| story_allocation_days |          1 | index_story_allocation_days_on_assignment_id_and_date |            2 | date          | A         |      418853 |     NULL | NULL   |      | BTREE      |         |               |
| story_allocation_days |          1 | index_story_allocation_days_on_account_id             |            1 | account_id    | A         |          41 |     NULL | NULL   |      | BTREE      |         |               |
| story_allocation_days |          1 | index_story_allocation_days_on_story_id_and_date      |            1 | story_id      | A         |        2519 |     NULL | NULL   |      | BTREE      |         |               |
| story_allocation_days |          1 | index_story_allocation_days_on_story_id_and_date      |            2 | date          | A         |       83246 |     NULL | NULL   |      | BTREE      |         |               |
| story_allocation_days |          1 | index_story_allocation_days_on_date                   |            1 | date          | A         |         740 |     NULL | NULL   |      | BTREE      |         |               |
| story_allocation_days |          1 | index_story_allocation_days_on_created_at             |            1 | created_at    | A         |        6977 |     NULL | NULL   |      | BTREE      |         |               |
| story_allocation_days |          1 | index_story_allocation_days_on_updated_at             |            1 | updated_at    | A         |        7681 |     NULL | NULL   |      | BTREE      |         |               |
| story_allocation_days |          1 | index_story_allocation_days_on_assignment_id          |            1 | assignment_id | A         |       43891 |     NULL | NULL   |      | BTREE      |         |               |
| story_allocation_days |          1 | index_story_allocation_days_on_story_id               |            1 | story_id      | A         |        2494 |     NULL | NULL   |      | BTREE      |         |               |
| story_allocation_days |          1 | sad_deleted_at                                        |            1 | deleted_at    | A         |         225 |     NULL | NULL   | YES  | BTREE      |         |               |

我在 story_allocation_days 上尝试了 USE INDEX(index_story_allocation_days_on_assignment_id),但这实际上比原始查询慢了一点:

99822 rows in set (5.99 sec)

直觉,我认为 WHERE (story_allocation_days.deleted_at is null) 可能会破坏优化器,所以我尝试创建 sad_deleted_at 索引并使用 NULL 更新所有行的 deleted_at 列到 1970-01-01 00:00:00,因为索引性能可能因 NULL 值的存在而受到影响。但是,这导致性能也比原来的差:

SELECT DISTINCT SQL_CALC_FOUND_ROWS `story_allocation_days`.* 
FROM `story_allocation_days` 
INNER JOIN `workspaces` 
ON `workspaces`.`id` = `story_allocation_days`.`workspace_id` 
INNER JOIN `participations` 
ON `participations`.`workspace_id` = `workspaces`.`id` 
INNER JOIN `assignments` 
ON `assignments`.`id` = `story_allocation_days`.`assignment_id` 
WHERE (story_allocation_days.deleted_at = '1970-01-01 00:00:00') 
AND `workspaces`.`is_budgeted` = TRUE 
AND (participations.account_id = 5071) 
AND (participations.access_integer >= 30) 
AND (participations.type = 'MavenParticipation' OR workspaces.account_id = 5071) 
AND (assignments.assignee_id IS NOT NULL);
99822 rows in set (6.12 sec)

这是相应的 EXPLAIN 语句:

|  1 | SIMPLE      | story_allocation_days | NULL       | ref    | index_story_allocation_days_on_workspace_id,index_story_allocation_days_on_assignment_id_and_date,index_story_allocation_days_on_assignment_id,sad_deleted_at                                                       | sad_deleted_at                         | 6       | const                                                 | 210391 |   100.00 | Using temporary                    |
|  1 | SIMPLE      | workspaces            | NULL       | eq_ref | PRIMARY,index_workspaces_on_account_id                                                                                                                                                                              | PRIMARY                                | 4       | bm_rpm.story_allocation_days.workspace_id       |      1 |    10.00 | Using where; Distinct              |
|  1 | SIMPLE      | assignments           | NULL       | eq_ref | PRIMARY,index_assignments_on_assignee_id                                                                                                                                                                            | PRIMARY                                | 4       | bm_rpm.story_allocation_days.assignment_id      |      1 |    50.00 | Using where; Distinct              |
|  1 | SIMPLE      | participations        | NULL       | ref    | index_participations_on_account_id_and_workspace_id,index_participations_on_workspace_id_and_user_id,index_participations_for_sad_api_index,index_participations_on_account_id,index_participations_on_workspace_id | index_participations_for_sad_api_index | 10      | bm_rpm.story_allocation_days.workspace_id,const |      7 |    33.33 | Using where; Using index; Distinct |

同样,上面的 FORMAT=JSON 版本:

# NEW:
| {
  "query_block": {
    "select_id": 1,
    "cost_info": {
      "query_cost": "366895.66"
    },
    "duplicates_removal": {
      "using_temporary_table": true,
      "using_filesort": false,
      "nested_loop": [
        {
          "table": {
            "table_name": "story_allocation_days",
            "access_type": "ref",
            "possible_keys": [
              "index_story_allocation_days_on_workspace_id",
              "index_story_allocation_days_on_assignment_id_and_date",
              "index_story_allocation_days_on_assignment_id",
              "sad_deleted_at"
            ],
            "key": "sad_deleted_at",
            "used_key_parts": [
              "deleted_at"
            ],
            "key_length": "6",
            "ref": [
              "const"
            ],
            "rows_examined_per_scan": 210552,
            "rows_produced_per_join": 210552,
            "filtered": "100.00",
            "cost_info": {
              "read_cost": "11223.00",
              "eval_cost": "42110.40",
              "prefix_cost": "53333.40",
              "data_read_per_join": "16M"
            },
            "used_columns": [
              "id",
              "assignment_id",
              "story_id",
              "workspace_id",
              "account_id",
              "current",
              "date",
              "minutes",
              "created_at",
              "updated_at",
              "deleted_at",
              "cost_amount_in_cents",
              "bill_amount_in_cents",
              "cost_rate_in_cents",
              "bill_rate_in_cents"
            ]
          }
        },
        {
          "table": {
            "table_name": "workspaces",
            "access_type": "eq_ref",
            "possible_keys": [
              "PRIMARY",
              "index_workspaces_on_account_id"
            ],
            "key": "PRIMARY",
            "used_key_parts": [
              "id"
            ],
            "key_length": "4",
            "ref": [
              "bm_rpm.story_allocation_days.workspace_id"
            ],
            "rows_examined_per_scan": 1,
            "rows_produced_per_join": 21055,
            "filtered": "10.00",
            "distinct": true,
            "cost_info": {
              "read_cost": "210552.00",
              "eval_cost": "4211.04",
              "prefix_cost": "305995.80",
              "data_read_per_join": "105M"
            },
            "used_columns": [
              "id",
              "is_budgeted",
              "account_id"
            ],
            "attached_condition": "(`bm_rpm`.`workspaces`.`is_budgeted` = TRUE)"
          }
        },
        {
          "table": {
            "table_name": "assignments",
            "access_type": "eq_ref",
            "possible_keys": [
              "PRIMARY",
              "index_assignments_on_assignee_id"
            ],
            "key": "PRIMARY",
            "used_key_parts": [
              "id"
            ],
            "key_length": "4",
            "ref": [
              "bm_rpm.story_allocation_days.assignment_id"
            ],
            "rows_examined_per_scan": 1,
            "rows_produced_per_join": 10527,
            "filtered": "50.00",
            "distinct": true,
            "cost_info": {
              "read_cost": "21055.20",
              "eval_cost": "2105.52",
              "prefix_cost": "331262.04",
              "data_read_per_join": "657K"
            },
            "used_columns": [
              "id",
              "assignee_id"
            ],
            "attached_condition": "(`bm_rpm`.`assignments`.`assignee_id` is not null)"
          }
        },
        {
          "table": {
            "table_name": "participations",
            "access_type": "ref",
            "possible_keys": [
              "index_participations_on_account_id_and_workspace_id",
              "index_participations_on_workspace_id_and_user_id",
              "index_participations_for_sad_api_index",
              "index_participations_on_account_id",
              "index_participations_on_workspace_id"
            ],
            "key": "index_participations_for_sad_api_index",
            "used_key_parts": [
              "workspace_id",
              "account_id"
            ],
            "key_length": "10",
            "ref": [
              "bm_rpm.story_allocation_days.workspace_id",
              "const"
            ],
            "rows_examined_per_scan": 7,
            "rows_produced_per_join": 27096,
            "filtered": "33.33",
            "using_index": true,
            "distinct": true,
            "cost_info": {
              "read_cost": "19373.95",
              "eval_cost": "5419.35",
              "prefix_cost": "366895.66",
              "data_read_per_join": "55M"
            },
            "used_columns": [
              "id",
              "workspace_id",
              "type",
              "account_id",
              "access_integer"
            ],
            "attached_condition": "((`bm_rpm`.`participations`.`access_integer` >= 30) and ((`bm_rpm`.`participations`.`type` = 'MavenParticipation') or (`bm_rpm`.`workspaces`.`account_id` = 5071)))"
          }
        }
      ]
    }
  }
} |

我很困惑为什么查询时间没有改善(实际上变得更糟),即使 story_allocation_daysfilter 列从 10.0 变为 100.0 并且检查的行数从 400,000+ 到 200,000-ish。

谁能看出为什么索引没有缩短查询时间?

编辑:

我尝试从原始查询中删除 SQL_CALC_FOUND_ROWS,但 99k 行仍然需要 5.51 秒。下面的 EXPLAIN,采用 JSON 格式(因为这个问题我 运行 字符不足):

| {
  "query_block": {
    "select_id": 1,
    "cost_info": {
      "query_cost": "153230.79"
    },
    "duplicates_removal": {
      "using_temporary_table": true,
      "using_filesort": false,
      "nested_loop": [
        {
          "table": {
            "table_name": "story_allocation_days",
            "access_type": "ALL",
            "possible_keys": [
              "index_story_allocation_days_on_workspace_id",
              "index_story_allocation_days_on_assignment_id_and_date",
              "index_story_allocation_days_on_assignment_id"
            ],
            "rows_examined_per_scan": 430268,
            "rows_produced_per_join": 43026,
            "filtered": "10.00",
            "cost_info": {
              "read_cost": "80548.24",
              "eval_cost": "8605.36",
              "prefix_cost": "89153.60",
              "data_read_per_join": "3M"
            },
            "used_columns": [
              "id",
              "assignment_id",
              "story_id",
              "workspace_id",
              "account_id",
              "current",
              "date",
              "minutes",
              "created_at",
              "updated_at",
              "deleted_at",
              "cost_amount_in_cents",
              "bill_amount_in_cents",
              "cost_rate_in_cents",
              "bill_rate_in_cents"
            ],
            "attached_condition": "isnull(`bm_rpm`.`story_allocation_days`.`deleted_at`)"
          }
        },
        {
          "table": {
            "table_name": "workspaces",
            "access_type": "eq_ref",
            "possible_keys": [
              "PRIMARY",
              "index_workspaces_on_account_id"
            ],
            "key": "PRIMARY",
            "used_key_parts": [
              "id"
            ],
            "key_length": "4",
            "ref": [
              "bm_rpm.story_allocation_days.workspace_id"
            ],
            "rows_examined_per_scan": 1,
            "rows_produced_per_join": 4302,
            "filtered": "10.00",
            "distinct": true,
            "cost_info": {
              "read_cost": "43026.80",
              "eval_cost": "860.54",
              "prefix_cost": "140785.76",
              "data_read_per_join": "21M"
            },
            "used_columns": [
              "id",
              "is_budgeted",
              "account_id"
            ],
            "attached_condition": "(`bm_rpm`.`workspaces`.`is_budgeted` = TRUE)"
          }
        },
        {
          "table": {
            "table_name": "assignments",
            "access_type": "eq_ref",
            "possible_keys": [
              "PRIMARY",
              "index_assignments_on_assignee_id"
            ],
            "key": "PRIMARY",
            "used_key_parts": [
              "id"
            ],
            "key_length": "4",
            "ref": [
              "bm_rpm.story_allocation_days.assignment_id"
            ],
            "rows_examined_per_scan": 1,
            "rows_produced_per_join": 2151,
            "filtered": "50.00",
            "distinct": true,
            "cost_info": {
              "read_cost": "4302.68",
              "eval_cost": "430.27",
              "prefix_cost": "145948.98",
              "data_read_per_join": "134K"
            },
            "used_columns": [
              "id",
              "assignee_id"
            ],
            "attached_condition": "(``bm_rpm`.`assignments`.`assignee_id` is not null)"
          }
        },
        {
          "table": {
            "table_name": "participations",
            "access_type": "ref",
            "possible_keys": [
              "index_participations_on_account_id_and_workspace_id",
              "index_participations_on_workspace_id_and_user_id",
              "index_participations_for_sad_api_index",
              "index_participations_on_account_id",
              "index_participations_on_workspace_id"
            ],
            "key": "index_participations_for_sad_api_index",
            "used_key_parts": [
              "workspace_id",
              "account_id"
            ],
            "key_length": "10",
            "ref": [
              "bm_rpm.story_allocation_days.workspace_id",
              "const"
            ],
            "rows_examined_per_scan": 7,
            "rows_produced_per_join": 5537,
            "filtered": "33.33",
            "using_index": true,
            "distinct": true,
            "cost_info": {
              "read_cost": "3959.11",
              "eval_cost": "1107.46",
              "prefix_cost": "153230.79",
              "data_read_per_join": "11M"
            },
            "used_columns": [
              "id",
              "workspace_id",
              "type",
              "account_id",
              "access_integer"
            ],
            "attached_condition": "((`bm_rpm`.`participations`.`access_integer` >= 30) and ((`bm_rpm`.`participations`.`type` = 'MavenParticipation') or (`bm_rpm`.`workspaces`.`account_id` = 5071)))"
          }
        }
      ]
    }
  }
} |

编辑#2:

这是 story_allocation_daysSHOW CREATE TABLE:

| story_allocation_days | CREATE TABLE `story_allocation_days` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `assignment_id` int(11) NOT NULL,
  `story_id` int(11) NOT NULL,
  `workspace_id` int(11) NOT NULL,
  `account_id` int(11) NOT NULL,
  `current` tinyint(1) NOT NULL,
  `date` date NOT NULL,
  `minutes` int(11) NOT NULL DEFAULT '0',
  `created_at` datetime NOT NULL,
  `updated_at` datetime NOT NULL,
  `deleted_at` datetime DEFAULT NULL,
  `cost_amount_in_cents` bigint(20) DEFAULT NULL,
  `bill_amount_in_cents` bigint(20) DEFAULT NULL,
  `cost_rate_in_cents` bigint(20) DEFAULT NULL,
  `bill_rate_in_cents` bigint(20) DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `index_story_allocation_days_on_workspace_id` (`workspace_id`),
  KEY `index_story_allocation_days_on_assignment_id_and_date` (`assignment_id`,`date`),
  KEY `index_story_allocation_days_on_account_id` (`account_id`),
  KEY `index_story_allocation_days_on_story_id_and_date` (`story_id`,`date`),
  KEY `index_story_allocation_days_on_date` (`date`),
  KEY `index_story_allocation_days_on_created_at` (`created_at`),
  KEY `index_story_allocation_days_on_updated_at` (`updated_at`),
  KEY `index_story_allocation_days_on_assignment_id` (`assignment_id`),
  KEY `index_story_allocation_days_on_story_id` (`story_id`)
) ENGINE=InnoDB AUTO_INCREMENT=9646396 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci |

我会提供以下内容。重写从参与者开始的查询,因为您首先从该粒度开始,而不是所有“悲伤”条目。添加 MySQL 关键字“STRAIGHT_JOIN”告诉引擎...不要为我思考,按照我写的去做...因为您正在做“Distinct”,所以做“*”可能是评估所有列的唯一性。如果你只输入“sad.id”(假设每个 table 都有“Id”作为它自己的主键。我在下面输入 [whateverItsUniqueKeyIdIs] 但认为它可能只是 sad.id 并在下面进行相应更改。如果这是该行的唯一值,则引擎不需要转到实际数据行页面来限定唯一性。

接下来是索引。通过在 table 上使用多列复合索引来覆盖 WHERE 的连接和条件,也将有助于引擎直接根据索引进行查询,而不是转到原始数据页面。不要尝试让 table 在单独的列上有索引,因为只有一个不会像涵盖您正在寻找的标准的那些那样优化。

Table                  Index
Participants           ( account_id, access_integer, type, workspace_id)
Workspaces             ( id, is_budgeted, account_id )
Story_Allocation_Days  ( workspace_id, deleted_at, assignment_id, [whateverItsUniqueKeyIdIs]
Assignments            ( id, assignee_id )

终于正式查询了。

SELECT STRAIGHT_JOIN DISTINCT 
        SQL_CALC_FOUND_ROWS sad.[whateverItsUniqueKeyIdIs]
    FROM 
        participations p
            INNER JOIN workspaces ws
                ON p.workspace_id = ws.id
                AND ws.is_budgeted = TRUE 
            INNER JOIN story_allocation_days sad
                ON p.workspace_id = sad.workspace_id
                AND sad.deleted_at is null
                INNER JOIN assignments a
                    ON sad.assignment_id = a.id  
                    AND a.assignee_id IS NOT NULL
    where
            p.account_id = 5071
        AND p.access_integer >= 30
        AND ( p.type = 'MavenParticipation' 
            OR ws.account_id = 5071)

唯一可能的杀手级阻力是“OR ws.Account_id = 5701”,但应该仍然很快,因为主要依据是参与者限制标准而不是所有账户。

对评论的回复

关于查询的直觉。数据库可能充满膨胀,有时会隐藏在您正在搜索的内容中。作为数据库背后的人,您更了解数据中的粒度。

我尝试查看根“我想要什么”的查询。在你的情况下,你想要一个特定帐户的所有 participants(有一些次要的其他标准)。通过首先将它放在我的脑海中,它成为我的第一个 table。现在,看看帮助我首先获得那个组件的索引,而且只有那个。然后,我向外加入其他部分,例如您的 workspacestory_allocation_days 以及作业。由于比较的上下文,我直接在他们的 JOIN 条件中应用他们的限制条件,但 ws.account_id 上的“OR”除外,但核心仍然始终是 p.account_id = 5071 AND p.access_integer >= 30

现在,拥有适当/有效的索引是另一回事,并且以正确的顺序排列它们 order/position 可能会产生巨大的影响,不仅仅是一个字段有多个索引,而且更有效的单个索引有多个列 -- 基于您 运行 最常查询的类型。

正如我在其他 post 答案中以及您的数据场景中所描述的那样。您最初是从 story_allocation_days table 开始的。它必须遍历所有内容,针对每个帐户,甚至在它到达参与者 table 之前,然后到所有工作站和作业。引擎不知道你想要什么,所以有有效的索引。

从问题 Participants 中的键 table 开始,并且知道您需要特定的 account_id 和 access_integer 值,这就是您的起点。想象一个由盒子组成的房间,它们保存着所有参与者的数据。房间里的每个箱子上都有先到先得,箱子旁边有一个“Account_ID”,所有的箱子都是按顺序排列的。你可以有 1,000 个盒子,你可以 运行 直接找到你想要的一个帐户 ID。您已经消除了房间其余部分的所有其他盒子。所以现在,你打开那个盒子,在里面,它们按 access_integer 从 1-?? 预先排序。所以现在,你翻到 31 就完成了。

只有这样,您才能找到简短的清单来查找其余的详细信息,而您甚至没有查看原始数据页面。在每一页的顶部都有一个类型和工作区 ID 的注释,因为这是我建议索引的方式。同样,无需进入文档翻页查看类型和工作空间,这些都写在页面顶部以供快速参考。这些是用于加入 table 的下一级的字段。无需获取基础记录的原始数据即可完成所有操作。

通过使用 straight_join 子句,您已经从引擎中消除了查询的所有繁重工作,现在它只需对这些次要级别使用适当的索引进行简单连接 table秒。 HTH.

OR(通常)会降低性能。 UNION 是一种解决方法:

( SELECT   sad.*
    FROM  `story_allocation_days` AS sad
    INNER JOIN  `workspaces` AS w  ON w.`id` = sad.`workspace_id`
    INNER JOIN  `participations` AS p  ON p.`workspace_id` = w.`id`
    INNER JOIN  `assignments` AS a  ON a.`id` = sad.`assignment_id`
    WHERE  sad.deleted_at = '1970-01-01'
      AND  w.`is_budgeted` = TRUE
      AND  p.account_id = 5071
      AND  p.access_integer >= 30
      AND  p.type = 'MavenParticipation'
      AND  a.assignee_id IS NOT NULL
)
UNION DISTINCT
( SELECT   sad.*
    FROM  `story_allocation_days` AS sad
    INNER JOIN  `workspaces` AS w  ON w.`id` = sad.`workspace_id`
    INNER JOIN  `participations` AS p  ON p.`workspace_id` = w.`id`
    INNER JOIN  `assignments` AS a  ON a.`id` = sad.`assignment_id`
    WHERE  sad.deleted_at = '1970-01-01'
      AND  w.`is_budgeted` = TRUE
      AND  w.account_id = 5071
      AND  p.account_id = 5071
      AND  p.access_integer >= 30
      AND  a.assignee_id IS NOT NULL
)

那就需要这些了。这些索引中列的顺序很重要。 (我假设每个 table 都有 PRIMARY KEY(id)。)

w:  INDEX(account_id, is_budgeted)
p:  INDEX(account_id, type, access_integer, workspace_id)
p:  INDEX(workspace_id, account_id, access_integer)
sad:  INDEX(workspace_id, deleted_at)
sad:  INDEX(assignment_id, deleted_at)

有一些额外的索引,因为我不知道优化器更喜欢以什么顺序访问 table。 (它取决于数据。)

我删除了 SQL_CALC_FOUND_ROWS,因为没有 LIMIT 就没有必要了。

让我知道在 SELECTs 中出现相同的行是否很常见,并且 [=] 中有很多大列(例如 TEXTBLOB) 21=]。在那种情况下,我会想修改查询以避免拖太多东西:

SELECT sad2.*
    FROM ( ( SELECT sad.id
               FROM ... (1st 4-way join plus WHERE)
             UNION DISTINCT
           ( SELECT sad.id
               FROM ... (2nd 4-way join plus WHERE)
         ) )  AS u
    JOIN `story_allocation_days` AS sad2  USING(id);

这可能会更快,因为只需要拖拉 id 直到完成 UNION