提高缓慢的 MariaDB 查询性能

Improving slow MariaDB query performance

我有一个看似相当简单的查询,但它非常慢,如果可以的话我想提高它的性能。

SELECT `contacts`.`unit_id`, `contacts`.`owner_id`, `units`.`description`, 
  `units`.`address`, `owners`.`name`, `owners`.`email`, COUNT(*) AS contact_count
FROM `contacts`
LEFT JOIN `units` ON `contacts`.`unit_id` = `units`.`id`
LEFT JOIN `owners` ON `contacts`.`owner_id` = `owners`.`id`
WHERE `owners.group_id` = 6
  AND `contacts`.`checkin` BETWEEN '2021-10-01 00:00:00' AND '2021-10-31 23:59:59'
GROUP BY `units`.`id`
ORDER BY `contact_count` DESC
LIMIT 20;

我只是想获取在给定日期范围内联系人最多且属于特定所有者组的单元。

+------+-------------+----------+--------+--------------------------------------------------+---------------------------+---------+-------------------------+------+---------------------------------+
| id   | select_type | table    | type   | possible_keys                                    | key                       | key_len | ref                     | rows | Extra                           |
+------+-------------+----------+--------+--------------------------------------------------+---------------------------+---------+-------------------------+------+---------------------------------+
|    1 | SIMPLE      | owners   | ref    | PRIMARY,owners_group_id_foreign                  | owners_group_id_foreign   | 4       | const                   | 1133 | Using temporary; Using filesort |
|    1 | SIMPLE      | contacts | ref    | contacts_checkin_index,contacts_owner_id_foreign | contacts_owner_id_foreign | 4       | appdb.owners.id         | 1145 | Using where                     |
|    1 | SIMPLE      | units    | eq_ref | PRIMARY                                          | PRIMARY                   | 4       | appdb.contacts.unit_id  |    1 |                                 |
+------+-------------+----------+--------+--------------------------------------------------+---------------------------+---------+-------------------------+------+---------------------------------+

据我所知,应编入索引的所有内容是:

CREATE TABLE `contacts` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `owner_id` int(10) unsigned NOT NULL,
  `unit_id` int(10) unsigned NOT NULL,
  `terminal_id` int(10) unsigned NOT NULL,
  `checkin` datetime NOT NULL
  PRIMARY KEY (`id`),
  KEY `contacts_checkin_index` (`checkin`),
  KEY `contacts_unit_id_foreign` (`unit_id`),
  KEY `contacts_terminal_id_foreign` (`terminal_id`),
  KEY `contacts_owner_id_foreign` (`owner_id`),
  CONSTRAINT `contacts_unit_id_foreign` FOREIGN KEY (`unit_id`) REFERENCES `units` (`id`) ON DELETE CASCADE ON UPDATE CASCADE,
  CONSTRAINT `contacts_terminal_id_foreign` FOREIGN KEY (`terminal_id`) REFERENCES `terminals` (`id`) ON DELETE CASCADE ON UPDATE CASCADE,
  CONSTRAINT `contacts_owner_id_foreign` FOREIGN KEY (`owner_id`) REFERENCES `owners` (`id`) ON DELETE CASCADE ON UPDATE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=25528530 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci

contactstable目前约有1000万行,本次查询到运行大约需要4分钟。这是否有任何可以显着改进的地方,或者我现在只是遇到了硬件的限制?

我没有测试过,所以不能确定,但​​我认为速度慢的主要原因是参与group by的行数。

所以,您可以试试下面的方法来减少行数。 (因为我无法测试它,我不确定这个查询是否 运行 正确。我只是想告诉你一个方法。)

SELECT B.*, `owners`.`name`, `owners`.`email`
FROM (
   SELECT `units`.`id`, MAX(`contacts`.`owner_id`) AS owner_id, `units`.`description`, `units`.`address`, COUNT(*) AS contact_count
   FROM (
        SELECT *
        FROM `contacts`
        WHERE `contacts`.`checkin` BETWEEN '2021-10-01 00:00:00' AND '2021-10-31 23:59:59') as A
   LEFT JOIN `units` ON A.`unit_id` = `units`.`id`
   GROUP BY `units`.`id`) AS B
LEFT JOIN `owners` ON B.`owner_id` = `owners`.`id` AND `owners.group_id` = 6
ORDER BY `contact_count` DESC
LIMIT 20

之前,我也有类似的经历,我必须按日期和时间范围检查广告浏览量和页面访问量, 这样,时间就减少了。

SELECT  sub.unit_id, sub.owner_id, u.`description`, u.`address`,
        sub.name, sub.email,
        sub.contact_count
    FROM  
        ( SELECT  c.`unit_id`, c.`owner_id`,
                  o.`name`, o.`email`,
                  COUNT(*) AS contact_count
            FROM  `contacts` AS c
            JOIN  `owners` AS o  ON c.`owner_id` = o.`id`
            WHERE  o.`group_id` = 6
              AND  c.`checkin` >= '2021-10-01'
              AND  c.`checkin` <  '2021-10-01' + INTERVAL 1 MONTH
            GROUP BY  c.`unit_id`
            ORDER BY  `contact_count` DESC
            LIMIT  20 
        ) AS sub
    LEFT JOIN  `units` AS u  ON sub.`unit_id` = u.`id`
    ORDER BY  `contact_count` DESC, sub.unit_id DESC;

备注:

  • 为了只击中 units 20 次,我把它翻了过来。
  • JOIN owners 不能是 LEFT JOIN,所以我改了。
  • 我更改了 GROUP BY 以避免过早地使用 units
  • 可能 GROUP BY 现在是多余的。
  • 我更改了日期范围以使其更易于通用。
  • 我增加了 ORDER BY 以使其在重复计数的情况下具有确定性。
  • 请注意下面“复合”索引的作用。

可能有帮助的索引:

contacts:  INDEX(checkin, unit_id, owner_id)
contacts:  INDEX(owner_id, checkin, unit_id)
owners:  INDEX(group_id, id,  name, email)

添加这些时,删除以相同列开头的任何 INDEXes。示例:contacts: INDEX(checkin)