MySQL self join 只查找下一个版本
MySQL self join to find only the next revision
实际上我正在使用 Wordpress。我想创建一个自连接或类似的东西来找到 post 的修订版,以及相同 post 的以下修订版。
create table wp_posts (post_id int, revision_id int);
INSERT INTO wp_posts(post_id, revision_id) VALUES (1, 1);
INSERT INTO wp_posts(post_id, revision_id) VALUES (1, 2);
INSERT INTO wp_posts(post_id, revision_id) VALUES (1, 3);
INSERT INTO wp_posts(post_id, revision_id) VALUES (2, 11);
INSERT INTO wp_posts(post_id, revision_id) VALUES (2, 12);
INSERT INTO wp_posts(post_id, revision_id) VALUES (2, 13);
SELECT a.post_id, a.revision_id "PreviousRevision", b.revision_id "FollowingRevision"
FROM `wp_posts` a
JOIN `wp_posts` b
ON a.post_id = b.post_id #the id of every revision of a post is different but the post_id is the same
WHERE a.revision_id < b.revision_id
AND a.revision_id != b.revision_id
https://www.db-fiddle.com/f/eHnwYABYrVVQAhn8xLJ77q/1
之前的查询不起作用,因为对于 a 的每条记录,它需要所有已进行的修订,而不仅仅是下一个。
这是我得到的,我已经删除了我不想要的行。我只需要父子行。
如何只取一个元素?
在 MySQL 8+ 中,您将使用 window 函数:
SELECT p.post_id, p.revision_id,
lag(p.revision_id) over (partition by p.post_id order by p.revision_id) as prev_revision_id,
lead(p.revision_id) over (partition by p.post_id order by p.revision_id) as next_revision_id
FROM `wp_posts` p;
在早期版本中,我会使用相关子查询:
select p.post_id, p.revision_id,
(select max(p2.revision_id)
from wp_posts p2
where p2.post_id = p.post_id and p2.revision_id < p.revision_id
) as prev_revision_id,
(select min(p2.revision_id)
from wp_posts p2
where p2.post_id = p.post_id and p2.revision_id > p.revision_id
) as next_revision_id
from wp_posts p;
对于此示例数据,您需要 a.post_id, b.revision_id
的每个组合的最大值 a.revision_id
:
SELECT
a.post_id,
MAX(a.revision_id) "PreviousRevision",
b.revision_id "FollowingRevision"
FROM `wp_posts` a JOIN `wp_posts` b
ON a.post_id = b.post_id
WHERE a.revision_id < b.revision_id
GROUP BY a.post_id, b.revision_id
另外,条件 a.revision_id != b.revision_id
不是必需的,因为您已经有了 a.revision_id < b.revision_id
。
参见 demo.
结果:
| post_id | PreviousRevision | FollowingRevision |
| ------- | ---------------- | ----------------- |
| 1 | 1 | 2 |
| 1 | 2 | 3 |
| 2 | 11 | 12 |
| 2 | 12 | 13 |
我不会将来自@forpas 的查询与分组依据一起使用,因为我不喜欢查询的解释方式(临时 + 文件排序)。
在这种情况下我通常会这样做:
SELECT
a.post_id
, a.revision_id "PrevRevision"
, b.revision_id "NextRevision"
FROM
`wp_posts` AS a
INNER JOIN `wp_posts` AS b ON (
b.post_id = a.post_id
AND b.revision_id > a.revision_id
)
LEFT JOIN `wp_posts` AS c ON (
c.post_id = a.post_id
AND c.revision_id > a.revision_id
AND c.revision_id < b.revision_id
)
WHERE
c.revision_id IS NULL
在 table 上用索引 (post_id, revision_id) 解释:
id select_type table partitions type possible_keys key key_len ref rows filtered Extra
1 SIMPLE a index IX_wp_post_idx IX_wp_post_idx 10 6 100.00 Using where; Using index
1 SIMPLE b ref IX_wp_post_idx IX_wp_post_idx 5 test.a.post_id 4 33.33 Using where; Using index
1 SIMPLE c ref IX_wp_post_idx IX_wp_post_idx 5 test.a.post_id 4 16.67 Using where; Using index
在某些数据集上查询子查询(@Gordon Linoff 建议)会更快。
实际上我正在使用 Wordpress。我想创建一个自连接或类似的东西来找到 post 的修订版,以及相同 post 的以下修订版。
create table wp_posts (post_id int, revision_id int);
INSERT INTO wp_posts(post_id, revision_id) VALUES (1, 1);
INSERT INTO wp_posts(post_id, revision_id) VALUES (1, 2);
INSERT INTO wp_posts(post_id, revision_id) VALUES (1, 3);
INSERT INTO wp_posts(post_id, revision_id) VALUES (2, 11);
INSERT INTO wp_posts(post_id, revision_id) VALUES (2, 12);
INSERT INTO wp_posts(post_id, revision_id) VALUES (2, 13);
SELECT a.post_id, a.revision_id "PreviousRevision", b.revision_id "FollowingRevision"
FROM `wp_posts` a
JOIN `wp_posts` b
ON a.post_id = b.post_id #the id of every revision of a post is different but the post_id is the same
WHERE a.revision_id < b.revision_id
AND a.revision_id != b.revision_id
https://www.db-fiddle.com/f/eHnwYABYrVVQAhn8xLJ77q/1
之前的查询不起作用,因为对于 a 的每条记录,它需要所有已进行的修订,而不仅仅是下一个。
这是我得到的,我已经删除了我不想要的行。我只需要父子行。
如何只取一个元素?
在 MySQL 8+ 中,您将使用 window 函数:
SELECT p.post_id, p.revision_id,
lag(p.revision_id) over (partition by p.post_id order by p.revision_id) as prev_revision_id,
lead(p.revision_id) over (partition by p.post_id order by p.revision_id) as next_revision_id
FROM `wp_posts` p;
在早期版本中,我会使用相关子查询:
select p.post_id, p.revision_id,
(select max(p2.revision_id)
from wp_posts p2
where p2.post_id = p.post_id and p2.revision_id < p.revision_id
) as prev_revision_id,
(select min(p2.revision_id)
from wp_posts p2
where p2.post_id = p.post_id and p2.revision_id > p.revision_id
) as next_revision_id
from wp_posts p;
对于此示例数据,您需要 a.post_id, b.revision_id
的每个组合的最大值 a.revision_id
:
SELECT
a.post_id,
MAX(a.revision_id) "PreviousRevision",
b.revision_id "FollowingRevision"
FROM `wp_posts` a JOIN `wp_posts` b
ON a.post_id = b.post_id
WHERE a.revision_id < b.revision_id
GROUP BY a.post_id, b.revision_id
另外,条件 a.revision_id != b.revision_id
不是必需的,因为您已经有了 a.revision_id < b.revision_id
。
参见 demo.
结果:
| post_id | PreviousRevision | FollowingRevision |
| ------- | ---------------- | ----------------- |
| 1 | 1 | 2 |
| 1 | 2 | 3 |
| 2 | 11 | 12 |
| 2 | 12 | 13 |
我不会将来自@forpas 的查询与分组依据一起使用,因为我不喜欢查询的解释方式(临时 + 文件排序)。
在这种情况下我通常会这样做:
SELECT
a.post_id
, a.revision_id "PrevRevision"
, b.revision_id "NextRevision"
FROM
`wp_posts` AS a
INNER JOIN `wp_posts` AS b ON (
b.post_id = a.post_id
AND b.revision_id > a.revision_id
)
LEFT JOIN `wp_posts` AS c ON (
c.post_id = a.post_id
AND c.revision_id > a.revision_id
AND c.revision_id < b.revision_id
)
WHERE
c.revision_id IS NULL
在 table 上用索引 (post_id, revision_id) 解释:
id select_type table partitions type possible_keys key key_len ref rows filtered Extra
1 SIMPLE a index IX_wp_post_idx IX_wp_post_idx 10 6 100.00 Using where; Using index
1 SIMPLE b ref IX_wp_post_idx IX_wp_post_idx 5 test.a.post_id 4 33.33 Using where; Using index
1 SIMPLE c ref IX_wp_post_idx IX_wp_post_idx 5 test.a.post_id 4 16.67 Using where; Using index
在某些数据集上查询子查询(@Gordon Linoff 建议)会更快。