SQL/MySQL 删除除其中 2 行之外的所有行

SQL/MySQL DELETE all rows EXCEPT 2 of them

我有一个数据库 table 设置如下:

   id | code  | group_id | status ---
   ---|-------|---------|------------
    1 | abcd1 | group_1 | available
    2 | abcd2 | group_1 | available
    3 | adsd3 | group_1 | available
    4 | dfgd4 | group_1 | available
    5 | vfcd5 | group_1 | available

    6 | bgcd6 | group_2 | available
    7 | abcd7 | group_2 | available
    8 | ahgf8 | group_2 | available
    9 | dfgd9 | group_2 | available
   10 | qwer6 | group_2 | available

在上面的示例中,每个 group_id 总共有 5 行(例如任意行,总行数将是动态的并且会变化),我需要删除匹配 available 中的每一行 status 除了其中2个(哪个2个无所谓,只要剩下2个即可)

基本上每个唯一 group_id 应该只有 2 行,其中 statusavailable。我可以做一个简单的 SQL 查询来删除所有这些,但是很难想出一个 SQL 查询来删除所有 除了 2 ...请帮忙:)

如果“id”列是 PRIMARY KEY 或 UNIQUE KEY,那么我们可以使用相关子查询来获取特定 group_id.[=19= 的第二小值]

然后我们可以使用它来识别 group_id 的行,这些行在“id”列中具有更高的值。

这样的查询:

 SELECT t.`id`
      , t.`group_id` 
   FROM `setup_like_this` t
  WHERE t.`status`  = 'available'
    AND t.`id`
          > ( SELECT s.`id`
                FROM `setup_like_this` s
               WHERE s.`status`   = 'available'
                 AND s.`group_id` = t.`group_id`
               ORDER
                  BY s.`id`
               LIMIT 1,1
            )

我们首先将其作为 SELECT 进行测试,以检查返回的行。当我们对这个查询返回要删除的行集感到满意时,我们可以用 DELETE t.* FROM 替换 SELECT ... FROM 将其转换为 DELETE 语句来删除行。


转换为 DELETE 语句时遇到错误 1093。

一种解决方法是将上面的查询变成内联视图,然后加入目标 table

DELETE q.*
  FROM `setup_like_this` q
  JOIN ( -- inline view, query from above returns `id` of rows we want to delete
         SELECT t.`id`
              , t.`group_id` 
           FROM `setup_like_this` t
          WHERE t.`status`  = 'available'
            AND t.`id`
                  > ( SELECT s.`id`
                        FROM `setup_like_this` s
                       WHERE s.`status`   = 'available'
                         AND s.`group_id` = t.`group_id`
                       ORDER
                          BY s.`id`
                       LIMIT 1,1
                    )
       ) r
    ON r.id = q.id

如果code是唯一的,你可以使用子查询来保持"min"和"max"

DELETE FROM t
WHERE t.status = 'available' 
   AND (t.group_id, t.code) NOT IN (
      SELECT group_id, MAX(code) 
      FROM t 
      WHERE status = 'available' 
      GROUP BY group_id
  )
  AND (t.group_id, t.code) NOT IN (
      SELECT group_id, MIN(code) 
      FROM t 
      WHERE status = 'available' 
      GROUP BY group_id
  )

同样,使用自增id:

DELETE FROM t
WHERE t.status = 'available' 
   AND t.id NOT IN (
      SELECT MAX(id) FROM t WHERE status = 'available' GROUP BY group_id
      UNION
      SELECT MIN(id) FROM t WHERE status = 'available' GROUP BY group_id
)

在这个版本中,我将子查询改成了 UNION,但 "AND" 格式也同样有效。此外,如果 "code" 在整个 table 中是唯一的,则 NOT IN 也可以简化为排除 group_id(尽管在子查询的 GROUP BY 子句中仍然需要它).


编辑:MySQL 不喜欢在执行 UPDATE/DELETE 的查询的 WHERE 中引用 tables 的子查询 UPDATEd/DELETEd;在这些情况下,您通常可以对子查询进行双重包装以给它一个别名,导致 MySQL 将其视为临时 table(在幕后)。

DELETE FROM t
WHERE t.status = 'available' 
   AND t.id NOT IN ( 
      SELECT * FROM (
         SELECT MAX(id) FROM t WHERE status = 'available' GROUP BY group_id
         UNION
         SELECT MIN(id) FROM t WHERE status = 'available' GROUP BY group_id
      ) AS a
)

另一种选择,我不记得 MySQL 是否对 DELETE/UPDATE 中的连接抱怨太多......

DELETE t
FROM t
LEFT JOIN (
   SELECT MIN(id) AS minId, MAX(id) AS maxId, 1 AS keep_flag 
   FROM t 
   WHERE status = 'available' 
   GROUP BY group_id
) AS tKeep ON t.id IN (tKeep.minId, tKeep.maxId)
WHERE t.status = 'available' 
    AND tKeep.keep_flag IS NULL
select id, code, group_id, status
from (
  select id, code, group_id, status
  , ROW_NUMBER() OVER (
      PARTITION BY group_id 
      ORDER BY id DESC) row_num
    ) rownum
  from a
) q
where rownum < 3

为了保持最小和最大 ID,我认为 join 是最简单的解决方案:

DELETE t
     FROM t LEFT JOIN
          (SELECT group_id, MIN(id) as min_id, MAX(id) as max_id
           FROM t
           WHERE t.status = 'available' 
           GROUP BY group_id
          ) tt
          ON t.id IN (tt.min_id, tt.max_id)
WHERE t.status = 'available' AND
      tt.group_id IS NULL;