从 SQL 服务器中的两个表中删除相同的数据

Delete equal data from two tables in SQL Server

我有 2 个 table 包含这些列:

Table一个

Id | Name  | Salary  
1  | TEST1 | 100  
2  | TEST2 | 200  
3  | TEST3 | 300

Table乙

Id | Name  | Salary  
1  | TEST1 | 100  
2  | TEST2 | 200  
4  | TEST4 | 400

我想从两个 table 中删除相似的数据(不使用连接)。当我查询

SELECT * 
FROM A 

SELECT * 
FROM B

我应该得到这个结果:

Table一个

Id | Name  | Salary   
3  | TEST3 | 300

Table乙

Id | Name  | Salary   
4  | TEST4 | 400

如有任何帮助,我们将不胜感激。提前致谢。

PS :我将加载 table 大约 1000 万行

使用NOT EXISTS

SELECT * 
FROM   a 
WHERE  NOT EXISTS (SELECT 1 
                   FROM   b 
                   WHERE  a.id = b.id) 

SELECT * 
FROM   b 
WHERE  NOT EXISTS (SELECT 1 
                   FROM   a 
                   WHERE  a.id = b.id) 

对于所有字段的使用EXCEPT

SELECT Id, Name, Salary FROM A
EXCEPT
SELECT Id, Name, Salary FROM B

SELECT Id, Name, Salary FROM B
EXCEPT
SELECT Id, Name, Salary FROM A

要从 tablea 中删除记录,请使用以下查询

WITH cte 
     AS (SELECT * 
         FROM   tablea a 
         WHERE  EXISTS (SELECT 1 
                        FROM   tableb b 
                        WHERE  a.id = b.id 
                               AND a.NAME = b.NAME 
                               AND a.salary = b.salary)) 
DELETE FROM cte 

SELECT * 
FROM   tablea 

在从tableA中删除数据之前,将数据插入到temp table中,以便在从tableB

中删除数据时参考

使用DELETE FROM:

SELECT *
INTO #temp
FROM TableA;    -- to get the same data to compare with second DELETE

DELETE t
FROM TableA t
WHERE EXISTS(SELECT Id,Name,Salary
             FROM TableB
             INTERSECT
             SELECT t.ID, t.Name, t.Salary);

DELETE t
FROM TableB t
WHERE EXISTS(SELECT Id,Name,Salary
             FROM #temp
             INTERSECT
             SELECT t.ID, t.Name, t.Salary);  


SELECT * FROM TableA;
SELECT * FROM TableB;

LiveDemo

输出:

 TableA:
╔════╦═══════╦════════╗
║ Id ║ Name  ║ Salary ║
╠════╬═══════╬════════╣
║  3 ║ TEST3 ║    300 ║
╚════╩═══════╩════════╝


TableB:
╔════╦═══════╦════════╗
║ Id ║ Name  ║ Salary ║
╠════╬═══════╬════════╣
║  4 ║ TEST4 ║    400 ║
╚════╩═══════╩════════╝

编辑:

为避免整个 table 使用 OUTPUT 子句;

CREATE TABLE #temp(ID INT, NAME VARCHAR(100), Salaray INT);

DELETE t
OUTPUT deleted.Id, deleted.Name, deleted.Salary
INTO #temp
FROM TableA t
WHERE EXISTS(SELECT Id,Name,Salary
             FROM TableB
             INTERSECT
             SELECT t.ID, t.Name, t.Salary);

DELETE t
FROM TableB t
WHERE EXISTS(SELECT Id,Name,Salary 
             FROM (SELECT Id,Name,Salary FROM TableA
                   UNION ALL
                   SELECT Id,Name,Salary FROM #temp) AS sub
             INTERSECT
             SELECT t.ID, t.Name, t.Salary); 

LiveDemo2

为避免删除问题(日志增长、tempdb 压力等),您可以使用每个 100k 块来处理数据。添加 WHILE LOOP 和 2 个变量 @range_start, @range_stop 并增加 100k 或任何其他适合您系统的值。

你最好把"delete lots of rows"换成"create a new table with only those rows remaining".

如果你的SQL服务器版本支持EXCEPT就很简单了:

SELECT * INTO newA FROM a
EXCEPT
SELECT * FROM b
;

SELECT * INTO newB FROM b
EXCEPT
SELECT * FROM a
;

fiddle

EXISTS 也简化了 NULL 处理。