从 table A 中查找没有记录的行加入 table B

Find rows from table A without record in joined table B

我有两个 table 称为 Employee(列:Id、Name)和 DataSource(列:Id、EmployeeId、DataSourceName)。

每个员工都可以导出到零个或多个数据源并想象以下情况:

员工table

+----+-------------+
| Id | Name        |
+----+-------------+
| 1  | Ivan        |
| 2  | Adam        |
+----+-------------+

数据源table:

+----+---------------------------------+
| Id | EmplpoyeeId   | DataSourceName  |
+----+---------------------------------+
| 1  | 1             | Source1         |
| 2  | 1             | Source2         |
| 3  | 2             | Source2         |
+----+---------------------------------+

我需要一个查询来确定哪个员工没有导出到 'Source1'(在这种情况下结果应该是 'Adam',因为他只导出到 'Source2')。

表 Employee 和 DataSource 可以有大量记录(数千条)。

有几种方法可以确定它,我们需要找到性能最好的一种。我想到的很少:

左连接:

SELECT Employee.Id 
FROM Employee 
LEFT JOIN DataSource ON DataSource.EmployeeId = Employee.Id AND DataSource.DataSourceName = 'Source1'
WHERE DataSource.Id IS NULL

内部 SELECT:

SELECT Employee.Id
FROM Employee
WHERE NOT EXIST (SELECT NULL FROM DataSource WHERE DataSource.EmployeeId = Employee.Id AND DataSource.DataSourceName = 'Source1')

异常:

SELECT Employee.ID 
FROM Employee

EXCEPT

SELECT Employee.Id 
FROM Employee 
INNER JOIN DataSource ON DataSource.EmployeeId = Employee.Id AND DataSource.DataSourceName = 'Source1'

在开始对它们进行基准测试之前,我想问一下是否还有更多我应该考虑的方法(并且可能表现良好)。您能否分享您对最佳性能查询的想法。

如果您需要进一步阅读该主题,这篇文章很好;

http://www.sqlinthewild.co.za/index.php/2010/03/23/left-outer-join-vs-not-exists/

这表明 NOT EXISTS 将执行得更好,因为它不需要完成完整连接(执行 Anti-Semi 连接而不是 Semi Join);

"That’s the major difference between these two. When using the LEFT OUTER JOIN … IS NULL technique, SQL can’t tell that you’re only doing a check for nonexistance. Optimiser’s not smart enough (yet). Hence it does the complete join and then filters. The NOT EXISTS filters as part of the join."