根据条件获取两个表之间的共同记录

Get common records between two tables based on condition

假设我有两个表:IndustryCustomersProductCustomers,它们具有相同的架构,只有这样一列

行业客户:

CustomerId
1
2
3

产品客户:

CustomerId
2
3
4

所以我想要的是:

1- 如果 industryCustomers 和 productCustomers 都有记录,则获取它们之间的共同客户(只需通过内连接 customerId)

2- 如果 industryCustomers 有任何记录但 productCustomer 没有记录则 select all industryCustomers

3- 如果 industryCustomers 没有任何记录,则 select 所有产品客户

目前我是根据条​​件使用IF和select来做的,但是我不知道是否可以通过一个查询获得客户。

这是我的查询

IF EXISTS (SELECT TOP 1 1 FROM #IndustryCustomers)
BEGIN
    IF EXISTS (SELECT TOP 1 1 FROM #ProductCustomers)
        SELECT *
        FROM #IndustryCustomers ic
            JOIN #ProductCustomers pc
                ON ic.CustomerId = pc.CustomerId;
    ELSE
        SELECT *
        FROM #IndustryCustomers;
END;
ELSE
    SELECT *
    FROM #ProductCustomers;

您可以 UNION ALL 您的三个 SELECT 并将相应的条件放在 WHERE 子句中,例如

SELECT ic.CustomerId 
  FROM #IndustryCustomers AS ic 
       INNER JOIN #ProductCustomers AS pc ON ic.CustomerId = pc.CustomerId
 WHERE EXISTS (SELECT 1 FROM #IndustryCustomers) 
   AND EXISTS (SELECT 1 FROM #ProductCustomers)

UNION ALL 

SELECT ic.CustomerId 
  FROM #IndustryCustomers AS ic
 WHERE EXISTS (SELECT 1 FROM #IndustryCustomers) 
   AND NOT EXISTS (SELECT 1 FROM #ProductCustomers)

UNION ALL 

SELECT pc.CustomerId 
  FROM #ProductCustomers AS pc
 WHERE NOT EXISTS (SELECT 1 FROM #IndustryCustomers)

显然,这需要所有三个 SQL return 同一组列,因此我将 * 缩减为客户 ID。

不过,我确实认为这个“解决方案”虽然在形式上满足了您的要求,但可读性不如您当前的解决方案...

TLDR;

SELECT
    *
FROM    (
    SELECT
        CustomerID  = CASE
                        WHEN NOT EXISTS (SELECT     * FROM ProductCustomers) THEN
                            ic.CustomerID
                        WHEN NOT EXISTS (SELECT     * FROM IndustryCustomers) THEN
                            pc.CustomerID
                        WHEN EXISTS (SELECT     * FROM IndustryCustomers)
                            AND EXISTS (SELECT      * FROM ProductCustomers)
                            AND ic.CustomerID = pc.CustomerID THEN
                            ic.CustomerID
                    END
    FROM    IndustryCustomers   AS ic
    FULL JOIN ProductCustomers AS pc
        ON pc.CustomerID = ic.CustomerID
) AS x
WHERE   x.CustomerID IS NOT NULL;

查询分解

第 1 步:获取所有数据

如果您想要一个查询,但不想使用 UNION,您将需要对两个 table 执行 FULL JOIN

SELECT
    *
FROM    IndustryCustomers   AS ic
FULL JOIN ProductCustomers AS pc
    ON pc.CustomerID = ic.CustomerID;
ic.CustomerID pc.CustomerID
2 2
3 3
NULL 4
1 NULL

第 2 步:根据您的逻辑过滤 select 列表中的数据

现在您拥有了生成所需结果所需的所有数据。现在根据您的逻辑将结果中的列更改为 return 您想要的结果。如果没有ProductCustomers,总是returnIndustryCustomers,如果没有IndustryCustomers总是returnProductCustomers,如果两者都有记录,只有return 匹配的。

SELECT
    CustomerID  = CASE
                    WHEN NOT EXISTS (SELECT     * FROM ProductCustomers) THEN
                        ic.CustomerID
                    WHEN NOT EXISTS (SELECT     * FROM IndustryCustomers) THEN
                        pc.CustomerID
                    WHEN EXISTS (SELECT     * FROM IndustryCustomers)
                        AND EXISTS (SELECT      * FROM ProductCustomers)
                        AND ic.CustomerID = pc.CustomerID THEN
                        ic.CustomerID
                END
FROM    IndustryCustomers   AS ic
FULL JOIN ProductCustomers AS pc
    ON pc.CustomerID = ic.CustomerID;
CustomerID
2
3
NULL
NULL

第 3 步:通过删除 NULLS 清理结果

这会为您提供所需的结果,但结果集中不符合条件的行现在为 NULL。您有两种摆脱它们的选择:

选项 1

将您的 CASE 语句复制到您的 WHERE 子句并使用它来过滤掉 NULLs。

优点: 您有一个“SELECT”声明。除非您只是喜欢它的外观,否则这里没有真正的好处。

缺点:代码更难阅读,如果您稍后修改此逻辑,则必须记住更新两个地方的逻辑。恕我直言,这个骗局是个大骗局。发生这种情况的可能性很高。当人们快速更新代码时,我经常看到这种情况。

SELECT
    CustomerID  = CASE
                    WHEN NOT EXISTS (SELECT     * FROM ProductCustomers) THEN
                        ic.CustomerID
                    WHEN NOT EXISTS (SELECT     * FROM IndustryCustomers) THEN
                        pc.CustomerID
                    WHEN EXISTS (SELECT     * FROM IndustryCustomers)
                        AND EXISTS (SELECT      * FROM ProductCustomers)
                        AND ic.CustomerID = pc.CustomerID THEN
                        ic.CustomerID
                END
FROM    IndustryCustomers   AS ic
FULL JOIN ProductCustomers AS pc
    ON pc.CustomerID = ic.CustomerID
WHERE   (CASE
            WHEN NOT EXISTS (SELECT * FROM ProductCustomers) THEN
                ic.CustomerID
            WHEN NOT EXISTS (SELECT * FROM IndustryCustomers) THEN
                pc.CustomerID
            WHEN EXISTS (SELECT * FROM IndustryCustomers)
                AND EXISTS (SELECT * FROM ProductCustomers )
                AND ic.CustomerID = pc.CustomerID THEN
                ic.CustomerID
        END
        )   IS NOT NULL;

选项 2

将您的查询包装在消除 NULLS 的查询中。

优点:无需维护重复逻辑,代码更短更易于阅读。

缺点:这不是一个单一的SELECT语句,但在功能上没有缺点。

SELECT
    *
FROM    (
    SELECT
        CustomerID  = CASE
                        WHEN NOT EXISTS (SELECT     * FROM ProductCustomers) THEN
                            ic.CustomerID
                        WHEN NOT EXISTS (SELECT     * FROM IndustryCustomers) THEN
                            pc.CustomerID
                        WHEN EXISTS (SELECT     * FROM IndustryCustomers)
                            AND EXISTS (SELECT      * FROM ProductCustomers)
                            AND ic.CustomerID = pc.CustomerID THEN
                            ic.CustomerID
                    END
    FROM    IndustryCustomers   AS ic
    FULL JOIN ProductCustomers AS pc
        ON pc.CustomerID = ic.CustomerID
) AS x
WHERE   x.CustomerID IS NOT NULL;

显示每个场景结果的示例代码

我正在使用 Common Table Expression (CTE) and a Table Value Constructor 构建示例数据。 selects 数据的查询在每一个中都是相同的。

IndustryCustomersProductCustomers都有数据

WITH
    IndustryCustomers AS (
        SELECT
            IndustryCustomers.CustomerID
        FROM ( VALUES (1), (2), (3)) AS IndustryCustomers (CustomerID)
    ),
    ProductCustomers AS (
        SELECT
            ProductCustomers.CustomerID
        FROM ( VALUES (2), (3), (4)) AS ProductCustomers (CustomerID)
    )
SELECT
    *
FROM    (
    SELECT
        CustomerID  = CASE
                        WHEN NOT EXISTS (SELECT     * FROM ProductCustomers) THEN
                            ic.CustomerID
                        WHEN NOT EXISTS (SELECT     * FROM IndustryCustomers) THEN
                            pc.CustomerID
                        WHEN EXISTS (SELECT     * FROM IndustryCustomers)
                            AND EXISTS (SELECT      * FROM ProductCustomers)
                            AND ic.CustomerID = pc.CustomerID THEN
                            ic.CustomerID
                    END
    FROM    IndustryCustomers   AS ic
    FULL JOIN ProductCustomers AS pc
        ON pc.CustomerID = ic.CustomerID
) AS x
WHERE   x.CustomerID IS NOT NULL;
CustomerID
2
3

ProductCustomers 不包含数据

WITH
    IndustryCustomers AS (
        SELECT
            IndustryCustomers.CustomerID
        FROM ( VALUES (1), (2), (3)) AS IndustryCustomers (CustomerID)
    ),
    ProductCustomers AS (
        SELECT CustomerID = NULL
        WHERE 1 = 2
    )
SELECT
    *
FROM    (
    SELECT
        CustomerID  = CASE
                        WHEN NOT EXISTS (SELECT     * FROM ProductCustomers) THEN
                            ic.CustomerID
                        WHEN NOT EXISTS (SELECT     * FROM IndustryCustomers) THEN
                            pc.CustomerID
                        WHEN EXISTS (SELECT     * FROM IndustryCustomers)
                            AND EXISTS (SELECT      * FROM ProductCustomers)
                            AND ic.CustomerID = pc.CustomerID THEN
                            ic.CustomerID
                    END
    FROM    IndustryCustomers   AS ic
    FULL JOIN ProductCustomers AS pc
        ON pc.CustomerID = ic.CustomerID
) AS x
WHERE   x.CustomerID IS NOT NULL;
CustomerID
1
2
3

IndustryCustomers 不包含数据

WITH
    IndustryCustomers AS (
        SELECT CustomerID = NULL
        WHERE 1 = 2
    ),
    ProductCustomers AS (
        SELECT
            ProductCustomers.CustomerID
        FROM ( VALUES (2), (3), (4)) AS ProductCustomers (CustomerID)
    )
SELECT
    *
FROM    (
    SELECT
        CustomerID  = CASE
                        WHEN NOT EXISTS (SELECT     * FROM ProductCustomers) THEN
                            ic.CustomerID
                        WHEN NOT EXISTS (SELECT     * FROM IndustryCustomers) THEN
                            pc.CustomerID
                        WHEN EXISTS (SELECT     * FROM IndustryCustomers)
                            AND EXISTS (SELECT      * FROM ProductCustomers)
                            AND ic.CustomerID = pc.CustomerID THEN
                            ic.CustomerID
                    END
    FROM    IndustryCustomers   AS ic
    FULL JOIN ProductCustomers AS pc
        ON pc.CustomerID = ic.CustomerID
) AS x
WHERE   x.CustomerID IS NOT NULL;
CustomerID
2
3
4

补充说明

使用 EXISTS 子句时始终使用 SELECT * FROM ... 形式。不仅代码的意图更加清晰,而且使用 *1TOP 1 1Column1, ..., Column327 之间也没有性能差异。 SQL 服务器一找到单个结果就停止执行查询,甚至从不考虑 TOP。如果比较它们,您会发现执行计划都是相同的。

EXISTS (SELECT 1...) vs EXISTS (SELECT TOP 1...) Does it matter?

在每个 table 中使用 10,000 条记录进行测试,其中只有一半重叠

SET STATISTICS IO, TIME ON
DECLARE
    @IndustryStartID    int = 1,
    @IndustryEndID      int = 10,
    @ProductStartID     int = 5,
    @ProductEndID       int = 15;


WITH
    IndustryCustomers AS (
        SELECT CustomerID = @IndustryStartID
        UNION ALL
        SELECT
            ic.CustomerID + 1
        FROM    IndustryCustomers AS ic
        WHERE   ic.CustomerID + 1 <= @IndustryEndID
    ),
    ProductCustomers AS (
        SELECT CustomerID = @ProductStartID
        UNION ALL
        SELECT
            pc.CustomerID + 1
        FROM    ProductCustomers AS pc
        WHERE   pc.CustomerID + 1 <= @ProductEndID
    )
SELECT
    *
FROM    (
    SELECT
        CustomerID  = CASE
                        WHEN NOT EXISTS (SELECT     * FROM ProductCustomers) THEN
                            ic.CustomerID
                        WHEN NOT EXISTS (SELECT     * FROM IndustryCustomers) THEN
                            pc.CustomerID
                        WHEN EXISTS (SELECT     * FROM IndustryCustomers)
                            AND EXISTS (SELECT      * FROM ProductCustomers)
                            AND ic.CustomerID = pc.CustomerID THEN
                            ic.CustomerID
                    END
    FROM    IndustryCustomers   AS ic
    FULL JOIN ProductCustomers AS pc
        ON pc.CustomerID = ic.CustomerID
) AS x
WHERE   x.CustomerID IS NOT NULL
OPTION (MAXRECURSION 10000);

SET STATISTICS IO, TIME OFF

就我个人而言,我可能会使用@Heinz 的方法,但奇怪的是,NOT EXISTS 的性能比我的解决方案差。根据执行计划,似乎 NOT EXISTS 无缘无故地扫描了整个 table,不知道为什么。将不得不进一步调查发生了什么(我正在使用 SQL Server 2017 开发版)。

所以这是一个非常简洁的解决方案,似乎比 Heinzi 和 Nick 的解决方案表现更好(在我非常有限的测试中)

使用 APPLY 和 Full Join 的简洁解决方案

SELECT FinalCustomerID = ISNULL(I.CustomerID,P.CustomerID)
FROM #IndustryCustomers AS I
FULL JOIN #ProductCustomers AS P
    ON I.CustomerID = P.CustomerID
CROSS APPLY (
    SELECT 
     HasI = CASE WHEN EXISTS (SELECT * FROM #IndustryCustomers) THEN 'Y' ELSE 'N' END
    ,HasP = CASE WHEN EXISTS (SELECT * FROM #ProductCustomers ) THEN 'Y' ELSE 'N' END
) AS C
WHERE ('N' NOT IN (HasI,HasP) AND I.CustomerID = P.CustomerID)
    OR (HasI = 'Y' AND HasP = 'N' AND I.CustomerID IS NOT NULL)
    OR (HasI = 'N' AND HasP = 'Y' AND P.CustomerID IS NOT NULL)