LEFT JOIN WHERE IS NULL for same table in Teradata SQL

Question

我有一个 table 有 51 条记录。 table 结构如下所示：

ack_extract_id query_id cnst_giftran_key 字段 1 值 1

现在 ack_extract_ids 可以是 8,9。我想检查 extract_id 9 的 giftran 密钥，而不是 8.

我试过的是

            SELECT *
            FROM ddcoe_tbls.ack_flextable ack_flextable1
            INNER JOIN ddcoe_tbls.ack_main_config config
                ON ack_flextable1.ack_extract_id = config.ack_extract_id
            LEFT JOIN ddcoe_tbls.ack_flextable ack_flextable2
                ON ack_flextable1.cnst_giftran_key = ack_flextable2.cnst_giftran_key
            WHERE  ack_flextable2.cnst_giftran_key IS NULL
            AND  config.ack_extract_file_nm LIKE '%Dtl%'
                AND ack_flextable2.ack_extract_id = 8
                AND ack_flextable1.ack_extract_id = 9

但是它返回了 0 条记录。理想情况下，右侧为空的左连接应该返回右侧 table 中不存在 cnst_giftran_key 的记录，对吗？

我在这里错过了什么？

Answer 1

当您在 where 子句中测试来自 left-joined table 的列时（在您的情况下为 ack_flextable2.ack_extract_id），您强制该连接的行为就好像它是内部连接一样。相反，将该测试移动为连接条件的一部分。

然后要查找缺少该值的记录，请在 where 子句中测试 NULL 键。

        SELECT *
        FROM ddcoe_tbls.ack_flextable ack_flextable1
        INNER JOIN ddcoe_tbls.ack_main_config config
            ON ack_flextable1.ack_extract_id = config.ack_extract_id
        LEFT JOIN ddcoe_tbls.ack_flextable ack_flextable2
            ON ack_flextable1.cnst_giftran_key = ack_flextable2.cnst_giftran_key
                AND ack_flextable2.ack_extract_id = 8
        WHERE  ack_flextable2.cnst_giftran_key IS NULL
        AND  config.ack_extract_file_nm LIKE '%Dtl%'
            AND ack_flextable1.ack_extract_id = 9
            AND ack_flextable2.cnst_giftran_key IS NULL

Answer 2

这不是答案，只是解释

根据您对 Joe Stefanelli 的回答的评论，我了解到您没有完全理解外部联接中 WHERE 和 ON 的问题。那么让我们看一个例子。

我们正在查找所有供应商的最后订单，即没有供应商新订单的订单记录。

select *
from order
where not exists
(
  select *
  from order newer 
  where newer.supplier = order.supplier 
    and newer.orderdate > order.orderdate
);

这是straight-forward；该查询与我们刚刚输入的内容相匹配：查找不存在同一供应商的新订单的订单。

具有 anti-join 模式的相同查询：

select order.*
from order
left join order newer on  newer.supplier = order.supplier 
                      and newer.orderdate > order.orderdate
where newer.id is null;

在这里，我们将每个订单与所有新订单相结合，因此可能会产生一个巨大的中间结果。使用左外部连接，我们确保在没有供应商的新订单时附加虚拟记录。然后最后我们使用 WHERE 子句扫描中间结果，只保留附加记录 ID 为空的记录。好吧，ID 显然是 table 的主键，永远不能为空，所以我们在这里保留的只是 outer-joined 结果，其中较新的数据只是包含空值的虚拟记录。因此，我们准确地得到了不存在新订单的订单。

谈论一个巨大的中间结果：这怎么能比第一个查询更快？嗯，不应该。第一个查询实际上应该运行同样快或更快。一个好的 DBMS 会看穿这一点，并为两个查询制定相同的执行计划。然而，一个相当年轻的 DBMS 可能真的会更快地执行反连接。这是因为开发人员在连接技术上投入了太多精力，因为几乎每个查询都需要这些技术，而且还不太关心 IN 和 EXISTS。在这种情况下，可能运行会遇到 NOT IN 或 NOT EXISTS 的性能问题，并改用 anti-join 模式。

现在关于 WHERE / ON 问题：

select order.*
from order
left join order newer on newer.orderdate > order.orderdate
where newer.supplier = order.supplier
and newer.id is null;

这看起来与以前几乎相同，但某些条件已从 ON 移动到 WHERE。这意味着外部连接获得不同的条件。这是发生了什么：对于每个订单，找到所有更新的订单 ‐无论哪个供应商！因此，最后订单日期的所有订单都会获得 outer-join 虚拟记录。但随后在 WHERE 子句中，我们删除了供应商不匹配的所有对。请注意，outer-joined 记录包含 newer.supplier 的 NULL，因此 newer.supplier = order.supplier 对它们永远不会为真；他们被删除。但是，如果我们删除所有 outer-joined 记录，我们会得到与普通内连接完全相同的结果。当我们在 WHERE 子句中放置外连接条件时，我们将外连接变成了内连接。所以查询可以是re-written as

select order.*
from order
inner join order newer on newer.orderdate > order.orderdate
where newer.supplier = order.supplier
and newer.id is null;

并且在 FROM 和 INNER JOIN 中使用 tables，条件是在 ON 还是 WHERE 中并不重要；这更像是一个可读性问题，因为这两个标准将同样得到应用。

现在我们看到 newer.id is null 永远不可能是真的。最终结果将是空的 ‐这正是您的查询所发生的情况。

Answer 3

您可以尝试使用此查询：

select * from ddcoe_tbls.ack_main_config
where cnst_giftran_key not in 
  (
   select cnst_giftran_key from ddcoe_tbls.ack_main_config 
   where ack_extract_id = 8
  )  
and ack_extract_id = 9;

LEFT JOIN WHERE IS NULL for same table in Teradata SQL

LEFT JOIN WHERE RIGHT IS NULL for same table in Teradata SQL

sql

left-join

teradata