检索行之间特定列不同的行
Retrieve rows where specific columns are different between rows
我有以下 SQL 来获得最高(最大)和第二高的日期行。在某些情况下,数据中 'OLD ROW' 和 'NEW ROW' 的唯一区别是生效日期。我想做的只是显示每个 EMPLID 的旧行和新行,其中 BANK_CD and/or ACCOUNT_NUM 在 'old' 和 'new' 之间不同行。
WITH CTE AS (
SELECT A.EMPLID
, C.VENDOR_ID
, B.FIRST_NAME
, B.LAST_NAME
, A.BANK_CD
, A.ACCOUNT_NUM
, A.ACCOUNT_TYPE
, A.PRIORITY
, A.LAST_UPDATE_DATE
, A.EFFDT
, MAX(A.LAST_UPDATE_DATE) OVER(PARTITION BY A.EMPLID) AS MAX_UPDATE_DATE
, ROW_NUMBER() OVER(PARTITION BY A.EMPLID
ORDER BY A.EFFDT DESC
, D.EFFDT DESC) AS RN
FROM PS_DIRECT_DEPOSIT D INNER JOIN PS_DIR_DEP_DISTRIB A ON A.EMPLID = D.EMPLID
AND A.EFFDT = D.EFFDT INNER JOIN PS_EMPLOYEES B ON B.EMPLID = A.EMPLID INNER JOIN PS_GHS_DIR_DEP_VND C ON C.EMPLID = A.EMPLID
INNER JOIN PS_DIR_DEP_DISTRIB E ON E.EMPLID = A.EMPLID AND E.EFFDT = A.EFFDT AND E.PRIORITY = A.PRIORITY
WHERE B.EMPL_STATUS NOT IN ('T','R','D')
AND ((A.DEPOSIT_TYPE = 'P'
AND A.AMOUNT_PCT = 100)
OR A.PRIORITY = 999
OR A.DEPOSIT_TYPE = 'B')
AND D.EFF_STATUS = 'A' )
SELECT CASE WHEN RN = 1 THEN 'NEW ROW' WHEN RN = 2 THEN 'OLD ROW' END AS
'ROW_TYPE' , *
FROM CTE
WHERE RN IN (1, 2)
AND MAX_UPDATE_DATE >= GETDATE() - 8
这是我现在得到的输出示例:
ROW_TYPE EMPLID VENDOR_ID FIRST_NAME LAST_NAME BANK_CD ACCOUNT_NUM ACCOUNT_TYPE PRIORITY LAST_UPDATE_DATE EFFDT MAX_UPDATE_DATE RN
NEW ROW 12345 XYZ123 John Smith 111111122 45678 C 999 03/12/2019 03/12/2019 03/12/2019 1
OLD ROW 12345 XYZ123 John Smith 111111122 45678 C 999 10/25/2017 10/25/2017 10/25/2017 2
NEW ROW 47831 A86464 Samm Bulle 754566654 98865 C 999 03/12/2019 06/08/2018 03/12/2019 1
OLD ROW 47831 A86464 Samm Bulle 754566654 45678 C 999 10/25/2017 06/08/2018 10/25/2017 2
NEW ROW 32456 KG4561 Kilo Renne 875123311 32146 C 300 09/02/2018 09/02/2018 09/02/2018 1
OLD ROW 32456 KG4561 Kilo Renne 971215477 78131 C 310 12/21/2017 12/21/2017 12/21/2017 2
在上面的示例中,我不想输出前两组 EMPLID 行(前 4 行),因为它们在新旧行之间具有相同的 BANK_CD 和 ACCOUNT_NUM .最后一组行我想保留在我的输出中,因为你可以看到 BANK_CD 和 ACCOUNT_NUM 是不同的(尽管可能被包括在内)。
我尝试为 PS_DIR_DEP_DISTRIB 添加自连接,然后在 WHERE 子句中添加 where AND (A.BANK_CD <> E.BANK_CD OR A.ACCOUNT_NUM <> E.ACCOUNT_NUM) 如下:
WITH CTE AS (
SELECT A.EMPLID
--, C.VENDOR_ID
, B.FIRST_NAME
, B.LAST_NAME
, A.BANK_CD
, A.ACCOUNT_NUM
, A.ACCOUNT_TYPE
, A.PRIORITY
, A.LAST_UPDATE_DATE
, A.EFFDT
, MAX(A.LAST_UPDATE_DATE) OVER(PARTITION BY A.EMPLID) AS MAX_UPDATE_DATE
, ROW_NUMBER() OVER(PARTITION BY A.EMPLID
ORDER BY A.EFFDT DESC
, D.EFFDT DESC) AS RN
FROM PS_DIRECT_DEPOSIT D INNER JOIN PS_DIR_DEP_DISTRIB A ON A.EMPLID = D.EMPLID
AND A.EFFDT = D.EFFDT INNER JOIN PS_EMPLOYEES B ON B.EMPLID = A.EMPLID INNER JOIN PS_GHS_DIR_DEP_VND C ON C.EMPLID = A.EMPLID
---ADDING SELF JOIN BELOW---
INNER JOIN PS_DIR_DEP_DISTRIB E ON E.EMPLID = A.EMPLID AND E.EFFDT =
A.EFFDT AND E.PRIORITY = A.PRIORITY
WHERE B.EMPL_STATUS NOT IN ('T','R','D')
AND ((A.DEPOSIT_TYPE = 'P'
AND A.AMOUNT_PCT = 100)
OR A.PRIORITY = 999
OR A.DEPOSIT_TYPE = 'B')
AND D.EFF_STATUS = 'A'
---ADDING NEW WHERE CONDITION BELOW---
AND (E.ACCOUNT_NUM <> A.ACCOUNT_NUM OR E.BANK_CD <> A.BANK_CD ))
SELECT CASE WHEN RN = 1 THEN 'NEW ROW' WHEN RN = 2 THEN 'OLD ROW' END AS 'ROW_TYPE'
, *
FROM CTE
WHERE RN IN (1, 2)
AND MAX_UPDATE_DATE >= GETDATE() - 8
当我进行上述更改时,我根本没有得到任何返回的数据。我做错了什么?
3/20/19 编辑:
按照 Tarek 在下面的建议回答,我似乎遗漏了一些我希望在最终输出中出现的行。添加的新字段 BANK_CD_prev
和 ACCOUNT_NUM_prev
似乎在我希望显示的行中返回 NULL
值,因此最终查询未检索它们的原因。如何处理?
ROW_TYPE EMPLID FIRST_NAME LAST_NAME BANK_CD ACCOUNT_NUM ACCOUNT_TYPE PRIORITY LAST_UPDATE_DATE EFFDT MAX_UPDATE_DATE RN BANK_CD_prev ACCOUNT_NUM_prev
NEW ROW 56789 Test User 874556411 54765 C 999 2019-02-28 2019-02-28 2019-02-28 1 NULL NULL
OLD ROW 56789 Test User 874556411 98451 C 999 2017-10-09 2017-10-09 2017-10-09 1 874556411 54765
NEW ROW 56789 Sampl Test 756561623 46331 C 999 2018-03-12 2018-03-12 2018-03-12 1 NULL NULL
OLD ROW 56789 Test User 756561623 46331 C 999 2015-05-18 2015-05-18 2015-05-18 1 756561623 46331
请注意以上是 运行 查询 没有 ( AND BANK_CD <> BANK_CD_prev AND ACCOUNT_NUM <> ACCOUNT_NUM_prev
的最终条件只是为了让我明白为什么我遗漏了一些行。在这种情况下,您可以看到第一个 EMPLID
的两行 ACCOUNT_NUM 不同,因此我应该输出这两行。第二个 EMPLID
(最后两行)有点不同之处在于 BANK_CD
和 ACCOUNT_NUM
对于两行都是相同的,所以我不希望它被输出(它不是,所以这是正确的)但仍然显示 NULL
与第一组行相同。
3/20/19 编辑 2: 你可以看到 BANK_CD_COUNT
和 ACCOUNT_NUM_COUNT
中有一些我认为是造成的奇怪数字一个问题。
ROW_TYPE EMPLID FIRST_NAME LAST_NAME BANK_CD ACCOUNT_NUM ACCOUNT_TYPE PRIORITY LAST_UPDATE_DATE EFFDT MAX_UPDATE_DATE RN BANK_CD_COUNT ACCOUNT_NUM_COUNT
NEW ROW 812682 Test User 031308302 8675309 C 999 2019-03-09 2019-03-09 2019-03-09 1 1101 1
OLD ROW 812682 Test User 231379393 0001236408 C 999 2018-04-11 2018-03-03 2019-03-09 2 476 1
此 link 包含创建 Table 和插入脚本,供任何可以帮助复制此内容的人使用。
我只想将以下分析功能添加到您的 CTE 到 return 前身:
LAG(A.BANK_CD) OVER (PARTITION BY A.EMPLID ORDER BY A.EFFDT DESC) AS BANK_CD_prev,
LAG(A.ACCOUNT_NUM) OVER (PARTITION BY A.EMPLID ORDER BY A.EFFDT DESC) AS ACCOUNT_NUM_prev
然后您只需要在主查询中引用它就可以了
FROM CTE
WHERE RN IN (1, 2)
AND MAX_UPDATE_DATE >= GETDATE() - 8
AND BANK_CD <> BANK_CD_prev
AND ACCOUNT_NUM <> ACCOUNT_NUM_prev
如果解决了您的问题,请回复。由于您没有提供示例进行测试,因此很难判断。
编辑:我的意思是:在您的初始方法中使用该解决方案,根本不需要自连接。它不会帮助你解决你的问题。
一种方法是使用 LEAD
和 LAG
函数。我们在这里都需要。
这是您的原始查询,我按日期格式化并注释掉了过滤器。
WITH
CTE
AS
(
SELECT
A.EMPLID
,C.VENDOR_ID
,B.FIRST_NAME
,B.LAST_NAME
,A.BANK_CD
,A.ACCOUNT_NUM
,A.ACCOUNT_TYPE
,A.PRIORITY
,A.LAST_UPDATE_DATE
,A.EFFDT
,MAX(A.LAST_UPDATE_DATE) OVER (PARTITION BY A.EMPLID) AS MAX_UPDATE_DATE
,ROW_NUMBER() OVER (PARTITION BY A.EMPLID ORDER BY A.EFFDT DESC, D.EFFDT DESC) AS RN
FROM
PS_DIRECT_DEPOSIT D
INNER JOIN PS_DIR_DEP_DISTRIB A
ON A.EMPLID = D.EMPLID
AND A.EFFDT = D.EFFDT
INNER JOIN PS_EMPLOYEES B ON B.EMPLID = A.EMPLID
INNER JOIN PS_GHS_DIR_DEP_VND C ON C.EMPLID = A.EMPLID
INNER JOIN PS_DIR_DEP_DISTRIB E
ON E.EMPLID = A.EMPLID
AND E.EFFDT = A.EFFDT
AND E.PRIORITY = A.PRIORITY
WHERE
B.EMPL_STATUS NOT IN ('T','R','D')
AND
(
(
A.DEPOSIT_TYPE = 'P'
AND A.AMOUNT_PCT = 100
)
OR A.PRIORITY = 999
OR A.DEPOSIT_TYPE = 'B'
)
AND D.EFF_STATUS = 'A'
)
SELECT
CASE
WHEN RN = 1 THEN 'NEW ROW'
WHEN RN = 2 THEN 'OLD ROW'
END AS 'ROW_TYPE'
,*
FROM CTE
WHERE
RN IN (1,2)
--AND MAX_UPDATE_DATE >= GETDATE() - 8
;
此查询return根据您的示例数据得出以下结果:
+----------+---------+-----------+------------+-----------+----------+-------------+--------------+----------+-------------------------+-------------------------+-------------------------+----+
| ROW_TYPE | EMPLID | VENDOR_ID | FIRST_NAME | LAST_NAME | BANK_CD | ACCOUNT_NUM | ACCOUNT_TYPE | PRIORITY | LAST_UPDATE_DATE | EFFDT | MAX_UPDATE_DATE | RN |
+----------+---------+-----------+------------+-----------+----------+-------------+--------------+----------+-------------------------+-------------------------+-------------------------+----+
| NEW ROW | 097432 | 3471B | H | Adam | 09146115 | 13404891 | C | 999 | 2016-06-10 00:00:00.000 | 2016-06-10 00:00:00.000 | 2016-06-10 00:00:00.000 | 1 |
| OLD ROW | 097432 | 3471B | H | Adam | 09146115 | 13404891 | C | 999 | 2016-06-10 00:00:00.000 | 2014-11-05 00:00:00.000 | 2016-06-10 00:00:00.000 | 2 |
| NEW ROW | 100765 | 1272B | V | Milo | 2358414 | 040925 | S | 310 | 2014-03-05 00:00:00.000 | 2011-04-27 00:00:00.000 | 2014-03-05 00:00:00.000 | 1 |
| OLD ROW | 100765 | 1272B | V | Milo | 2358414 | 040925 | S | 300 | 2014-03-05 00:00:00.000 | 2001-08-23 00:00:00.000 | 2014-03-05 00:00:00.000 | 2 |
| NEW ROW | 1045632 | 3870A | V | Olo | 2345794 | 179410860 | C | 999 | 2018-08-16 00:00:00.000 | 2018-08-16 00:00:00.000 | 2018-08-16 00:00:00.000 | 1 |
| OLD ROW | 1045632 | 3870A | V | Olo | 2345794 | 179410860 | C | 310 | 2011-02-25 00:00:00.000 | 2011-02-25 00:00:00.000 | 2018-08-16 00:00:00.000 | 2 |
| NEW ROW | 21345 | 12345A | J | Smith | 0224547 | 59167824 | C | 999 | 2019-02-28 00:00:00.000 | 2019-02-28 00:00:00.000 | 2019-02-28 00:00:00.000 | 1 |
| OLD ROW | 21345 | 12345A | J | Smith | 0224547 | 591678 | C | 999 | 2017-11-08 00:00:00.000 | 2017-10-17 00:00:00.000 | 2019-02-28 00:00:00.000 | 2 |
| NEW ROW | 26110 | 1272B | S | Sams | 8208302 | 822328 | C | 999 | 2019-02-08 00:00:00.000 | 2019-02-08 00:00:00.000 | 2019-02-08 00:00:00.000 | 1 |
| OLD ROW | 26110 | 1272B | S | Sams | 8208302 | 822328 | C | 315 | 2014-03-05 00:00:00.000 | 2012-07-30 00:00:00.000 | 2019-02-08 00:00:00.000 | 2 |
+----------+---------+-----------+------------+-----------+----------+-------------+--------------+----------+-------------------------+-------------------------+-------------------------+----+
我把它包括在这里,所以你可以看到与下面的区别。
这是添加了 LEAD
和 LAG
函数的新查询(我们需要同时获取上一行和下一行),并在 WHERE
子句中添加了额外的过滤器。
LEAD
和 LAG
return 到达 window 末尾时为 NULL。例如,对于 window 的第一行,LAG
(上一个值)将 return NULL,因为还没有 "previous" 行。
因此,对于第一行 (rn=1),我们需要将其值与 "next".
进行比较
对于第二行 (rn=2),我们需要将其值与 "prev".
进行比较
最终查询
WITH
CTE
AS
(
SELECT
A.EMPLID
,C.VENDOR_ID
,B.FIRST_NAME
,B.LAST_NAME
,A.BANK_CD
,A.ACCOUNT_NUM
,A.ACCOUNT_TYPE
,A.PRIORITY
,A.LAST_UPDATE_DATE
,A.EFFDT
,MAX(A.LAST_UPDATE_DATE) OVER (PARTITION BY A.EMPLID) AS MAX_UPDATE_DATE
,ROW_NUMBER() OVER (PARTITION BY A.EMPLID ORDER BY A.EFFDT DESC, D.EFFDT DESC) AS RN
,LAG(A.BANK_CD) OVER (PARTITION BY A.EMPLID ORDER BY A.EFFDT DESC, D.EFFDT DESC) AS BANK_CD_prev
,LEAD(A.BANK_CD) OVER (PARTITION BY A.EMPLID ORDER BY A.EFFDT DESC, D.EFFDT DESC) AS BANK_CD_next
,LAG(A.ACCOUNT_NUM) OVER (PARTITION BY A.EMPLID ORDER BY A.EFFDT DESC, D.EFFDT DESC) AS ACCOUNT_NUM_prev
,LEAD(A.ACCOUNT_NUM) OVER (PARTITION BY A.EMPLID ORDER BY A.EFFDT DESC, D.EFFDT DESC) AS ACCOUNT_NUM_next
FROM
PS_DIRECT_DEPOSIT D
INNER JOIN PS_DIR_DEP_DISTRIB A
ON A.EMPLID = D.EMPLID
AND A.EFFDT = D.EFFDT
INNER JOIN PS_EMPLOYEES B ON B.EMPLID = A.EMPLID
INNER JOIN PS_GHS_DIR_DEP_VND C ON C.EMPLID = A.EMPLID
INNER JOIN PS_DIR_DEP_DISTRIB E
ON E.EMPLID = A.EMPLID
AND E.EFFDT = A.EFFDT
AND E.PRIORITY = A.PRIORITY
WHERE
B.EMPL_STATUS NOT IN ('T','R','D')
AND
(
(
A.DEPOSIT_TYPE = 'P'
AND A.AMOUNT_PCT = 100
)
OR A.PRIORITY = 999
OR A.DEPOSIT_TYPE = 'B'
)
AND D.EFF_STATUS = 'A'
)
SELECT
CASE
WHEN RN = 1 THEN 'NEW ROW'
WHEN RN = 2 THEN 'OLD ROW'
END AS 'ROW_TYPE'
,*
FROM CTE
WHERE
RN IN (1,2)
AND
(
(
rn = 1
AND BANK_CD <> BANK_CD_next
)
OR
(
rn = 1
AND ACCOUNT_NUM <> ACCOUNT_NUM_next
)
OR
(
rn = 2
AND BANK_CD <> BANK_CD_prev
)
OR
(
rn = 2
AND ACCOUNT_NUM <> ACCOUNT_NUM_prev
)
)
--AND MAX_UPDATE_DATE >= GETDATE() - 8
;
最终结果
+----------+--------+-----------+------------+-----------+---------+-------------+--------------+----------+-------------------------+-------------------------+-------------------------+----+--------------+--------------+------------------+------------------+
| ROW_TYPE | EMPLID | VENDOR_ID | FIRST_NAME | LAST_NAME | BANK_CD | ACCOUNT_NUM | ACCOUNT_TYPE | PRIORITY | LAST_UPDATE_DATE | EFFDT | MAX_UPDATE_DATE | RN | BANK_CD_prev | BANK_CD_next | ACCOUNT_NUM_prev | ACCOUNT_NUM_next |
+----------+--------+-----------+------------+-----------+---------+-------------+--------------+----------+-------------------------+-------------------------+-------------------------+----+--------------+--------------+------------------+------------------+
| NEW ROW | 21345 | 12345A | J | Smith | 0224547 | 59167824 | C | 999 | 2019-02-28 00:00:00.000 | 2019-02-28 00:00:00.000 | 2019-02-28 00:00:00.000 | 1 | NULL | 0224547 | NULL | 591678 |
| OLD ROW | 21345 | 12345A | J | Smith | 0224547 | 591678 | C | 999 | 2017-11-08 00:00:00.000 | 2017-10-17 00:00:00.000 | 2019-02-28 00:00:00.000 | 2 | 0224547 | NULL | 59167824 | NULL |
+----------+--------+-----------+------------+-----------+---------+-------------+--------------+----------+-------------------------+-------------------------+-------------------------+----+--------------+--------------+------------------+------------------+
我有以下 SQL 来获得最高(最大)和第二高的日期行。在某些情况下,数据中 'OLD ROW' 和 'NEW ROW' 的唯一区别是生效日期。我想做的只是显示每个 EMPLID 的旧行和新行,其中 BANK_CD and/or ACCOUNT_NUM 在 'old' 和 'new' 之间不同行。
WITH CTE AS (
SELECT A.EMPLID
, C.VENDOR_ID
, B.FIRST_NAME
, B.LAST_NAME
, A.BANK_CD
, A.ACCOUNT_NUM
, A.ACCOUNT_TYPE
, A.PRIORITY
, A.LAST_UPDATE_DATE
, A.EFFDT
, MAX(A.LAST_UPDATE_DATE) OVER(PARTITION BY A.EMPLID) AS MAX_UPDATE_DATE
, ROW_NUMBER() OVER(PARTITION BY A.EMPLID
ORDER BY A.EFFDT DESC
, D.EFFDT DESC) AS RN
FROM PS_DIRECT_DEPOSIT D INNER JOIN PS_DIR_DEP_DISTRIB A ON A.EMPLID = D.EMPLID
AND A.EFFDT = D.EFFDT INNER JOIN PS_EMPLOYEES B ON B.EMPLID = A.EMPLID INNER JOIN PS_GHS_DIR_DEP_VND C ON C.EMPLID = A.EMPLID
INNER JOIN PS_DIR_DEP_DISTRIB E ON E.EMPLID = A.EMPLID AND E.EFFDT = A.EFFDT AND E.PRIORITY = A.PRIORITY
WHERE B.EMPL_STATUS NOT IN ('T','R','D')
AND ((A.DEPOSIT_TYPE = 'P'
AND A.AMOUNT_PCT = 100)
OR A.PRIORITY = 999
OR A.DEPOSIT_TYPE = 'B')
AND D.EFF_STATUS = 'A' )
SELECT CASE WHEN RN = 1 THEN 'NEW ROW' WHEN RN = 2 THEN 'OLD ROW' END AS
'ROW_TYPE' , *
FROM CTE
WHERE RN IN (1, 2)
AND MAX_UPDATE_DATE >= GETDATE() - 8
这是我现在得到的输出示例:
ROW_TYPE EMPLID VENDOR_ID FIRST_NAME LAST_NAME BANK_CD ACCOUNT_NUM ACCOUNT_TYPE PRIORITY LAST_UPDATE_DATE EFFDT MAX_UPDATE_DATE RN
NEW ROW 12345 XYZ123 John Smith 111111122 45678 C 999 03/12/2019 03/12/2019 03/12/2019 1
OLD ROW 12345 XYZ123 John Smith 111111122 45678 C 999 10/25/2017 10/25/2017 10/25/2017 2
NEW ROW 47831 A86464 Samm Bulle 754566654 98865 C 999 03/12/2019 06/08/2018 03/12/2019 1
OLD ROW 47831 A86464 Samm Bulle 754566654 45678 C 999 10/25/2017 06/08/2018 10/25/2017 2
NEW ROW 32456 KG4561 Kilo Renne 875123311 32146 C 300 09/02/2018 09/02/2018 09/02/2018 1
OLD ROW 32456 KG4561 Kilo Renne 971215477 78131 C 310 12/21/2017 12/21/2017 12/21/2017 2
在上面的示例中,我不想输出前两组 EMPLID 行(前 4 行),因为它们在新旧行之间具有相同的 BANK_CD 和 ACCOUNT_NUM .最后一组行我想保留在我的输出中,因为你可以看到 BANK_CD 和 ACCOUNT_NUM 是不同的(尽管可能被包括在内)。
我尝试为 PS_DIR_DEP_DISTRIB 添加自连接,然后在 WHERE 子句中添加 where AND (A.BANK_CD <> E.BANK_CD OR A.ACCOUNT_NUM <> E.ACCOUNT_NUM) 如下:
WITH CTE AS (
SELECT A.EMPLID
--, C.VENDOR_ID
, B.FIRST_NAME
, B.LAST_NAME
, A.BANK_CD
, A.ACCOUNT_NUM
, A.ACCOUNT_TYPE
, A.PRIORITY
, A.LAST_UPDATE_DATE
, A.EFFDT
, MAX(A.LAST_UPDATE_DATE) OVER(PARTITION BY A.EMPLID) AS MAX_UPDATE_DATE
, ROW_NUMBER() OVER(PARTITION BY A.EMPLID
ORDER BY A.EFFDT DESC
, D.EFFDT DESC) AS RN
FROM PS_DIRECT_DEPOSIT D INNER JOIN PS_DIR_DEP_DISTRIB A ON A.EMPLID = D.EMPLID
AND A.EFFDT = D.EFFDT INNER JOIN PS_EMPLOYEES B ON B.EMPLID = A.EMPLID INNER JOIN PS_GHS_DIR_DEP_VND C ON C.EMPLID = A.EMPLID
---ADDING SELF JOIN BELOW---
INNER JOIN PS_DIR_DEP_DISTRIB E ON E.EMPLID = A.EMPLID AND E.EFFDT =
A.EFFDT AND E.PRIORITY = A.PRIORITY
WHERE B.EMPL_STATUS NOT IN ('T','R','D')
AND ((A.DEPOSIT_TYPE = 'P'
AND A.AMOUNT_PCT = 100)
OR A.PRIORITY = 999
OR A.DEPOSIT_TYPE = 'B')
AND D.EFF_STATUS = 'A'
---ADDING NEW WHERE CONDITION BELOW---
AND (E.ACCOUNT_NUM <> A.ACCOUNT_NUM OR E.BANK_CD <> A.BANK_CD ))
SELECT CASE WHEN RN = 1 THEN 'NEW ROW' WHEN RN = 2 THEN 'OLD ROW' END AS 'ROW_TYPE'
, *
FROM CTE
WHERE RN IN (1, 2)
AND MAX_UPDATE_DATE >= GETDATE() - 8
当我进行上述更改时,我根本没有得到任何返回的数据。我做错了什么?
3/20/19 编辑:
按照 Tarek 在下面的建议回答,我似乎遗漏了一些我希望在最终输出中出现的行。添加的新字段 BANK_CD_prev
和 ACCOUNT_NUM_prev
似乎在我希望显示的行中返回 NULL
值,因此最终查询未检索它们的原因。如何处理?
ROW_TYPE EMPLID FIRST_NAME LAST_NAME BANK_CD ACCOUNT_NUM ACCOUNT_TYPE PRIORITY LAST_UPDATE_DATE EFFDT MAX_UPDATE_DATE RN BANK_CD_prev ACCOUNT_NUM_prev
NEW ROW 56789 Test User 874556411 54765 C 999 2019-02-28 2019-02-28 2019-02-28 1 NULL NULL
OLD ROW 56789 Test User 874556411 98451 C 999 2017-10-09 2017-10-09 2017-10-09 1 874556411 54765
NEW ROW 56789 Sampl Test 756561623 46331 C 999 2018-03-12 2018-03-12 2018-03-12 1 NULL NULL
OLD ROW 56789 Test User 756561623 46331 C 999 2015-05-18 2015-05-18 2015-05-18 1 756561623 46331
请注意以上是 运行 查询 没有 ( AND BANK_CD <> BANK_CD_prev AND ACCOUNT_NUM <> ACCOUNT_NUM_prev
的最终条件只是为了让我明白为什么我遗漏了一些行。在这种情况下,您可以看到第一个 EMPLID
的两行 ACCOUNT_NUM 不同,因此我应该输出这两行。第二个 EMPLID
(最后两行)有点不同之处在于 BANK_CD
和 ACCOUNT_NUM
对于两行都是相同的,所以我不希望它被输出(它不是,所以这是正确的)但仍然显示 NULL
与第一组行相同。
3/20/19 编辑 2: 你可以看到 BANK_CD_COUNT
和 ACCOUNT_NUM_COUNT
中有一些我认为是造成的奇怪数字一个问题。
ROW_TYPE EMPLID FIRST_NAME LAST_NAME BANK_CD ACCOUNT_NUM ACCOUNT_TYPE PRIORITY LAST_UPDATE_DATE EFFDT MAX_UPDATE_DATE RN BANK_CD_COUNT ACCOUNT_NUM_COUNT
NEW ROW 812682 Test User 031308302 8675309 C 999 2019-03-09 2019-03-09 2019-03-09 1 1101 1
OLD ROW 812682 Test User 231379393 0001236408 C 999 2018-04-11 2018-03-03 2019-03-09 2 476 1
此 link 包含创建 Table 和插入脚本,供任何可以帮助复制此内容的人使用。
我只想将以下分析功能添加到您的 CTE 到 return 前身:
LAG(A.BANK_CD) OVER (PARTITION BY A.EMPLID ORDER BY A.EFFDT DESC) AS BANK_CD_prev,
LAG(A.ACCOUNT_NUM) OVER (PARTITION BY A.EMPLID ORDER BY A.EFFDT DESC) AS ACCOUNT_NUM_prev
然后您只需要在主查询中引用它就可以了
FROM CTE
WHERE RN IN (1, 2)
AND MAX_UPDATE_DATE >= GETDATE() - 8
AND BANK_CD <> BANK_CD_prev
AND ACCOUNT_NUM <> ACCOUNT_NUM_prev
如果解决了您的问题,请回复。由于您没有提供示例进行测试,因此很难判断。
编辑:我的意思是:在您的初始方法中使用该解决方案,根本不需要自连接。它不会帮助你解决你的问题。
一种方法是使用 LEAD
和 LAG
函数。我们在这里都需要。
这是您的原始查询,我按日期格式化并注释掉了过滤器。
WITH
CTE
AS
(
SELECT
A.EMPLID
,C.VENDOR_ID
,B.FIRST_NAME
,B.LAST_NAME
,A.BANK_CD
,A.ACCOUNT_NUM
,A.ACCOUNT_TYPE
,A.PRIORITY
,A.LAST_UPDATE_DATE
,A.EFFDT
,MAX(A.LAST_UPDATE_DATE) OVER (PARTITION BY A.EMPLID) AS MAX_UPDATE_DATE
,ROW_NUMBER() OVER (PARTITION BY A.EMPLID ORDER BY A.EFFDT DESC, D.EFFDT DESC) AS RN
FROM
PS_DIRECT_DEPOSIT D
INNER JOIN PS_DIR_DEP_DISTRIB A
ON A.EMPLID = D.EMPLID
AND A.EFFDT = D.EFFDT
INNER JOIN PS_EMPLOYEES B ON B.EMPLID = A.EMPLID
INNER JOIN PS_GHS_DIR_DEP_VND C ON C.EMPLID = A.EMPLID
INNER JOIN PS_DIR_DEP_DISTRIB E
ON E.EMPLID = A.EMPLID
AND E.EFFDT = A.EFFDT
AND E.PRIORITY = A.PRIORITY
WHERE
B.EMPL_STATUS NOT IN ('T','R','D')
AND
(
(
A.DEPOSIT_TYPE = 'P'
AND A.AMOUNT_PCT = 100
)
OR A.PRIORITY = 999
OR A.DEPOSIT_TYPE = 'B'
)
AND D.EFF_STATUS = 'A'
)
SELECT
CASE
WHEN RN = 1 THEN 'NEW ROW'
WHEN RN = 2 THEN 'OLD ROW'
END AS 'ROW_TYPE'
,*
FROM CTE
WHERE
RN IN (1,2)
--AND MAX_UPDATE_DATE >= GETDATE() - 8
;
此查询return根据您的示例数据得出以下结果:
+----------+---------+-----------+------------+-----------+----------+-------------+--------------+----------+-------------------------+-------------------------+-------------------------+----+
| ROW_TYPE | EMPLID | VENDOR_ID | FIRST_NAME | LAST_NAME | BANK_CD | ACCOUNT_NUM | ACCOUNT_TYPE | PRIORITY | LAST_UPDATE_DATE | EFFDT | MAX_UPDATE_DATE | RN |
+----------+---------+-----------+------------+-----------+----------+-------------+--------------+----------+-------------------------+-------------------------+-------------------------+----+
| NEW ROW | 097432 | 3471B | H | Adam | 09146115 | 13404891 | C | 999 | 2016-06-10 00:00:00.000 | 2016-06-10 00:00:00.000 | 2016-06-10 00:00:00.000 | 1 |
| OLD ROW | 097432 | 3471B | H | Adam | 09146115 | 13404891 | C | 999 | 2016-06-10 00:00:00.000 | 2014-11-05 00:00:00.000 | 2016-06-10 00:00:00.000 | 2 |
| NEW ROW | 100765 | 1272B | V | Milo | 2358414 | 040925 | S | 310 | 2014-03-05 00:00:00.000 | 2011-04-27 00:00:00.000 | 2014-03-05 00:00:00.000 | 1 |
| OLD ROW | 100765 | 1272B | V | Milo | 2358414 | 040925 | S | 300 | 2014-03-05 00:00:00.000 | 2001-08-23 00:00:00.000 | 2014-03-05 00:00:00.000 | 2 |
| NEW ROW | 1045632 | 3870A | V | Olo | 2345794 | 179410860 | C | 999 | 2018-08-16 00:00:00.000 | 2018-08-16 00:00:00.000 | 2018-08-16 00:00:00.000 | 1 |
| OLD ROW | 1045632 | 3870A | V | Olo | 2345794 | 179410860 | C | 310 | 2011-02-25 00:00:00.000 | 2011-02-25 00:00:00.000 | 2018-08-16 00:00:00.000 | 2 |
| NEW ROW | 21345 | 12345A | J | Smith | 0224547 | 59167824 | C | 999 | 2019-02-28 00:00:00.000 | 2019-02-28 00:00:00.000 | 2019-02-28 00:00:00.000 | 1 |
| OLD ROW | 21345 | 12345A | J | Smith | 0224547 | 591678 | C | 999 | 2017-11-08 00:00:00.000 | 2017-10-17 00:00:00.000 | 2019-02-28 00:00:00.000 | 2 |
| NEW ROW | 26110 | 1272B | S | Sams | 8208302 | 822328 | C | 999 | 2019-02-08 00:00:00.000 | 2019-02-08 00:00:00.000 | 2019-02-08 00:00:00.000 | 1 |
| OLD ROW | 26110 | 1272B | S | Sams | 8208302 | 822328 | C | 315 | 2014-03-05 00:00:00.000 | 2012-07-30 00:00:00.000 | 2019-02-08 00:00:00.000 | 2 |
+----------+---------+-----------+------------+-----------+----------+-------------+--------------+----------+-------------------------+-------------------------+-------------------------+----+
我把它包括在这里,所以你可以看到与下面的区别。
这是添加了 LEAD
和 LAG
函数的新查询(我们需要同时获取上一行和下一行),并在 WHERE
子句中添加了额外的过滤器。
LEAD
和 LAG
return 到达 window 末尾时为 NULL。例如,对于 window 的第一行,LAG
(上一个值)将 return NULL,因为还没有 "previous" 行。
因此,对于第一行 (rn=1),我们需要将其值与 "next".
进行比较对于第二行 (rn=2),我们需要将其值与 "prev".
进行比较最终查询
WITH
CTE
AS
(
SELECT
A.EMPLID
,C.VENDOR_ID
,B.FIRST_NAME
,B.LAST_NAME
,A.BANK_CD
,A.ACCOUNT_NUM
,A.ACCOUNT_TYPE
,A.PRIORITY
,A.LAST_UPDATE_DATE
,A.EFFDT
,MAX(A.LAST_UPDATE_DATE) OVER (PARTITION BY A.EMPLID) AS MAX_UPDATE_DATE
,ROW_NUMBER() OVER (PARTITION BY A.EMPLID ORDER BY A.EFFDT DESC, D.EFFDT DESC) AS RN
,LAG(A.BANK_CD) OVER (PARTITION BY A.EMPLID ORDER BY A.EFFDT DESC, D.EFFDT DESC) AS BANK_CD_prev
,LEAD(A.BANK_CD) OVER (PARTITION BY A.EMPLID ORDER BY A.EFFDT DESC, D.EFFDT DESC) AS BANK_CD_next
,LAG(A.ACCOUNT_NUM) OVER (PARTITION BY A.EMPLID ORDER BY A.EFFDT DESC, D.EFFDT DESC) AS ACCOUNT_NUM_prev
,LEAD(A.ACCOUNT_NUM) OVER (PARTITION BY A.EMPLID ORDER BY A.EFFDT DESC, D.EFFDT DESC) AS ACCOUNT_NUM_next
FROM
PS_DIRECT_DEPOSIT D
INNER JOIN PS_DIR_DEP_DISTRIB A
ON A.EMPLID = D.EMPLID
AND A.EFFDT = D.EFFDT
INNER JOIN PS_EMPLOYEES B ON B.EMPLID = A.EMPLID
INNER JOIN PS_GHS_DIR_DEP_VND C ON C.EMPLID = A.EMPLID
INNER JOIN PS_DIR_DEP_DISTRIB E
ON E.EMPLID = A.EMPLID
AND E.EFFDT = A.EFFDT
AND E.PRIORITY = A.PRIORITY
WHERE
B.EMPL_STATUS NOT IN ('T','R','D')
AND
(
(
A.DEPOSIT_TYPE = 'P'
AND A.AMOUNT_PCT = 100
)
OR A.PRIORITY = 999
OR A.DEPOSIT_TYPE = 'B'
)
AND D.EFF_STATUS = 'A'
)
SELECT
CASE
WHEN RN = 1 THEN 'NEW ROW'
WHEN RN = 2 THEN 'OLD ROW'
END AS 'ROW_TYPE'
,*
FROM CTE
WHERE
RN IN (1,2)
AND
(
(
rn = 1
AND BANK_CD <> BANK_CD_next
)
OR
(
rn = 1
AND ACCOUNT_NUM <> ACCOUNT_NUM_next
)
OR
(
rn = 2
AND BANK_CD <> BANK_CD_prev
)
OR
(
rn = 2
AND ACCOUNT_NUM <> ACCOUNT_NUM_prev
)
)
--AND MAX_UPDATE_DATE >= GETDATE() - 8
;
最终结果
+----------+--------+-----------+------------+-----------+---------+-------------+--------------+----------+-------------------------+-------------------------+-------------------------+----+--------------+--------------+------------------+------------------+
| ROW_TYPE | EMPLID | VENDOR_ID | FIRST_NAME | LAST_NAME | BANK_CD | ACCOUNT_NUM | ACCOUNT_TYPE | PRIORITY | LAST_UPDATE_DATE | EFFDT | MAX_UPDATE_DATE | RN | BANK_CD_prev | BANK_CD_next | ACCOUNT_NUM_prev | ACCOUNT_NUM_next |
+----------+--------+-----------+------------+-----------+---------+-------------+--------------+----------+-------------------------+-------------------------+-------------------------+----+--------------+--------------+------------------+------------------+
| NEW ROW | 21345 | 12345A | J | Smith | 0224547 | 59167824 | C | 999 | 2019-02-28 00:00:00.000 | 2019-02-28 00:00:00.000 | 2019-02-28 00:00:00.000 | 1 | NULL | 0224547 | NULL | 591678 |
| OLD ROW | 21345 | 12345A | J | Smith | 0224547 | 591678 | C | 999 | 2017-11-08 00:00:00.000 | 2017-10-17 00:00:00.000 | 2019-02-28 00:00:00.000 | 2 | 0224547 | NULL | 59167824 | NULL |
+----------+--------+-----------+------------+-----------+---------+-------------+--------------+----------+-------------------------+-------------------------+-------------------------+----+--------------+--------------+------------------+------------------+