查询以检索值已更改的行
Query to retrieve rows where a value has changed
试图恢复一些非常生疏的 SQL 技能,但已经超出了我的理解范围!我有一个 table(下面的示例),每周我都会上传指示销售机会进展的行。机会具有索引 ID,而某些维度(如“阶段”和“总成本”)可能每周都在变化。每个文件加载还有一个日期格式的 dateLoaded 列。
dateLoaded
OppNumber
Stage
Total Cost
2022-04-25
12345
04
60.00
2022-04-25
23456
01
500.00
2022-04-25
34567
02
225.00
2022-04-25
45678
04
1750.00
2022-04-25
56789
06
50.00
2022-05-01
12345
04
100.00
2022-05-01
23456
01
500.00
2022-05-01
34567
02
275.00
2022-05-01
45678
04
2000.00
2022-05-01
56789
06
50.00
2022-05-07
12345
04
125.00
2022-05-07
23456
02
500.00
2022-05-07
34567
04
275.00
2022-05-07
56789
04
55.00
(注意OppNumber 45678已经在2022-05-07文件中去掉)
我尝试编写的查询将查看 2 个最近的 文件加载,其中“总成本”发生变化,并且 return 都是原始值和新的价值。使用上面的 table 示例,它将 return (按 OppNumber 排序,然后按 dateLoaded ASC 排序):
dateLoaded
OppNumber
Stage
Total Cost
2022-05-01
12345
04
100.00
2022-05-07
12345
04
125.00
2022-05-01
56789
06
50.00
2022-05-07
56789
06
55.00
理想情况下,我也很想知道 2 次加载前而不是最近的记录存在于何处(在本例中,它将 return 2022-05-01 中 45678 的记录文件)。
我最终不得不将这些数据转储到 Excel 并创建公式,但我确信有办法为此编写查询。
提前感谢您的建议!
如果你有 mysql 8 或以上,你可以使用 window 函数
示例 1
第一个 cte 选择那些在最近的负载中有一个 oppnumber 的那些并且 dense_ranks 按日期并且滞后以获得之前的价格之后它非常简单
WITH CTE AS
(
SELECT * ,DENSE_RANK() OVER (ORDER BY DATELOADED) DR,
LAG(TOTAL_COST) OVER (PARTITION BY OPPNUMBER ORDER BY DATELOADED) PREV
FROM T
WHERE EXISTS (SELECT OPPNUMBER FROM T T1 WHERE T1.OPPNUMBER = T.OPPNUMBER AND T1.DATELOADED = (sELECT MAX(DATELOADED) FROM T) )
ORDER BY OPPNUMBER,DATELOADED DESC
),
CTE1 AS
(SELECT MAX(DR) MAXDR FROM CTE
)
SELECT CTE.*
FROM CTE
JOIN CTE1 ON (CTE.DR = CTE1.MAXDR)
WHERE TOTAL_COST <> PREV
UNION ALL
SELECT CTE.*
FROM CTE
JOIN CTE1 ON (CTE.DR = CTE1.MAXDR - 1)
JOIN (SELECT CTE.*
FROM CTE
JOIN CTE1 ON (CTE.DR = CTE1.MAXDR)
WHERE TOTAL_COST <> PREV) S ON S.OPPNUMBER = CTE.OPPNUMBER
ORDER BY OPPNUMBER, DATELOADED ASC;
+------------+-----------+-------+------------+----+--------+
| dateLoaded | OppNumber | Stage | Total_Cost | DR | PREV |
+------------+-----------+-------+------------+----+--------+
| 2022-05-01 | 12345 | 4 | 100.00 | 2 | 60.00 |
| 2022-05-07 | 12345 | 4 | 125.00 | 3 | 100.00 |
| 2022-05-01 | 56789 | 6 | 50.00 | 2 | 50.00 |
| 2022-05-07 | 56789 | 4 | 55.00 | 3 | 50.00 |
+------------+-----------+-------+------------+----+--------+
4 rows in set (0.028 sec)
示例 2
与示例 1 类似,除了特别关注最近的变化而不是最近的负载
WITH CTE AS
(
SELECT * ,row_number() OVER (PARTITION BY OPPNUMBER ORDER BY DATELOADED) rn,
LAG(TOTAL_COST) OVER (PARTITION BY OPPNUMBER ORDER BY DATELOADED) PREV
FROM T
ORDER BY OPPNUMBER,DATELOADED DESC
),
cte1 as
(
Select oppnumber opn, max(rn) maxrn from cte where total_cost <> prev group by oppnumber
)
select * from cte join cte1 on cte1.opn = cte.oppnumber and cte1.maxrn = cte.rn
union all
select * from cte join cte1 on cte1.opn = cte.oppnumber and cte1.maxrn - 1 = cte.rn
order by oppnumber , dateloaded desc
+------------+-----------+-------+------------+----+---------+-------+-------+
| dateLoaded | OppNumber | Stage | Total_Cost | rn | PREV | opn | maxrn |
+------------+-----------+-------+------------+----+---------+-------+-------+
| 2022-05-07 | 12345 | 4 | 125.00 | 3 | 100.00 | 12345 | 3 |
| 2022-05-01 | 12345 | 4 | 100.00 | 2 | 60.00 | 12345 | 3 |
| 2022-05-01 | 34567 | 2 | 275.00 | 2 | 225.00 | 34567 | 2 |
| 2022-04-25 | 34567 | 2 | 225.00 | 1 | NULL | 34567 | 2 |
| 2022-05-01 | 45678 | 4 | 2000.00 | 2 | 1750.00 | 45678 | 2 |
| 2022-04-25 | 45678 | 4 | 1750.00 | 1 | NULL | 45678 | 2 |
| 2022-05-07 | 56789 | 4 | 55.00 | 3 | 50.00 | 56789 | 3 |
| 2022-05-01 | 56789 | 6 | 50.00 | 2 | 50.00 | 56789 | 3 |
+------------+-----------+-------+------------+----+---------+-------+-------+
8 rows in set (0.004 sec)
试图恢复一些非常生疏的 SQL 技能,但已经超出了我的理解范围!我有一个 table(下面的示例),每周我都会上传指示销售机会进展的行。机会具有索引 ID,而某些维度(如“阶段”和“总成本”)可能每周都在变化。每个文件加载还有一个日期格式的 dateLoaded 列。
dateLoaded | OppNumber | Stage | Total Cost |
---|---|---|---|
2022-04-25 | 12345 | 04 | 60.00 |
2022-04-25 | 23456 | 01 | 500.00 |
2022-04-25 | 34567 | 02 | 225.00 |
2022-04-25 | 45678 | 04 | 1750.00 |
2022-04-25 | 56789 | 06 | 50.00 |
2022-05-01 | 12345 | 04 | 100.00 |
2022-05-01 | 23456 | 01 | 500.00 |
2022-05-01 | 34567 | 02 | 275.00 |
2022-05-01 | 45678 | 04 | 2000.00 |
2022-05-01 | 56789 | 06 | 50.00 |
2022-05-07 | 12345 | 04 | 125.00 |
2022-05-07 | 23456 | 02 | 500.00 |
2022-05-07 | 34567 | 04 | 275.00 |
2022-05-07 | 56789 | 04 | 55.00 |
(注意OppNumber 45678已经在2022-05-07文件中去掉)
我尝试编写的查询将查看 2 个最近的 文件加载,其中“总成本”发生变化,并且 return 都是原始值和新的价值。使用上面的 table 示例,它将 return (按 OppNumber 排序,然后按 dateLoaded ASC 排序):
dateLoaded | OppNumber | Stage | Total Cost |
---|---|---|---|
2022-05-01 | 12345 | 04 | 100.00 |
2022-05-07 | 12345 | 04 | 125.00 |
2022-05-01 | 56789 | 06 | 50.00 |
2022-05-07 | 56789 | 06 | 55.00 |
理想情况下,我也很想知道 2 次加载前而不是最近的记录存在于何处(在本例中,它将 return 2022-05-01 中 45678 的记录文件)。
我最终不得不将这些数据转储到 Excel 并创建公式,但我确信有办法为此编写查询。
提前感谢您的建议!
如果你有 mysql 8 或以上,你可以使用 window 函数
示例 1
第一个 cte 选择那些在最近的负载中有一个 oppnumber 的那些并且 dense_ranks 按日期并且滞后以获得之前的价格之后它非常简单
WITH CTE AS
(
SELECT * ,DENSE_RANK() OVER (ORDER BY DATELOADED) DR,
LAG(TOTAL_COST) OVER (PARTITION BY OPPNUMBER ORDER BY DATELOADED) PREV
FROM T
WHERE EXISTS (SELECT OPPNUMBER FROM T T1 WHERE T1.OPPNUMBER = T.OPPNUMBER AND T1.DATELOADED = (sELECT MAX(DATELOADED) FROM T) )
ORDER BY OPPNUMBER,DATELOADED DESC
),
CTE1 AS
(SELECT MAX(DR) MAXDR FROM CTE
)
SELECT CTE.*
FROM CTE
JOIN CTE1 ON (CTE.DR = CTE1.MAXDR)
WHERE TOTAL_COST <> PREV
UNION ALL
SELECT CTE.*
FROM CTE
JOIN CTE1 ON (CTE.DR = CTE1.MAXDR - 1)
JOIN (SELECT CTE.*
FROM CTE
JOIN CTE1 ON (CTE.DR = CTE1.MAXDR)
WHERE TOTAL_COST <> PREV) S ON S.OPPNUMBER = CTE.OPPNUMBER
ORDER BY OPPNUMBER, DATELOADED ASC;
+------------+-----------+-------+------------+----+--------+
| dateLoaded | OppNumber | Stage | Total_Cost | DR | PREV |
+------------+-----------+-------+------------+----+--------+
| 2022-05-01 | 12345 | 4 | 100.00 | 2 | 60.00 |
| 2022-05-07 | 12345 | 4 | 125.00 | 3 | 100.00 |
| 2022-05-01 | 56789 | 6 | 50.00 | 2 | 50.00 |
| 2022-05-07 | 56789 | 4 | 55.00 | 3 | 50.00 |
+------------+-----------+-------+------------+----+--------+
4 rows in set (0.028 sec)
示例 2
与示例 1 类似,除了特别关注最近的变化而不是最近的负载
WITH CTE AS
(
SELECT * ,row_number() OVER (PARTITION BY OPPNUMBER ORDER BY DATELOADED) rn,
LAG(TOTAL_COST) OVER (PARTITION BY OPPNUMBER ORDER BY DATELOADED) PREV
FROM T
ORDER BY OPPNUMBER,DATELOADED DESC
),
cte1 as
(
Select oppnumber opn, max(rn) maxrn from cte where total_cost <> prev group by oppnumber
)
select * from cte join cte1 on cte1.opn = cte.oppnumber and cte1.maxrn = cte.rn
union all
select * from cte join cte1 on cte1.opn = cte.oppnumber and cte1.maxrn - 1 = cte.rn
order by oppnumber , dateloaded desc
+------------+-----------+-------+------------+----+---------+-------+-------+
| dateLoaded | OppNumber | Stage | Total_Cost | rn | PREV | opn | maxrn |
+------------+-----------+-------+------------+----+---------+-------+-------+
| 2022-05-07 | 12345 | 4 | 125.00 | 3 | 100.00 | 12345 | 3 |
| 2022-05-01 | 12345 | 4 | 100.00 | 2 | 60.00 | 12345 | 3 |
| 2022-05-01 | 34567 | 2 | 275.00 | 2 | 225.00 | 34567 | 2 |
| 2022-04-25 | 34567 | 2 | 225.00 | 1 | NULL | 34567 | 2 |
| 2022-05-01 | 45678 | 4 | 2000.00 | 2 | 1750.00 | 45678 | 2 |
| 2022-04-25 | 45678 | 4 | 1750.00 | 1 | NULL | 45678 | 2 |
| 2022-05-07 | 56789 | 4 | 55.00 | 3 | 50.00 | 56789 | 3 |
| 2022-05-01 | 56789 | 6 | 50.00 | 2 | 50.00 | 56789 | 3 |
+------------+-----------+-------+------------+----+---------+-------+-------+
8 rows in set (0.004 sec)