MSSQL 查询:如何按每个分区查找不正确的行?
MSSQL query: how to find the incorrect row by each partition?
我需要根据逻辑找出不正确的行。
逻辑是:
如果child有行(我会调用第一行)
| merit | fruit | vegetable |
| --------- | ----- | --------- |
| behaviour | apple | cucumber |
然后在 merit = poem 和 fruit = apple 的行中必须是 only vegetable = cucumber(黄瓜无其他字)
(是第二行)
| merit | fruit | vegetable |
| ----- | ----- | --------- |
| poem | apple | cucumber |
AND第二行的时间间隔必须比第一行的时间早或晚4小时,作为正确的例子:
| child_id | date | merit | fruit | vegetable |
| --------- | --------------- | --------- | ----- | --------- |
| 2 | 1/26/2022 16:00 | poem | apple | cucumber |
| 2 | 1/26/2022 18:00 | behaviour | apple | cucumber |
可以看到,是4小时间隔
我有 table:
| child_id | date | merit | fruit | vegetable |
| --------- | --------------- | ----------- | ------- | --------- |
| 1 | 1/27/2022 14:00 | behaviour | apple | cucumber |
| 1 | 1/27/2022 15:00 | poem | apple | carrot |
| 1 | 1/27/2022 17:00 | sleep | apple | ginger |
| 1 | 1/27/2022 20:00 | competition | berry | tomatoe |
| 2 | 1/26/2022 13:00 | sleep | apricot | tomatoe |
| 2 | 1/30/2022 13:00 | poem | apple | cucumber |
| 2 | 1/29/2022 13:00 | poem | apple | cucumber |
| 2 | 1/26/2022 16:00 | poem | apple | cucumber |
| 2 | 1/26/2022 18:00 | behaviour | apple | cucumber |
| 2 | 1/26/2022 19:00 | present | apple | broccoli |
| 3 | 1/25/2022 11:00 | present | orange | cucumber |
| 3 | 1/25/2022 13:00 | poem | apple | ginger |
| 3 | 1/25/2022 15:00 | behaviour | apple | cucumber |
| 4 | 1/26/2022 14:00 | behaviour | apple | cucumber |
| 4 | 1/27/2022 21:00 | poem | apple | carrot |
| 4 | 1/27/2022 15:00 | poem | apple | carrot |
| 4 | 1/27/2022 20:00 | sleep | apple | ginger |
| 4 | 1/27/2022 21:00 | competition | berry | tomatoe |
我期望的结果:
| child_id | date | merit | fruit | vegetable |
| --------- | --------------- | ----- | ----- | --------- |
| 1 | 1/27/2022 15:00 | poem | apple | carrot |
| 3 | 1/25/2022 13:00 | poem | apple | ginger |
我不知道如何通过 child 找到这些行。我写了这个 SQL 并卡住了:
select * from example_1 where merit in ('behaviour', 'poem')
这里需要分区吗?
一个可能的解决方案是使用 LEFT OUTER JOIN 将 table 连接到自身,然后仅接受 table returns 的连接版本为空的记录:
SELECT e1.*
FROM example_1 e1
LEFT OUTER JOIN example_1 e2
ON e1.fruit = e2.fruit
AND e1.vegetable <> e2.vegetable
AND e2.date BETWEEN DATEADD(HOUR, -4, e1.date) AND e1.date
AND e2.merit = 'behavior'
WHERE e1.merit = 'poem'
AND e2.child_id IS NULL
诀窍主要在于加入条件,我们要确保在 'behavior' 和 'poem' 之间匹配 vegetable
,同时还要检查最后 4 小时。
在这种方法中,我们使用整理子查询。顶级查询 B 为所需结果定义了 non-join 数据限制。所以蔬菜<>黄瓜和优点=诗
Exists 确保定义了第一行的限制并且存在不匹配项的相关性。所以我们确保水果匹配,优点是 'behavior',child_id 的匹配,无论哪种方式,日期差异都在 4 小时内。
SELECT B.*
FROM table B
WHERE vegetable <> 'cucumber'
and merit = 'poem'
and exists (SELECT 1
FROM Table A
WHERE A.Fruit = B.Fruit
AND A.Child_id = B.Child_ID
AND A.merit = 'behaviour'
AND abs(Datediff(hour,A.Date,B.Date)) <=4)
给我们:
+----------+-------------------------+-------+-------+-----------+
| child_id | date | merit | fruit | vegetable |
+----------+-------------------------+-------+-------+-----------+
| 1 | 2022-01-27 15:00:00.000 | poem | apple | carrot |
| 3 | 2022-01-25 13:00:00.000 | poem | apple | ginger |
+----------+-------------------------+-------+-------+-----------+
我需要根据逻辑找出不正确的行。
逻辑是:
如果child有行(我会调用第一行)
| merit | fruit | vegetable | | --------- | ----- | --------- | | behaviour | apple | cucumber |
然后在 merit = poem 和 fruit = apple 的行中必须是 only vegetable = cucumber(黄瓜无其他字) (是第二行)
| merit | fruit | vegetable | | ----- | ----- | --------- | | poem | apple | cucumber |
AND第二行的时间间隔必须比第一行的时间早或晚4小时,作为正确的例子:
| child_id | date | merit | fruit | vegetable | | --------- | --------------- | --------- | ----- | --------- | | 2 | 1/26/2022 16:00 | poem | apple | cucumber | | 2 | 1/26/2022 18:00 | behaviour | apple | cucumber |
可以看到,是4小时间隔
我有 table:
| child_id | date | merit | fruit | vegetable |
| --------- | --------------- | ----------- | ------- | --------- |
| 1 | 1/27/2022 14:00 | behaviour | apple | cucumber |
| 1 | 1/27/2022 15:00 | poem | apple | carrot |
| 1 | 1/27/2022 17:00 | sleep | apple | ginger |
| 1 | 1/27/2022 20:00 | competition | berry | tomatoe |
| 2 | 1/26/2022 13:00 | sleep | apricot | tomatoe |
| 2 | 1/30/2022 13:00 | poem | apple | cucumber |
| 2 | 1/29/2022 13:00 | poem | apple | cucumber |
| 2 | 1/26/2022 16:00 | poem | apple | cucumber |
| 2 | 1/26/2022 18:00 | behaviour | apple | cucumber |
| 2 | 1/26/2022 19:00 | present | apple | broccoli |
| 3 | 1/25/2022 11:00 | present | orange | cucumber |
| 3 | 1/25/2022 13:00 | poem | apple | ginger |
| 3 | 1/25/2022 15:00 | behaviour | apple | cucumber |
| 4 | 1/26/2022 14:00 | behaviour | apple | cucumber |
| 4 | 1/27/2022 21:00 | poem | apple | carrot |
| 4 | 1/27/2022 15:00 | poem | apple | carrot |
| 4 | 1/27/2022 20:00 | sleep | apple | ginger |
| 4 | 1/27/2022 21:00 | competition | berry | tomatoe |
我期望的结果:
| child_id | date | merit | fruit | vegetable |
| --------- | --------------- | ----- | ----- | --------- |
| 1 | 1/27/2022 15:00 | poem | apple | carrot |
| 3 | 1/25/2022 13:00 | poem | apple | ginger |
我不知道如何通过 child 找到这些行。我写了这个 SQL 并卡住了:
select * from example_1 where merit in ('behaviour', 'poem')
这里需要分区吗?
一个可能的解决方案是使用 LEFT OUTER JOIN 将 table 连接到自身,然后仅接受 table returns 的连接版本为空的记录:
SELECT e1.*
FROM example_1 e1
LEFT OUTER JOIN example_1 e2
ON e1.fruit = e2.fruit
AND e1.vegetable <> e2.vegetable
AND e2.date BETWEEN DATEADD(HOUR, -4, e1.date) AND e1.date
AND e2.merit = 'behavior'
WHERE e1.merit = 'poem'
AND e2.child_id IS NULL
诀窍主要在于加入条件,我们要确保在 'behavior' 和 'poem' 之间匹配 vegetable
,同时还要检查最后 4 小时。
在这种方法中,我们使用整理子查询。顶级查询 B 为所需结果定义了 non-join 数据限制。所以蔬菜<>黄瓜和优点=诗
Exists 确保定义了第一行的限制并且存在不匹配项的相关性。所以我们确保水果匹配,优点是 'behavior',child_id 的匹配,无论哪种方式,日期差异都在 4 小时内。
SELECT B.*
FROM table B
WHERE vegetable <> 'cucumber'
and merit = 'poem'
and exists (SELECT 1
FROM Table A
WHERE A.Fruit = B.Fruit
AND A.Child_id = B.Child_ID
AND A.merit = 'behaviour'
AND abs(Datediff(hour,A.Date,B.Date)) <=4)
给我们:
+----------+-------------------------+-------+-------+-----------+
| child_id | date | merit | fruit | vegetable |
+----------+-------------------------+-------+-------+-----------+
| 1 | 2022-01-27 15:00:00.000 | poem | apple | carrot |
| 3 | 2022-01-25 13:00:00.000 | poem | apple | ginger |
+----------+-------------------------+-------+-------+-----------+