忽略重复记录 SQL

Ignoring Duplicate Records SQL

需要一些帮助:)

所以我有 table 条包含以下列的记录:

Key (PK, FK, int) DT (smalldatetime) Value (real)

DT 是一天中每半小时的日期时间,具有关联值

例如

Key       DT                       VALUE
1000      2010-01-01 08:00:00      80
1000      2010-01-01 08:30:00      75
1000      2010-01-01 09:00:00      100

我有一个查询,每 24 小时及其相关时间查找最大值,但是,有一天最大值出现两次,因此重复了导致处理问题的日期。我试过使用 rownumber() ,但我不能在我的 where 子句中使用计算列? 目前我有:

SELECT       cast(T1.DT as date) as 'Date',Cast(T1.DT as time(0)) as 'HH', ROW_NUMBER() over (PARTITION BY  cast(DT as date) ORDER BY DT) AS 'RowNumber'
FROM        TABLE_1 AS T1
INNER JOIN  (
                SELECT CAST([DT] as date) as 'DATE'
                ,       MAX([VALUE]) as 'MAX_HH'
                FROM    TABLE_1
                WHERE   DT > '6-nov-2016' and [KEY] = '1000'
                GROUP BY CAST([DT] as date)
            ) AS MAX_DT
        ON  MAX_DT.[DATE] = CAST(T1.[DT] as date)
        AND T1.VALUE = MAX_DT.MAX_HH
WHERE       DT > '6-nov-2016' and [KEY] = '1000'
ORDER BY DT

这导致

Key       DT               VALUE       HH
1000      2010-01-01       80          07:00:00
1000      2010-02-01       100         17:30:00
1000      2010-02-01       100         18:00:00

我需要删除重复的日期(我没有偏好哪个 HH)

我想我已经解释得很糟糕了,如果它没有意义请告诉我,我会尝试重新写

有什么想法吗?

有了 SQL 你可以使用 SELECT DISTINCT,

SELECT DISTINCT 语句用于 return 只有不同的(不同的)值。

在table中,一列往往包含很多重复值;有时您只想列出不同的(不同的)值。

SELECT DISTINCT 语句用于 return 只有不同的(不同的)值。

你能试试这个新代码在 ** **:

 SELECT       cast(T1.DT as date) as 'Date', ** MIN(Cast(T1.DT as time(0))) as 'HH' **
    FROM        TABLE_1 AS T1
    INNER JOIN  (
                    SELECT CAST([DT] as date) as 'DATE'
                    ,       MAX([VALUE]) as 'MAX_HH'
                FROM    TABLE_1
                WHERE   DT > '6-nov-2016' and [KEY] = '1000'
                GROUP BY CAST([DT] as date)
            ) AS MAX_DT
        ON  MAX_DT.[DATE] = CAST(T1.[DT] as date)
        AND T1.VALUE = MAX_DT.MAX_HH
WHERE       DT > '6-nov-2016' and [KEY] = '1000'

此处按

分组
GROUP BY cast(T1.DT as date)
ORDER BY DT

我会做这样的事情 我没试过,但我认为它是正确的。

SELECT  cast(T1.DT as date) as 'Date',Cast(T1.DT as time(0)) as 'HH', VALUE 
FROM TABLE_1 T1      
       WHERE [DT] IN (       
       --select the max date from Table_1 for each day
            SELECT MAX([DT]) max_date FROM TABLE_1
            WHERE (CAST([DT] as date) ,value) IN 
            (
             SELECT CAST([DT] as date) as 'CAST_DATE'
              ,MAX([VALUE]) as 'MAX_HH'
              FROM    TABLE_1
              WHERE   DT > '6-nov-2016' and [KEY] = '1000'
             GROUP BY CAST([DT] as date
            )group by [DT]
           )
 WHERE       DT > '6-nov-2016' and [KEY] = '1000'

JOIN 更改为 APPLY

APPLY 操作允许您将连接关系限制为每个源关系只有一个结果。

SELECT v.[Key], cast(v.DT As Date) as "Date", v.[Value], cast(v.DT as Time(0)) as "HH"
FROM
(   -- First a projection to get just the exact dates you want
    SELECT DISTINCT [Key], CAST(DT as DATE) as DT 
    FROM Table_1 
    WHERE [Key] = '1000' AMD DT > '20161106'
) dates
CROSS APPLY (
    -- Then use APPLY rather than JOIN to find just the exact one record you need for each date
    SELECT TOP 1 * 
    FROM Table_1 
    WHERE [Key] = dates.[Key] AND cast(DT as DATE) = dates.DT ORDER BY [Value] DESC
) v

最后一点:此查询和问题中的示例查询都将包含 2016 年 11 月 6 日的值。查询显示 > 2016-11-05 具有排他性不等式,但原始查询仍在使用完整的 DateTime 进行比较值,这意味着隐含的 0 作为时间分量。因此,11 月 6 日的 12:01 AM 仍然大于 11 月 6 日的 12:00:00.001 AM。如果您想从查询中排除所有 11 月 6 日的日期,您需要将其更改为在末尾使用时间值日期,或投射到日期 before 进行 > 比较。