在子查询中查找最近的记录(SQL 服务器)

Find most recent record in a subquery (SQL Server)

我正在将一些代码从 Oracle 转换到 SQL Server (2012),并且 运行 遇到了一个问题,即此子查询使用 PARTITION/ORDER BY 来检索最新的记录。子查询 运行 本身没问题,但因为它是一个子查询,所以出现错误:

SQL Server Database Error: The ORDER BY clause is invalid in views, inline functions, derived tables, subqueries, and common table expressions, unless TOP, OFFSET or FOR XML is also specified.

这是 SQL 的部分:

FROM (
  SELECT distinct enr.MemberNum,
    (ISNULL(enr.MemberFirstName, '') + ' ' + ISNULL(enr.MemberLastName, '')) AS MEMBER_NAME,
    enr.MemberBirthDate as DOB,
    enr.MemberGender as Gender,
    LAST_VALUE(enr.MemberCurrentAge) OVER (PARTITION BY MemberNum ORDER BY StaticDate ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS AGE,
    LAST_VALUE(enr.EligStateAidCategory)OVER (PARTITION BY MemberNum ORDER BY StaticDate ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS EligStateAidCategory,
    LAST_VALUE(enr.EligStateAidCategory)OVER (PARTITION BY MemberNum ORDER BY StaticDate ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS AID_CAT_ROLL_UP,
    LAST_VALUE(enr.EligFinanceAidCategoryRollup)OVER (PARTITION BY MemberNum ORDER BY StaticDate ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS EligFinanceAidCategoryRollup,
    SUM(enr.MemberMonth) OVER (PARTITION BY MemberNum) AS TOTAL_MEMBER_MONTHS
  FROM dv_Enrollment enr
  WHERE enr.StaticDate BETWEEN '01-JUN-2016' AND '30-JUN-2016'
)A

所以,我环顾四周,发现您可以使用 TOP (2147483647) hack,所以我尝试将第一行更改为:

SELECT distinct TOP (2147483647) enr.MemberNum,

但我仍然遇到同样的错误。我想到的所有其他方法也需要 ORDER BY(使用 DENSE RANK 等)。

在这两个数据库中,我会这样写:

FROM (SELECT enr.MemberNum,
             (ISNULL(enr.MemberFirstName, '') + ' ' + ISNULL(enr.MemberLastName, '')) AS MEMBER_NAME,
             enr.MemberBirthDate as DOB,
             enr.MemberGender as Gender,
             MAX(CASE WHEN seqnum = 1 THEN enr.MemberCurrentAge END) AS AGE,
             MAX(CASE WHEN seqnum = 1 THEN enr.EligStateAidCategory END) AS EligStateAidCategory,
             MAX(CASE WHEN seqnum = 1 THEN enr.EligStateAidCategory END) AS AID_CAT_ROLL_UP,
             MAX(CASE WHEN seqnum = 1 THEN enr.EligFinanceAidCategoryRollup END) AS EligFinanceAidCategoryRollup,
            SUM(enr.MemberMonth) as TOTAL_MEMBER_MONTHS
    FROM (SELECT enr.*,
                 ROW_NUMBER() OVER (PARTITION BY MemberNum ORDER BY StaticDate DESC) as seqnum
          FROM dv_Enrollment enr
         ) enr
    WHERE enr.StaticDate >= DATE '2016-06-01' AND  -- DATE not needed in SQL Server
          enr.StaticDate < DATE '2016-07-01'       -- DATE not needed in SQL Server
    GROUP BY enr.MemberNum, enr.MemberFirstName, enr.MemberLastName,
             enr.MemberBirthDate, enr.MemberGender
   ) A

为什么要更改?

  • 日期的变化只是为了注意日期中的时间成分。 BETWEEN 和 date/times 是一个坏习惯,因为有时它会导致代码不正确并且难以调试错误。
  • 我只是不喜欢用 SELECT DISTINCT 来表示 GROUP BY。将它与 window 函数一起使用是很聪明的(并且必须与 LAST_VALUE()) 一起使用;但我认为代码最终会产生误导。
  • 我发现使用带有 seqnum 的子查询可以清楚地表明四个 "last value" 变量都是从最后一行提取数据。
  • 此外,如果排序不稳定(即键不唯一),seqnum保证值都来自同一行。 last_value() 没有。

将其切换到聚合子查询并 cross apply() 看看会发生什么。

select 
    e.MemberNum
  , e.MemberName
  , e.DOB
  , e.Gender
  , x.MemberCurrentAge
  , x.EligStateAidCategory
  , x.EligFinanceAidCategoryRollup
  , x.MemberMonth
  , e.Total_Member_Months
from (
    select
        enr.MemberNum
      , MemberName = isnull(enr.MemberFirstName+' ', '') + isnull(enr.MemberLastName, '')
      , DOB    = enr.MemberBirthDate
      , Gender = enr.MemberGender
      /* This sounds like a weird thing to sum */
      , Total_Member_Months = sum(enr.MemberMonth) 
    from dv_Enrollment enr
    group by 
        enr.MemberNum
      , isnull(enr.MemberFirstName+' ', '') + isnull(enr.MemberLastName, '')
      , enr.MemberBirthDate
      , enr.MemberGender
    ) as e
  /* cross apply() is like an inner join
   , use outer apply() for something like a left join */
  cross apply (
    select top 1
        i.MemberCurrentAge
      , i.EligStateAidCategory
      , i.EligFinanceAidCategoryRollup
      , i.MemberMonth
      from dv_Enrollment as i
      where i.MemberNum = e.MemberNum
        and i.StaticDate >= '20160601'
        and i.StatisDate <= '20160630'
      order by i.StaticDate desc -- descending for most recent
    ) as x