在结果集中查找最近的日期
Find the most recent date in a result set
我正在处理一个查询,我需要查看患者访问诊所时输入的患者生命体征(特别是血压)。我正在提取 2015 年全年的结果,当然有些患者多次就诊,我只需要查看最近一次就诊时输入的生命体征。另一个细微的差别是收缩压和舒张压是分开输入的,所以我最终得到的结果如下:
Patient ID Name DOB Test Results Date
---------------------------------------------------------------------------------
1000 John Smith 1/1/1955 BP - Diastolic 120 2/10/2015
1000 John Smith 1/1/1955 BP - Systolic 70 2/10/2015
1000 John Smith 1/1/1955 BP - Diastolic 128 7/12/2015
1000 John Smith 1/1/1955 BP - Systolic 75 7/12/2015
1000 John Smith 1/1/1955 BP - Diastolic 130 10/22/2015
1000 John Smith 1/1/1955 BP - Systolic 76 10/22/2015
9999 Jane Doe 5/4/1970 BP - Diastolic 130 4/2/2015
9999 Jane Doe 5/4/1970 BP - Systolic 60 4/2/2015
9999 Jane Doe 5/4/1970 BP - Diastolic 127 11/20/2015
9999 Jane Doe 5/4/1970 BP - Systolic 65 11/20/2015
有 26,000 多个结果,所以显然我不想检查每个患者并查看他们最近的结果是什么时候。我希望我的结果看起来像这样:
Patient ID Name DOB Test Results Date
---------------------------------------------------------------------------------
1000 John Smith 1/1/1955 BP - Diastolic 130 10/22/2015
1000 John Smith 1/1/1955 BP - Systolic 76 10/22/2015
9999 Jane Doe 5/4/1970 BP - Diastolic 127 11/20/2015
9999 Jane Doe 5/4/1970 BP - Systolic 65 11/20/2015
我知道姓名和出生日期等会重复,但我主要关注结果栏。
这是我的查询:
SELECT DISTINCT
pd.PatientID as [Patient ID],
pd.PatientName as Name,
pd.DateOfBirth as DOB,
v.Test as Test,
v.Results as Results,
v.TestDate as Date
FROM PatientDemographic pd JOIN Vitals v ON pd.PatientID = v.PatientID
WHERE v.TestDate BETWEEN '01/01/2015' AND '12/31/2015'
AND v.Test LIKE 'BP%'
ORDER BY pd.PatientID, v.TestDate
在寻找其他答案后,我尝试为 SELECT
语句中的 v.TestDate
列执行 GROUP BY
和 MAX()
聚合函数(我特别引用 this link,虽然它是针对 Oracle 的,我使用的是 SQL 服务器,所以我不完全确定语法是否相同)。我的查询看起来像:
SELECT DISTINCT
pd.PatientID as [Patient ID],
pd.PatientName as Name,
pd.DateOfBirth as DOB,
v.Test as Test,
v.Results as Results,
MAX(v.TestDate) as Date
FROM PatientDemographic pd JOIN Vitals v ON pd.PatientID = v.PatientID
WHERE v.TestDate BETWEEN '01/01/2015' AND '12/31/2015'
AND v.Test LIKE 'BP%'
GROUP BY pd.PatientID
无可否认,我在使用 GROUP BY
时总是遇到一些困难。在这种特殊情况下,我收到一条错误消息,指出我也需要将 Patient Name 列添加到 GROUP BY
子句中,所以我这样做了,然后它要求提供 DOB。然后是测试名称。基本上,它希望我将 SELECT
语句中的所有内容添加到 GROUP BY
.
进行最近一次患者就诊的最佳方式是什么?
一种简单的方法是使用 ROW_NUMBER()
为每个测试查找最近的记录:
SELECT pd.PatientID as [Patient ID], pd.PatientName as Name, pd.DateOfBirth as DOB,
v.Test as Test, v.Results as Results, v.TestDate as Date
FROM PatientDemographic pd JOIN
(SELECT v.*,
ROW_NUMBER() OVER (PARTITION BY PatientId, Test ORDER BY TestDate DESC) as seqnum
FROM Vitals v
WHERE v.TestDate BETWEEN '2015-01-01' AND '2015-12-31' AND
v.Test LIKE 'BP%'
) v
ON pd.PatientID = v.PatientID
WHERE seqnum = 1
ORDER BY pd.PatientID, v.TestDate;
我回避 Gordon 使用的 window 功能。使用子查询的技术也可以完成工作:
SELECT
ID
,Name
,DOB
,Test
,Results
,[Date]
FROM
Vitals AS V
WHERE
V.[Date] = (SELECT MAX([Date]) FROM Vitals W WHERE W.Name = V.Name AND W.Test = 'A')
AND V.Test = 'A'
UNION
SELECT
ID
,Name
,DOB
,Test
,Results
,[Date]
FROM
Vitals AS V
WHERE
V.[Date] = (SELECT MAX([Date]) FROM Vitals W WHERE W.Name = V.Name AND W.Test = 'B')
AND V.Test = 'B'
这是 MS SQL 2005+
SELECT * FROM (
SELECT row_number() over(partition by pd.PatientID, v.Test order by v.TestDate desc) as rn,
pd.PatientID as [Patient ID],
pd.PatientName as Name,
pd.DateOfBirth as DOB,
v.Test as Test,
v.Results as Results,
v.TestDate as Date
FROM PatientDemographic pd
JOIN Vitals v ON pd.PatientID = v.PatientID
WHERE v.TestDate BETWEEN '01/01/2015' AND '12/31/2015'
AND v.Test LIKE 'BP%') t
WHERE rn = 1
窗口函数的效率不如 NOT EXISTS 子句。我想提出一个不使用窗口函数的更快的解决方案:
SELECT
pd.PatientID as [Patient ID],
pd.PatientName as Name,
pd.DateOfBirth as DOB,
v.Test as Test,
v.Results as Results,
v.TestDate as Date
FROM PatientDemographic pd JOIN Vitals v ON pd.PatientID = v.PatientID
WHERE
v.TestDate BETWEEN '01/01/2015' AND '12/31/2015'
AND v.Test LIKE 'BP%'
AND NOT EXISTS (
SELECT 1 FROM Vitals as v2 where v2.PatientID = v.PatientID
AND V2.TestDate BETWEEN '01/01/2015' AND '12/31/2015'
AND v2.Test LIKE 'BP%'
AND v2.TestDate > v.TestDate)
您也可以使用通用 Table 表达式来实现此目的。
IF OBJECT_ID('tempdb..#RecentPatientVitals') IS NOT NULL
DROP TABLE #RecentPatientVitals;
GO
CREATE TABLE #RecentPatientVitals
(
Patient_ID INT
, Name VARCHAR(100)
, DOB DATE
, Test VARCHAR(150)
, Results INT
, [Date] DATE
);
INSERT INTO #RecentPatientVitals
( Patient_ID, Name, DOB, Test, Results, [Date] )
VALUES ( 1000, 'John Smith', '1/1/1955', 'BP - Diastolic', 120, '2/10/2015' )
, ( 1000, 'John Smith', '1/1/1955', 'BP - Systolic', 70, '2/10/2015' )
, ( 1000, 'John Smith', '1/1/1955', 'BP - Diastolic', 128, '7/12/2015' )
, ( 1000, 'John Smith', '1/1/1955', 'BP - Systolic', 75, '7/12/2015' )
, ( 1000, 'John Smith', '1/1/1955', 'BP - Diastolic', 130, '10/22/2015' )
, ( 1000, 'John Smith', '1/1/1955', 'BP - Systolic', 76, '10/22/2015' )
, ( 9999, 'Jane Doe', '5/4/1970', 'BP - Diastolic', 130, '4/2/2015' )
, ( 9999, 'Jane Doe', '5/4/1970', 'BP - Systolic', 60, '4/2/2015' )
, ( 9999, 'Jane Doe', '5/4/1970', 'BP - Diastolic', 127, '11/20/2015' )
, ( 9999, 'Jane Doe', '5/4/1970', 'BP - Systolic', 65, '11/20/2015' );
SELECT *
FROM #RecentPatientVitals;
WITH PatVitals1
AS ( SELECT Patient_ID
, Name
, DOB
, Test
, MAX(Date) AS Date
FROM #RecentPatientVitals
GROUP BY Patient_ID
, Name
, DOB
, Test
) ,
PatVitals2
AS ( SELECT Patient_ID
, Test
, Results
, Date
FROM #RecentPatientVitals
)
SELECT P1.Patient_ID
, P1.Name
, P1.DOB
, P1.Test
, P2.Results
, P1.Date
FROM PatVitals1 P1
INNER JOIN PatVitals2 P2
ON P2.Patient_ID = P1.Patient_ID
AND P2.Date = P1.Date
AND P2.Test = P1.Test
GROUP BY P1.Patient_ID
, P1.Name
, P1.DOB
, P1.Test
, P2.Results
, P1.Date;
我正在处理一个查询,我需要查看患者访问诊所时输入的患者生命体征(特别是血压)。我正在提取 2015 年全年的结果,当然有些患者多次就诊,我只需要查看最近一次就诊时输入的生命体征。另一个细微的差别是收缩压和舒张压是分开输入的,所以我最终得到的结果如下:
Patient ID Name DOB Test Results Date
---------------------------------------------------------------------------------
1000 John Smith 1/1/1955 BP - Diastolic 120 2/10/2015
1000 John Smith 1/1/1955 BP - Systolic 70 2/10/2015
1000 John Smith 1/1/1955 BP - Diastolic 128 7/12/2015
1000 John Smith 1/1/1955 BP - Systolic 75 7/12/2015
1000 John Smith 1/1/1955 BP - Diastolic 130 10/22/2015
1000 John Smith 1/1/1955 BP - Systolic 76 10/22/2015
9999 Jane Doe 5/4/1970 BP - Diastolic 130 4/2/2015
9999 Jane Doe 5/4/1970 BP - Systolic 60 4/2/2015
9999 Jane Doe 5/4/1970 BP - Diastolic 127 11/20/2015
9999 Jane Doe 5/4/1970 BP - Systolic 65 11/20/2015
有 26,000 多个结果,所以显然我不想检查每个患者并查看他们最近的结果是什么时候。我希望我的结果看起来像这样:
Patient ID Name DOB Test Results Date
---------------------------------------------------------------------------------
1000 John Smith 1/1/1955 BP - Diastolic 130 10/22/2015
1000 John Smith 1/1/1955 BP - Systolic 76 10/22/2015
9999 Jane Doe 5/4/1970 BP - Diastolic 127 11/20/2015
9999 Jane Doe 5/4/1970 BP - Systolic 65 11/20/2015
我知道姓名和出生日期等会重复,但我主要关注结果栏。
这是我的查询:
SELECT DISTINCT
pd.PatientID as [Patient ID],
pd.PatientName as Name,
pd.DateOfBirth as DOB,
v.Test as Test,
v.Results as Results,
v.TestDate as Date
FROM PatientDemographic pd JOIN Vitals v ON pd.PatientID = v.PatientID
WHERE v.TestDate BETWEEN '01/01/2015' AND '12/31/2015'
AND v.Test LIKE 'BP%'
ORDER BY pd.PatientID, v.TestDate
在寻找其他答案后,我尝试为 SELECT
语句中的 v.TestDate
列执行 GROUP BY
和 MAX()
聚合函数(我特别引用 this link,虽然它是针对 Oracle 的,我使用的是 SQL 服务器,所以我不完全确定语法是否相同)。我的查询看起来像:
SELECT DISTINCT
pd.PatientID as [Patient ID],
pd.PatientName as Name,
pd.DateOfBirth as DOB,
v.Test as Test,
v.Results as Results,
MAX(v.TestDate) as Date
FROM PatientDemographic pd JOIN Vitals v ON pd.PatientID = v.PatientID
WHERE v.TestDate BETWEEN '01/01/2015' AND '12/31/2015'
AND v.Test LIKE 'BP%'
GROUP BY pd.PatientID
无可否认,我在使用 GROUP BY
时总是遇到一些困难。在这种特殊情况下,我收到一条错误消息,指出我也需要将 Patient Name 列添加到 GROUP BY
子句中,所以我这样做了,然后它要求提供 DOB。然后是测试名称。基本上,它希望我将 SELECT
语句中的所有内容添加到 GROUP BY
.
进行最近一次患者就诊的最佳方式是什么?
一种简单的方法是使用 ROW_NUMBER()
为每个测试查找最近的记录:
SELECT pd.PatientID as [Patient ID], pd.PatientName as Name, pd.DateOfBirth as DOB,
v.Test as Test, v.Results as Results, v.TestDate as Date
FROM PatientDemographic pd JOIN
(SELECT v.*,
ROW_NUMBER() OVER (PARTITION BY PatientId, Test ORDER BY TestDate DESC) as seqnum
FROM Vitals v
WHERE v.TestDate BETWEEN '2015-01-01' AND '2015-12-31' AND
v.Test LIKE 'BP%'
) v
ON pd.PatientID = v.PatientID
WHERE seqnum = 1
ORDER BY pd.PatientID, v.TestDate;
我回避 Gordon 使用的 window 功能。使用子查询的技术也可以完成工作:
SELECT
ID
,Name
,DOB
,Test
,Results
,[Date]
FROM
Vitals AS V
WHERE
V.[Date] = (SELECT MAX([Date]) FROM Vitals W WHERE W.Name = V.Name AND W.Test = 'A')
AND V.Test = 'A'
UNION
SELECT
ID
,Name
,DOB
,Test
,Results
,[Date]
FROM
Vitals AS V
WHERE
V.[Date] = (SELECT MAX([Date]) FROM Vitals W WHERE W.Name = V.Name AND W.Test = 'B')
AND V.Test = 'B'
这是 MS SQL 2005+
SELECT * FROM (
SELECT row_number() over(partition by pd.PatientID, v.Test order by v.TestDate desc) as rn,
pd.PatientID as [Patient ID],
pd.PatientName as Name,
pd.DateOfBirth as DOB,
v.Test as Test,
v.Results as Results,
v.TestDate as Date
FROM PatientDemographic pd
JOIN Vitals v ON pd.PatientID = v.PatientID
WHERE v.TestDate BETWEEN '01/01/2015' AND '12/31/2015'
AND v.Test LIKE 'BP%') t
WHERE rn = 1
窗口函数的效率不如 NOT EXISTS 子句。我想提出一个不使用窗口函数的更快的解决方案:
SELECT
pd.PatientID as [Patient ID],
pd.PatientName as Name,
pd.DateOfBirth as DOB,
v.Test as Test,
v.Results as Results,
v.TestDate as Date
FROM PatientDemographic pd JOIN Vitals v ON pd.PatientID = v.PatientID
WHERE
v.TestDate BETWEEN '01/01/2015' AND '12/31/2015'
AND v.Test LIKE 'BP%'
AND NOT EXISTS (
SELECT 1 FROM Vitals as v2 where v2.PatientID = v.PatientID
AND V2.TestDate BETWEEN '01/01/2015' AND '12/31/2015'
AND v2.Test LIKE 'BP%'
AND v2.TestDate > v.TestDate)
您也可以使用通用 Table 表达式来实现此目的。
IF OBJECT_ID('tempdb..#RecentPatientVitals') IS NOT NULL
DROP TABLE #RecentPatientVitals;
GO
CREATE TABLE #RecentPatientVitals
(
Patient_ID INT
, Name VARCHAR(100)
, DOB DATE
, Test VARCHAR(150)
, Results INT
, [Date] DATE
);
INSERT INTO #RecentPatientVitals
( Patient_ID, Name, DOB, Test, Results, [Date] )
VALUES ( 1000, 'John Smith', '1/1/1955', 'BP - Diastolic', 120, '2/10/2015' )
, ( 1000, 'John Smith', '1/1/1955', 'BP - Systolic', 70, '2/10/2015' )
, ( 1000, 'John Smith', '1/1/1955', 'BP - Diastolic', 128, '7/12/2015' )
, ( 1000, 'John Smith', '1/1/1955', 'BP - Systolic', 75, '7/12/2015' )
, ( 1000, 'John Smith', '1/1/1955', 'BP - Diastolic', 130, '10/22/2015' )
, ( 1000, 'John Smith', '1/1/1955', 'BP - Systolic', 76, '10/22/2015' )
, ( 9999, 'Jane Doe', '5/4/1970', 'BP - Diastolic', 130, '4/2/2015' )
, ( 9999, 'Jane Doe', '5/4/1970', 'BP - Systolic', 60, '4/2/2015' )
, ( 9999, 'Jane Doe', '5/4/1970', 'BP - Diastolic', 127, '11/20/2015' )
, ( 9999, 'Jane Doe', '5/4/1970', 'BP - Systolic', 65, '11/20/2015' );
SELECT *
FROM #RecentPatientVitals;
WITH PatVitals1
AS ( SELECT Patient_ID
, Name
, DOB
, Test
, MAX(Date) AS Date
FROM #RecentPatientVitals
GROUP BY Patient_ID
, Name
, DOB
, Test
) ,
PatVitals2
AS ( SELECT Patient_ID
, Test
, Results
, Date
FROM #RecentPatientVitals
)
SELECT P1.Patient_ID
, P1.Name
, P1.DOB
, P1.Test
, P2.Results
, P1.Date
FROM PatVitals1 P1
INNER JOIN PatVitals2 P2
ON P2.Patient_ID = P1.Patient_ID
AND P2.Date = P1.Date
AND P2.Test = P1.Test
GROUP BY P1.Patient_ID
, P1.Name
, P1.DOB
, P1.Test
, P2.Results
, P1.Date;