如何找到日期最接近另一个日期的值

How to find the value with date closest to another date

我从事医疗保健工作,需要生成一份报告,显示不同时间点的患者实验室值。时间点如下:

移植前:

1 年 = 365 天 +/- 30 天

3 个月 = 90 天 +/- 14 天

1 个月 = 30 天 +/- 7 天

Post移植:

1 天 = 24 小时 +/- 12 小时

1 周 = 7 天 +/- 1 天

1 个月 = 30 天 +/- 7 天

3 个月 = 90 天 +/- 14 天

6 个月 = 180 天 +/- 30 天

1 年 = 365 天 +/- 30 天

我的数据模型有很多 table(来自 SQL 服务器查询的结果),但主要实验室 table 如下所示:

+-----------------------+-----------------+------------+-----------+
| Order ID | Episode ID | Transplant Date | Lab Date   | Lab Value |
+----------+------------+-----------------+------------+-----------+
| 111      | 222        | 5/2/2018        | 1/22/2018  | 23        |
| 112      | 222        | 5/2/2018        | 1/27/2018  | 15        |
| 113      | 222        | 5/2/2018        | 5/3/2018   | 14        |
| 114      | 222        | 5/2/2018        | 10/19/2018 | 12        |
| 115      | 223        | 1/23/2019       | 1/24/2019  | 20        |
| 116      | 223        | 1/23/2019       | 1/25/2019  | 25        |
| 117      | 223        | 1/23/2019       | 1/31/2019  | 29        |
| 118      | 223        | 1/23/2019       | 4/23/2019  | 30        |
| 119      | 223        | 1/23/2019       | 3/1/2019   | 35        |
| 120      | 224        | 7/19/2019       | 7/19/2018  | 5         |
| 121      | 224        | 7/19/2019       | 7/24/2018  | 13        |
+-----------------------+-----------------+------------+-----------+

Order ID 是实验室的唯一标识符,Episode ID 是患者的唯一标识符,我们正在寻找与 Transplant Date.[=22 相关的实验室=]

还有另一个 table 患者数据,看起来像这样:

+------------+----------------+-----------------+
| Episode ID | Patient Name   | Transplant Date |
+------------+----------------+-----------------+
| 222        | Alphers, Ralph | 5/2/2018        |
| 223        | Bethe, Hans    | 1/23/2019       |
| 224        | Gammow, George | 7/19/2019       |
+------------+----------------+-----------------+

生成的数据应该与此类似:

+------------+------------+--------------+-------------+------------+-------------+--------------+---------------+-------------+
| Episode ID | 1 year pre | 3 months pre | 1 month pre | 1 day post | 1 week post | 1 month post | 6 months post | 1 year post |
+------------+------------+--------------+-------------+------------+-------------+--------------+---------------+-------------+
| 222        |            | 15           |             | 14         |             |              | 12            |             |
| 223        |            |              |             | 20         | 29          | 35           |               |             |
| 224        | 5          |              |             |            |             |              |               |             |
+------------+------------+--------------+-------------+------------+-------------+--------------+---------------+-------------+

考虑到处理时间(用户体验)和开发复杂性,是否有最佳方法?

现在,我是这样做的。

首先,我使用 Power Query (M) 创建时间点(例如 Table.AddColumn(#"Changed Type", "Minutes to One Year Before Transplant", each Number.Abs(Duration.TotalMinutes(([Lab Date] - DateTime.From(Date.AddYears([Transplant Date], -1)))))))。 然后,我使用 DAX 查找最接近正确目标日期的记录的天数:

Labs shortest minutes to one year before transplant = 
VAR EpisodeID = Patients[Episode ID]
VAR TargetDate = DATEADD(Patients[Transplant Date], 1, MONTH)
VAR WindowDays = 30
RETURN
CALCULATE(
    MIN(Labs[Minutes to One Month After Transplant]),
    FILTER(Labs, Labs[Episode ID] = EpisodeID),
    FILTER(Labs, Labs[Lab Date] >= DATEADD(TargetDate, -WindowDays, DAY)),
    FILTER(Labs, Labs[Lab Date] <= DATEADD(TargetDate, WindowDays, DAY))
)

然后,我使用该分钟数作为标识符来获取 Order ID

Lab Order ID closest to one year before transplant = 
VAR EpisodeID = Patients[Episode ID]
VAR TargetDate = DATEADD(Patients[Transplant Date], 1, MONTH)
VAR WindowDays = 30
VAR DaysFrom = Patients[Labs shortest minutes to one year before transplant]
RETURN
CALCULATE(
    MIN(Labs[Order ID]),
    FILTER(Labs, Labs[Episode ID] = EpisodeID),
    FILTER(Labs, Labs[Lab Date] >= DATEADD(TargetDate, -WindowDays, DAY)),
    FILTER(Labs, Labs[Lab Date] <= DATEADD(TargetDate, WindowDays, DAY))
)

最后,我可以使用那个 Order ID 从那个实验室中获取我想要的任何东西,例如值:

Lab Value closest to one year before transplant = 
VAR EpisodeID = Patients[Episode ID]
VAR OrderID = Patients[Lab Order ID closest to one year before transplant]
RETURN
CALCULATE(
    MIN(Labs[Value]),
    FILTER(Labs, Labs[Episode ID] = EpisodeID),
    FILTER(Labs, Labs[Order ID] = OrderID)
)

而且,我需要为 3 个不同的实验室执行此操作,这意味着重复此过程大约 30 次。而且,生成的报告需要一些时间来进行计算。我可以将大量工作推回 SQL 服务器,但也许这不是最好的主意?

我能想到的最简单的方法是为每个时间段创建计算列,然后在您想要的任何度量中直接使用它们。例如,对于 1 年前:

1 Year Pre = IF('Table'[Lab Date]>='Table'[Transplant Date]-395 && 'Table'[Lab Date]<='Table'[Transplant Date]-335,'Table'[LabValue],BLANK())

前 3 个月:

3 Months Pre = IF('Table'[Lab Date]>='Table'[Transplant Date]-104 && 'Table'[Lab Date]<='Table'[Transplant Date]-76,'Table'[LabValue],BLANK())

同样,您也可以为其他时间段创建计算列,并使用它们来获得所需的视觉效果。希望这有帮助。

你所有的代码都是 M,所以我不确定你为什么用 SQL 标记它。但这里是 [可能不是最优雅的] SQL 解决方案:

create table labs (
    OrderID int not null,
    EpisodeID int not null,
    TransplantDate date not null,
    LabDate date not null,
    LabValue int not null)

insert labs
values 
(111, 222, cast('5/2/2018'  as date), cast('1/22/2018'  as date), 23),
(112, 222, cast('5/2/2018'  as date), cast('1/27/2018'  as date), 15),
(113, 222, cast('5/2/2018'  as date), cast('5/3/2018'   as date), 14),
(114, 222, cast('5/2/2018'  as date), cast('10/19/2018' as date), 12),
(115, 223, cast('1/23/2019' as date), cast('1/24/2019'  as date), 20),
(116, 223, cast('1/23/2019' as date), cast('1/25/2019'  as date), 25),
(117, 223, cast('1/23/2019' as date), cast('1/31/2019'  as date), 29),
(118, 223, cast('1/23/2019' as date), cast('4/23/2019'  as date), 30),
(119, 223, cast('1/23/2019' as date), cast('3/1/2019'   as date), 35),
(120, 224, cast('7/19/2019' as date), cast('7/19/2018'  as date),  5),
(121, 224, cast('7/19/2019' as date), cast('7/24/2018'  as date), 13)

create table patient (
    EpisodeID int not null,
    PatientName varchar(128) not null,
    TransplantDate date not null
)

insert patient
values
(222, 'Alphers, Ralph', cast('5/2/2018'  as date)),
(223, 'Bethe, Hans',    cast('1/23/2019' as date)),
(224, 'Gammow, George', cast('7/19/2019' as date))


select q.EpisodeID
, min(q.[1YrPre]  ) as '1YrPre'
, min(q.[3MoPre]  ) as '3MoPre'
, min(q.[1MoPre]  ) as '1MoPre'
, min(q.[1DayPost]) as '1DayPost'
, min(q.[1WkPost] ) as '1WkPost'
, min(q.[1MoPost] ) as '1MoPost'
, min(q.[3MoPost] ) as '3MoPost'
, min(q.[6MoPost] ) as '6MoPost'
, min(q.[1YrPost] ) as '1YrPost'

from (
    select r.OrderID
    , r.EpisodeID
    , case when r.[1YrPreCheck]   = m.[1YrPreCheck]   and m.[1YrPreCheck]   <= 30 then r.LabValue end as '1YrPre'
    , case when r.[3MoPreCheck]   = m.[3MoPreCheck]   and m.[3MoPreCheck]   <= 14 then r.LabValue end as '3MoPre'
    , case when r.[1MoPreCheck]   = m.[1MoPreCheck]   and m.[1MoPreCheck]   <=  7 then r.LabValue end as '1MoPre'
    , case when r.[1DayPostCheck] = m.[1DayPostCheck] and m.[1DayPostCheck] <=  1 then r.LabValue end as '1DayPost'
    , case when r.[1WkPostCheck]  = m.[1WkPostCheck]  and m.[1WkPostCheck]  <=  1 then r.LabValue end as '1WkPost'
    , case when r.[1MoPostCheck]  = m.[1MoPostCheck]  and m.[1MoPostCheck]  <=  7 then r.LabValue end as '1MoPost'
    , case when r.[6MoPostCheck]  = m.[3MoPostCheck]  and m.[3MoPostCheck]  <= 14 then r.LabValue end as '3MoPost'
    , case when r.[6MoPostCheck]  = m.[6MoPostCheck]  and m.[6MoPostCheck]  <= 30 then r.LabValue end as '6MoPost'
    , case when r.[1YrPostCheck]  = m.[1YrPostCheck]  and m.[1YrPostCheck]  <= 30 then r.LabValue end as '1YrPost'

    from (
        select p.EpisodeID
        , min(abs(datediff(day, l.LabDate, dateadd(year,  -1, p.TransplantDate)))) as '1YrPreCheck'
        , min(abs(datediff(day, l.LabDate, dateadd(month, -3, p.TransplantDate)))) as '3MoPreCheck'
        , min(abs(datediff(day, l.LabDate, dateadd(month, -1, p.TransplantDate)))) as '1MoPreCheck'
        , min(abs(datediff(day, l.LabDate, dateadd(day,    1, p.TransplantDate)))) as '1DayPostCheck'
        , min(abs(datediff(day, l.LabDate, dateadd(day,    7, p.TransplantDate)))) as '1WkPostCheck'
        , min(abs(datediff(day, l.LabDate, dateadd(month,  1, p.TransplantDate)))) as '1MoPostCheck'
        , min(abs(datediff(day, l.LabDate, dateadd(month,  3, p.TransplantDate)))) as '3MoPostCheck'
        , min(abs(datediff(day, l.LabDate, dateadd(month,  6, p.TransplantDate)))) as '6MoPostCheck'
        , min(abs(datediff(day, l.LabDate, dateadd(year,   1, p.TransplantDate)))) as '1YrPostCheck'

        from labs l
          inner join patient p on p.EpisodeID = l.EpisodeID

        group by p.EpisodeID
    ) m
      inner join (
        select l.OrderID
        , p.EpisodeID
        , l.LabValue
        , abs(datediff(day, l.LabDate, dateadd(year,  -1, p.TransplantDate))) as '1YrPreCheck'
        , abs(datediff(day, l.LabDate, dateadd(month, -3, p.TransplantDate))) as '3MoPreCheck'
        , abs(datediff(day, l.LabDate, dateadd(month, -1, p.TransplantDate))) as '1MoPreCheck'
        , abs(datediff(day, l.LabDate, dateadd(day,    1, p.TransplantDate))) as '1DayPostCheck'
        , abs(datediff(day, l.LabDate, dateadd(day,    7, p.TransplantDate))) as '1WkPostCheck'
        , abs(datediff(day, l.LabDate, dateadd(month,  1, p.TransplantDate))) as '1MoPostCheck'
        , abs(datediff(day, l.LabDate, dateadd(month,  3, p.TransplantDate))) as '3MoPostCheck'
        , abs(datediff(day, l.LabDate, dateadd(month,  6, p.TransplantDate))) as '6MoPostCheck'
        , abs(datediff(day, l.LabDate, dateadd(year,   1, p.TransplantDate))) as '1YrPostCheck'

        from labs l
      inner join patient p on p.EpisodeID = l.EpisodeID
    ) r on r.EpisodeID = m.EpisodeID
)q 

group by q.EpisodeID

我会把它放在评论下,但我需要更多的声望点才能发表评论。也许主持人可以为我移动这个。

首先,

1 - 您需要确定当实验室不属于上述任何类别时该怎么做。例如,如果实验室日期是 6 个月,您会怎么做。您希望在哪里报告 6 个月的实验室?在上面的示例中,您丢失了 EpisodeID 222 中的一些数据。根据我的经验,您应该在某处报告它 - 即使它是一个需要调查的综合桶。

2 - 当您在同一时间段有 2 份报告时,您需要确定要做什么。使用 EpisodeID 222,您将看到在前 90 天内有 2 个实验室。 1月22日和1月27日都属于那个时期。

3 - 您在两个表中有相似的数据。 TransplantDate 应该只在您的 PatientTable

最好的选择是简单的数据透视表(交叉表)查询。如果您可以通过回答上面的 1 和 2 更好地定义您的数据,那么您将更进一步地完成这项工作。

由于对之前关于数据的回答的回应,我添加了一个不同的回答。

我会制作一个 table 带有日期落入的桶。这样,如果有人请求不同的桶,添加起来很简单。

CREATE TABLE [dbo].[table_Buckets](
    [Bucket] [varchar](50) NULL,
    [NumDaysLow] [int] NULL,
    [NumDaysHigh] [int] NULL
) ON [PRIMARY]

GO
SET ANSI_PADDING OFF
GO
INSERT [dbo].[table_Buckets] ([Bucket], [NumDaysLow], [NumDaysHigh]) VALUES (N'Pre-1Yr', -395, -335)
GO
INSERT [dbo].[table_Buckets] ([Bucket], [NumDaysLow], [NumDaysHigh]) VALUES (N'Pre-3Mth', -105, -75)
GO
INSERT [dbo].[table_Buckets] ([Bucket], [NumDaysLow], [NumDaysHigh]) VALUES (N'Pre-1Mth', -37, -21)
GO
INSERT [dbo].[table_Buckets] ([Bucket], [NumDaysLow], [NumDaysHigh]) VALUES (N'Post-1Day', 0, 2)
GO
INSERT [dbo].[table_Buckets] ([Bucket], [NumDaysLow], [NumDaysHigh]) VALUES (N'Post-1Wk', 6, 8)
GO
INSERT [dbo].[table_Buckets] ([Bucket], [NumDaysLow], [NumDaysHigh]) VALUES (N'Post-1Mth', 21, 37)
GO
INSERT [dbo].[table_Buckets] ([Bucket], [NumDaysLow], [NumDaysHigh]) VALUES (N'Post-3Mth', 76, 104)
GO
INSERT [dbo].[table_Buckets] ([Bucket], [NumDaysLow], [NumDaysHigh]) VALUES (N'Post-6Mth', 150, 210)
GO
INSERT [dbo].[table_Buckets] ([Bucket], [NumDaysLow], [NumDaysHigh]) VALUES (N'Post-1Yr', 335, 395)
GO

现在您可以 运行 以下 sql 查询将获取数据,将桶日期放入剧集,获取每个桶的最小数字,然后旋转 table到您想要的视图。您将必须围绕此结构设计数据。

select
   EpisodeID
  ,[Pre-1Yr]
  ,[Pre-3Mth]
  ,[Pre-1Mth]
  ,[Post-1Day]
  ,[Post-1Wk]
  ,[Post-1Mth]
  ,[Post-3Mth]
  ,[Post-6Mth]
  ,[Post-1Yr]

from 
(
  --this select statement takes the lowest value if there are more than one value per bucket
  select main.EpisodeID, main.Bucket, min(main.LabValue) as LabValue from
    (--this select statement assigns the episode to a buckets 
     select
        ml.EpisodeID
        , (select Bucket from
                table_Buckets
            where 
                    NumDaysLow  <= datediff(d,pd.TransplantDate, ml.LabDate)
                and NumDaysHigh >= datediff(d,pd.TransplantDate, ml.LabDate)
            ) AS Bucket
        , ml.LabValue as LabValue

    from 
        table_MainLab ML, 
        table_PatientData PD where ml.EpisodeID = pd.EpisodeID
    ) main
group by EpisodeID, Bucket) s


pivot
(avg(LabValue)
for [Bucket] in
  ([Pre-1Yr]
  ,[Pre-3Mth]
  ,[Pre-1Mth]
  ,[Post-1Day]
  ,[Post-1Wk]
  ,[Post-1Mth]
  ,[Post-3Mth]
  ,[Post-6Mth]
  ,[Post-1Yr])
 ) as pivottable

我是 posting 的新手,我还没有弄清楚如何将输出放在这个 post :(...我会练习