pd.merge_asof 的 oracle 中的等效日期函数

Equivalent date function in oracle for pd.merge_asof

在 oracle 中是否有与 python pd.merge_asof 等效的日期函数?请看下面的例子:

Table A1

ID      Date
1       12/02/2020
2       11/23/2019
3       09/09/2021
3       10/12/2021

Table A2

ID      Date
3       09/12/2021

对于 ID = 3,A2 中的日期是 09/12/2021 table。当我尝试将此 ID 和日期与 A1 匹配时,只有 ID 匹配。所以我尝试添加一个逻辑来获得以下输出(因为它是最近的日期)。

输出

ID       Date        ID2      Date2
3     09/09/2021      3     09/12/2021

没有这样的功能(至少,据我所知),所以一种选择是对日期之间的差异进行排名并获取排名最高的行。像这样:

SQL> with
  2  -- sample data
  3  t1 (id, datum) as
  4    (select 1, date '2020-12-02' from dual union all
  5     select 2, date '2019-11-23' from dual union all
  6     select 3, date '2021-09-09' from dual union all
  7     select 3, date '2021-10-12' from dual
  8    ),
  9  t2 (id, datum) as
 10    (select 3, date '2021-09-12' from dual),
 11  --
 12  temp as
 13    -- rank difference of dates in ascending order
 14    (select b.id b_id, b.datum b_datum, a.id a_id, a.datum a_datum,
 15       rank() over (partition by a.id order by abs(b.datum - a.datum) asc) rnk
 16     from t1 a join t2 b on a.id = b.id
 17    )
 18  -- value you want is ranked the "highest"
 19  select a_id, a_datum,
 20         b_id, b_datum
 21  from temp
 22  where rnk = 1;

      A_ID A_DATUM          B_ID B_DATUM
---------- ---------- ---------- ----------
         3 09/09/2021          3 09/12/2021

SQL>

你可以这样做(加入+聚合):

with
  t1 (id, datum) as (
    select 1, date '2020-12-02' from dual union all
    select 2, date '2019-11-23' from dual union all
    select 3, date '2021-09-09' from dual union all
    select 3, date '2021-10-12' from dual
  )
, t2 (id, datum) as (
    select 3, date '2021-09-12' from dual
  )
select t2.id, 
       min(t1.datum) keep (dense_rank first
                 order by abs(t1.datum - t2.datum)) as date_1,
       t2.datum as date_2
from   t2 left outer join t1 on t1.id = t2.id
group  by t2.id, t2.datum
;

        ID DATE_1     DATE_2    
---------- ---------- ----------
         3 09-09-2021 12-09-2021

我只在输出中包含了一次 id - 因为你是通过 id 加入的,所以显示两次是没有意义的。

从 Oracle 12 开始,您可以使用 LATERAL JOINFETCH FIRST ROW ONLY:

SELECT a1.*,
       a2."DATE" AS date2
FROM   a2
       CROSS JOIN LATERAL (
         SELECT a1.*
         FROM   a1
         WHERE  a1.id = a2.id
         ORDER BY ABS(a1."DATE" - a2."DATE")
         FETCH FIRST ROW ONLY
       ) a1

其中,对于示例数据:

CREATE TABLE A1 (ID, "DATE") AS
SELECT 1, DATE '2020-12-02' FROM DUAL UNION ALL
SELECT 2, DATE '2019-11-23' FROM DUAL UNION ALL
SELECT 3, DATE '2021-09-09' FROM DUAL UNION ALL
SELECT 3, DATE '2021-10-12' FROM DUAL;

CREATE TABLE A2 (ID, "DATE") AS
SELECT 3, DATE '2021-09-12' FROM DUAL;

输出:

ID DATE DATE2
3 2021-09-09 00:00:00 2021-09-12 00:00:00

如果 A2 中的同一个 ID 有多行,那么它们将分别与最接近的行相匹配:

例如,之后:

INSERT INTO a2 (id, "DATE") VALUES (3, DATE '2025-01-01');

那么上面的查询会输出:

ID DATE DATE2
3 2021-09-09 00:00:00 2021-09-12 00:00:00
3 2021-10-12 00:00:00 2025-01-01 00:00:00

db<>fiddle here