Oracle:根据每行的过去 6 个月间隔计算 count()
Oracle: Calculate the count() based on the past 6 month interval for each rows
我有以下数据(数据是从2017年至今)
SELECT * FROM TABLE1 WHERE DATE > TO_DATE('01/01/2019','MM/DD/YYYY')
Emp_ID Date Vehicle_ID Working_Hours
1005 01/01/2019 X500 7
1005 01/02/2019 X500 6
1005 01/03/2019 X700 7
1005 01/04/2019 X500 5
1005 01/05/2019 X700 7
1005 01/06/2019 X500 7
1006 01/01/2019 X500 7
1006 01/02/2019 X500 6
1006 01/03/2019 X700 7
1006 01/04/2019 X500 5
1006 01/05/2019 X700 7
1006 01/06/2019 X500 7
我需要计算两列。
LAST_6M_UNIQ_Vehicle_Count ==> 该员工过去(过去)6 个月的唯一车辆 ID 计数
LAST_6M_Vehicle_Count ==> 该员工过去 6 个月的所有车辆 ID 计数
注意:从日期列开始过去 6 个月
预期输出:
Emp_ID Date Vehicle_ID Working_Hours LAST_6M_UNIQ_Vehicle_Count LAST_6M_Vehicle_Count
1005 01/01/2019 X500 7 6 66
1005 01/02/2019 X500 6 7 62
1005 01/03/2019 X700 7 6 63
1005 01/04/2019 X500 5 7 67
1005 01/05/2019 X700 7 7 66
1005 01/06/2019 X500 7 7 67
. . . .
. . . .
. . . .
1005 03/20/2019 X600 6 12 75
1006 01/01/2019 X500 7 11 74
1006 01/02/2019 X500 6 10 66
1006 01/03/2019 X700 7 11 72
1006 01/04/2019 X500 5 13 67
1006 01/05/2019 X700 7 12 64
1006 01/06/2019 X500 7 12 63
例如,在第一行中,LAST_6M_UNIQ_Vehicle_Count 的值为 6,因为对于员工 ID 1005,车辆 ID 在 ((01/01/2019) - 6 个月) 和01/01/2019 中有 6 个不同的车辆 ID。
我试过 Over 和 Partition by 但缺少 6 个月的间隔
SELECT t.*, COUNT(DISTINCT t.VEHICLE_ID) OVER (PARTITION BY t.EMP_ID ORDER BY t.DATE)
AS LAST_6M_UNIQ_Vehicle_Count
FROM TABLE1 t
我无法计算每行基于 6 个月间隔的值。
非常感谢您的帮助。
您可以使用window函数和范围框架规范来做到这一点。
计算非重复计数有点棘手:Oracle 不直接支持它,但我们可以分两步进行。首先在 employee/vehicle 个分区内执行 window 计数,然后仅考虑员工分区中每辆车的第一次出现。
所以:
select vehicle_id, emp_id, "DATE",
sum(case when flag = 1 then 1 else 0 end) over(
partition by emp_id
order by "DATE"
range between interval '6' month preceding and current row
) as last_6m_uniq_vehicle_count,
count(*) over (
partition by emp_id
order by "DATE"
range between interval '6' month preceding and current row
) as last_6m_vehicle_count
from (
select t.*,
count(*) over (
partition by emp_id , vehicle_id
order by "DATE"
range between interval '6' month preceding and current row
) as flag
from table_name t
) t
order by "DATE", vehicle_id
Oracle 不喜欢 COUNT( DISTINCT ... ) OVER ( ... )
在带范围的窗口分析函数中使用时会引发 ORA-30487: ORDER BY not allowed here
异常(否则,这就是解决方案)。它可以在没有 DISTINCT
关键字的情况下使用,但不能使用它。
相反,您可以使用相关 sub-query:
SELECT t.*,
( SELECT COUNT( DISTINCT vehicle_id )
FROM table_name c
WHERE c.emp_id = t.emp_id
AND c."DATE" <= t."DATE"
AND ADD_MONTHS( t."DATE", -6 ) <= c."DATE"
) AS last_6m_uniq_vehicle_count,
COUNT(t.vehicle_id) OVER (
PARTITION BY t.emp_id
ORDER BY t."DATE"
RANGE BETWEEN INTERVAL '6' MONTH PRECEDING
AND CURRENT ROW
) AS last_6m_vehicle_count
FROM table_name t
其中样本数据:
CREATE TABLE table_name ( vehicle_id, emp_id, "DATE" ) AS
SELECT 1, 1, DATE '2020-08-31' FROM DUAL UNION ALL
SELECT 2, 1, DATE '2020-07-31' FROM DUAL UNION ALL
SELECT 1, 1, DATE '2020-06-30' FROM DUAL UNION ALL
SELECT 2, 1, DATE '2020-05-31' FROM DUAL UNION ALL
SELECT 2, 1, DATE '2020-04-30' FROM DUAL UNION ALL
SELECT 2, 1, DATE '2020-03-31' FROM DUAL UNION ALL
SELECT 2, 1, DATE '2020-02-29' FROM DUAL UNION ALL
SELECT 2, 1, DATE '2020-01-31' FROM DUAL UNION ALL
SELECT 3, 1, DATE '2020-01-31' FROM DUAL;
输出:
VEHICLE_ID | EMP_ID | DATE | LAST_6M_UNIQ_VEHICLE_COUNT | LAST_6M_VEHICLE_COUNT
---------: | -----: | :-------- | -------------------------: | --------------------:
2 | 1 | 31-JAN-20 | 2 | 2
3 | 1 | 31-JAN-20 | 2 | 2
2 | 1 | 29-FEB-20 | 2 | 3
2 | 1 | 31-MAR-20 | 2 | 4
2 | 1 | 30-APR-20 | 2 | 5
2 | 1 | 31-MAY-20 | 2 | 6
1 | 1 | 30-JUN-20 | 3 | 7
2 | 1 | 31-JUL-20 | 3 | 8
1 | 1 | 31-AUG-20 | 2 | 7
db<>fiddle here
正如 MTO 指出的那样,count(distinct)
不能用作 window 函数来解决此问题。
出于这个原因,我会选择横向连接:
select t.*, l.*
from t cross join lateral
(select count(*) as last_6m_vehicle_count, count(distinct t.vehicle_id) as last_6m_uniq_vehicle_count
from t t2
where t2.emp_id = t.emp_id and
t2.dte <= t.dte and
t2.dte > add_months(t.dte, -6)
) l;
Here 是一个 db<>fiddle.
我有以下数据(数据是从2017年至今)
SELECT * FROM TABLE1 WHERE DATE > TO_DATE('01/01/2019','MM/DD/YYYY')
Emp_ID Date Vehicle_ID Working_Hours
1005 01/01/2019 X500 7
1005 01/02/2019 X500 6
1005 01/03/2019 X700 7
1005 01/04/2019 X500 5
1005 01/05/2019 X700 7
1005 01/06/2019 X500 7
1006 01/01/2019 X500 7
1006 01/02/2019 X500 6
1006 01/03/2019 X700 7
1006 01/04/2019 X500 5
1006 01/05/2019 X700 7
1006 01/06/2019 X500 7
我需要计算两列。 LAST_6M_UNIQ_Vehicle_Count ==> 该员工过去(过去)6 个月的唯一车辆 ID 计数 LAST_6M_Vehicle_Count ==> 该员工过去 6 个月的所有车辆 ID 计数 注意:从日期列开始过去 6 个月
预期输出:
Emp_ID Date Vehicle_ID Working_Hours LAST_6M_UNIQ_Vehicle_Count LAST_6M_Vehicle_Count
1005 01/01/2019 X500 7 6 66
1005 01/02/2019 X500 6 7 62
1005 01/03/2019 X700 7 6 63
1005 01/04/2019 X500 5 7 67
1005 01/05/2019 X700 7 7 66
1005 01/06/2019 X500 7 7 67
. . . .
. . . .
. . . .
1005 03/20/2019 X600 6 12 75
1006 01/01/2019 X500 7 11 74
1006 01/02/2019 X500 6 10 66
1006 01/03/2019 X700 7 11 72
1006 01/04/2019 X500 5 13 67
1006 01/05/2019 X700 7 12 64
1006 01/06/2019 X500 7 12 63
例如,在第一行中,LAST_6M_UNIQ_Vehicle_Count 的值为 6,因为对于员工 ID 1005,车辆 ID 在 ((01/01/2019) - 6 个月) 和01/01/2019 中有 6 个不同的车辆 ID。
我试过 Over 和 Partition by 但缺少 6 个月的间隔
SELECT t.*, COUNT(DISTINCT t.VEHICLE_ID) OVER (PARTITION BY t.EMP_ID ORDER BY t.DATE)
AS LAST_6M_UNIQ_Vehicle_Count
FROM TABLE1 t
我无法计算每行基于 6 个月间隔的值。
非常感谢您的帮助。
您可以使用window函数和范围框架规范来做到这一点。
计算非重复计数有点棘手:Oracle 不直接支持它,但我们可以分两步进行。首先在 employee/vehicle 个分区内执行 window 计数,然后仅考虑员工分区中每辆车的第一次出现。
所以:
select vehicle_id, emp_id, "DATE",
sum(case when flag = 1 then 1 else 0 end) over(
partition by emp_id
order by "DATE"
range between interval '6' month preceding and current row
) as last_6m_uniq_vehicle_count,
count(*) over (
partition by emp_id
order by "DATE"
range between interval '6' month preceding and current row
) as last_6m_vehicle_count
from (
select t.*,
count(*) over (
partition by emp_id , vehicle_id
order by "DATE"
range between interval '6' month preceding and current row
) as flag
from table_name t
) t
order by "DATE", vehicle_id
Oracle 不喜欢 COUNT( DISTINCT ... ) OVER ( ... )
在带范围的窗口分析函数中使用时会引发 ORA-30487: ORDER BY not allowed here
异常(否则,这就是解决方案)。它可以在没有 DISTINCT
关键字的情况下使用,但不能使用它。
相反,您可以使用相关 sub-query:
SELECT t.*,
( SELECT COUNT( DISTINCT vehicle_id )
FROM table_name c
WHERE c.emp_id = t.emp_id
AND c."DATE" <= t."DATE"
AND ADD_MONTHS( t."DATE", -6 ) <= c."DATE"
) AS last_6m_uniq_vehicle_count,
COUNT(t.vehicle_id) OVER (
PARTITION BY t.emp_id
ORDER BY t."DATE"
RANGE BETWEEN INTERVAL '6' MONTH PRECEDING
AND CURRENT ROW
) AS last_6m_vehicle_count
FROM table_name t
其中样本数据:
CREATE TABLE table_name ( vehicle_id, emp_id, "DATE" ) AS
SELECT 1, 1, DATE '2020-08-31' FROM DUAL UNION ALL
SELECT 2, 1, DATE '2020-07-31' FROM DUAL UNION ALL
SELECT 1, 1, DATE '2020-06-30' FROM DUAL UNION ALL
SELECT 2, 1, DATE '2020-05-31' FROM DUAL UNION ALL
SELECT 2, 1, DATE '2020-04-30' FROM DUAL UNION ALL
SELECT 2, 1, DATE '2020-03-31' FROM DUAL UNION ALL
SELECT 2, 1, DATE '2020-02-29' FROM DUAL UNION ALL
SELECT 2, 1, DATE '2020-01-31' FROM DUAL UNION ALL
SELECT 3, 1, DATE '2020-01-31' FROM DUAL;
输出:
VEHICLE_ID | EMP_ID | DATE | LAST_6M_UNIQ_VEHICLE_COUNT | LAST_6M_VEHICLE_COUNT ---------: | -----: | :-------- | -------------------------: | --------------------: 2 | 1 | 31-JAN-20 | 2 | 2 3 | 1 | 31-JAN-20 | 2 | 2 2 | 1 | 29-FEB-20 | 2 | 3 2 | 1 | 31-MAR-20 | 2 | 4 2 | 1 | 30-APR-20 | 2 | 5 2 | 1 | 31-MAY-20 | 2 | 6 1 | 1 | 30-JUN-20 | 3 | 7 2 | 1 | 31-JUL-20 | 3 | 8 1 | 1 | 31-AUG-20 | 2 | 7
db<>fiddle here
正如 MTO 指出的那样,count(distinct)
不能用作 window 函数来解决此问题。
出于这个原因,我会选择横向连接:
select t.*, l.*
from t cross join lateral
(select count(*) as last_6m_vehicle_count, count(distinct t.vehicle_id) as last_6m_uniq_vehicle_count
from t t2
where t2.emp_id = t.emp_id and
t2.dte <= t.dte and
t2.dte > add_months(t.dte, -6)
) l;
Here 是一个 db<>fiddle.