case sql 语句中超出范围间隔的总和
Sum over range interval within case sql statment
我正在尝试获取每位客户开始日期之后每个日期的平均支出(这是为了进行新近度-频率-货币分析)。这是下面的 monetary_value 元素,我想得到客户开始日期之后所有交易的总和除以他们购买的天数。我正在使用 Oracle 12c。
我有以下有效的方法,但包括完整的日期范围。
RFM AS (
SELECT SRC_USER_ID,
COUNT(distinct PICKUP_DATE) -1 as frequency,
(MAX(PICKUP_DATE) - MIN(PICKUP_DATE)) as recency,
(TO_DATE ('2018/05/12', 'yyyy/mm/dd') - MIN(PICKUP_DATE)) as T,
(CASE WHEN COUNT(distinct PICKUP_DATE)-1=0 THEN 0 ELSE
SUM(PRICE_TOTAL)/COUNT(distinct PICKUP_DATE) END) AS monetary_value
FROM TRANSACTIONS
group by SRC_USER_ID
我了解到我需要使用 Window 聚合函数 (https://ss64.com/ora/syntax-analytic-aggregate.html)。但是当我尝试下面的方法时它不起作用。
RFM AS (
SELECT SRC_USER_ID,
COUNT(distinct PICKUP_DATE) -1 as frequency,
(MAX(PICKUP_DATE) - MIN(PICKUP_DATE)) as recency,
(TO_DATE ('2018/05/12', 'yyyy/mm/dd') - MIN(PICKUP_DATE)) as T,
(CASE WHEN COUNT(distinct PICKUP_DATE)-1=0 THEN 0 ELSE
SUM(PRICE_TOTAL) OVER (ORDER BY PICKUP_DATE) RANGE INTERVAL '1' DAY FOLLOWING UNBOUNDED/COUNT(distinct PICKUP_DATE) END) AS monetary_value
FROM TRANSACTIONS
group by SRC_USER_ID
如有任何帮助,我们将不胜感激。
学习解析函数时,看看 documentation and oracle-base 中的示例可能是个好主意。这是一个小测试 table,其中 3 列的名称与您查询中的列相似。 (注意:日期和价格是随机值。)
create table transactions
as
select
mod( level, 3 ) + 1 as srcuserid
, to_date( trunc( dbms_random.value( 2451925, 2458258 ) ), 'J' ) pickupdate
, round( dbms_random.value() * 10000, 2 ) pricetotal
from dual
connect by level <= 12 ;
select * from transactions order by srcuserid, pickupdate ;
SRCUSERID PICKUPDATE PRICETOTAL
1 27-JUL-03 9447.05
1 04-APR-05 9595.6
1 28-SEP-07 408.09
1 16-AUG-13 5643.33
2 20-JAN-01 6253.87
2 26-OCT-05 5981.7
2 16-DEC-08 8138.03
2 20-JUL-17 49.67
3 08-AUG-03 7411.74
3 29-OCT-06 2218.95
3 11-FEB-10 111.07
3 26-JUL-17 600.15
12 rows selected.
为了开发您的查询,请尝试使用分析函数来计算所有列的值(根据需要)。避免为此使用 GROUP BY,因为在这种情况下它会抛出 "not a GROUP BY expression" 错误。此外,您会发现结果集包含原始 table 中每一行的一行。您可以在这里使用 DISTINCT,因为我们只处理聚合。
select distinct -- without "distinct", you'll get a multiple identical rows "per window"
srcuserid
, count( pickupdate ) over ( partition by srcuserid ) as frequency
, max( pickupdate ) over ( partition by srcuserid ) as max_date
, min( pickupdate ) over ( partition by srcuserid ) as min_date
, sum( pricetotal ) over ( partition by srcuserid ) as sum_pricetotal
from transactions
-- group by srcuserid -- ORA-00979: not a GROUP BY expression
;
SRCUSERID FREQUENCY MAX_DATE MIN_DATE SUM_PRICETOTAL
2 4 20-JUL-17 20-JAN-01 20423.27
3 4 26-JUL-17 08-AUG-03 10341.91
1 4 16-AUG-13 27-JUL-03 25094.07
一旦这(有点)起作用,将查询用作内联视图,并向外部添加一些收尾工作 SELECT。请注意,此处的最终查询还使用 first_value() - 这可能是您查找 "window" 第一个条目的一种方式。
select
srcuserid
, count_ - 1 as frequency
, max_date - min_date as recency
, trunc( sysdate - min_date ) as T
, case
when count_ - 1 = 0 then 0
else round( ( sum_pricetotal - firstpricetotal ) / ( count_ - 1 ), 2 )
end as monetary_value
from (
select distinct
srcuserid
, count( pickupdate ) over ( partition by srcuserid ) as count_
, max( pickupdate ) over ( partition by srcuserid ) as max_date
, min( pickupdate ) over ( partition by srcuserid ) as min_date
, sum( pricetotal ) over ( partition by srcuserid ) as sum_pricetotal
-- first_value(): find the first ie oldest "pricetotal" for each client
, first_value( pricetotal ) over (
partition by srcuserid order by pickupdate ) as firstpricetotal
from transactions
)
;
-- result
SRCUSERID FREQUENCY RECENCY T MONETARY_VALUE
2 3 6025 6328 4723.13
3 3 5101 5398 976.72
1 3 3673 5410 5215.67
另请参阅:dbfiddle here。
我正在尝试获取每位客户开始日期之后每个日期的平均支出(这是为了进行新近度-频率-货币分析)。这是下面的 monetary_value 元素,我想得到客户开始日期之后所有交易的总和除以他们购买的天数。我正在使用 Oracle 12c。
我有以下有效的方法,但包括完整的日期范围。
RFM AS (
SELECT SRC_USER_ID,
COUNT(distinct PICKUP_DATE) -1 as frequency,
(MAX(PICKUP_DATE) - MIN(PICKUP_DATE)) as recency,
(TO_DATE ('2018/05/12', 'yyyy/mm/dd') - MIN(PICKUP_DATE)) as T,
(CASE WHEN COUNT(distinct PICKUP_DATE)-1=0 THEN 0 ELSE
SUM(PRICE_TOTAL)/COUNT(distinct PICKUP_DATE) END) AS monetary_value
FROM TRANSACTIONS
group by SRC_USER_ID
我了解到我需要使用 Window 聚合函数 (https://ss64.com/ora/syntax-analytic-aggregate.html)。但是当我尝试下面的方法时它不起作用。
RFM AS (
SELECT SRC_USER_ID,
COUNT(distinct PICKUP_DATE) -1 as frequency,
(MAX(PICKUP_DATE) - MIN(PICKUP_DATE)) as recency,
(TO_DATE ('2018/05/12', 'yyyy/mm/dd') - MIN(PICKUP_DATE)) as T,
(CASE WHEN COUNT(distinct PICKUP_DATE)-1=0 THEN 0 ELSE
SUM(PRICE_TOTAL) OVER (ORDER BY PICKUP_DATE) RANGE INTERVAL '1' DAY FOLLOWING UNBOUNDED/COUNT(distinct PICKUP_DATE) END) AS monetary_value
FROM TRANSACTIONS
group by SRC_USER_ID
如有任何帮助,我们将不胜感激。
学习解析函数时,看看 documentation and oracle-base 中的示例可能是个好主意。这是一个小测试 table,其中 3 列的名称与您查询中的列相似。 (注意:日期和价格是随机值。)
create table transactions
as
select
mod( level, 3 ) + 1 as srcuserid
, to_date( trunc( dbms_random.value( 2451925, 2458258 ) ), 'J' ) pickupdate
, round( dbms_random.value() * 10000, 2 ) pricetotal
from dual
connect by level <= 12 ;
select * from transactions order by srcuserid, pickupdate ;
SRCUSERID PICKUPDATE PRICETOTAL
1 27-JUL-03 9447.05
1 04-APR-05 9595.6
1 28-SEP-07 408.09
1 16-AUG-13 5643.33
2 20-JAN-01 6253.87
2 26-OCT-05 5981.7
2 16-DEC-08 8138.03
2 20-JUL-17 49.67
3 08-AUG-03 7411.74
3 29-OCT-06 2218.95
3 11-FEB-10 111.07
3 26-JUL-17 600.15
12 rows selected.
为了开发您的查询,请尝试使用分析函数来计算所有列的值(根据需要)。避免为此使用 GROUP BY,因为在这种情况下它会抛出 "not a GROUP BY expression" 错误。此外,您会发现结果集包含原始 table 中每一行的一行。您可以在这里使用 DISTINCT,因为我们只处理聚合。
select distinct -- without "distinct", you'll get a multiple identical rows "per window"
srcuserid
, count( pickupdate ) over ( partition by srcuserid ) as frequency
, max( pickupdate ) over ( partition by srcuserid ) as max_date
, min( pickupdate ) over ( partition by srcuserid ) as min_date
, sum( pricetotal ) over ( partition by srcuserid ) as sum_pricetotal
from transactions
-- group by srcuserid -- ORA-00979: not a GROUP BY expression
;
SRCUSERID FREQUENCY MAX_DATE MIN_DATE SUM_PRICETOTAL
2 4 20-JUL-17 20-JAN-01 20423.27
3 4 26-JUL-17 08-AUG-03 10341.91
1 4 16-AUG-13 27-JUL-03 25094.07
一旦这(有点)起作用,将查询用作内联视图,并向外部添加一些收尾工作 SELECT。请注意,此处的最终查询还使用 first_value() - 这可能是您查找 "window" 第一个条目的一种方式。
select
srcuserid
, count_ - 1 as frequency
, max_date - min_date as recency
, trunc( sysdate - min_date ) as T
, case
when count_ - 1 = 0 then 0
else round( ( sum_pricetotal - firstpricetotal ) / ( count_ - 1 ), 2 )
end as monetary_value
from (
select distinct
srcuserid
, count( pickupdate ) over ( partition by srcuserid ) as count_
, max( pickupdate ) over ( partition by srcuserid ) as max_date
, min( pickupdate ) over ( partition by srcuserid ) as min_date
, sum( pricetotal ) over ( partition by srcuserid ) as sum_pricetotal
-- first_value(): find the first ie oldest "pricetotal" for each client
, first_value( pricetotal ) over (
partition by srcuserid order by pickupdate ) as firstpricetotal
from transactions
)
;
-- result
SRCUSERID FREQUENCY RECENCY T MONETARY_VALUE
2 3 6025 6328 4723.13
3 3 5101 5398 976.72
1 3 3673 5410 5215.67
另请参阅:dbfiddle here。