case sql 语句中超出范围间隔的总和

Sum over range interval within case sql statment

我正在尝试获取每位客户开始日期之后每个日期的平均支出(这是为了进行新近度-频率-货币分析)。这是下面的 monetary_value 元素,我想得到客户开始日期之后所有交易的总和除以他们购买的天数。我正在使用 Oracle 12c。

我有以下有效的方法,但包括完整的日期范围。

RFM AS (
SELECT SRC_USER_ID,
  COUNT(distinct PICKUP_DATE) -1 as frequency,
  (MAX(PICKUP_DATE) - MIN(PICKUP_DATE)) as recency,
  (TO_DATE ('2018/05/12', 'yyyy/mm/dd') - MIN(PICKUP_DATE)) as T,
  (CASE WHEN COUNT(distinct PICKUP_DATE)-1=0 THEN 0 ELSE
         SUM(PRICE_TOTAL)/COUNT(distinct PICKUP_DATE) END) AS monetary_value
FROM TRANSACTIONS
group by SRC_USER_ID

我了解到我需要使用 Window 聚合函数 (https://ss64.com/ora/syntax-analytic-aggregate.html)。但是当我尝试下面的方法时它不起作用。

RFM AS (
SELECT SRC_USER_ID,
  COUNT(distinct PICKUP_DATE) -1 as frequency,
  (MAX(PICKUP_DATE) - MIN(PICKUP_DATE)) as recency,
  (TO_DATE ('2018/05/12', 'yyyy/mm/dd') - MIN(PICKUP_DATE)) as T,
  (CASE WHEN COUNT(distinct PICKUP_DATE)-1=0 THEN 0 ELSE
    SUM(PRICE_TOTAL) OVER (ORDER BY PICKUP_DATE) RANGE INTERVAL '1' DAY FOLLOWING UNBOUNDED/COUNT(distinct PICKUP_DATE) END) AS monetary_value
FROM TRANSACTIONS
group by SRC_USER_ID

如有任何帮助,我们将不胜感激。

学习解析函数时,看看 documentation and oracle-base 中的示例可能是个好主意。这是一个小测试 table,其中 3 列的名称与您查询中的列相似。 (注意:日期和价格是随机值。)

create table transactions
as
select
  mod( level, 3 ) + 1 as srcuserid
, to_date( trunc( dbms_random.value( 2451925, 2458258 ) ), 'J' ) pickupdate
, round( dbms_random.value() * 10000, 2 ) pricetotal
from dual
connect by level <= 12 ;

select * from transactions order by srcuserid, pickupdate ;

SRCUSERID  PICKUPDATE  PRICETOTAL  
1          27-JUL-03   9447.05     
1          04-APR-05   9595.6      
1          28-SEP-07   408.09      
1          16-AUG-13   5643.33     
2          20-JAN-01   6253.87     
2          26-OCT-05   5981.7      
2          16-DEC-08   8138.03     
2          20-JUL-17   49.67       
3          08-AUG-03   7411.74     
3          29-OCT-06   2218.95     
3          11-FEB-10   111.07      
3          26-JUL-17   600.15  

12 rows selected. 

为了开发您的查询,请尝试使用分析函数来计算所有列的值(根据需要)。避免为此使用 GROUP BY,因为在这种情况下它会抛出 "not a GROUP BY expression" 错误。此外,您会发现结果集包含原始 table 中每一行的一行。您可以在这里使用 DISTINCT,因为我们只处理聚合。

select distinct -- without "distinct", you'll get a multiple identical rows "per window"
  srcuserid
, count( pickupdate ) over ( partition by srcuserid ) as frequency
, max( pickupdate ) over ( partition by srcuserid )   as max_date
, min( pickupdate ) over ( partition by srcuserid )   as min_date
, sum( pricetotal ) over ( partition by srcuserid )   as sum_pricetotal
from transactions 
-- group by srcuserid  -- ORA-00979: not a GROUP BY expression
;

SRCUSERID  FREQUENCY  MAX_DATE   MIN_DATE   SUM_PRICETOTAL  
2          4          20-JUL-17  20-JAN-01  20423.27        
3          4          26-JUL-17  08-AUG-03  10341.91        
1          4          16-AUG-13  27-JUL-03  25094.07 

一旦这(有点)起作用,将查询用作内联视图,并向外部添加一些收尾工作 SELECT。请注意,此处的最终查询还使用 first_value() - 这可能是您查找 "window" 第一个条目的一种方式。

select
  srcuserid
, count_ - 1          as frequency
, max_date - min_date as recency
, trunc( sysdate - min_date )  as T
, case
    when count_ - 1 = 0 then 0
    else round( ( sum_pricetotal - firstpricetotal ) / ( count_ - 1 ), 2 ) 
  end as monetary_value 
from (
  select distinct
    srcuserid
  , count( pickupdate ) over ( partition by srcuserid ) as count_
  , max( pickupdate ) over ( partition by srcuserid )   as max_date
  , min( pickupdate ) over ( partition by srcuserid )   as min_date
  , sum( pricetotal ) over ( partition by srcuserid )   as sum_pricetotal
-- first_value(): find the first ie oldest "pricetotal" for each client
  , first_value( pricetotal ) over ( 
      partition by srcuserid order by pickupdate )      as firstpricetotal
  from transactions
) 
;

-- result
SRCUSERID  FREQUENCY  RECENCY  T     MONETARY_VALUE  
2          3          6025     6328  4723.13         
3          3          5101     5398  976.72          
1          3          3673     5410  5215.67 

另请参阅:dbfiddle here