运行 使用 Over Partition By 计数不同

Running Count Distinct using Over Partition By

我有一个数据集,其中包含一段时间内进行过购买的用户 ID。我想显示 YTD 不同的已购买用户数,按州和国家/地区划分。输出将有 4 列:Country、State、Year、Month、YTD Count of Distinct Users with purchase activity。

有办法吗?当我从视图中排除月份并进行不同计数时,以下代码有效:

Select Year, Country, State,
   COUNT(DISTINCT (CASE WHEN ActiveUserFlag > 0 THEN MBR_ID END)) AS YTD_Active_Member_Count
From MemberActivity
Where Month <= 5
Group By 1,2,3;

当用户跨多个月进行购买时会出现此问题,因为我无法按月汇总然后求和,因为它会重复用户计数。

为了趋势分析,我需要查看一年中每个月的年初至今计数。

第一个月出现的用户数:

select Country, State, year, month,
       sum(case when ActiveUserFlag > 0 and seqnum = 1 then 1 else 0 end) as YTD_Active_Member_Count
from (select ma.*,
             row_number() over (partition by year order by month) as seqnum
      from MemberActivity ma
     ) ma
where Month <= 5
group by Country, State, year, month;

Return每个会员在第一个月只购买一次,按月计算然后应用累计金额:

select Year, Country, State, month,
   sum(cnt)
   over (partition by Year, Country, State
         order by month
         rows unbounded preceding) AS YTD_Active_Member_Count
from
  (
    Select Year, Country, State, month,
       COUNT(*) as cnt -- 1st purchses per month
    From 
     ( -- this assumes there's at least one new active member per year/month/country
       -- otherwise there would be mising rows 
       Select *
       from MemberActivity
       where ActiveUserFlag > 0 -- only active members
         and Month <= 5
         -- and year = 2019 -- seems to be for this year only
       qualify row_number() -- only first purchase per member/year
               over (partition by MBR_ID, year
                     order by month --? probably there's a purchase_date) = 1
     ) as dt
    group by 1,2,3,4
 ) as dt
;