查询以查找 2 个月内差异最大的国家/地区的顶级客户

Query to find top customers in countries with greatest difference in from 2 months

假设有以下列:customer_id、收入、国家/地区、日期

什么查询可以显示从 11 月到 12 月收入增长最快的顶级客户?

什么查询可以显示每个国家/地区从 11 月到 12 月收入增长最快的前 100 位客户?

可以使用 window 函数计算收入增长 lag():

select customer_id, revenue, date, country,
       revenue - lag(revenue,1,revenue) 
                     over (partition by customer_id order by date) as growth
from turnover
where extract(month from date) in (11,12)

lag(revenue,1,0.0) 将 return 上一行的收入。如果没有上一行,它将 return 当前行的收入。这导致每个客户的第一行增长 0。

现在有了 growth 列,这变成了 问题,通常也可以使用 window 函数解决。然而,由于 window 函数不能嵌套在单个查询中,您需要使用两层嵌套派生 tables:

select customer_id, revenue, date, country, diff_to_previous,
       dense_rank() over (order by growth desc nulls last) as rnk
from (
  select customer_id, revenue, date, country
         revenue - lag(revenue,1,0.0) over (partition by customer_id order by date) as growth
  from turnover
  where extract(month from date) in (11,12)
) t1

这会根据增长情况为每一行分配一个等级。但是,您不能在 where 子句中直接使用 rnk 别名,这就是为什么要使用派生 table.

的附加级别

所以得到增长最高的客户的最终陈述是:

select *
from (
  select customer_id, revenue, date, country, growth, 
         dense_rank() over (order by growth desc) as rnk
  from (
    select customer_id, revenue, date, country, 
           revenue - lag(revenue,1,0.0) over (partition by customer_id order by date) as growth
    from turnover
    where extract(month from date) in (11,12)
  ) t1
) t2
where rnk = 1;

要获得每个国家/地区的 100 个最高增长,您只需更改 rnk 的计算以按国家/地区执行此操作:

select *
from (
  select customer_id, revenue, date, country, growth
         dense_rank() over (partition by country order by growth desc) as rnk
  from (
    select customer_id, revenue, date, country, 
           revenue - lag(revenue,1,0.0) over (partition by customer_id order by date) as growth
    from turnover
    where extract(month from date) in (11,12)
  ) t1
) t2
where rnk <= 100;

date 是一个糟糕的列名称。不仅因为它是一个关键字,更重要的是它没有记录该列包含的内容。那是 "start date" 吗? "end date"?一个"purchase date"?一个"due date"?