查询以查找 2 个月内差异最大的国家/地区的顶级客户
Query to find top customers in countries with greatest difference in from 2 months
假设有以下列:customer_id、收入、国家/地区、日期
什么查询可以显示从 11 月到 12 月收入增长最快的顶级客户?
什么查询可以显示每个国家/地区从 11 月到 12 月收入增长最快的前 100 位客户?
可以使用 window 函数计算收入增长 lag()
:
select customer_id, revenue, date, country,
revenue - lag(revenue,1,revenue)
over (partition by customer_id order by date) as growth
from turnover
where extract(month from date) in (11,12)
lag(revenue,1,0.0)
将 return 上一行的收入。如果没有上一行,它将 return 当前行的收入。这导致每个客户的第一行增长 0。
现在有了 growth
列,这变成了 greatest-n-per-group 问题,通常也可以使用 window 函数解决。然而,由于 window 函数不能嵌套在单个查询中,您需要使用两层嵌套派生 tables:
select customer_id, revenue, date, country, diff_to_previous,
dense_rank() over (order by growth desc nulls last) as rnk
from (
select customer_id, revenue, date, country
revenue - lag(revenue,1,0.0) over (partition by customer_id order by date) as growth
from turnover
where extract(month from date) in (11,12)
) t1
这会根据增长情况为每一行分配一个等级。但是,您不能在 where
子句中直接使用 rnk
别名,这就是为什么要使用派生 table.
的附加级别
所以得到增长最高的客户的最终陈述是:
select *
from (
select customer_id, revenue, date, country, growth,
dense_rank() over (order by growth desc) as rnk
from (
select customer_id, revenue, date, country,
revenue - lag(revenue,1,0.0) over (partition by customer_id order by date) as growth
from turnover
where extract(month from date) in (11,12)
) t1
) t2
where rnk = 1;
要获得每个国家/地区的 100 个最高增长,您只需更改 rnk
的计算以按国家/地区执行此操作:
select *
from (
select customer_id, revenue, date, country, growth
dense_rank() over (partition by country order by growth desc) as rnk
from (
select customer_id, revenue, date, country,
revenue - lag(revenue,1,0.0) over (partition by customer_id order by date) as growth
from turnover
where extract(month from date) in (11,12)
) t1
) t2
where rnk <= 100;
date
是一个糟糕的列名称。不仅因为它是一个关键字,更重要的是它没有记录该列包含的内容。那是 "start date" 吗? "end date"?一个"purchase date"?一个"due date"?
假设有以下列:customer_id、收入、国家/地区、日期
什么查询可以显示从 11 月到 12 月收入增长最快的顶级客户?
什么查询可以显示每个国家/地区从 11 月到 12 月收入增长最快的前 100 位客户?
可以使用 window 函数计算收入增长 lag()
:
select customer_id, revenue, date, country,
revenue - lag(revenue,1,revenue)
over (partition by customer_id order by date) as growth
from turnover
where extract(month from date) in (11,12)
lag(revenue,1,0.0)
将 return 上一行的收入。如果没有上一行,它将 return 当前行的收入。这导致每个客户的第一行增长 0。
现在有了 growth
列,这变成了 greatest-n-per-group 问题,通常也可以使用 window 函数解决。然而,由于 window 函数不能嵌套在单个查询中,您需要使用两层嵌套派生 tables:
select customer_id, revenue, date, country, diff_to_previous,
dense_rank() over (order by growth desc nulls last) as rnk
from (
select customer_id, revenue, date, country
revenue - lag(revenue,1,0.0) over (partition by customer_id order by date) as growth
from turnover
where extract(month from date) in (11,12)
) t1
这会根据增长情况为每一行分配一个等级。但是,您不能在 where
子句中直接使用 rnk
别名,这就是为什么要使用派生 table.
所以得到增长最高的客户的最终陈述是:
select *
from (
select customer_id, revenue, date, country, growth,
dense_rank() over (order by growth desc) as rnk
from (
select customer_id, revenue, date, country,
revenue - lag(revenue,1,0.0) over (partition by customer_id order by date) as growth
from turnover
where extract(month from date) in (11,12)
) t1
) t2
where rnk = 1;
要获得每个国家/地区的 100 个最高增长,您只需更改 rnk
的计算以按国家/地区执行此操作:
select *
from (
select customer_id, revenue, date, country, growth
dense_rank() over (partition by country order by growth desc) as rnk
from (
select customer_id, revenue, date, country,
revenue - lag(revenue,1,0.0) over (partition by customer_id order by date) as growth
from turnover
where extract(month from date) in (11,12)
) t1
) t2
where rnk <= 100;
date
是一个糟糕的列名称。不仅因为它是一个关键字,更重要的是它没有记录该列包含的内容。那是 "start date" 吗? "end date"?一个"purchase date"?一个"due date"?