使用 window 函数计算百分位数
Percentile calculation with a window function
我知道您可以使用 window 函数获取数据子集的平均值、总计、最小值和最大值。但是,是否可以使用 window 函数获得中位数或第 25 个百分位数而不是平均值?
换句话说,我如何重写它以获取 id 和每个地区第 25 或第 50 个百分位数的销售数字而不是平均值?
SELECT id, avg(sales)
OVER (PARTITION BY district) AS district_average
FROM t
您可以使用 percentile_cont()
或 percentile_disc()
将其编写为聚合函数:
select district, percentile_cont(0.25) within group (order by sales)
from t
group by district;
不幸的是,Postgres 目前不支持这些作为 window 函数:
select id, percentile_cont(0.25) within group (order by sales) over (partition by district)
from t;
因此,您可以使用 join
:
select t.*, p_25, p_75
from t join
(select district,
percentile_cont(0.25) within group (order by sales) as p_25,
percentile_cont(0.75) within group (order by sales) as p_75
from t
group by district
) td
on t.district = td.district
我知道您可以使用 window 函数获取数据子集的平均值、总计、最小值和最大值。但是,是否可以使用 window 函数获得中位数或第 25 个百分位数而不是平均值?
换句话说,我如何重写它以获取 id 和每个地区第 25 或第 50 个百分位数的销售数字而不是平均值?
SELECT id, avg(sales)
OVER (PARTITION BY district) AS district_average
FROM t
您可以使用 percentile_cont()
或 percentile_disc()
将其编写为聚合函数:
select district, percentile_cont(0.25) within group (order by sales)
from t
group by district;
不幸的是,Postgres 目前不支持这些作为 window 函数:
select id, percentile_cont(0.25) within group (order by sales) over (partition by district)
from t;
因此,您可以使用 join
:
select t.*, p_25, p_75
from t join
(select district,
percentile_cont(0.25) within group (order by sales) as p_25,
percentile_cont(0.75) within group (order by sales) as p_75
from t
group by district
) td
on t.district = td.district