分区中位数

Partitioned Median

我想知道是否有人可以帮助我尝试计算符合我认为的分组的中位数。

我喜欢下面的代码,但它只给出了每一行中的所有中位数。我想我需要使用 OVER(PARTITION BY()) 但即使在疯狂谷歌搜索和阅读像这样的知名文章后我也无法理解它 https://sqlperformance.com/2012/08/t-sql-queries/median

 `SELECT
 YEAR(reportsubmitted) as “Year Submitted”,
 Month(reportsubmitted) as “Month Submitted”, COUNT (DISTINCT(propertyid)) as 
 “Number of Reports Submitted”, SUM([report fee]) as “Total Report Fee”,

(
(SELECT MAX([days From Audit to Submission])

FROM (SELECT TOP 50 PERCENT ([days From Audit to Submission] )

FROM vwCMnAuditorsProcessLength WHERE ReportSubmitted > ‘2017-04-01’ ORDER BY 
[days From Audit to Submission] ) AS x)

(SELECT MIN([days From Audit to Submission])

FROM (SELECT TOP 50 PERCENT [days From Audit to Submission]
FROM vwCMnAuditorsProcessLength WHERE ReportSubmitted > ‘2017-04-01’ ORDER BY 
[Report Fee] DESC) AS y)  
) / 2.0 as “Median Days”

FROM vwCMnAuditorsProcessLength
WHERE reportsubmitted >= ‘2017-04-01’

GROUP BY MONTH(reportsubmitted), YEAR(reportsubmitted)`

我确实尝试了以下不同的东西,但它似乎打折了很多数据

SELECT

[MMYYYY ReportSubmitted],

[Total Report Fee],

[Number of Reports Submitted],

AVG([days from audit to submission]) as “Median days to Submission”

FROM (

SELECT [MMYYYY ReportSubmitted], [report fee], propertyid,
CAST([days from audit to submission] as decimal(5,2)) [days from audit to submission],

ROW_NUMBER() OVER(
Partition by [MMYYYY ReportSubmitted]
Order by [days from audit to submission] ASC) AS “RowASC”,

ROW_NUMBER() OVER(
Partition by [MMYYYY ReportSubmitted]
Order by [days from audit to submission] DESC) AS “RowDESC”,

SUM([report fee]) OVER(Partition by [MMYYYY ReportSubmitted] Order by [days from 
 audit to submission]) AS “Total Report Fee”,
COUNT(propertyid) OVER(Partition by [MMYYYY ReportSubmitted] Order by [days from audit to submission]) AS “Number of Reports Submitted”

FROM vwCMnAuditorsProcessLength) x

WHERE RowASC in (RowDESC,RowDESC-1,RowDESC+1)

 Group by [MMYYYY ReportSubmitted], [Total Report Fee], [Number of Reports Submitted]
Order by [MMYYYY ReportSubmitted]

如果有人有任何想法,我将非常感激

如果您不关心性能,那么最简单的方法就是最好的:

SELECT SalesPerson, Median = MAX(Median)
FROM
(
   SELECT SalesPerson,Median = PERCENTILE_CONT(0.5) WITHIN GROUP 
     (ORDER BY Amount) OVER (PARTITION BY SalesPerson)
   FROM dbo.Sales
) 
AS x
GROUP BY SalesPerson;

示例来自:https://sqlperformance.com/2014/02/t-sql-queries/grouped-median

如果你想要更简单的方法,我推荐 CRL 函数:

它可以让你像这样计算中位数:

SELECT dbo.Median(Field) FROM Table