如何Return记录等于Transact-SQL中聚合的特定百分比?

How to Return Records Equal to a Specific Percentage of an Aggregate in Transact-SQL?

我的要求是提供一个随机的索赔样本,该样本占支付总额的 2.5%,也占特定人群索赔总额的 2.5%。目标是在满足这两个条件的报告中提供记录。我的分期table定义如下:

[RecordId] UniqueIdentifier NOT NULL PRIMARY KEY DEFAULT NEWID()
,ClaimNO varchar(50)
,Company_ID varchar(10)
,HPCode varchar(10)
,FinancialResponsibility varchar(30)
,ProviderType varchar(50)
,DateOfService date
,DatePaid date
,ClaimType varchar(50)
,TotalBilled numeric(11,2)
,TotalPaid numeric(11,2)
,ProcessorType varchar(100)

我已经建立了 return 索赔总数 2.5% 的逻辑,但需要有关如何最好地确保满足这两个标准的指导。

这是我迄今为止尝试过的方法:

with cteTotals as (
Select Count(*) as TotalClaims, sum(TotalPaid) as TotalPaid, sum(TotalPaid) * .025 as PaidSampleAmount
from [Z_Monthly_Quality_Review]
),

ctePopulation as (
    Select *
    from [Z_Monthly_Quality_Review]
),

cteSampleRows as (  
    select TOP 2.5 PERCENT NEWID() RandomID, RecordID, ClaimNo, HPCode, FinancialResponsibility, ProviderType, ProcessorType, 
    Format(DateOfService, 'MM/dd/yyyy') as DateOfService, Format(DatePaid, 'MM/dd/yyyy') as DatePaid, ClaimType, TotalBilled, TotalPaid  
    from [Z_Monthly_Quality_Review]  
    order by NEWID()
    ),

cteSamplePaid as (
    Select Top 2.5 PERCENT NEWID() RandomID, RecordID, ClaimNo, HPCode, FinancialResponsibility, ProviderType, ProcessorType,
    Format(DateOfService, 'MM/dd/yyyy') as DateOfService, Format(DatePaid, 'MM/dd/yyyy') as DatePaid, ClaimType, TotalBilled, TotalPaid  
    from [Z_Monthly_Quality_Review] mqr
    inner join ctePopulation cte on mqr.ClaimNo = cte.ClaimNO
    order by NEWID()
)

既然必须满足这两个标准,我应该如何构建两个 CTE 来确保这一点?在我的 cteSamplePaid 中,如何确保支付总额的总和等于总人口的 2.5%?这可以通过 Having 子句来实现吗?最终结果将通过 SQL Server Reporting Services 显示给我的业务用户。理想情况下,我想为他们提供 1 个满足这两个标准的样本。如果那不可能,我如何根据这两个标准随机抽取声明?

不要认为有保证的方式它会加起来达到总数的 2.5%。无法保证结果,并且性能会非常差,因为您基本上必须强制执行所有可能的行组合。非常接近您的目标的一种方法是使用 return 行加起来达到可接受的误差范围。

由于没有提供示例数据,我就用了AdventureWorks2017(从here下载)

USE AdventureWorks2017 
GO

DROP TABLE IF EXISTS #SalesData
SELECT SalesOrderID AS ID,TotalDue
INTO #SalesData
FROM Sales.SalesOrderHeader

Declare @DesiredPercentage Numeric(10,3) = .025 /*Desired sum percentage of total rows*/
        ,@AcceptableMargin Numeric(10,3) = .01 /*Random row total can be plus or minus this percentage of the desired sum*/
DECLARE @DesiredSum Numeric(16,2) =  @DesiredPercentage *(SELECT SUM(TotalDue) FROM #SalesData)

/*For loop*/
DECLARE @RowNum INT
    ,@LoopCounter INT = 1
    
WHILE (1=1)
BEGIN
    DROP TABLE IF EXISTS #RandomData
    SELECT RowNum = ROW_NUMBER() OVER (ORDER BY B.RandID),A.*,RunningTotal = SUM(TotalDue) OVER (ORDER BY B.RandID)
    INTO #RandomData
    FROM #SalesData AS A
    CROSS APPLY (SELECT RandID = NEWID()) AS B
    WHERE TotalDue < @DesiredSum /*If single row bigger than desired sum, then filter it out*/
    ORDER BY B.RandID

    SELECT Top(1) @RowNum = RowNum
    FROM #RandomData AS A
    CROSS APPLY (SELECT DeltaFromDesiredSum = ABS(RunningTotal-@DesiredSum)) AS B
    WHERE RunningTotal BETWEEN @DesiredSum *(1-@AcceptableMargin) AND @DesiredSum *(1+@AcceptableMargin)
    ORDER BY DeltaFromDesiredSum

    IF (@RowNum IS NOT NULL)
        BREAK;

    IF (@LoopCounter >=100) /*Prevents infinite loops*/
        THROW 59194,'Result unable to be generated in 100 tries. Recommend expanding acceptable margin',1;

    SET @LoopCounter +=1;
END

SELECT *
FROM #RandomData
WHERE RowNum <= @RowNum

SELECT RandomRowTotal = SUM(TotalDue)
    ,DesiredSum = @DesiredSum
    ,PercentageFromDesiredSum = Concat(Cast(Round(100*(1-SUM(TotalDue)/@DesiredSum),2) as Float),'%')
FROM #RandomData
WHERE RowNum <= @RowNum