如何在 SQL 中使用 over-partition by query 来获取当前值、平均值和最大值？

Question

我有这个table，它显示了设备在某个区域和特定位置完成的一个点。

working_date    device   points   area   location
19-06-2020        a        1       x       xa   
19-06-2020        a        2       x       xa 
19-06-2020        a        3       x       xa 
19-06-2020        a        4       x       xa
20-06-2020        a        5       x       xa
20-06-2020        a        6       x       xa
20-06-2020        a        7       x       xa
20-06-2020        a        8       x       xa
20-06-2020        a        9       x       xa

我想获取按区域和位置分组的当前、平均和最大点数。如果我选择任何一天，当前数量将显示最近工作日期的数量。同时，平均数量将显示设备工作的总体平均值。最后，最大数量将显示设备完成的总体最大点数。

根据我的 table 以上，如果我选择 21-06-2020 那么期望的结果：

working_date  area  location   device   current_qty  avg_qty   max_qty
21-06-2020     x       xa        a         5           4,5        5

平均数量来自 total_qty / total_of_date，而最大数量来自所有日期的最大数量。

到目前为止我构建的查询是：

select t1.working_date, t1.device, t1.area, t1.location, t1.points_qty, t1.total_date,
sum(t1.pile_qty) over(partition by t1.working_date) / sum(t1.total_date) over(partition by t1.working_date) as avg_qty,
max(t1.pile_qty) over(partition by t1.working_date) as max_qty
from (
select working_date, device, points, area, location, count(points) as points_qty, count(distinct working_date) as total_date 
from table1 group by device, area, location
group by working_date, device, points, area, location) t1
group by working_date, device, points, area, location, pile_qty, total_date

通过上面的查询，我得到：

working_date  area  location   device   current_qty  avg_qty   max_qty
21-06-2020     x       xa        a         5           5          5

我应该如何编写查询以获得所需的结果？

提前致谢。

Answer 1

demo:db<>fiddle

SELECT
    *,
    AVG(current_qty) OVER () as avg_qty,             -- 2
    MAX(current_qty) OVER () as max_qty
FROM (
    SELECT 
        working_date,
        area,
        location,
        device,
        COUNT(*) as current_qty                      -- 1
    FROM mytable
    GROUP BY working_date, device, area, location    -- 1
) s
WHERE working_date <= '2020-06-21'                   -- 3
ORDER BY working_date DESC
LIMIT 1

对 working_date 值进行正常分组以计算日期的 qty 值。
使用整个分组数据集的这些 qty 值，使用无限制的 window 函数将 avg 和 max 数量的值添加到记录中
查找给定日期的最新数据集：过滤所有具有相同或更小日期值的记录，将这些日期中的最新日期排在最前面，return 仅排在最前面-大多数使用限制。

只有当您的每条记录的区域、位置和设备值都与示例中的相同时，分组才能正常工作。如果它们不同，您可以使用 COUNT() as window 函数而不是组聚合来将值添加到每个记录：

demo:db<>fiddle

SELECT
    *,
    AVG(current_qty) OVER () as avg_qty,
    MAX(current_qty) OVER () as max_qty
FROM (
    SELECT 
        working_date,
        area,
        location,
        device,
        COUNT(*) OVER (PARTITION BY working_date) as current_qty
    FROM mytable
) s
WHERE working_date <= '2020-06-21'
ORDER BY working_date DESC
LIMIT 1

但是，在那种情况下，不清楚应该获取 2020-06-20 组的五个记录中的哪一个。您必须应用您的排序标准才能将预期的排序到顶部。

Answer 2

我想，我有适合您的解决方案。但是，我不确定答案是否会在不同情况下提供正确的结果。下面是我的代码=> 请勾选link=>DB-FIDDLE LINK.

WITH CTE AS
    (
      SELECT working_date,area,location,device, 
             COUNT(working_date) GrpCount
      FROM MYTable 
      GROUP BY working_date,area,location,device
    
    ),y AS
    (SELECT area,location,device,GrpCount,
           (SELECT GrpCount FROM CTE WHERE working_date<TO_DATE('21-06-2020','DD-MM-YYYY') ORDER BY working_date DESC LIMIT 1)  current_qty  
    FROM CTE
    )
    SELECT TO_DATE('21-06-2020','DD-MM-YYYY'),area,location,device, 
           MAX(current_qty) current_qty,
           string_agg(GrpCount::text, ',') avg_qty,
           Max(GrpCount) max_qty
    FROM Y
    GROUP BY area,location,device

注意：-在这里，你可以看到，对于current_qty我已经使用你输入的日期21-06-2020像(SELECT GrpCount FROM CTE WHERE working_date<TO_DATE('21-06-2020','DD-MM-YYYY') ORDER BY working_date DESC LIMIT 1) current_qty 来查找当前数量。它给了我你的预期结果。请检查具有不同日期范围和数据范围的代码。

如何在 SQL 中使用 over-partition by query 来获取当前值、平均值和最大值？

How to use over - partition by query in SQL in order to get the current, average, and maximum value?

sql

postgresql

window-functions