6 个最高最小平均元素 postgresql

Question

我需要计算平均价格并将它们按 2 列分组。然后 select 前 2 个值 (PostgreSQL 10.1)。例如，我有以下结构：

------------------------------------------------------------------------------------------
        category        | shop_name |     price |      date     |
MSI GeForce RTX 2080    |amazon     |   62649   |   1/6/2019    |   
MSI GeForce RTX 2080    |amazon     |   58668   |   1/17/2019   |   
MSI GeForce RTX 2080    |amazon     |   62649   |   1/7/2019    |   
MSI GeForce RTX 2080    |amazon     |   60542   |   1/16/2019   |   
MSI GeForce RTX 2080    |amazon     |   62649   |   1/5/2019    |   
MSI GeForce RTX 2080    |brandstar  |   66456   |   1/16/2019   |   
MSI GeForce RTX 2080    |brandstar  |   66347   |   1/17/2019   |   
MSI GeForce RTX 2080    |brandstar  |   66456   |   1/16/2019   |   
MSI GeForce RTX 2080    |brigo      |   63300   |   1/17/2019   |   
MSI GeForce RTX 2080    |brigo      |   65330   |   1/16/2019   |   
MSI GeForce RTX 2080    |brigo      |   65330   |   1/16/2019   |
MSI GeForce RTX 2070    | fake_shop |   65330   |   1/16/2019   |
MSI GeForce RTX 2070    | fake_shop |   65330   |   1/17/2019   |
MSI GeForce RTX 2070    | fake_shop |   65330   |   1/18/2019   |

假设我想 select 类别的前 2 个平均结果和 shop_name。所以我希望得到以下结果：

        category        | shop_name |     price |      date     |     avg   |
MSI GeForce RTX 2080    |amazon     |   62649   |   1/6/2019    |   61431.4 |1
MSI GeForce RTX 2080    |amazon     |   58668   |   1/17/2019   |   61431.4 |1  
MSI GeForce RTX 2080    |amazon     |   62649   |   1/7/2019    |   61431.4 |1  
MSI GeForce RTX 2080    |amazon     |   60542   |   1/16/2019   |   61431.4 |1  
MSI GeForce RTX 2080    |amazon     |   62649   |   1/5/2019    |   61431.4 |1  
MSI GeForce RTX 2080    |brandstar  |   66456   |   1/16/2019   |   66419.66667 |  3
MSI GeForce RTX 2080    |brandstar  |   66347   |   1/17/2019   |   66419.66667 |  3
MSI GeForce RTX 2080    |brandstar  |   66456   |   1/16/2019   |   66419.66667 |  3
MSI GeForce RTX 2080    |brigo      |   63300   |   1/17/2019   |   64653.33333 |  2
MSI GeForce RTX 2080    |brigo      |   65330   |   1/16/2019   |   64653.33333 |  2
MSI GeForce RTX 2080    |brigo      |   65330   |   1/16/2019   |   64653.33333 |  2
MSI GeForce RTX 2070    | fake_shop |   65330   |   1/16/2019   |   65330   | 1
MSI GeForce RTX 2070    | fake_shop |   65330   |   1/17/2019   |   65330   | 1
MSI GeForce RTX 2070    | fake_shop |   65330   |   1/18/2019   |   65330   | 1

然后我想选择排名小于 3 的行。

但我得到以下结果：

    ---------------------------------------------------------------------------------------------
    MSI GeForce RTX 2080    |amazon     |   62649   |   1/6/2019    |   61431.4 |   1   |
    MSI GeForce RTX 2080    |amazon     |   58668   |   1/17/2019   |   61431.4 |   1   |
    MSI GeForce RTX 2080    |amazon     |   62649   |   1/7/2019    |   61431.4 |   1   |
    MSI GeForce RTX 2080    |amazon     |   60542   |   1/16/2019   |   61431.4 |   1   |
    MSI GeForce RTX 2080    |amazon     |   62649   |   1/5/2019    |   61431.4 |   1   |
    MSI GeForce RTX 2080    |brandstar  |   66456   |   1/16/2019   |   66419.66667 |   1   |
    MSI GeForce RTX 2080    |brandstar  |   66347   |   1/17/2019   |   66419.66667 |   1   |
    MSI GeForce RTX 2080    |brandstar  |   66456   |   1/16/2019   |   66419.66667 |   1   |
    MSI GeForce RTX 2080    |brigo      |   63300   |   1/17/2019   |   64653.33333 |   1   |
    MSI GeForce RTX 2080    |brigo      |   65330   |   1/16/2019   |   64653.33333 |   1   |
    MSI GeForce RTX 2080    |brigo      |   65330   |   1/16/2019   |   64653.33333 |   1   |
    MSI GeForce RTX 2070    | fake_shop |   65330   |   1/16/2019   |   65330   | 1
    MSI GeForce RTX 2070    | fake_shop |   65330   |   1/17/2019   |   65330   | 1
    MSI GeForce RTX 2070    | fake_shop |   65330   |   1/18/2019   |   65330   | 1

这是我的 SQL 查询：

SELECT tt.category,
       tt.shop_name,
       tt.price,
       tt.updated,
       tt.avg_price,
       rank() OVER (PARTITION BY tt.category,
                                 tt.shop_name,
                                 tt.avg_price
                    ORDER BY tt.avg_price DESC)
FROM
  ( SELECT category,
           LOWER(shop_name) AS shop_name,
           CAST (price AS INTEGER) AS price,
                DATE(updated) AS updated,
                avg(price) OVER (PARTITION BY category,
                                              LOWER(shop_name)) AS avg_price
   FROM prices ) AS tt

Answer 1

只需使用 AVG() OVER () 后跟 DENSE_RANK():

WITH cte1 AS (
    SELECT *, AVG(price) OVER (PARTITION BY category, shop_name) AS avg_price
    FROM prices
), cte2 AS (
    SELECT *, DENSE_RANK() OVER (PARTITION BY category ORDER BY avg_price) AS rnk
    FROM cte1
)
SELECT *
FROM cte2
WHERE rnk <= 2
ORDER BY category, shop_name

Answer 2

我想你想要：

select tt.category,  tt.shop_name, tt.price,  tt.updated, tt.avg_price,
       dense_rank() over (partition by tt.category order by tt.avg_price desc)
from (select category, lower(shop_name) as shop_name,
             (price::int) as price, updated::date as updated,
             avg(price) over (partition by category,  lower(shop_name)) as avg_price 
      from prices
     ) tt

我简化了一些逻辑，但主要变化是 partition by 用于 rank()。您似乎想要每家商店的排名。 dense_rank()也比较合适。

如果要区分超额价格相同的品类：

       dense_rank() over (partition by tt.shop_name order by tt.avg_price desc, category)

6 个最高最小平均元素 postgresql

6 top min average elements postgresql

sql

postgresql

greatest-n-per-group

window-functions