SQL - 按组计算百分比,适用于多个组

SQL - Calculate percentage by group, for multiple groups

我在 GBQ 中有一个 table,格式如下:

UserId  Orders  Month  
 XDT     23      1
 XDT     0       4     
 FKR     3       6
 GHR     23      4
 ...     ...    ...

它显​​示每个用户和每个月的订单数。

我想计算有订单的用户百分比,我是这样计算的:

SELECT
  HasOrders,
  ROUND(COUNT(*) * 100 / CAST( SUM(COUNT(*)) OVER () AS float64), 2) Parts
FROM (
    SELECT
        *,
        CASE WHEN Orders = 0 THEN 0 ELSE 1 END AS HasOrders
    FROM `Table` ) 
GROUP BY
  HasOrders
ORDER BY
  Parts

它给了我以下结果:

HasOrders   Parts
   0         35
   1         65

我需要按月计算有订单的用户百分比,每个月 = 100%

目前为了做到这一点,我每月执行一次查询,这不实用:

SELECT
  HasOrders,
  ROUND(COUNT(*) * 100 / CAST( SUM(COUNT(*)) OVER () AS float64), 2) Parts
FROM (
    SELECT
        *,
        CASE WHEN Orders = 0 THEN 0 ELSE 1 END AS HasOrders
    FROM `Table` ) 
WHERE Month = 1
GROUP BY
  HasOrders
ORDER BY
  Parts

有没有办法执行一次查询并得到这个结果?

HasOrders   Parts   Month
   0         25      1
   1         75      1
   0         45      2
   1         55      2
  ...       ...     ...
SELECT
    SIGN(Orders),
    ROUND(COUNT(*) * 100.000 / SUM(COUNT(*), 2) OVER (PARTITION BY Month)) AS Parts,
    Month
FROM T
GROUP BY Month, SIGN(Orders)
ORDER BY Month, SIGN(Orders)

Postgres 上的演示: https://dbfiddle.uk/?rdbms=postgres_10&fiddle=4cd2d1455673469c2dfc060eccea8020

您已声明总计为 100% 很重要,因此对于百分比正好落在0.5%的奇数倍数。或者向偶数舍入或向下舍入是更好的选择:

WITH DATA AS (
    SELECT SIGN(Orders) AS HasOrders, Month,
        COUNT(*) * 10000.000 / SUM(COUNT(*)) OVER (PARTITION BY Month) AS PartsPercent
    FROM T
    GROUP BY Month, SIGN(Orders)
    ORDER BY Month, SIGN(Orders)
)
select HasOrders, Month, PartsPercent,
    PartsPercent - TRUNCATE(PartsPercent) AS Fraction,
    CASE WHEN HasOrders = 0
         THEN FLOOR(PartsPercent) ELSE CEILING(PartsPercent)
    END AS PartsRound0Down,
    CASE WHEN PartsPercent - TRUNCATE(PartsPercent) = 0.5
              AND MOD(TRUNCATE(PartsPercent), 2) = 0
         THEN FLOOR(PartsPercent) ELSE ROUND(PartsPercent) -- halfway up
    END AS PartsRoundTowardEven,
    CASE WHEN PartsPercent - TRUNCATE(PartsPercent) = 0.5 AND PartsPercent < 50
         THEN FLOOR(PartsPercent) ELSE ROUND(PartsPercent) -- halfway up
    END AS PartsSmallestTowardZero
from DATA

通常不建议测试 floating-point 值是否相等,我不知道 BigQuery 的 float64 将如何与 0.5 进行比较。然而,二分之一仍然可以用二进制表示。在突破为 101 与 99 的情况下查看这些。我无法立即访问 BigQuery,因此请注意 Postgres 的舍入行为不同: https://dbfiddle.uk/?rdbms=postgres_10&fiddle=c8237e272427a0d1114c3d8056a01a09

考虑以下方法

select hasOrders, round(100 * parts, 2) as parts, month from (
  select month, 
    countif(orders = 0) / count(*) `0`,
    countif(orders > 0) / count(*) `1`,
  from your_table
  group by month
)
unpivot (parts for hasOrders in (`0`, `1`))          

输出如下