Count Distinct Window 函数与 Groupby

Count Distinct Window Function with Groupby

我有一个 table,其中包含用户名、市场和 purchase_id。我正在尝试在没有子查询的 SnowSql 中使用 window 函数来计算用户购买的不同购买次数以及市场上唯一购买的总数。

初始Table

User Market Purchase_ID
John Smith NYC 1
John Smith NYC 2
Bob Miller NYC 2
Bob Miller NYC 4
Tim Wilson NYC 3

期望的结果如下所示:

User Purchases Unique Market Purchases
John Smith 2 4
Bob Miller 2 4
Tim Wilson 1 4

我在没有子查询的情况下尝试的查询如下所示,但收到 groupby 错误。

SELECT 
  user,
  COUNT(DISTINCT purchase_id),
  COUNT(DISTINCT purchase_id) OVER (partition by market)
FROM table
GROUP BY 1

感谢对此的任何帮助。谢谢!

我认为您不能简单地将其作为聚合来执行此操作。但是你可以得到这样的答案:

SELECT user,
       SUM( (seqnum = 1)::INT ) as purchases,
       SUM(SUM( (seqnum = 1)::INT )) OVER (PARTITION BY market) as market_purchases
FROM (SELECT t.*,
             ROW_NUMBER() OVER (PARTITION BY purchase_id ORDER BY purchase_id) as seqnum
      FROM table t
     ) t
GROUP BY 1

DISTTNCT在window函数中是不允许的,所以你需要使用子查询

CREATE TABLE table1
    (`User` varchar(10), `Market` varchar(3), `Purchase_ID` int)
;
    
INSERT INTO table1
    (`User`, `Market`, `Purchase_ID`)
VALUES
    ('John Smith', 'NYC', 1),
    ('John Smith', 'NYC', 2),
    ('Bob Miller', 'NYC', 2),
    ('Bob Miller', 'NYC', 4),
    ('Tim Wilson', 'NYC', 3)
;
SELECT 
  user,
  COUNT(DISTINCT purchase_id)
  ,MAX((SELECT COUNT(DISTINCT purchase_id) FROM table1 WHERE `Market` = t1.`Market` )) bymarkte
FROM table1 t1
GROUP BY 1
user       | COUNT(DISTINCT purchase_id) | bymarkte
:--------- | --------------------------: | -------:
Bob Miller |                           2 |        4
John Smith |                           2 |        4
Tim Wilson |                           1 |        4
SELECT 
  user,
  COUNT(DISTINCT purchase_id)
  ,MAX(countr) bymarkte
FROM table1 t1
INNER JOIN (SELECT `Market`,COUNT(DISTINCT purchase_id) countr FROM table1 GROUP BY  `Market` ) ta ON t1.`Market` = ta.`Market`

GROUP BY 1
user       | COUNT(DISTINCT purchase_id) | bymarkte
:--------- | --------------------------: | -------:
Bob Miller |                           2 |        4
John Smith |                           2 |        4
Tim Wilson |                           1 |        4

db<>fiddle here

这可能行得通,您可能会争先恐后地进入您想要的格式,但它会在没有子查询的情况下产生答案。

使用令人敬畏的 GROUPING SETS,它允许在单个语句中使用多个分组依据子句 - 您遇到的确切错误 :-)。

好问题!

  SELECT 
      COUNT(DISTINCT PURCHASE_ID)  
    , USER_NAME
    , MARKET
 FROM 
    CTE
  GROUP BY 
    GROUPING SETS (USER_NAME, MARKET);

复制|粘贴|运行

WITH CTE AS (SELECT 'JOHN SMITH' USER_NAME, 'NYC' MARKET,   1 
PURCHASE_ID
UNION SELECT 'JOHN SMITH' USER_NAME,    'NYC' MARKET,   2 PURCHASE_ID
UNION SELECT 'BOB MILLER' USER_NAME,    'NYC' MARKET,   2 PURCHASE_ID
UNION SELECT 'BOB MILLER' USER_NAME,    'NYC' MARKET,   4 PURCHASE_ID
UNION SELECT 'TIM WILSON' USER_NAME,    'NYC' MARKET,   3 PURCHASE_ID) 

SELECT 
      COUNT(DISTINCT PURCHASE_ID)  
    , USER_NAME
    , MARKET
FROM 
    CTE
GROUP BY 
    GROUPING SETS (USER_NAME, MARKET);