无需重复计算销售额的自助加入
Self Join without double counting sales
我正在尝试创建购物篮查询。 Table_data 提取与 Host_Category 共享一个篮子的所有商品的所有交易(在本例中为类别 123);但是我正在使用的当前 SQL 正在复制附加商品的销售,因为我正在将 table 与其自身进行完全连接,有时同一主机类别中有超过 1 个商品交易(如下例所示):
如何更改查询以获得所需的输出?
我的查询如下:
SELECT Host_Category,Attached_Item
,Count(DISTINCT Trx_NBR || Location_NBR || Trx_Date) AS Shared_Trx_Count
,Sum(Host_Sales) AS Host_Sales
,Sum(Attached_Sales) AS Attached_Sales
FROM (SELECT a.Trx_NBR, a.Trx_Date, a.Location_NBR
,CASE WHEN a.Category IN (123) THEN '123' END AS Host_Category
,a.sales AS Host_Sales
,b.Item AS Attached_Item, b.sales AS Attached_Sales
FROM table_data a FULL JOIN table_data b
ON a.Trx_NBR = b.Trx_NBR AND a.Location_NBR = b.Location_NBR AND a.Trx_Date = b.Trx_Date
WHERE a.Category IN (123) ) AS c GROUP BY 1,2
您可以简单地使用 min/max 或其他一些聚合函数,而不是求和,这样就可以解决问题:
SELECT Host_Category,Attached_Item
,Count(DISTINCT Trx_NBR || Location_NBR || Trx_Date) AS Shared_Trx_Count
,Sum(Host_Sales) AS Host_Sales
,min(Attached_Sales) AS Attached_Sales
FROM (SELECT a.Trx_NBR, a.Trx_Date, a.Location_NBR
,CASE WHEN a.Category IN (123) THEN '123' END AS Host_Category
,a.sales AS Host_Sales
,b.Item AS Attached_Item, b.sales AS Attached_Sales
FROM table_data a FULL JOIN table_data b
ON a.Trx_NBR = b.Trx_NBR AND a.Location_NBR = b.Location_NBR AND a.Trx_Date = b.Trx_Date
WHERE a.Category IN (123) ) AS c GROUP BY 1,2
在连接之前聚合以获得唯一数据。这可能是您想要的:
SELECT Host_Category,Attached_Item
-- do you still need COUNT(DISTINCT)? Or is COUNT(*) ok?
,Count(DISTINCT Trx_NBR || Location_NBR || Trx_Date) AS Shared_Trx_Count
,Sum(Host_Sales) as Host_Sales
,Sum(Attached_Sales) AS Attached_Sales
FROM
(
SELECT a.*
,b.Item AS Attached_Item
,b.sales AS Attached_Sales
FROM
(
SELECT a.Trx_NBR, a.Trx_Date, a.Location_NBR
,CASE WHEN a.Category IN (123) THEN '123' END AS Host_Category
,Sum(a.sales) AS Host_Sales
FROM table_data AS a
WHERE a.Category IN (123)
GROUP BY 1,2,3,4
) AS a
-- your previous Full Join was a Left Join anyway due to the WHERE-condition
LEFT JOIN table_data b
ON a.Trx_NBR = b.Trx_NBR
AND a.Location_NBR = b.Location_NBR
AND a.Trx_Date = b.Trx_Date
) AS c
GROUP BY 1,2
我正在尝试创建购物篮查询。 Table_data 提取与 Host_Category 共享一个篮子的所有商品的所有交易(在本例中为类别 123);但是我正在使用的当前 SQL 正在复制附加商品的销售,因为我正在将 table 与其自身进行完全连接,有时同一主机类别中有超过 1 个商品交易(如下例所示):
如何更改查询以获得所需的输出?
我的查询如下:
SELECT Host_Category,Attached_Item
,Count(DISTINCT Trx_NBR || Location_NBR || Trx_Date) AS Shared_Trx_Count
,Sum(Host_Sales) AS Host_Sales
,Sum(Attached_Sales) AS Attached_Sales
FROM (SELECT a.Trx_NBR, a.Trx_Date, a.Location_NBR
,CASE WHEN a.Category IN (123) THEN '123' END AS Host_Category
,a.sales AS Host_Sales
,b.Item AS Attached_Item, b.sales AS Attached_Sales
FROM table_data a FULL JOIN table_data b
ON a.Trx_NBR = b.Trx_NBR AND a.Location_NBR = b.Location_NBR AND a.Trx_Date = b.Trx_Date
WHERE a.Category IN (123) ) AS c GROUP BY 1,2
您可以简单地使用 min/max 或其他一些聚合函数,而不是求和,这样就可以解决问题:
SELECT Host_Category,Attached_Item
,Count(DISTINCT Trx_NBR || Location_NBR || Trx_Date) AS Shared_Trx_Count
,Sum(Host_Sales) AS Host_Sales
,min(Attached_Sales) AS Attached_Sales
FROM (SELECT a.Trx_NBR, a.Trx_Date, a.Location_NBR
,CASE WHEN a.Category IN (123) THEN '123' END AS Host_Category
,a.sales AS Host_Sales
,b.Item AS Attached_Item, b.sales AS Attached_Sales
FROM table_data a FULL JOIN table_data b
ON a.Trx_NBR = b.Trx_NBR AND a.Location_NBR = b.Location_NBR AND a.Trx_Date = b.Trx_Date
WHERE a.Category IN (123) ) AS c GROUP BY 1,2
在连接之前聚合以获得唯一数据。这可能是您想要的:
SELECT Host_Category,Attached_Item
-- do you still need COUNT(DISTINCT)? Or is COUNT(*) ok?
,Count(DISTINCT Trx_NBR || Location_NBR || Trx_Date) AS Shared_Trx_Count
,Sum(Host_Sales) as Host_Sales
,Sum(Attached_Sales) AS Attached_Sales
FROM
(
SELECT a.*
,b.Item AS Attached_Item
,b.sales AS Attached_Sales
FROM
(
SELECT a.Trx_NBR, a.Trx_Date, a.Location_NBR
,CASE WHEN a.Category IN (123) THEN '123' END AS Host_Category
,Sum(a.sales) AS Host_Sales
FROM table_data AS a
WHERE a.Category IN (123)
GROUP BY 1,2,3,4
) AS a
-- your previous Full Join was a Left Join anyway due to the WHERE-condition
LEFT JOIN table_data b
ON a.Trx_NBR = b.Trx_NBR
AND a.Location_NBR = b.Location_NBR
AND a.Trx_Date = b.Trx_Date
) AS c
GROUP BY 1,2