无需重复计算销售额的自助加入

Self Join without double counting sales

我正在尝试创建购物篮查询。 Table_data 提取与 Host_Category 共享一个篮子的所有商品的所有交易(在本例中为类别 123);但是我正在使用的当前 SQL 正在复制附加商品的销售,因为我正在将 table 与其自身进行完全连接,有时同一主机类别中有超过 1 个商品交易(如下例所示):

如何更改查询以获得所需的输出?

我的查询如下:

SELECT Host_Category,Attached_Item
       ,Count(DISTINCT Trx_NBR || Location_NBR || Trx_Date) AS Shared_Trx_Count
       ,Sum(Host_Sales) AS Host_Sales
       ,Sum(Attached_Sales) AS Attached_Sales
FROM (SELECT a.Trx_NBR, a.Trx_Date, a.Location_NBR
             ,CASE WHEN a.Category IN (123) THEN '123' END AS Host_Category
             ,a.sales AS Host_Sales
             ,b.Item AS Attached_Item, b.sales AS Attached_Sales
      FROM table_data a FULL JOIN table_data b
      ON a.Trx_NBR = b.Trx_NBR AND a.Location_NBR = b.Location_NBR AND a.Trx_Date = b.Trx_Date
      WHERE a.Category IN (123) ) AS c GROUP BY 1,2

您可以简单地使用 min/max 或其他一些聚合函数,而不是求和,这样就可以解决问题:

SELECT Host_Category,Attached_Item
       ,Count(DISTINCT Trx_NBR || Location_NBR || Trx_Date) AS Shared_Trx_Count
       ,Sum(Host_Sales) AS Host_Sales
       ,min(Attached_Sales) AS Attached_Sales
FROM (SELECT a.Trx_NBR, a.Trx_Date, a.Location_NBR
             ,CASE WHEN a.Category IN (123) THEN '123' END AS Host_Category
             ,a.sales AS Host_Sales
             ,b.Item AS Attached_Item, b.sales AS Attached_Sales
      FROM table_data a FULL JOIN table_data b
      ON a.Trx_NBR = b.Trx_NBR AND a.Location_NBR = b.Location_NBR AND a.Trx_Date = b.Trx_Date
      WHERE a.Category IN (123) ) AS c GROUP BY 1,2

连接之前聚合以获得唯一数据。这可能是您想要的:

SELECT Host_Category,Attached_Item
       -- do you still need COUNT(DISTINCT)? Or is COUNT(*) ok?
       ,Count(DISTINCT Trx_NBR || Location_NBR || Trx_Date) AS Shared_Trx_Count
       ,Sum(Host_Sales) as Host_Sales
       ,Sum(Attached_Sales) AS Attached_Sales
FROM
 (
   SELECT a.*
         ,b.Item AS Attached_Item
         ,b.sales AS Attached_Sales
   FROM
    (
      SELECT a.Trx_NBR, a.Trx_Date, a.Location_NBR
          ,CASE WHEN a.Category IN (123) THEN '123' END AS Host_Category
          ,Sum(a.sales) AS Host_Sales
      FROM table_data AS a
      WHERE a.Category IN (123)
      GROUP BY 1,2,3,4
    ) AS a
   -- your previous Full Join was a Left Join anyway due to the WHERE-condition
   LEFT JOIN table_data b
   ON a.Trx_NBR = b.Trx_NBR 
   AND a.Location_NBR = b.Location_NBR
   AND a.Trx_Date = b.Trx_Date
 ) AS c 
GROUP BY 1,2