加入后重复记录 table

Duplicate records upon joining table

我对 SQL 和 Tableau 还是很陌生,但是我正在努力实现我的个人项目。

Table一个;显示 table,其中包含每个产品类别的缺陷数量以及缺陷数量

+--------+-------------+--------------+-----------------+
| Issue# | Date_Raised | Category_ID# | Defect_Quantity | 
+--------+-------------+--------------+-----------------+
| PCR12  | 11-Jan-2019 | Product#1    |              14 |
| PCR13  | 12-Jan-2019 | Product#1    |              54 |
| PCR14  | 5-Feb-2019  | Product#1    |               5 |
| PCR15  | 5-Feb-2019  | Product#2    |               7 | 
| PCR16  | 20-Mar-2019 | Product#1    |              76 | 
| PCR17  | 22-Mar-2019 | Product#2    |               5 | 
| PCR18  | 25-Mar-2019 | Product#1    |              89 | 
+--------+-------------+--------------+-----------------+

Table乙;按月显示每个产品的消费数量

+-------------+--------------+-------------------+
| Date_Raised | Category_ID# | Consumed_Quantity |
+-------------+--------------+-------------------+
| 5-Jan-2019  | Product#1    | 100               |
| 17-Jan-2019 | Product#1    | 200               |
| 5-Feb-2019  | Product#1    | 100               |
| 8-Feb-2019  | Product#2    | 50                |
| 10-Mar-2019 | Product#1    | 100               |
| 12-Mar-2019 | Product#2    | 50                |
+-------------+--------------+-------------------+

最终结果

我想在 tableau 中创建一个 table/bar 图表,显示每月 Defect_Quantity/Consumed_Quantity,每个 Category_ID#,所以如下所示;

+----------+-----------+-----------+
|  Month   | Product#1 | Product#2 |
+----------+-----------+-----------+
| Jan-2019 | 23%       |           |
| Feb-2019 | 5%        | 14%       |
| Mar-2019 | 89%       | 10%       |
+----------+-----------+-----------+

到目前为止我尝试了什么 不幸的是,我没有真正做任何事情,我正在努力了解如何在加入基于 Category_ID# 的 table 后摆脱重复项。

感谢我在这里得到的所有帮助。

通过使用 ROW_NUMBER() OVER (PARTITION BY ORDER BY ) as RN,您可以删除重复的行。根据您的最终结果,您应该从日期中提取月份并使用 pivot 来实现。

我可以考虑在 product1 和 2 上都做 left joins

select to_char(to_date(Date_Raised,'d-mon-yyyy'),'mon-yyyy')
    , (p2.product1 - sum(case when category_id='Product#1' then Defect_Quantity else 0 end))/p2.product1 * 100 
    , (p2.product2 - sum(case when category_id='Product#2' then Defect_Quantity else 0 end))/p2.product2 * 100  
from tableA t1
left join 
    (select to_char(to_date(Date_Raised,'d-mon-yyyy'),'mon-yyyy') Date_Raised
        , sum(Comsumed_Quantity) as product1 tableB  
        where category_id = 'Product#1'
        group by to_char(to_date(Date_Raised,'d-mon-yyyy'),'mon-yyyy')) p1
    on p1.Date_Raised = t1.Date_Raised  
left join 
    (select to_char(to_date(Date_Raised,'d-mon-yyyy'),'mon-yyyy') Date_Raised
        , sum(Comsumed_Quantity) as product2 tableB  
        where category_id = 'Product#2'
        group by to_char(to_date(Date_Raised,'d-mon-yyyy'),'mon-yyyy')) p2
    on p2.Date_Raised = t1.Date_Raised
group by to_char(to_date(Date_Raised,'d-mon-yyyy'),'mon-yyyy')

我会这样做:

select to_char(date_raised, 'YYYY-MM'),
       (sum(case when product = 'Product#1' then defect_quantity end) /
        sum(case when product = 'Product#1' then consumed_quantity end)
       ) as product1,
       (sum(case when product = 'Product#2' then defect_quantity end) /
        sum(case when product = 'Product#2' then consumed_quantity end)
       ) as product2        
from ((select date_raised, product, defect_quantity, 0 as consumed_quantity
       from a
      ) union all
      (select date_raised, product, 0 as defect_quantity, consumed_quantity
       from b
      )
     ) ab
group by to_char(date_raised, 'YYYY-MM')
order by min(date_raised);

(我更改了日期格式,因为我更喜欢 YYYY-MM,但这与逻辑无关。)

为什么我更喜欢这种方法?这将包括 所有 个月,其中 either table 中有一行。不用担心因为1个月有漏产或者次品,不小心过滤掉了一些月份