MySQL 具有不同 ID 的重复行

MySQL duplicate rows with different id's

我有一个 MySQL table 有很多重复的行。我怎样才能找到 id 并删除它们。我需要保留第一个 lead_id 并删除任何其他重复项。

因此,在此示例中,我需要找到重复的电子邮件值并删除所有行。 IE。删除 lead_id 为 40944 的所有行并保留所有 40943.

id      |   lead_id     | form  |field_number   |   value
--------+---------------+-------+---------------+----------------------
537618  |   40943       |1      | 3.3           |   Mike
537622  |   40943       |1      | 4.3           |   Mesa
537623  |   40943       |1      | 4.4           |   AZ
537624  |   40943       |1      | 4.5           |   85210
537625  |   40943       |1      | 4.6           |   United States
537626  |   40943       |1      | 5             |   mike@email.com
537627  |   40943       |1      | 6             |   (555) 555-5555
537628  |   40943       |1      | 19            |   JM-SL-I4CLR,JM-FM-I5CLR
537629  |   40943       |1      | 12            |   2015-10-01
547618  |   40944       |1      | 3.3           |   Mike
547622  |   40944       |1      | 4.3           |   Mesa
547623  |   40944       |1      | 4.4           |   AZ
547624  |   40944       |1      | 4.5           |   85210
547625  |   40944       |1      | 4.6           |   United States
547626  |   40944       |1      | 5             |   mike@email.com
547627  |   40944       |1      | 6             |   (555) 555-5555
547628  |   40944       |1      | 19            |   JM-SL-I4CLR,JM-FM-I5CLR
547629  |   40944       |1      | 12            |   2015-10-01

我试过了:

SELECT `value`, count(*) 
 FROM `lead_detail` 
 WHERE `field_number` = 5 
 GROUP BY `value` 
 HAVING count(*) > 1

结果

value          |    count(*)
---------------+------------------
mike@email.com |    2

只是不确定如何删除行?

您只是在寻找重复的电子邮件,所以您可能不需要它。 但是当后来的线索比最初的线索有更多的细节时会发生什么? 仅当所有字段重复时才删除,此查询仅供参考。

DELETE FROM lead_detail
WHERE lead_id in
  (SELECT * FROM (SELECT lead_id FROM 
                    (SELECT lead_id,
                            GROUP_CONCAT(form ORDER BY form,field_number)as forms,
                            GROUP_CONCAT(field_number ORDER BY form,field_number) as field_numbers,
                            GROUP_CONCAT(value ORDER BY form,field_number) as `values`
                     FROM lead_detail
                     GROUP BY lead_id)l1
   WHERE EXISTS (SELECT 1 FROM 
                 (SELECT lead_id,
                            GROUP_CONCAT(form ORDER BY form,field_number)as forms,
                            GROUP_CONCAT(field_number ORDER BY form,field_number) as field_numbers,
                            GROUP_CONCAT(value ORDER BY form,field_number) as `values`
                     FROM lead_detail
                     GROUP BY lead_id)l2
                 WHERE l2.lead_id < l1.lead_id
                 AND l2.forms = l1.forms
                 AND l2.field_numbers = l1.field_numbers
                 AND l2.`values` = l1.`values`)
   )T
   )

你可以简单地使用类似这样的东西来完成。我自己用过它并成功完成了工作..

DELETE t1 FROM lead_detail t1, lead_detail t2 
WHERE t1.id > t2.id AND t1.field_number = t2.field_number

而且您可以根据需要自由 expand/change where 部分(仅在需要时)。

这应该 return 您要删除的 lead_id。我建议先 运行 它,将结果存储在临时 table 中并进行一些查询以确保您不会丢失任何东西。即使记录是混合的(重复不是对线索的连续输入 table)

select distinct(l1.lead_id)
from lead_detail l1
inner join lead_detail l2 on  l1.value = l2.value AND l1.field_number = 5 AND l2.field_number = 5 AND l1.id != l2.id LIMIT
18446744073709551610  OFFSET 1

18446744073709551610就是因为这个Can't do offset without limit

偏移量跳过第一个(以确保您保留一个记录)

做一个

Delete from lead_detail where lead_id in (above query) 

仔细检查结果后