SQL 删除重复的行
SQL to remove duplicated rows
我写了一个 sql 语句只保留一个实例(最小 id),其中有重复 product_codes。问题是该语句非常低效并且绝对需要 运行 的年龄,所以我希望有一种更有效的方式来编写它
数据集的结构为:
id product_code cat_desc product_desc
1 2352345 423 COCA COLA
2 8967896 457 FANTA
3 6456466 435 SPARKLING WATER
4 3562314 457 STILL WATER
语句是:
DELETE
FROM raw_products_inter
WHERE id IN (SELECT id
FROM raw_products_inter outer_table
WHERE product_code IN (SELECT product_code
FROM raw_products_inter
GROUP BY 1
HAVING COUNT(id) > 1)
AND id NOT IN (SELECT MIN(id)
FROM raw_products_inter inner_table
WHERE inner_table.product_code = outer_table.product_code))
您应该能够使用 EXISTS 条件提高性能:
DELETE
FROM raw_products_inter P
WHERE EXISTS (
SELECT *
FROM raw_products_inter OP
WHERE OP.product_code = P.product_code
AND OP.id < P.id
)
我写了一个 sql 语句只保留一个实例(最小 id),其中有重复 product_codes。问题是该语句非常低效并且绝对需要 运行 的年龄,所以我希望有一种更有效的方式来编写它
数据集的结构为:
id product_code cat_desc product_desc
1 2352345 423 COCA COLA
2 8967896 457 FANTA
3 6456466 435 SPARKLING WATER
4 3562314 457 STILL WATER
语句是:
DELETE
FROM raw_products_inter
WHERE id IN (SELECT id
FROM raw_products_inter outer_table
WHERE product_code IN (SELECT product_code
FROM raw_products_inter
GROUP BY 1
HAVING COUNT(id) > 1)
AND id NOT IN (SELECT MIN(id)
FROM raw_products_inter inner_table
WHERE inner_table.product_code = outer_table.product_code))
您应该能够使用 EXISTS 条件提高性能:
DELETE
FROM raw_products_inter P
WHERE EXISTS (
SELECT *
FROM raw_products_inter OP
WHERE OP.product_code = P.product_code
AND OP.id < P.id
)