如何删除重复项并在 table 中保留一行
How to delete duplicates and leave one row in a table
我有 table 看起来像这样。我想删除重复项并为每个用户保留一行。我该怎么做?
Table*
id user
Thango 1
Thango 1
Samg 2
Samg 2
结果
id 用户
Thango 1
Samg 2
对于这个数据集,清空并重新填充 table:
可能更简单
-- deduplicate into a temporary table
create table mytmp as select distinct id, user from mytable;
-- empty the original table (backup your data first!)
truncate table mytable;
-- refill the table from the temporary table
insert into mytable(id, user) select id, user from mytmp;
-- drop the temporary table
drop table mytemp;
完成此操作后,您可以考虑在 table 上创建一个 unique
约束以避免进一步重复:
alter table mytable
add constraint myconstraint
unique (id, user);
您使用通用 Table 表达式 (CTE)。为了更好的解释,我建议你看看这个 url
可能的解决方案:
WITH CTE([user],
duplicatecount)
AS (SELECT [user],
ROW_NUMBER() OVER(PARTITION BY [user]
ORDER BY [id]) AS DuplicateCount
FROM dbo.[YourDataBase])
DELETE FROM CTE
WHERE DuplicateCount > 1
我有 table 看起来像这样。我想删除重复项并为每个用户保留一行。我该怎么做?
Table*
id user
Thango 1
Thango 1
Samg 2
Samg 2
结果
id 用户
Thango 1
Samg 2
对于这个数据集,清空并重新填充 table:
可能更简单-- deduplicate into a temporary table
create table mytmp as select distinct id, user from mytable;
-- empty the original table (backup your data first!)
truncate table mytable;
-- refill the table from the temporary table
insert into mytable(id, user) select id, user from mytmp;
-- drop the temporary table
drop table mytemp;
完成此操作后,您可以考虑在 table 上创建一个 unique
约束以避免进一步重复:
alter table mytable
add constraint myconstraint
unique (id, user);
您使用通用 Table 表达式 (CTE)。为了更好的解释,我建议你看看这个 url
可能的解决方案:
WITH CTE([user],
duplicatecount)
AS (SELECT [user],
ROW_NUMBER() OVER(PARTITION BY [user]
ORDER BY [id]) AS DuplicateCount
FROM dbo.[YourDataBase])
DELETE FROM CTE
WHERE DuplicateCount > 1