SQL : 根据条件删除重复项
SQL : Remove duplicate based on a critera
我大多是 SQL 的新手,因此我对它提供的所有高级选项知之甚少。我目前使用 MS SQL Server 2016(开发版)。
我得到以下结果:
| Type | Role | GUID |
|--------|--------|--------------------------------------|
| B | 0 | ABC |
| B | 0 | KLM |
| A | 0 | CDE |
| A | 0 | EFG |
| A | 1 | CDE |
| B | 1 | ABC |
| B | 1 | GHI |
| B | 1 | IJK |
| B | 1 | KLM |
来自以下SELECT:
SELECT DISTINCT
Type,
Role,
GUID
我正在计算遵循这些约束的 GUID :
-> 如果有多个行具有相同的 GUID,只计算 "Role" 设置为“1”的行,否则,计算 "Role" 设置为 0[= 的行26=]
-> 如果只有一个,则根据自己的角色值将其计为 "Role 0" 或 "Role 1"。
我的objective是得到以下结果:
| Type | Role | COUNT(GUID) |
|--------|--------|--------------------------------------|
| A | 0 | 1 | => counted EFG as there was no other row with a "Role" set to 1
| A | 1 | 1 | => counted CDE with "Role" set to 1, but the row with "Role" set to 0 is ignored
| B | 1 | 4 |
您的查询没有实现您提到的逻辑。这是一个使用子查询和 window 函数的方法:
select type, role, count(*)
from (select t.*,
count(*) over (partition by GUID) as guid_cnt
from t
) t
where (guid_cnt > 1 and role = 1) or
(guid_cnt = 1 and role = 0)
group by type, role;
子查询获取匹配 GUID
的行数。外部 where
然后根据您的条件使用它进行过滤。
注意:role
不是列名的好选择。它是一个关键字(参见 here) and may be reserved in the future (see here)。
A NOT EXISTS
可用于此。
例如:
declare @T table ([Type] char(1), [Role] int, [GUID] varchar(3));
insert into @T ([Type], [Role], [GUID]) values
('A',0,'CDE'),
('A',0,'EFG'),
('A',1,'CDE'),
('B',0,'ABC'),
('B',0,'KLM'),
('B',1,'ABC'),
('B',1,'GHI'),
('B',1,'IJK'),
('B',1,'KLM');
select [Type], [Role], COUNT(DISTINCT [GUID]) as TotalUniqueGuid
from @T t
where not exists (
select 1
from @T t1
where t.[Type] = t1.[Type]
and t.[Role] = 0 and t1.[Role] > 0
and t.[GUID] = t1.[GUID]
)
group by [Type], [Role];
Returns:
Type Role TotalUniqueGuid
A 0 1
A 1 1
B 1 4
我大多是 SQL 的新手,因此我对它提供的所有高级选项知之甚少。我目前使用 MS SQL Server 2016(开发版)。
我得到以下结果:
| Type | Role | GUID |
|--------|--------|--------------------------------------|
| B | 0 | ABC |
| B | 0 | KLM |
| A | 0 | CDE |
| A | 0 | EFG |
| A | 1 | CDE |
| B | 1 | ABC |
| B | 1 | GHI |
| B | 1 | IJK |
| B | 1 | KLM |
来自以下SELECT:
SELECT DISTINCT
Type,
Role,
GUID
我正在计算遵循这些约束的 GUID :
-> 如果有多个行具有相同的 GUID,只计算 "Role" 设置为“1”的行,否则,计算 "Role" 设置为 0[= 的行26=] -> 如果只有一个,则根据自己的角色值将其计为 "Role 0" 或 "Role 1"。
我的objective是得到以下结果:
| Type | Role | COUNT(GUID) |
|--------|--------|--------------------------------------|
| A | 0 | 1 | => counted EFG as there was no other row with a "Role" set to 1
| A | 1 | 1 | => counted CDE with "Role" set to 1, but the row with "Role" set to 0 is ignored
| B | 1 | 4 |
您的查询没有实现您提到的逻辑。这是一个使用子查询和 window 函数的方法:
select type, role, count(*)
from (select t.*,
count(*) over (partition by GUID) as guid_cnt
from t
) t
where (guid_cnt > 1 and role = 1) or
(guid_cnt = 1 and role = 0)
group by type, role;
子查询获取匹配 GUID
的行数。外部 where
然后根据您的条件使用它进行过滤。
注意:role
不是列名的好选择。它是一个关键字(参见 here) and may be reserved in the future (see here)。
A NOT EXISTS
可用于此。
例如:
declare @T table ([Type] char(1), [Role] int, [GUID] varchar(3));
insert into @T ([Type], [Role], [GUID]) values
('A',0,'CDE'),
('A',0,'EFG'),
('A',1,'CDE'),
('B',0,'ABC'),
('B',0,'KLM'),
('B',1,'ABC'),
('B',1,'GHI'),
('B',1,'IJK'),
('B',1,'KLM');
select [Type], [Role], COUNT(DISTINCT [GUID]) as TotalUniqueGuid
from @T t
where not exists (
select 1
from @T t1
where t.[Type] = t1.[Type]
and t.[Role] = 0 and t1.[Role] > 0
and t.[GUID] = t1.[GUID]
)
group by [Type], [Role];
Returns:
Type Role TotalUniqueGuid
A 0 1
A 1 1
B 1 4