查找两列中具有重复值的行,其中一列中至少有一个值是特定值
Find rows with duplicate values in two columns where at least one value in one column is a specific value
所以,我不确定它是如何工作的,而且我没有通过谷歌搜索找到足够的答案(可能没有使用正确的流行语)。所以它来了:
假设我有一个像这样的 table,我们称它为 persons
ID
Name
First Name
Country
1
Doe
John
USA
2
Doe
John
UK
3
Doe
John
Brazil
4
Meyer
Julia
Germany
5
Meyer
Julia
Austria
6
Picard
Jean-Luc
France
7
Picard
Jean-Luc
UK
8
Nakamura
Hikaro
Japan
好的,现在我想 select 所有具有相同名字和名字并且至少有一个国家是英国的行。所以我的结果集应该是这样的。
ID
Name
First_Name
Country
1
Doe
John
USA
2
Doe
John
UK
3
Doe
John
Brazil
6
Picard
Jean-Luc
France
7
Picard
Jean-Luc
UK
我的意思是,我知道如何像这样找到一般的双打
SELECT *
FROM persons p1
JOIN (SELECT NAME, FIRST_NAME, count(*) FROM PERSONS
GROUP BY FIRST_NAME, NAME having count(*) >1) p2
ON p1.NAME = p2.NAME
AND p1.FIRST_NAME = p2.FIRST_NAME;
但这也导致 Julia Meyer 出现在那里,我不想让她出现。
有什么建议吗?
有条件地计算感兴趣的国家/地区
SELECT *
FROM persons p1
JOIN (
SELECT NAME, FIRST_NAME
FROM PERSONS
GROUP BY FIRST_NAME, NAME
having count(*) > 1 and count(case country = 'UK' then 1 end) >= 1
) p2 ON p1.NAME = p2.NAME
AND p1.FIRST_NAME = p2.FIRST_NAME;
使用EXISTS
:
SELECT p1.*
FROM persons p1
WHERE EXISTS (
SELECT *
FROM persons p2
WHERE p2.ID <> p1.ID
AND p2.Name = p1.Name AND p2.FirstName = p1.FirstName
AND 'UK' IN (p1.Country, p2.Country)
);
参见demo。
有 2 个条件,1) 包含 'UK' 和 2) count(1) > 1。因此,下面的查询将有效。
SELECT p1.*
FROM persons p1
WHERE (p1.NAME, p1.FIRST_NAME) IN (
SELECT p2.NAME, p2.FIRST_NAME
FROM persons p2
WHERE p2.Country = 'UK') AND
AND (p1.NAME, p1.FIRST_NAME) IN (
SELECT p3.NAME, p3.FIRST_NAME
FROM persons p3
GROUP BY p3.NAME, p3.FIRST_NAME
HAVING COUNT(1) > 1)
I want to select all the rows that have the same name and first name and where at least one country is the UK.
您可以使用带条件聚合的 COUNT
分析函数(并在单个 table 扫描中解决问题,无需任何 self-joins):
SELECT id, name, first_name, country
FROM (
SELECT t.*,
COUNT(CASE country WHEN 'UK' THEN 1 END)
OVER (PARTITION BY name, first_name) AS cnt
FROM table_name t
)
WHERE cnt > 0;
其中,对于示例数据:
CREATE TABLE table_name (ID, Name, First_Name, Country) AS
SELECT 1, 'Doe', 'John', 'USA' FROM DUAL UNION ALL
SELECT 2, 'Doe', 'John', 'UK' FROM DUAL UNION ALL
SELECT 3, 'Doe', 'John', 'Brazil' FROM DUAL UNION ALL
SELECT 4, 'Meyer', 'Julia', 'Germany' FROM DUAL UNION ALL
SELECT 5, 'Meyer', 'Julia', 'Austria' FROM DUAL UNION ALL
SELECT 6, 'Picard', 'Jean-Luc', 'France' FROM DUAL UNION ALL
SELECT 7, 'Picard', 'Jean-Luc', 'UK' FROM DUAL UNION ALL
SELECT 8, 'Nakamura', 'Hikaro', 'Japan' FROM DUAL;
输出:
ID
NAME
FIRST_NAME
COUNTRY
1
Doe
John
USA
2
Doe
John
UK
3
Doe
John
Brazil
6
Picard
Jean-Luc
France
7
Picard
Jean-Luc
UK
如果你想找到至少有一个是 UK
的重复行,那么还要计算分区中的所有行:
SELECT id, name, first_name, country
FROM (
SELECT t.*,
COUNT(CASE country WHEN 'UK' THEN 1 END)
OVER (PARTITION BY name, first_name) AS cnt_uk,
COUNT(*)
OVER (PARTITION BY name, first_name) AS cnt_all
FROM table_name t
)
WHERE cnt_uk > 0
AND cnt_all >= 2;
这为示例数据提供了相同的输出。
db<>fiddle here
所以,我不确定它是如何工作的,而且我没有通过谷歌搜索找到足够的答案(可能没有使用正确的流行语)。所以它来了: 假设我有一个像这样的 table,我们称它为 persons
ID | Name | First Name | Country |
---|---|---|---|
1 | Doe | John | USA |
2 | Doe | John | UK |
3 | Doe | John | Brazil |
4 | Meyer | Julia | Germany |
5 | Meyer | Julia | Austria |
6 | Picard | Jean-Luc | France |
7 | Picard | Jean-Luc | UK |
8 | Nakamura | Hikaro | Japan |
好的,现在我想 select 所有具有相同名字和名字并且至少有一个国家是英国的行。所以我的结果集应该是这样的。
ID | Name | First_Name | Country |
---|---|---|---|
1 | Doe | John | USA |
2 | Doe | John | UK |
3 | Doe | John | Brazil |
6 | Picard | Jean-Luc | France |
7 | Picard | Jean-Luc | UK |
我的意思是,我知道如何像这样找到一般的双打
SELECT *
FROM persons p1
JOIN (SELECT NAME, FIRST_NAME, count(*) FROM PERSONS
GROUP BY FIRST_NAME, NAME having count(*) >1) p2
ON p1.NAME = p2.NAME
AND p1.FIRST_NAME = p2.FIRST_NAME;
但这也导致 Julia Meyer 出现在那里,我不想让她出现。
有什么建议吗?
有条件地计算感兴趣的国家/地区
SELECT *
FROM persons p1
JOIN (
SELECT NAME, FIRST_NAME
FROM PERSONS
GROUP BY FIRST_NAME, NAME
having count(*) > 1 and count(case country = 'UK' then 1 end) >= 1
) p2 ON p1.NAME = p2.NAME
AND p1.FIRST_NAME = p2.FIRST_NAME;
使用EXISTS
:
SELECT p1.*
FROM persons p1
WHERE EXISTS (
SELECT *
FROM persons p2
WHERE p2.ID <> p1.ID
AND p2.Name = p1.Name AND p2.FirstName = p1.FirstName
AND 'UK' IN (p1.Country, p2.Country)
);
参见demo。
有 2 个条件,1) 包含 'UK' 和 2) count(1) > 1。因此,下面的查询将有效。
SELECT p1.*
FROM persons p1
WHERE (p1.NAME, p1.FIRST_NAME) IN (
SELECT p2.NAME, p2.FIRST_NAME
FROM persons p2
WHERE p2.Country = 'UK') AND
AND (p1.NAME, p1.FIRST_NAME) IN (
SELECT p3.NAME, p3.FIRST_NAME
FROM persons p3
GROUP BY p3.NAME, p3.FIRST_NAME
HAVING COUNT(1) > 1)
I want to select all the rows that have the same name and first name and where at least one country is the UK.
您可以使用带条件聚合的 COUNT
分析函数(并在单个 table 扫描中解决问题,无需任何 self-joins):
SELECT id, name, first_name, country
FROM (
SELECT t.*,
COUNT(CASE country WHEN 'UK' THEN 1 END)
OVER (PARTITION BY name, first_name) AS cnt
FROM table_name t
)
WHERE cnt > 0;
其中,对于示例数据:
CREATE TABLE table_name (ID, Name, First_Name, Country) AS
SELECT 1, 'Doe', 'John', 'USA' FROM DUAL UNION ALL
SELECT 2, 'Doe', 'John', 'UK' FROM DUAL UNION ALL
SELECT 3, 'Doe', 'John', 'Brazil' FROM DUAL UNION ALL
SELECT 4, 'Meyer', 'Julia', 'Germany' FROM DUAL UNION ALL
SELECT 5, 'Meyer', 'Julia', 'Austria' FROM DUAL UNION ALL
SELECT 6, 'Picard', 'Jean-Luc', 'France' FROM DUAL UNION ALL
SELECT 7, 'Picard', 'Jean-Luc', 'UK' FROM DUAL UNION ALL
SELECT 8, 'Nakamura', 'Hikaro', 'Japan' FROM DUAL;
输出:
ID NAME FIRST_NAME COUNTRY 1 Doe John USA 2 Doe John UK 3 Doe John Brazil 6 Picard Jean-Luc France 7 Picard Jean-Luc UK
如果你想找到至少有一个是 UK
的重复行,那么还要计算分区中的所有行:
SELECT id, name, first_name, country
FROM (
SELECT t.*,
COUNT(CASE country WHEN 'UK' THEN 1 END)
OVER (PARTITION BY name, first_name) AS cnt_uk,
COUNT(*)
OVER (PARTITION BY name, first_name) AS cnt_all
FROM table_name t
)
WHERE cnt_uk > 0
AND cnt_all >= 2;
这为示例数据提供了相同的输出。
db<>fiddle here