如何过滤具有不同列值的数据

Question

有 table 值

create table test_tbl (id serial,create_date timestamptz,display_name text,address text);

insert into test_tbl(create_date,display_name,address)
values (NOW(),'name1', 'addr1'),(NOW(),'name1', 'addr1'),(NOW(),'name1', 'addr1'),
(NOW(),'name2', 'addr2'),(NOW(),'name2', 'addr2'),(NOW(),'name3', 'addr2'),
(NOW(),'name3', 'addr3'),(NOW(),'name4', 'addr3'),(NOW(),'name1', 'addr3'),
(NOW(),'name5', 'addr5');

目标是过滤单独地址的客户（即使客户去了几次），但是如果同一个客户在特定地址去了几次并且还有其他客户我也需要那个值。

查询结果必须是这样的：

id  create_date  display_name  address
4  2020-12-01  name2  addr2
5  2020-12-01  name2  addr2
6  2020-12-01  name3  addr2
7  2020-12-01  name3  addr3
8  2020-12-01  name4  addr3
9  2020-12-01  name1  addr3

我们必须过滤 ID 为 1,2,3 的数据，因为在特定地址上有一个唯一的客户端，而 id = 10 因为他一个人在那里

Answer 1

您可以使用 exists 仅保留存在具有相同 address 和不同 display_name.

的另一行的行

select *
from test_tbl t
where exists (
    select 1 from test_tbl  
    from test_tbl t1 
    where t1.adddress = t.address and t1.display_name <> t.display_name
)

Demo on DB Fiddle:

id | create_date                   | display_name | address
-: | :---------------------------- | :----------- | :------
 4 | 2020-12-01 18:42:26.651971+00 | name2        | addr2  
 5 | 2020-12-01 18:42:26.651971+00 | name2        | addr2  
 6 | 2020-12-01 18:42:26.651971+00 | name3        | addr2  
 7 | 2020-12-01 18:42:26.651971+00 | name3        | addr3  
 8 | 2020-12-01 18:42:26.651971+00 | name4        | addr3  
 9 | 2020-12-01 18:42:26.651971+00 | name1        | addr3

Answer 2

使用 count(distinct display_name) 您会发现一个地址上有多少个不同的 display_names。然后使用 having count(distinct display_name) > 1 删除它们。对于 having count(distinct display_name) = 1，您 select 只有那些具有 display_name.

的地址

select * 
from test_tbl
where address in (select address
                  from test_tbl
                  group by address
                  having count(distinct display_name)  > 1)

这是一个演示：

DEMO

如何过滤具有不同列值的数据

How to filter data with distinct several column values

sql

postgresql

subquery

where-clause