DISTINCT 没有给出预期的结果

Question

ID  FirstName   LastName    Gender  Salary
1   Ben         Hoskins     Male    70000
2   Mark        Hastings    Male    60000
4   Ben         Hoskins     Male    70000
8   John        Stanmore    Male    80000

虽然运行查询：

select *
from Employees
where  Salary > (SELECT AVG(distinct SALARY) FROM employees)

7000 显示 2 条记录，应该显示 1 条。为什么我使用了distinct却显示了2条记录？

Answer 1

看起来 table 中有重复的行（仅 id 不同），所以我假设您希望 distinct 覆盖名称，性别和薪水，而不仅仅是薪水。

您似乎想要 distinct 在外部和内部查询中：

select distinct firstname, lastname, gender, salary
from employees
where salary > (
    select avg(salary)
    from (
        select distinct firstname, lastname, gender, salary
        from employees
    ) e
)

如果你的数据库支持window函数，可以缩短：

select *
from (
    select e.*, avg(salary) over() as avg_salary
    from (
        select distinct firstname, lastname, gender, salary
        from employees
    ) e
) e
where salary > avg_salary

Answer 2

您使用了 distinct 子查询，而不是在外部查询中，因此外部查询仍然会看到包含重复项的数据。

您可以解决这个问题：

select e.FirstName, e.LastName, e.Gender, e.Salary
from Employees e
where e.Salary > (SELECT AVG(distinct e2.SALARY) FROM employees e2);

但是，如果您有包含此类重复项的行，则数据模型似乎有严重问题。数据应该是固定的。

在此期间，您可以解决该问题。您可以使用消除重复项的 CTE 来表达您的查询：

with e as (
      select e.*
      from (select e.*,
                   row_number(). over (partition by firstname, lastname, gender order by id desc) as seqnum
            from employees e
           )
      where seqnum = 1
     )
select e.*
from e
where e.salary > (select avg(salary) from e)

DISTINCT 没有给出预期的结果

DISTINCT not giving expected result

sql

average

subquery

distinct

aggregate-functions