当我按键聚合时,为什么不能从 `GROUP BY` 中排除依赖列?
Why can't I exclude dependent columns from `GROUP BY` when I aggregate by a key?
如果我有以下 tables(例如使用 PostgreSQL,但可以是任何其他关系数据库),其中 car
有两个键(id
和 vin
):
create table car (
id int primary key not null,
color varchar(10),
brand varchar(10),
vin char(17) unique not null
);
create table appraisal (
id int primary key not null,
recorded date not null,
car_id int references car (id),
car_vin char(17) references car (vin),
price int
);
我可以成功地将 c.color
和 c.brand
包含在 select 列表中而无需聚合它们,因为它们依赖于 c.id
:
select
c.id, c.color, c.brand,
min(price) as min_appraisal,
max(price) as max_appraisal
from car c
left join appraisal a on a.car_id = c.id
group by c.id; -- c.color, c.brand are not needed here
但是,以下查询失败,因为它不允许我在 select 列表中包含 c.color
和 c.brand
,即使它确实依赖于 c.vin
(即键)的table.
select
c.vin, c.color, c.brand,
min(price) as min_appraisal,
max(price) as max_appraisal
from car c
left join appraisal a on a.car_vin = c.vin
group by c.vin; -- Why are c.color, c.brand needed here?
Error: ERROR: column "c.color" must appear in the GROUP BY clause or be used in an aggregate function
Position: 18
DB Fiddle 中的示例。
因为只有 PK 涵盖 GROUP BY
子句中基础 table 的所有列。因此您的第一个查询有效。 UNIQUE
约束没有。
不可延迟的 UNIQUE
和 NOT NULL
约束的组合也符合条件。但这并没有实现——以及 SQL 标准已知的一些其他功能依赖项。该功能的主要作者 Peter Eisentraut 有更多想法,但当时确定需求低且相关成本可能很高。参见 discussion about the feature on pgsql-hackers.
When GROUP BY
is present, or any aggregate functions are present, it
is not valid for the SELECT
list expressions to refer to ungrouped
columns except within aggregate functions or when the ungrouped column
is functionally dependent on the grouped columns, since there would
otherwise be more than one possible value to return for an ungrouped
column. A functional dependency exists if the grouped columns (or a
subset thereof) are the primary key of the table containing the
ungrouped column.
PostgreSQL recognizes functional dependency (allowing columns to be
omitted from GROUP BY
) only when a table's primary key is included in
the GROUP BY
list. The SQL standard specifies additional conditions
that should be recognized.
由于 c.vin
是 UNIQUE NOT NULL
,您可以改用 PK 列来修复第二个查询:
...
group by c.id;
此外,虽然参照完整性被强制执行并且整个 table 被查询,但两个给定的查询都可以大大便宜:聚合行 appraisal
before 加入。这消除了先验地在外部 SELECT
中 GROUP BY
的需要。喜欢:
SELECT c.vin, c.color, c.brand
, a.min_appraisal
, a.max_appraisal
FROM car c
LEFT JOIN (
SELECT car_vin
, min(price) AS min_appraisal
, max(price) AS max_appraisal
FROM appraisal
GROUP BY car_vin
) a ON a.car_vin = c.vin;
参见:
- Multiple array_agg() calls in a single query
相关:
- SQL statement working in MySQL not working in Postgresql - Sum & group_by rails 3
- PostgreSQL - GROUP BY clause
如果我有以下 tables(例如使用 PostgreSQL,但可以是任何其他关系数据库),其中 car
有两个键(id
和 vin
):
create table car (
id int primary key not null,
color varchar(10),
brand varchar(10),
vin char(17) unique not null
);
create table appraisal (
id int primary key not null,
recorded date not null,
car_id int references car (id),
car_vin char(17) references car (vin),
price int
);
我可以成功地将 c.color
和 c.brand
包含在 select 列表中而无需聚合它们,因为它们依赖于 c.id
:
select
c.id, c.color, c.brand,
min(price) as min_appraisal,
max(price) as max_appraisal
from car c
left join appraisal a on a.car_id = c.id
group by c.id; -- c.color, c.brand are not needed here
但是,以下查询失败,因为它不允许我在 select 列表中包含 c.color
和 c.brand
,即使它确实依赖于 c.vin
(即键)的table.
select
c.vin, c.color, c.brand,
min(price) as min_appraisal,
max(price) as max_appraisal
from car c
left join appraisal a on a.car_vin = c.vin
group by c.vin; -- Why are c.color, c.brand needed here?
Error: ERROR: column "c.color" must appear in the GROUP BY clause or be used in an aggregate function Position: 18
DB Fiddle 中的示例。
因为只有 PK 涵盖 GROUP BY
子句中基础 table 的所有列。因此您的第一个查询有效。 UNIQUE
约束没有。
不可延迟的 UNIQUE
和 NOT NULL
约束的组合也符合条件。但这并没有实现——以及 SQL 标准已知的一些其他功能依赖项。该功能的主要作者 Peter Eisentraut 有更多想法,但当时确定需求低且相关成本可能很高。参见 discussion about the feature on pgsql-hackers.
When
GROUP BY
is present, or any aggregate functions are present, it is not valid for theSELECT
list expressions to refer to ungrouped columns except within aggregate functions or when the ungrouped column is functionally dependent on the grouped columns, since there would otherwise be more than one possible value to return for an ungrouped column. A functional dependency exists if the grouped columns (or a subset thereof) are the primary key of the table containing the ungrouped column.
PostgreSQL recognizes functional dependency (allowing columns to be omitted from
GROUP BY
) only when a table's primary key is included in theGROUP BY
list. The SQL standard specifies additional conditions that should be recognized.
由于 c.vin
是 UNIQUE NOT NULL
,您可以改用 PK 列来修复第二个查询:
...
group by c.id;
此外,虽然参照完整性被强制执行并且整个 table 被查询,但两个给定的查询都可以大大便宜:聚合行 appraisal
before 加入。这消除了先验地在外部 SELECT
中 GROUP BY
的需要。喜欢:
SELECT c.vin, c.color, c.brand
, a.min_appraisal
, a.max_appraisal
FROM car c
LEFT JOIN (
SELECT car_vin
, min(price) AS min_appraisal
, max(price) AS max_appraisal
FROM appraisal
GROUP BY car_vin
) a ON a.car_vin = c.vin;
参见:
- Multiple array_agg() calls in a single query
相关:
- SQL statement working in MySQL not working in Postgresql - Sum & group_by rails 3
- PostgreSQL - GROUP BY clause