来自一个 table 的 SELECT 个字段和来自相关 table 的聚合

SELECT fields from one table with aggregates from related table

这里是对 2 个表的简单描述:

CREATE TABLE jobs(id PRIMARY KEY, description);
CREATE TABLE dates(id PRIMARY KEY, job REFERENCES jobs(id), date);

每份工作可能有一个或多个日期。

我想创建一个生成以下内容的查询(在 pidgin 中):

jobs.id, jobs.description, min(dates.date) as start, max(dates.date) as finish

我试过这样的方法:

SELECT id, description,
      (SELECT min(date) as start  FROM dates d WHERE d.job=j.id),
      (SELECT max(date) as finish FROM dates d WHERE d.job=j.id)
FROM jobs j;

这有效,但看起来效率很低。

我尝试了 INNER JOIN,但看不到如何在 dates 上使用合适的聚合查询加入 jobs

任何人都可以建议一个干净有效的方法来做到这一点吗?

检索所有行时:先聚合,后加入:

SELECT id, j.description, d.start, d.finish
FROM   jobs j
LEFT   JOIN (
   SELECT job AS id, min(date) AS start, max(date) AS finish 
   FROM   dates 
   GROUP  BY job
   ) d USING (id);

相关:

关于JOIN .. USING

这不是 "different type of join"。 USING (col) 标准 SQL (!) ON a.col = b.col 的语法快捷方式。更准确地说,quoting the manual:

The USING clause is a shorthand that allows you to take advantage of the specific situation where both sides of the join use the same name for the joining column(s). It takes a comma-separated list of the shared column names and forms a join condition that includes an equality comparison for each one. For example, joining T1 and T2 with USING (a, b) produces the join condition ON *T1*.a = *T2*.a AND *T1*.b = *T2*.b.

Furthermore, the output of JOIN USING suppresses redundant columns: there is no need to print both of the matched columns, since they must have equal values. While JOIN ON produces all columns from T1 followed by all columns from T2, JOIN USING produces one output column for each of the listed column pairs (in the listed order), followed by any remaining columns from T1, followed by any remaining columns from T2.

特别方便的是可以写SELECT * FROM ...,加入的列只列出一次

除了, you can also use a window clause

SELECT j.id, j.description,
       first_value(d.date) OVER w AS start,
       last_value(d.date) OVER w AS finish
FROM jobs j
JOIN dates d ON d.job = j.id
WINDOW w AS (PARTITION BY j.id ORDER BY d.date
            ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING);

Window 函数有效地按一个或多个列分组(PARTITION BY 子句) and/or ORDER BY 一些其他列然后你可以应用一些 window function,甚至是常规聚合函数,而不影响任何其他列的分组或排序(在您的情况下为 description)。它需要一种稍微不同的构造查询的方式,但是一旦你明白了它是非常棒的。

在您的情况下,您需要获取 partition 的第一个值,这很容易,因为默认情况下可以访问它。您还需要查看 window 帧 (默认情况下以当前行结束)到 分区 中的最后一个值然后你需要 ROWS 子句。由于您使用相同的 window 定义生成两列,因此此处使用 WINDOW 子句;如果它适用于单个列,您只需在 select 列表中编写 window 函数,然后在 OVER 子句和 window 定义中不带名称(WINDOW w AS (...)).