过滤不同的计数

Filter for distinct counts

我想统计每一类羽毛在数据集中出现的次数,然后在beak列有类别[=12]的情况下,只过滤那些出现次数超过5次的=].

但是,我收到以下错误:

near "(": syntax error

SELECT 
    land_birds.feather, land_birds.weight, COUNT(DISTINCT land_birds.feather) AS numFeathers,
    land_birds.size, sea_birds.beak
     
FROM
    land_birds
INNER JOIN
    sea_birds
ON 
    land_birds.colour = sea_birds.colour
WHERE sea_birds.colour IN (SELECT colour from land_birds) AND beak LIKE 'Long'
GROUP BY feather
ORDER BY feather ASC
FILTER(WHERE numFeathers > 5)

要过滤通过分组生成的信息,您可以使用紧跟在 GROUP BY 子句之后的 HAVING 子句,如下所示:

SELECT
      land_birds.feather
    , land_birds.weight
    , COUNT(DISTINCT land_birds.feather) AS numFeathers
    , land_birds.size
    , sea_birds.beak
FROM land_birds
INNER JOIN sea_birds ON land_birds.colour = sea_birds.colour
WHERE beak LIKE 'Long'
GROUP BY land_birds.feather
HAVING COUNT(DISTINCT land_birds.feather) > 5
ORDER BY land_birds.feather ASC 

虽然在 having 子句中使用您为该计算提供的别名“numFeathers”似乎合乎逻辑,但不要。而是引用计算本身。记住这一点可能会有所帮助,您可以在 select 子句中没有出现的 having 子句中引用分组计算,例如这仍然有效

SELECT
      land_birds.feather
    , land_birds.weight

    , land_birds.size
    , sea_birds.beak
FROM land_birds
INNER JOIN sea_birds ON land_birds.colour = sea_birds.colour
WHERE beak LIKE 'Long'
GROUP BY land_birds.feather
HAVING COUNT(DISTINCT land_birds.feather) > 5
ORDER BY land_birds.feather ASC 

这里根本没有用于该计算的列别名。


关于您查询的其他观察结果。

  1. 在整个查询中引用列时始终使用 table 名称(或 table 别名)
  2. 关于颜色的内部连接条件意味着结果中只能有与该条件完全匹配的行。因此,您不需要在 where 子句中也包含相同的条件。

最后一点,请不要将 having 子句视为 where 子句的替代品。 where 子句出现在分组之前,因此它减少了要分组的数据量。 having 子句过滤生成的信息,这些信息只能在分组后存在。简而言之,它们是具有特定功能和用途的非常不同的条款。