如何对 PostgreSQL 中的列的一部分进行分组?

How to group on part of a column in PostgreSQL?

我在 PostgreSQL 中有以下 table tableA

+-------------+-------------------------+
| OperationId |         Error           |
+-------------+-------------------------+
|           1 | MajorCategoryX:DetailsP |
|           2 | MajorCategoryX:DetailsQ |
|           3 | MajorCategoryY:DetailsR |
+-------------+-------------------------+

如何对 MajorErrorCategory 进行分组以便获得以下信息?

+----------------+------------+
|    Category    | ErrorCount |
+----------------+------------+
| MajorCategoryX |          2 |
| MajorCategoryY |          1 |
+----------------+------------+

CategoryError 在 ':' 上拆分后的第一部分。

考虑 substring() 函数:

SELECT substring(TableName.Error,1,14) AS Category, 
       Count(*) As ErrorCount
FROM TableName
GROUP BY substring(TableName.Error,1,14) 

假设 : 之前的长度可以变化,您可以将 substringstrpos 结合使用来实现您的结果:

SELECT 
    SUBSTRING(error, 0, STRPOS(error, ':')) AS Category,     
    COUNT(*) AS ErrorCount
FROM t
GROUP BY SUBSTRING(error, 0, STRPOS(error, ':'))

Sample SQL Fiddle

如果您不想重复函数调用,您当然可以将该部分包装在一个 suquery 或通用 table 表达式中。

这是我使用子查询和 split_part 函数得出的结果:

SELECT *, COUNT(ErrorSplit) 
FROM (
  SELECT split_part(Error, ':', 1) AS ErrorSplit
  FROM tableA
) AS tableSplit
GROUP BY ErrorSplit;

输出:

   errorsplit  | count
----------------------
MajorCategoryX |  2
MajorCategoryY |  1

SQL Fiddle

split_part() 似乎最简单 ():

但是你不需要子查询:

SELECT split_part(error, ':', 1) AS category, count(*) AS errorcount 
FROM   tbl
GROUP  BY 1;

并且count(*)count(<expression>)稍快。

GROUP BY 1 是对第一个 SELECT 项的位置引用,对于较长的表达式来说是一个方便的 shorthand。示例:

  • Select first row in each GROUP BY group?