从具有两级 GROUP BY 的 table 收集数据
Collect data from a table with two level of GROUP BY
我正在尝试使用 GROUP BY
从 Oracle 数据库 table 收集数据。我认为我需要两个级别的 GROUP BY
,但我不知道如何完成我的查询。
我有一个 STATUS table 有数百万这样的状态:
REQUEST STATUS
------- -----------
ID -> REQUEST_ID
... ID
STATUS_CODE
....
请求流程示例(状态 table):
SELECT ... FROM STATUS WHERE REQUEST_ID = 1 ORDER BY ID;
ID REQUEST_ID STATUS_CODE STATUS_ALIAS CREATED
1 1 201 REQUEST_SAVED
2 1 204 REQUEST_SIGNATURE_VALID
3 1 210 REQUEST_XML_VALID
4 1 280 REQUEST_ACCEPTED
5 1 310 SENT_TO_SYSTEM_1_FOR_VERIFICATION
6 1 320 SENT_TO_SYSTEM_2_FOR_VERIFICATION
7 1 521 SYSTEM_1_VERIFICATION_ERROR
8 1 511 SYSTEM_2_VERIFICATION_ERROR
24880 1 310 SENT_TO_SYSTEM_1_FOR_VERIFICATION
24881 1 320 SENT_TO_SYSTEM_2_FOR_VERIFICATION
24885 1 620 SYSTEM_1_VERIFICATION_TIMEOUT
24886 1 610 SYSTEM_2_VERIFICATION_TIMEOUT
24887 1 310 SENT_TO_SYSTEM_1_FOR_VERIFICATION
24888 1 320 SENT_TO_SYSTEM_2_FOR_VERIFICATION
.....
我想收集 REQUEST_ID
处于 VERIFICATION 状态但尚未 TIMEOUTED 的,如下所示:
24887 1 310 SENT_TO_SYSTEM_1_FOR_VERIFICATION
.....
这就是我 select 数据的方式:
SELECT REQUEST_ID, STATUS_CODE, MAX(ID) FROM STATUS
GROUP BY REQUEST_ID, STATUS_CODE HAVING STATUS_CODE = 310;
REQUEST_ID STATUS_CODE MAX(ID)
1 310 24887
这正确显示了 ID
,我需要从那里过滤分组的状态记录 REQUEST_ID
,但是当我将此查询与外部 SELECT
组合以显示 REQUEST_ID
s,它不起作用。
这是我迄今为止最好的尝试:
SELECT T1.REQUEST_ID FROM STATUS T1
GROUP BY T1.REQUEST_ID, T1.ID HAVING T1.ID >= (
SELECT MAX(ID) FROM STATUS T2
GROUP BY T2.REQUEST_ID, T2.STATUS_CODE
HAVING T2.STATUS_CODE IN (310, 320) AND NOT IN (610, 620)
);
ORA-01427: single-row subquery returns more than one row
01427. 00000 - "single-row subquery returns more than one row"
更新
建议解决方案的问题如下。
让我们假设流程以这种方式继续:
24887 1 310 SENT_TO_SYSTEM_1_FOR_VERIFICATION
24888 1 320 SENT_TO_SYSTEM_2_FOR_VERIFICATION
24889 1 460 SYSTEM_2_VERIFICATION_OK
24890 1 510 SYSTEM_1_VERIFICATION_ERROR
然后如果在假设 10 分钟内没有来自系统 1 的其他响应,我只需要为系统 1 添加超时:
24891 1 620 SYSTEM_1_VERIFICATION_TIMEOUT
但只有一次。这就是查询必须过滤掉 620 的原因。否则,尽管在之前的检查中设置了超时标志,但此请求 ID 1 再次出现在结果集中 运行.
更新 2
我可以在 Java 级别编写适当的“WHERE
”条件,并找到 lambda filters
处于 'stucked' 状态的请求,我需要在其中添加超时状态。但是这样我总是需要从 Java 循环遍历整个 STATUS table 并在每个 GRUOP BY REQUEST_ID
组上执行我的 java 逻辑。这很糟糕而且很耗时,会 运行 这么久,所以这个解决方案将无法正常工作。也许我需要一个存储过程?这就是为什么我想要一个“超级”SQL 查询,其中 returns 具有卡住的请求的 ID,我可以为具有这些 ID 的请求设置超时标志。
我可能会感到困惑,但我认为您需要的只是:
SELECT REQUEST_ID, STATUS_CODE, MAX(ID)
FROM STATUS
WHERE STATUS_CODE IN (310, 320)
GROUP BY REQUEST_ID, STATUS_CODE;
这个 T2.STATUS_CODE IN (310, 320) AND NOT IN (610, 620)
没有任何意义,因为当您将状态代码指定为 310/320 时,它肯定不会在 610/620 中。
在HAVING T2.STATUS_CODE IN (310, 320) AND NOT IN (610, 620)
中,第二个子句没有添加任何东西,就好像它在 (310,320) 中一样,它不能在 (610,620) 中。
请参阅下面的 dbFiddle link 了解架构、测试和其他查询。
SELECT
REQUEST_ID,
STATUS_CODE,
MAX(ID) AS MAX_ID
FROM STATUS
WHERE STATUS_CODE IN (310, 320)
GROUP BY
REQUEST_ID,
STATUS_CODE;
REQUEST_ID | STATUS_CODE | MAX_ID
---------: | ----------: | -----:
1 | 310 | 24887
1 | 320 | 24888
db<>fiddle here
在 Oracle 中,您可以在 HAVING
子句中使用 LAST
聚合函数来按请求的最终状态进行过滤。
在所有的 DBMS 中,您可以使用 row_number()
标记最后一行,然后对其进行过滤。
假设 ID
列始终递增(或将其替换为始终递增的列),您将得到:
create table t (ID, REQUEST_ID, STATUS_CODE, STATUS_ALIAS)
as
select 1, 1, 201, 'REQUEST_SAVED' from dual union all
select 2, 1, 204, 'REQUEST_SIGNATURE_VALID' from dual union all
select 3, 1, 210, 'REQUEST_XML_VALID' from dual union all
select 4, 1, 280, 'REQUEST_ACCEPTED' from dual union all
select 5, 1, 310, 'SENT_TO_SYSTEM_1_FOR_VERIFICATION' from dual union all
select 6, 1, 320, 'SENT_TO_SYSTEM_2_FOR_VERIFICATION' from dual union all
select 7, 1, 521, 'SYSTEM_1_VERIFICATION_ERROR' from dual union all
select 8, 1, 511, 'SYSTEM_2_VERIFICATION_ERROR' from dual union all
select 24880, 1, 310, 'SENT_TO_SYSTEM_1_FOR_VERIFICATION' from dual union all
select 24881, 1, 320, 'SENT_TO_SYSTEM_2_FOR_VERIFICATION' from dual union all
select 24885, 1, 620, 'SYSTEM_1_VERIFICATION_TIMEOUT' from dual union all
select 24886, 1, 610, 'SYSTEM_2_VERIFICATION_TIMEOUT' from dual union all
select 24887, 1, 310, 'SENT_TO_SYSTEM_1_FOR_VERIFICATION' from dual union all
select 24888, 1, 320, 'SENT_TO_SYSTEM_2_FOR_VERIFICATION' from dual union all
select 30000, 2, 201, 'REQUEST_SAVED' from dual union all
select 30001, 2, 204, 'REQUEST_SIGNATURE_VALID' from dual union all
select 30002, 2, 210, 'REQUEST_XML_VALID' from dual union all
select 30003, 2, 280, 'REQUEST_ACCEPTED' from dual union all
select 30004, 2, 310, 'SENT_TO_SYSTEM_1_FOR_VERIFICATION' from dual union all
select 30005, 2, 320, 'SENT_TO_SYSTEM_2_FOR_VERIFICATION' from dual union all
select 30006, 2, 521, 'SYSTEM_1_VERIFICATION_ERROR' from dual
select
request_id
, max(status_alias) keep(dense_rank last order by id asc) as final_status
from t
/*To restrict input as much as possible*/
where status_code >= 310
group by request_id
having max(status_code) keep(dense_rank last order by id asc) in (310, 320)
REQUEST_ID | FINAL_STATUS
---------: | :--------------------------------
1 | SENT_TO_SYSTEM_2_FOR_VERIFICATION
with a as (
select
t.*
, row_number() over(
partition by
request_id
order by
id desc
) as rn
from t
where status_code >= 310
)
select *
from a
where rn = 1
and status_code in (310, 320)
ID | REQUEST_ID | STATUS_CODE | STATUS_ALIAS | RN
----: | ---------: | ----------: | :-------------------------------- | -:
24888 | 1 | 320 | SENT_TO_SYSTEM_2_FOR_VERIFICATION | 1
db<>fiddle here
我正在尝试使用 GROUP BY
从 Oracle 数据库 table 收集数据。我认为我需要两个级别的 GROUP BY
,但我不知道如何完成我的查询。
我有一个 STATUS table 有数百万这样的状态:
REQUEST STATUS
------- -----------
ID -> REQUEST_ID
... ID
STATUS_CODE
....
请求流程示例(状态 table):
SELECT ... FROM STATUS WHERE REQUEST_ID = 1 ORDER BY ID;
ID REQUEST_ID STATUS_CODE STATUS_ALIAS CREATED
1 1 201 REQUEST_SAVED
2 1 204 REQUEST_SIGNATURE_VALID
3 1 210 REQUEST_XML_VALID
4 1 280 REQUEST_ACCEPTED
5 1 310 SENT_TO_SYSTEM_1_FOR_VERIFICATION
6 1 320 SENT_TO_SYSTEM_2_FOR_VERIFICATION
7 1 521 SYSTEM_1_VERIFICATION_ERROR
8 1 511 SYSTEM_2_VERIFICATION_ERROR
24880 1 310 SENT_TO_SYSTEM_1_FOR_VERIFICATION
24881 1 320 SENT_TO_SYSTEM_2_FOR_VERIFICATION
24885 1 620 SYSTEM_1_VERIFICATION_TIMEOUT
24886 1 610 SYSTEM_2_VERIFICATION_TIMEOUT
24887 1 310 SENT_TO_SYSTEM_1_FOR_VERIFICATION
24888 1 320 SENT_TO_SYSTEM_2_FOR_VERIFICATION
.....
我想收集 REQUEST_ID
处于 VERIFICATION 状态但尚未 TIMEOUTED 的,如下所示:
24887 1 310 SENT_TO_SYSTEM_1_FOR_VERIFICATION
.....
这就是我 select 数据的方式:
SELECT REQUEST_ID, STATUS_CODE, MAX(ID) FROM STATUS
GROUP BY REQUEST_ID, STATUS_CODE HAVING STATUS_CODE = 310;
REQUEST_ID STATUS_CODE MAX(ID)
1 310 24887
这正确显示了 ID
,我需要从那里过滤分组的状态记录 REQUEST_ID
,但是当我将此查询与外部 SELECT
组合以显示 REQUEST_ID
s,它不起作用。
这是我迄今为止最好的尝试:
SELECT T1.REQUEST_ID FROM STATUS T1
GROUP BY T1.REQUEST_ID, T1.ID HAVING T1.ID >= (
SELECT MAX(ID) FROM STATUS T2
GROUP BY T2.REQUEST_ID, T2.STATUS_CODE
HAVING T2.STATUS_CODE IN (310, 320) AND NOT IN (610, 620)
);
ORA-01427: single-row subquery returns more than one row
01427. 00000 - "single-row subquery returns more than one row"
更新
建议解决方案的问题如下。 让我们假设流程以这种方式继续:
24887 1 310 SENT_TO_SYSTEM_1_FOR_VERIFICATION
24888 1 320 SENT_TO_SYSTEM_2_FOR_VERIFICATION
24889 1 460 SYSTEM_2_VERIFICATION_OK
24890 1 510 SYSTEM_1_VERIFICATION_ERROR
然后如果在假设 10 分钟内没有来自系统 1 的其他响应,我只需要为系统 1 添加超时:
24891 1 620 SYSTEM_1_VERIFICATION_TIMEOUT
但只有一次。这就是查询必须过滤掉 620 的原因。否则,尽管在之前的检查中设置了超时标志,但此请求 ID 1 再次出现在结果集中 运行.
更新 2
我可以在 Java 级别编写适当的“WHERE
”条件,并找到 lambda filters
处于 'stucked' 状态的请求,我需要在其中添加超时状态。但是这样我总是需要从 Java 循环遍历整个 STATUS table 并在每个 GRUOP BY REQUEST_ID
组上执行我的 java 逻辑。这很糟糕而且很耗时,会 运行 这么久,所以这个解决方案将无法正常工作。也许我需要一个存储过程?这就是为什么我想要一个“超级”SQL 查询,其中 returns 具有卡住的请求的 ID,我可以为具有这些 ID 的请求设置超时标志。
我可能会感到困惑,但我认为您需要的只是:
SELECT REQUEST_ID, STATUS_CODE, MAX(ID)
FROM STATUS
WHERE STATUS_CODE IN (310, 320)
GROUP BY REQUEST_ID, STATUS_CODE;
这个 T2.STATUS_CODE IN (310, 320) AND NOT IN (610, 620)
没有任何意义,因为当您将状态代码指定为 310/320 时,它肯定不会在 610/620 中。
在HAVING T2.STATUS_CODE IN (310, 320) AND NOT IN (610, 620)
中,第二个子句没有添加任何东西,就好像它在 (310,320) 中一样,它不能在 (610,620) 中。
请参阅下面的 dbFiddle link 了解架构、测试和其他查询。
SELECT REQUEST_ID, STATUS_CODE, MAX(ID) AS MAX_ID FROM STATUS WHERE STATUS_CODE IN (310, 320) GROUP BY REQUEST_ID, STATUS_CODE;
REQUEST_ID | STATUS_CODE | MAX_ID ---------: | ----------: | -----: 1 | 310 | 24887 1 | 320 | 24888
db<>fiddle here
在 Oracle 中,您可以在 HAVING
子句中使用 LAST
聚合函数来按请求的最终状态进行过滤。
在所有的 DBMS 中,您可以使用 row_number()
标记最后一行,然后对其进行过滤。
假设 ID
列始终递增(或将其替换为始终递增的列),您将得到:
create table t (ID, REQUEST_ID, STATUS_CODE, STATUS_ALIAS) as select 1, 1, 201, 'REQUEST_SAVED' from dual union all select 2, 1, 204, 'REQUEST_SIGNATURE_VALID' from dual union all select 3, 1, 210, 'REQUEST_XML_VALID' from dual union all select 4, 1, 280, 'REQUEST_ACCEPTED' from dual union all select 5, 1, 310, 'SENT_TO_SYSTEM_1_FOR_VERIFICATION' from dual union all select 6, 1, 320, 'SENT_TO_SYSTEM_2_FOR_VERIFICATION' from dual union all select 7, 1, 521, 'SYSTEM_1_VERIFICATION_ERROR' from dual union all select 8, 1, 511, 'SYSTEM_2_VERIFICATION_ERROR' from dual union all select 24880, 1, 310, 'SENT_TO_SYSTEM_1_FOR_VERIFICATION' from dual union all select 24881, 1, 320, 'SENT_TO_SYSTEM_2_FOR_VERIFICATION' from dual union all select 24885, 1, 620, 'SYSTEM_1_VERIFICATION_TIMEOUT' from dual union all select 24886, 1, 610, 'SYSTEM_2_VERIFICATION_TIMEOUT' from dual union all select 24887, 1, 310, 'SENT_TO_SYSTEM_1_FOR_VERIFICATION' from dual union all select 24888, 1, 320, 'SENT_TO_SYSTEM_2_FOR_VERIFICATION' from dual union all select 30000, 2, 201, 'REQUEST_SAVED' from dual union all select 30001, 2, 204, 'REQUEST_SIGNATURE_VALID' from dual union all select 30002, 2, 210, 'REQUEST_XML_VALID' from dual union all select 30003, 2, 280, 'REQUEST_ACCEPTED' from dual union all select 30004, 2, 310, 'SENT_TO_SYSTEM_1_FOR_VERIFICATION' from dual union all select 30005, 2, 320, 'SENT_TO_SYSTEM_2_FOR_VERIFICATION' from dual union all select 30006, 2, 521, 'SYSTEM_1_VERIFICATION_ERROR' from dual
select request_id , max(status_alias) keep(dense_rank last order by id asc) as final_status from t /*To restrict input as much as possible*/ where status_code >= 310 group by request_id having max(status_code) keep(dense_rank last order by id asc) in (310, 320)
REQUEST_ID | FINAL_STATUS ---------: | :-------------------------------- 1 | SENT_TO_SYSTEM_2_FOR_VERIFICATION
with a as ( select t.* , row_number() over( partition by request_id order by id desc ) as rn from t where status_code >= 310 ) select * from a where rn = 1 and status_code in (310, 320)
ID | REQUEST_ID | STATUS_CODE | STATUS_ALIAS | RN ----: | ---------: | ----------: | :-------------------------------- | -: 24888 | 1 | 320 | SENT_TO_SYSTEM_2_FOR_VERIFICATION | 1
db<>fiddle here