Select 记录彼此相隔 10 分钟以内的所有记录

Select records all within 10 minutes from each other

我的 Oracle 数据库中有一些数据来源。

如果某个 Office_ID 已被停用,并且它在某一天拥有所有三个客户(A、B、C),那么我们必须检查是否所有客户都已离开。如果是,那么我们需要检查所有客户端的时间范围是否在 10 分钟内。

如果对于特定办公室,这种情况在一天内重复三次,我们将宣布该办公室关闭。

这是一些示例数据:

+-----------+-----------+--------------+--------+
| OFFICE_ID | FAIL_TIME | ACTIVITY_DAY | CLIENT |
| 1002      | 5:39:00   | 23/01/2015   | A      |
| 1002      | 17:49:00  | 23/12/2014   | A      |
| 1002      | 18:41:57  | 1/5/2014     | B      |
| 1002      | 10:32:00  | 1/7/2014     | A      |
| 1002      | 10:34:23  | 1/7/2014     | B      |
| 1002      | 10:35:03  | 1/7/2014     | C      |
| 1002      | 12:08:52  | 1/7/2014     | B      |
| 1002      | 12:09:00  | 1/7/2014     | A      |
| 1002      | 12:26:10  | 1/7/2014     | B      |
| 1002      | 13:31:32  | 1/7/2014     | B      |
| 1002      | 15:24:06  | 1/7/2014     | B      |
| 1002      | 15:55:06  | 1/7/2014     | C      |
+-----------+-----------+--------------+--------+

结果应该是这样的:

1002    10:32:00      A
1002    10:34:23      B
1002    10:35:03      C

如有任何帮助,我们将不胜感激。我正在寻找 SQL 查询或 PL/SQL 程序。

要识别记录,您可以像这样自己加入 table 三次:

SELECT
  a.*, b.*, c.*
FROM FailLog a INNER JOIN
     FailLog b ON b.OFFICE_ID = A.OFFICE_ID AND 
          a.CLIENT = 'A' AND 
          b.CLIENT = 'B' AND
          b.ACTIVITY_DAY = a.ACTIVITY_DAY INNER JOIN
     FailLog c ON c.OFFICE_ID = A.OFFICE_ID AND 
          c.CLIENT = 'C' AND 
          c.ACTIVITY_DAY = a.ACTIVITY_DAY AND
          -- need to calculate difference in min here
          GREATEST (a.FAIL_TIME, b. FAIL_TIME, c. FAIL_TIME) - 
          LEAST (a.FAIL_TIME, b. FAIL_TIME, c. FAIL_TIME) <= 10 

输出将为您提供一行而不是问题中要求的三行,但这将是故障数据的正确级别,因为所有三个客户端都应该有它。

我们首先需要的是一种比较方式FAIL_TIME。由于您还没有发布 table 结构,让我们假设我们正在处理字符串。

Oracle 有一些简洁的内置函数用于转换日期和字符串。如果我们连接 ACTIVITY_DATE 和 FAIL_TIME 我们可以将它们转换为 DATE 数据类型:

to_date(ACTIVITY_DAY||' '||FAIL_TIME, 'dd/mm/yyyy hh24:mi:ss')

我们可以将其转换为表示午夜后秒数的字符串:

to_char(to_date(ACTIVITY_DAY||' '||FAIL_TIME, 'dd/mm/yyyy hh24:mi:ss'), 'sssss')

然后我们可以将 that 转换为一个数字,我们可以在一些算术中使用它来与其他行进行比较;十分钟 = 600 秒。

接下来我们可以使用子查询分解(WITH 子句)。这种语法的一个巧妙特性是我们可以将一个子查询的输出传递给另一个子查询,因此我们只需要编写一次那个粗糙的嵌套强制转换表达式。

with t as
    ( select OFFICE_ID
               , ACTIVITY_DAY
               , FAIL_TIME
               , to_number(to_char(to_date(ACTIVITY_DAY||' '||FAIL_TIME, 'dd/mm/yyyy hh24:mi:ss'), 'sssss')) FAIL_TIME_SSSSS
               , CLIENT
      from faillog
    )

我们可以使用这个子查询来构建其他子查询,这些子查询将 table 的行分为每个 CLIENT 的集合,以便在我们的主查询中使用。

最后,我们可以使用分析 COUNT() 函数来跟踪每个 OFFICE 和 ACTIVITY_DATE 组合有多少串 FAIL_TIME。

count(*) over (partition by a.OFFICE_ID, a.ACTIVITY_DAY) 

将它们放在一个在线视图中可以让我们测试我们是否可以 "declare the office as closed"。

select * from (
    with t as ( select OFFICE_ID
                       , ACTIVITY_DAY
                       , FAIL_TIME
                       , to_number(to_char(to_date(ACTIVITY_DAY||' '||FAIL_TIME, 'dd/mm/yyyy hh24:mi:ss'), 'sssss')) FAIL_TIME_SSSSS
                       , CLIENT
                from faillog
                )
         , a as (select *
                  from t
                  where CLIENT = 'A' )
         , b as (select *
                  from t
                  where CLIENT = 'B' )
         , c as (select *
                  from t
                  where CLIENT = 'C' )
    select a.OFFICE_ID
           , a.ACTIVITY_DAY 
           , a.FAIL_TIME as a_fail_time
           , b.FAIL_TIME as b_fail_time
           , c.FAIL_TIME as a_fail_time
           , count(*) over (partition by a.OFFICE_ID, a.ACTIVITY_DAY) as fail_count
    from a 
         join b on a.OFFICE_ID = b.OFFICE_ID and a.ACTIVITY_DAY = b.ACTIVITY_DAY
         join c on a.OFFICE_ID = c.OFFICE_ID and a.ACTIVITY_DAY = c.ACTIVITY_DAY
    where a.FAIL_TIME_SSSSS between b.FAIL_TIME_SSSSS - 600 and b.FAIL_TIME_SSSSS + 600
    and   a.FAIL_TIME_SSSSS between c.FAIL_TIME_SSSSS - 600 and c.FAIL_TIME_SSSSS + 600
    and   b.FAIL_TIME_SSSSS between a.FAIL_TIME_SSSSS - 600 and a.FAIL_TIME_SSSSS + 600
    and   b.FAIL_TIME_SSSSS between c.FAIL_TIME_SSSSS - 600 and c.FAIL_TIME_SSSSS + 600
    and   c.FAIL_TIME_SSSSS between a.FAIL_TIME_SSSSS - 600 and a.FAIL_TIME_SSSSS + 600
    and   c.FAIL_TIME_SSSSS between b.FAIL_TIME_SSSSS - 600 and b.FAIL_TIME_SSSSS + 600
)
where fail_count >= 3
/

备注

  1. 显然,我已经在子查询中对 CLIENT 标识符进行了硬编码。 可以避免硬编码,但示例查询已经足够复杂了。
  2. 此查询不搜索 三胞胎。假设 A、B 和 C 各有一个故障 在十分钟内 window 有多少实例并不重要 每个 CLIENT 都出现在 window 内。你的里面什么都没有 商业规则这样说是错误的。
  3. 同样,同一个实例 一个 CLIENT 可以与其他 CLIENT 的实例匹配 重叠 windows。现在这可能是不可取的:双倍或三倍 计数可能会使 FAIL_COUNT 膨胀。但同样,处理这将 使最终查询更加复杂。
  4. 所显示的查询对于 A、B 和 C FAIL_TIME 值的每个不同组合都有一行。如果您真的需要每个 CLIENT/FAIL_TIME.
  5. 一行,则可以旋转结果集

使用 COUNT analytic functionRANGE BETWEEN INTERVAL '10' MINUTE PRECEDING AND INTERVAL '10' MINUTE FOLLOWING 避免自连接的解决方案:

SQL Fiddle

Oracle 11g R2 架构设置:

CREATE TABLE Test ( OFFICE_ID, FAIL_TIME, ACTIVITY_DAY, CLIENT ) AS
          SELECT 1002,  '5:39:00', '23/01/2015', 'A' FROM DUAL
UNION ALL SELECT 1002, '17:49:00', '23/12/2014', 'A' FROM DUAL
UNION ALL SELECT 1002, '18:41:57', '1/5/2014', 'B' FROM DUAL
UNION ALL SELECT 1002, '10:32:00', '1/7/2014', 'A' FROM DUAL
UNION ALL SELECT 1002, '10:34:23', '1/7/2014', 'B' FROM DUAL
UNION ALL SELECT 1002, '10:35:03', '1/7/2014', 'C' FROM DUAL
UNION ALL SELECT 1002, '12:08:52', '1/7/2014', 'B' FROM DUAL
UNION ALL SELECT 1002, '12:09:00', '1/7/2014', 'A' FROM DUAL
UNION ALL SELECT 1002, '12:26:10', '1/7/2014', 'B' FROM DUAL
UNION ALL SELECT 1002, '13:31:32', '1/7/2014', 'B' FROM DUAL
UNION ALL SELECT 1002, '15:24:06', '1/7/2014', 'B' FROM DUAL
UNION ALL SELECT 1002, '15:55:06', '1/7/2014', 'C' FROM DUAL

查询 1:

WITH Times AS (
  SELECT OFFICE_ID,
         TO_DATE( ACTIVITY_DAY || ' ' || FAIL_TIME, 'DD/MM/YYYY HH24/MI/SS' ) AS FAIL_DATETIME,
         CLIENT
  FROM   Test
),
Next_Times As (
  SELECT OFFICE_ID,
         FAIL_DATETIME,
         COUNT( CASE CLIENT WHEN 'A' THEN 1 END ) OVER ( PARTITION BY OFFICE_ID ORDER BY FAIL_DATETIME RANGE BETWEEN INTERVAL '10' MINUTE PRECEDING AND INTERVAL '10' MINUTE FOLLOWING ) AS COUNT_A,
         COUNT( CASE CLIENT WHEN 'B' THEN 1 END ) OVER ( PARTITION BY OFFICE_ID ORDER BY FAIL_DATETIME RANGE BETWEEN INTERVAL '10' MINUTE PRECEDING AND INTERVAL '10' MINUTE FOLLOWING ) AS COUNT_B,
         COUNT( CASE CLIENT WHEN 'C' THEN 1 END ) OVER ( PARTITION BY OFFICE_ID ORDER BY FAIL_DATETIME RANGE BETWEEN INTERVAL '10' MINUTE PRECEDING AND INTERVAL '10' MINUTE FOLLOWING ) AS COUNT_C
  FROM   Times
)
SELECT OFFICE_ID,
       TO_CHAR( FAIL_DATETIME, 'HH24:MI:SS' ) AS FAIL_TIME,
       TO_CHAR( FAIL_DATETIME, 'DD/MM/YYYY' ) AS ACTIVITY_DAY       
FROM   Next_Times
WHERE  COUNT_A > 0
AND    COUNT_B > 0
AND    COUNT_C > 0
ORDER BY FAIL_DATETIME

Results:

| OFFICE_ID | FAIL_TIME | ACTIVITY_DAY |
|-----------|-----------|--------------|
|      1002 |  10:32:00 |   01/07/2014 |
|      1002 |  10:34:23 |   01/07/2014 |
|      1002 |  10:35:03 |   01/07/2014 |