查询连续块中具有相同值的最后一条记录

Query for last record with identical values in a continuous block

在 SQLite 中,我有一个 table datatable 格式如下:

+-----------------------+-----+-----+
|       timestamp       |  x  |  y  |
+-----------------------+-----+-----+
| "2015-01-30 23:00:00" |  1  |  1  |
| "2015-01-30 22:00:00" |  2  |  2  |
| "2015-01-30 21:00:00" |  2  |  2  |
| "2015-01-30 20:00:00" |  2  |  2  |
| "2015-01-30 19:00:00" |  3  |  3  |
| "2015-01-30 18:00:00" |  4  |  4  |
| "2015-01-30 17:00:00" |  2  |  2  |
+-----------------------+-----+-----+

我想提取连续块中最旧的记录(按时间戳),x,y 值与第二个最新条目的 x,y 值匹配。我有一个有效的查询(请参阅 post 的末尾),但是对于多个子查询来说效率非常低。我知道一定有更好的方法。

使用上面的示例 table:

  1. 搜索坐标 x,y 必须与倒数第二个条目的 2,2 匹配(时间戳 = '2015-01-30 22:00:00')
  2. 记录必须来自相同 x,y (22:00-20:00) 的连续块,但不能来自任何具有坐标 2,2 的更早记录(即17:00)
  3. 预期值是此 2,2 块中最早的记录,或 20:00

这是我目前的查询。它可以工作,但对于大型 tables 可能会很慢 - 特别是对于字符串连接。

-- find oldest time in continuous block that matches coordinates of interest
select min(timestamp) from datatable
where timestamp > (
    -- find most recent time that does not match coordinates of interest
    select max(timestamp) from datatable
    where timestamp < '2015-01-30 23:00:00'
    and x || ' | ' || y != (
        -- find coordinates of interest (2nd most recent record)
        select x || ' | ' || y
        from datatable
        where timestamp < '2015-01-30 23:00:00'
        order by timestamp
        limit 1
        -- returns 2 | 2
    )
    -- returns '2015-01-30 19:00:00
)
-- returns '2015-01-30 20:00:00 (which is the expected result)

可以删除字符串连接:

select min(timestamp), x, y
from datatable
where timestamp > (select max(timestamp)
                   from datatable
                   join (select x, y
                         from datatable
                         order by timestamp desc
                         limit 1 offset 1) as second
                   on datatable.x <> second.x
                   or datatable.y <> second.y
                   where timestamp < (select timestamp
                                      from datatable
                                      order by timestamp desc
                                      limit 1 offset 1))

有了 timestamp 上的索引,两个查询应该都不会太糟糕。

最快的方法可能是在应用程序中搜索块的末尾,即读取此查询的结果:

select timestamp, x, y
from datatable
order by timestamp desc
limit -1 offset 1

并在 x,y 值改变时停止。