查询连续块中具有相同值的最后一条记录
Query for last record with identical values in a continuous block
在 SQLite 中,我有一个 table datatable
格式如下:
+-----------------------+-----+-----+
| timestamp | x | y |
+-----------------------+-----+-----+
| "2015-01-30 23:00:00" | 1 | 1 |
| "2015-01-30 22:00:00" | 2 | 2 |
| "2015-01-30 21:00:00" | 2 | 2 |
| "2015-01-30 20:00:00" | 2 | 2 |
| "2015-01-30 19:00:00" | 3 | 3 |
| "2015-01-30 18:00:00" | 4 | 4 |
| "2015-01-30 17:00:00" | 2 | 2 |
+-----------------------+-----+-----+
我想提取连续块中最旧的记录(按时间戳),x,y
值与第二个最新条目的 x,y
值匹配。我有一个有效的查询(请参阅 post 的末尾),但是对于多个子查询来说效率非常低。我知道一定有更好的方法。
使用上面的示例 table:
- 搜索坐标
x,y
必须与倒数第二个条目的 2,2
匹配(时间戳 = '2015-01-30 22:00:00')
- 记录必须来自相同
x,y
(22:00
-20:00
) 的连续块,但不能来自任何具有坐标 2,2
的更早记录(即17:00
)
- 预期值是此
2,2
块中最早的记录,或 20:00
这是我目前的查询。它可以工作,但对于大型 tables 可能会很慢 - 特别是对于字符串连接。
-- find oldest time in continuous block that matches coordinates of interest
select min(timestamp) from datatable
where timestamp > (
-- find most recent time that does not match coordinates of interest
select max(timestamp) from datatable
where timestamp < '2015-01-30 23:00:00'
and x || ' | ' || y != (
-- find coordinates of interest (2nd most recent record)
select x || ' | ' || y
from datatable
where timestamp < '2015-01-30 23:00:00'
order by timestamp
limit 1
-- returns 2 | 2
)
-- returns '2015-01-30 19:00:00
)
-- returns '2015-01-30 20:00:00 (which is the expected result)
可以删除字符串连接:
select min(timestamp), x, y
from datatable
where timestamp > (select max(timestamp)
from datatable
join (select x, y
from datatable
order by timestamp desc
limit 1 offset 1) as second
on datatable.x <> second.x
or datatable.y <> second.y
where timestamp < (select timestamp
from datatable
order by timestamp desc
limit 1 offset 1))
有了 timestamp
上的索引,两个查询应该都不会太糟糕。
最快的方法可能是在应用程序中搜索块的末尾,即读取此查询的结果:
select timestamp, x, y
from datatable
order by timestamp desc
limit -1 offset 1
并在 x,y
值改变时停止。
在 SQLite 中,我有一个 table datatable
格式如下:
+-----------------------+-----+-----+
| timestamp | x | y |
+-----------------------+-----+-----+
| "2015-01-30 23:00:00" | 1 | 1 |
| "2015-01-30 22:00:00" | 2 | 2 |
| "2015-01-30 21:00:00" | 2 | 2 |
| "2015-01-30 20:00:00" | 2 | 2 |
| "2015-01-30 19:00:00" | 3 | 3 |
| "2015-01-30 18:00:00" | 4 | 4 |
| "2015-01-30 17:00:00" | 2 | 2 |
+-----------------------+-----+-----+
我想提取连续块中最旧的记录(按时间戳),x,y
值与第二个最新条目的 x,y
值匹配。我有一个有效的查询(请参阅 post 的末尾),但是对于多个子查询来说效率非常低。我知道一定有更好的方法。
使用上面的示例 table:
- 搜索坐标
x,y
必须与倒数第二个条目的2,2
匹配(时间戳 = '2015-01-30 22:00:00') - 记录必须来自相同
x,y
(22:00
-20:00
) 的连续块,但不能来自任何具有坐标2,2
的更早记录(即17:00
) - 预期值是此
2,2
块中最早的记录,或20:00
这是我目前的查询。它可以工作,但对于大型 tables 可能会很慢 - 特别是对于字符串连接。
-- find oldest time in continuous block that matches coordinates of interest
select min(timestamp) from datatable
where timestamp > (
-- find most recent time that does not match coordinates of interest
select max(timestamp) from datatable
where timestamp < '2015-01-30 23:00:00'
and x || ' | ' || y != (
-- find coordinates of interest (2nd most recent record)
select x || ' | ' || y
from datatable
where timestamp < '2015-01-30 23:00:00'
order by timestamp
limit 1
-- returns 2 | 2
)
-- returns '2015-01-30 19:00:00
)
-- returns '2015-01-30 20:00:00 (which is the expected result)
可以删除字符串连接:
select min(timestamp), x, y
from datatable
where timestamp > (select max(timestamp)
from datatable
join (select x, y
from datatable
order by timestamp desc
limit 1 offset 1) as second
on datatable.x <> second.x
or datatable.y <> second.y
where timestamp < (select timestamp
from datatable
order by timestamp desc
limit 1 offset 1))
有了 timestamp
上的索引,两个查询应该都不会太糟糕。
最快的方法可能是在应用程序中搜索块的末尾,即读取此查询的结果:
select timestamp, x, y
from datatable
order by timestamp desc
limit -1 offset 1
并在 x,y
值改变时停止。