优化 Sqlite 自连接

Optimizing Sqlite self joins

帮我优化我的 sqlite 查询

table:

CREATE TABLE links (
    c   INTEGER NOT NULL,
    position    INTEGER NOT NULL,
    key_id INTEGER  REFERENCES keys(id),
    PRIMARY KEY(c, position, key_id)
) WITHOUT ROWID;

查询

select c_1.* from links c_1 
join links c_2 on c_1.key_id = c_2.key_id and c_2.position > c_1.position
join links c_3 on c_1.key_id = c_3.key_id and c_3.position > c_2.position
join links c_4 on c_1.key_id = c_4.key_id and c_4.position > c_3.position
where c_1.c = unicode('A') 
and c_2.c = unicode('p')
and c_3.c = unicode('i')
and c_4.c = unicode('x')

想法是通过索引一个词('key_id')的每个后面('c')来实现子串搜索。我正在尝试回答以下请求:给我所有包含 Ap 大于 Ai 大于 i 的单词大于 p 且与 ix 相同。上面的查询应该匹配下面的词:

换句话说,我正在尝试优化以下查询:

select * from links where key like '%A%p%i%x%'

查询计划如下所示:

示例结果

c|position|key_id
-----------------
65  1   121
65  1   2292
65  1   3919
65  1   3923
65  1   3925
65  1   3933
65  1   3946
65  1   4375
65  1   4375
65  1   4375
65  1   4375

在此示例中,它找到了三个键。稍后我会将它映射到单词并能够显示它找到的前缀是什么。

A links 中有 240,076 行,执行需要 2 秒。如何让它 运行 更快?

您的主键索引位于 c, position, key_id,但在您的查询中,您的 WHEREON 测试比较 c 是否相等,position不平等,key_id 代表平等。这意味着索引中的key_id不能使用。

来自 the documentation(已强调):

Then the index might be used if the initial columns of the index (columns a, b, and so forth) appear in WHERE clause terms. The initial columns of the index must be used with the = or IN or IS operators. The right-most column that is used can employ inequalities. For the right-most column of an index that is used, there can be up to two inequalities that must sandwich the allowed values of the column between two extremes.

如您所见,将 position 检查中的 > 切换为 = 会大大加快速度 - 使用三个相等性检查意味着可以使用整个索引来查找匹配行。

如果您在 PK 中使用不同的列顺序重新创建 table - 到 c, key_id, position,或者添加一个新的索引,其中三列按该顺序排列,您应该会看到改进,因为这样整个索引就可以用来查找要连接的行,而不仅仅是索引的一部分,因为它遵循除了最右边的列之外的所有列都使用相等性测试的约束。

更改后我看到的查询计划:

QUERY PLAN
|--SEARCH TABLE links AS c_1 USING PRIMARY KEY (c=?)
|--SEARCH TABLE links AS c_2 USING PRIMARY KEY (c=? AND key_id=? AND position>?)
|--SEARCH TABLE links AS c_3 USING PRIMARY KEY (c=? AND key_id=? AND position>?)
`--SEARCH TABLE links AS c_4 USING PRIMARY KEY (c=? AND key_id=? AND position>?)