MySQL 全文搜索:使用带通配符的距离

MySQL Fulltext search: Using distance with wildcards

我的数据是这样的:

[column "content"]
The quick red horse jumps over the quick dog
The quick brown horse
The quick brown horse jumps over the lazy dog
The quick brown horses jumps over the dog
quick as a mouse was the spider. The horse is brown.

我使用 MATCH 和 AGAINST 来获取所有有马和马的行。所以,我知道,通配符*适用于布尔模式。

SELECT * FROM news
WHERE   (MATCH (content) AGAINST ('+quick +horse*' IN BOOLEAN MODE));

在下一个查询中,我得到所有带有 "horses"(复数)和 "quick" 的行,距离最大为 3。

SELECT * FROM news
WHERE   (MATCH (content) AGAINST  ('"quick horses" @3' IN BOOLEAN MODE));

就是将两者组合在一起:所有马或马 AND 和 "quick",距离最大为 3。

SELECT * FROM news
WHERE   (MATCH (content) AGAINST  ('"quick horse*" @3' IN BOOLEAN MODE));

在结果集中只有 "horse" 包含的行。 "horses" 不包括在内!

完整文档参见:http://sqlfiddle.com/#!9/033e02/6

有人知道吗?

发现是MySQL中记载的BUG后,又搜索了下方法。错误:https://bugs.mysql.com/bug.php?id=80723

想法:正则表达式。 又是一条坎坷之路,因为MySQL目前只支持一部分常用表达式。

Reference to groups in a MySQL regex?

https://dev.mysql.com/doc/refman/5.5/en/regexp.html

经过多次实验,这对我有用。 http://sqlfiddle.com/#!9/7007ac7/1

SELECT * FROM news
WHERE content REGEXP 'quick([[:space:][:punct:]])*(((([[:alnum:]])*)*[[:space:][:punct:]]){1,3})horse';