postgres中全文搜索的问题
problems with full-text search in postgres
我有下一个table和数据:
/* script for people table, with field tsvector and gin */
CREATE TABLE public.people (
id INTEGER,
name VARCHAR(30),
lastname VARCHAR(30),
complete TSVECTOR
)
WITH (oids = false);
CREATE INDEX idx_complete ON public.people
USING gin (complete);
/* data for people table */
INSERT INTO public.people ("id", "name", "lastname", "complete")
VALUES
(1, 'MICHAEL', 'BRYANT BRYANT', '''bryant'':2,3 ''michael'':1'),
(2, 'HENRY STEVEN', 'BUSH TIESSEN', '''bush'':3 ''henri'':1 ''steven'':2 ''tiessen'':4'),
(3, 'WILLINGTON STEVEN', 'STEPHENS FLINN', '''flinn'':4 ''stephen'':3 ''steven'':2 ''willington'':1'),
(4, 'BRET', 'MARTINEZ AROCH', '''aroch'':3 ''bret'':1 ''martinez'':2'),
(5, 'TERENCE BERT', 'CAVALIERE ENRON', '''bert'':2 ''cavalier'':3 ''terenc'':1');
我需要根据 tsvector
字段检索名字和姓氏。其实我有疑问:
SELECT * FROM people WHERE complete @@ to_tsquery('WILLINGTON & FLINN');
并且结果正确(第三条记录)。但是如果我尝试
SELECT * FROM people WHERE complete @@ to_tsquery('STEVEN & FLINN');
/* the same record! */
我没有结果。为什么?我能做什么?
您应该使用与插入字段 'complete' 中的值相同的语言来搜索 table。
检查比较英语和德语的查询结果:
select * ,
to_tsvector('english', concat_ws(' ', name, lastname )) as english,
to_tsvector('german', concat_ws(' ', name, lastname )) as german
from public.people
所以这对你有用:
SELECT * FROM people WHERE complete @@ to_tsquery('english','STEVEN & FLINN');
您可能正在使用文本搜索配置,其中 STEVEN
或 FLINN
被词干提取修改。
我可以在这里重现:
test=> SHOW default_text_search_config;
default_text_search_config
----------------------------
pg_catalog.german
(1 row)
test=> SELECT complete FROM public.people WHERE id = 3;
complete
-------------------------------------------------
'flinn':4 'stephen':3 'steven':2 'willington':1
(1 row)
test=> SELECT * FROM ts_debug('STEVEN & FLINN');
alias | description | token | dictionaries | dictionary | lexemes
-----------+-----------------+--------+---------------+-------------+---------
asciiword | Word, all ASCII | STEVEN | {german_stem} | german_stem | {stev}
blank | Space symbols | | {} | |
blank | Space symbols | & | {} | |
asciiword | Word, all ASCII | FLINN | {german_stem} | german_stem | {flinn}
(4 rows)
test=> SELECT * FROM public.people
WHERE complete @@ to_tsquery('STEVEN & FLINN');
id | name | lastname | complete
----+------+----------+----------
(0 rows)
所以你看,德语 Snowball 词典词干 STEVEN
到 stev
。
由于 complete
包含未提取词干的版本 steven
,未找到匹配项。
填充 complete
和查询时应使用相同的文本搜索配置。
我有下一个table和数据:
/* script for people table, with field tsvector and gin */
CREATE TABLE public.people (
id INTEGER,
name VARCHAR(30),
lastname VARCHAR(30),
complete TSVECTOR
)
WITH (oids = false);
CREATE INDEX idx_complete ON public.people
USING gin (complete);
/* data for people table */
INSERT INTO public.people ("id", "name", "lastname", "complete")
VALUES
(1, 'MICHAEL', 'BRYANT BRYANT', '''bryant'':2,3 ''michael'':1'),
(2, 'HENRY STEVEN', 'BUSH TIESSEN', '''bush'':3 ''henri'':1 ''steven'':2 ''tiessen'':4'),
(3, 'WILLINGTON STEVEN', 'STEPHENS FLINN', '''flinn'':4 ''stephen'':3 ''steven'':2 ''willington'':1'),
(4, 'BRET', 'MARTINEZ AROCH', '''aroch'':3 ''bret'':1 ''martinez'':2'),
(5, 'TERENCE BERT', 'CAVALIERE ENRON', '''bert'':2 ''cavalier'':3 ''terenc'':1');
我需要根据 tsvector
字段检索名字和姓氏。其实我有疑问:
SELECT * FROM people WHERE complete @@ to_tsquery('WILLINGTON & FLINN');
并且结果正确(第三条记录)。但是如果我尝试
SELECT * FROM people WHERE complete @@ to_tsquery('STEVEN & FLINN');
/* the same record! */
我没有结果。为什么?我能做什么?
您应该使用与插入字段 'complete' 中的值相同的语言来搜索 table。
检查比较英语和德语的查询结果:
select * ,
to_tsvector('english', concat_ws(' ', name, lastname )) as english,
to_tsvector('german', concat_ws(' ', name, lastname )) as german
from public.people
所以这对你有用:
SELECT * FROM people WHERE complete @@ to_tsquery('english','STEVEN & FLINN');
您可能正在使用文本搜索配置,其中 STEVEN
或 FLINN
被词干提取修改。
我可以在这里重现:
test=> SHOW default_text_search_config;
default_text_search_config
----------------------------
pg_catalog.german
(1 row)
test=> SELECT complete FROM public.people WHERE id = 3;
complete
-------------------------------------------------
'flinn':4 'stephen':3 'steven':2 'willington':1
(1 row)
test=> SELECT * FROM ts_debug('STEVEN & FLINN');
alias | description | token | dictionaries | dictionary | lexemes
-----------+-----------------+--------+---------------+-------------+---------
asciiword | Word, all ASCII | STEVEN | {german_stem} | german_stem | {stev}
blank | Space symbols | | {} | |
blank | Space symbols | & | {} | |
asciiword | Word, all ASCII | FLINN | {german_stem} | german_stem | {flinn}
(4 rows)
test=> SELECT * FROM public.people
WHERE complete @@ to_tsquery('STEVEN & FLINN');
id | name | lastname | complete
----+------+----------+----------
(0 rows)
所以你看,德语 Snowball 词典词干 STEVEN
到 stev
。
由于 complete
包含未提取词干的版本 steven
,未找到匹配项。
填充 complete
和查询时应使用相同的文本搜索配置。