为什么 Solr 查询不适用于空格?
Why Solr query not working on whitespaces?
我是 Solr 的初学者,我的 Soleserver 索引中有以下集合:
{
"id": "book5",
"title": [
"Five point someone"
],
"author": "Chetan Bagat",
"genere": "fantasy",
"description": [
"An iit guide"
],
"comments": [
"good",
"excellent"
],
"publications": [
"swapnapublications",
"pb publications"
]
}
和
{
"id": "book1",
"title": [
"nightatcallcenter"
],
"author": "ChetanBagat",
"genere": "fiction",
"description": [
"Aniitguide"
],
"comments": [
"good",
"excellent"
],
"publications": [
"bangalorepublications",
"aswinpublications"
]
}
我的查询 q=5 +point+有人失败了
但我的查询
q=nightatcallcenter 保持良好为什么会这样?我怎样才能使第一个查询工作
我的架构:
<fields>
<field name="id" type="text_general" indexed="true" stored="true" required="true" multiValued="false" />
<field name="title" type="text_general" indexed="true" stored="true" multiValued="true"/
<field name="genere" type="text_general" indexed="true" stored="true"/>
<field name="description" type="text_general" indexed="true" stored="true" multiValued="true"/>
<field name="comments" type="text_general" indexed="true" stored="true" multiValued="true"/>
<field name="author" type="text_general" indexed="true" stored="true" />
<field name="publications" type="text_general" indexed="true" stored="true" multiValued="true" />
<copyField source='*' dest='fulltext'/>
<field name='fulltext' type='text_general' multiValued='true '/>
</fields>
感谢@alexf,分词器工作完美
<fieldType name="text_general" class="solr.TextField" positionIncrementGap="100" >
<analyzer>
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
</analyzer>
</fieldType>
您遇到的问题是使用 text_general
,您将创建一个令牌。当您搜索 Five +point+someone
时,您正在寻找三个标记:
- 五个
- 点
- 有人
您可以使用的干净解决方案是创建一个自定义 text_general
,如下所示:
<fieldType name="text_general" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" />
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" />
<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
我是 Solr 的初学者,我的 Soleserver 索引中有以下集合:
{
"id": "book5",
"title": [
"Five point someone"
],
"author": "Chetan Bagat",
"genere": "fantasy",
"description": [
"An iit guide"
],
"comments": [
"good",
"excellent"
],
"publications": [
"swapnapublications",
"pb publications"
]
}
和
{
"id": "book1",
"title": [
"nightatcallcenter"
],
"author": "ChetanBagat",
"genere": "fiction",
"description": [
"Aniitguide"
],
"comments": [
"good",
"excellent"
],
"publications": [
"bangalorepublications",
"aswinpublications"
]
}
我的查询 q=5 +point+有人失败了
但我的查询
q=nightatcallcenter 保持良好为什么会这样?我怎样才能使第一个查询工作
我的架构:
<fields>
<field name="id" type="text_general" indexed="true" stored="true" required="true" multiValued="false" />
<field name="title" type="text_general" indexed="true" stored="true" multiValued="true"/
<field name="genere" type="text_general" indexed="true" stored="true"/>
<field name="description" type="text_general" indexed="true" stored="true" multiValued="true"/>
<field name="comments" type="text_general" indexed="true" stored="true" multiValued="true"/>
<field name="author" type="text_general" indexed="true" stored="true" />
<field name="publications" type="text_general" indexed="true" stored="true" multiValued="true" />
<copyField source='*' dest='fulltext'/>
<field name='fulltext' type='text_general' multiValued='true '/>
</fields>
感谢@alexf,分词器工作完美
<fieldType name="text_general" class="solr.TextField" positionIncrementGap="100" >
<analyzer>
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
</analyzer>
</fieldType>
您遇到的问题是使用 text_general
,您将创建一个令牌。当您搜索 Five +point+someone
时,您正在寻找三个标记:
- 五个
- 点
- 有人
您可以使用的干净解决方案是创建一个自定义 text_general
,如下所示:
<fieldType name="text_general" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" />
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" />
<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>