Hibernate Search/Lucene : 字符串字段不能用于排序 "indexed with multiple values per document, use SORTED_SET instead"
Hibernate Search/Lucene : String field cannot be used for sorting "indexed with multiple values per document, use SORTED_SET instead"
我有以下型号。
public class FeatureMeta {
@Id
@GeneratedValue(strategy=GenerationType.AUTO)
private Long id;
@Column(unique=true)
private String uri;
@Column
@Field
private String name;
@Field
@Column
private String businessDesc;
@Field
@Column
private String logicalDesc;
.
.
}
我正在尝试按 "name" 对文档进行排序,如下所示:
org.hibernate.search.jpa.FullTextQuery jpaQuery =
fullTextEntityManager.createFullTextQuery(aggrBuilder.build(), FeatureMeta.class);
.
.
SortFieldContext sortCtx = queryBuilder.sort().byField("name",SortField.Type.STRING);
jpaQuery.setSort(sortCtx.createSort());
.
但是 Lucene 抛出以下异常?
java.lang.IllegalStateException: Type mismatch: name was indexed with
multiple values per document, use SORTED_SET instead at
org.apache.lucene.uninverting.FieldCacheImpl$SortedDocValuesCache.createValue(FieldCacheImpl.java:678)
at
org.apache.lucene.uninverting.FieldCacheImpl$Cache.get(FieldCacheImpl.java:189)
at
org.apache.lucene.uninverting.FieldCacheImpl.getTermsIndex(FieldCacheImpl.java:646)
at
org.apache.lucene.uninverting.FieldCacheImpl.getTermsIndex(FieldCacheImpl.java:626)
at
org.apache.lucene.uninverting.UninvertingReader.getSortedDocValues(UninvertingReader.java:256)
at org.apache.lucene.index.DocValues.getSorted(DocValues.java:262)
at
org.apache.lucene.search.FieldComparator$TermOrdValComparator.getSortedDocValues(FieldComparator.java:762)
at
org.apache.lucene.search.FieldComparator$TermOrdValComparator.getLeafComparator(FieldComparator.java:767)
at
org.apache.lucene.search.FieldValueHitQueue.getComparators(FieldValueHitQueue.java:183)
at
org.apache.lucene.search.TopFieldCollector$SimpleFieldCollector.getLeafCollector(TopFieldCollector.java:164)
at
org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:812)
at
org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:535)
at
org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:523)
at
org.hibernate.search.query.engine.impl.LazyQueryState.search(LazyQueryState.java:103)
有什么建议吗?
编辑:实际上,在做任何其他事情之前,您应该在 name
字段中检查您使用的是哪个分析器。分析器可能有一个分词器,它会产生多值字段,无法对其进行排序。
尝试添加不同的字段进行排序,并在该字段上使用带有 KeywordTokenizer
的分析器:
@AnalyzerDef(name = "sort_analyzer",
tokenizer = @TokenizerDef(factory = KeywordTokenizerFactory.class),
filters = {
@TokenFilterDef(factory = ASCIIFoldingFilterFactory.class),
@TokenFilterDef(factory = LowerCaseFilterFactory.class)
}
)
public class FeatureMeta {
@Id
@GeneratedValue(strategy=GenerationType.AUTO)
private Long id;
@Column(unique=true)
private String uri;
@Column
@Field
@Field(name = "name_sort", analyzer = @Analyzer(definition = "sort_analyzer"))
private String name;
@Field
@Column
private String businessDesc;
@Field
@Column
private String logicalDesc;
.
.
}
然后根据这个新字段而不是默认字段排序:
SortFieldContext sortCtx = queryBuilder.sort().byField("name_sort",SortField.Type.STRING);
原回答(我提出的观点仍然有效):
不确定是什么原因导致您的情况出现异常,但请尝试在您的代码中修复这些问题:
- 在
name
属性 上添加 @SortableField
注释
- 不使用
queryBuilder.sort().byField("name",SortField.Type.STRING)
,只使用queryBuilder.sort().byField("name")
如果它不起作用,也许您应该尝试擦除索引并重新索引。
我有以下型号。
public class FeatureMeta {
@Id
@GeneratedValue(strategy=GenerationType.AUTO)
private Long id;
@Column(unique=true)
private String uri;
@Column
@Field
private String name;
@Field
@Column
private String businessDesc;
@Field
@Column
private String logicalDesc;
.
.
}
我正在尝试按 "name" 对文档进行排序,如下所示:
org.hibernate.search.jpa.FullTextQuery jpaQuery =
fullTextEntityManager.createFullTextQuery(aggrBuilder.build(), FeatureMeta.class);
.
.
SortFieldContext sortCtx = queryBuilder.sort().byField("name",SortField.Type.STRING);
jpaQuery.setSort(sortCtx.createSort());
.
但是 Lucene 抛出以下异常?
java.lang.IllegalStateException: Type mismatch: name was indexed with multiple values per document, use SORTED_SET instead at org.apache.lucene.uninverting.FieldCacheImpl$SortedDocValuesCache.createValue(FieldCacheImpl.java:678) at org.apache.lucene.uninverting.FieldCacheImpl$Cache.get(FieldCacheImpl.java:189) at org.apache.lucene.uninverting.FieldCacheImpl.getTermsIndex(FieldCacheImpl.java:646) at org.apache.lucene.uninverting.FieldCacheImpl.getTermsIndex(FieldCacheImpl.java:626) at org.apache.lucene.uninverting.UninvertingReader.getSortedDocValues(UninvertingReader.java:256) at org.apache.lucene.index.DocValues.getSorted(DocValues.java:262) at org.apache.lucene.search.FieldComparator$TermOrdValComparator.getSortedDocValues(FieldComparator.java:762) at org.apache.lucene.search.FieldComparator$TermOrdValComparator.getLeafComparator(FieldComparator.java:767) at org.apache.lucene.search.FieldValueHitQueue.getComparators(FieldValueHitQueue.java:183) at org.apache.lucene.search.TopFieldCollector$SimpleFieldCollector.getLeafCollector(TopFieldCollector.java:164) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:812) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:535) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:523) at org.hibernate.search.query.engine.impl.LazyQueryState.search(LazyQueryState.java:103)
有什么建议吗?
编辑:实际上,在做任何其他事情之前,您应该在 name
字段中检查您使用的是哪个分析器。分析器可能有一个分词器,它会产生多值字段,无法对其进行排序。
尝试添加不同的字段进行排序,并在该字段上使用带有 KeywordTokenizer
的分析器:
@AnalyzerDef(name = "sort_analyzer",
tokenizer = @TokenizerDef(factory = KeywordTokenizerFactory.class),
filters = {
@TokenFilterDef(factory = ASCIIFoldingFilterFactory.class),
@TokenFilterDef(factory = LowerCaseFilterFactory.class)
}
)
public class FeatureMeta {
@Id
@GeneratedValue(strategy=GenerationType.AUTO)
private Long id;
@Column(unique=true)
private String uri;
@Column
@Field
@Field(name = "name_sort", analyzer = @Analyzer(definition = "sort_analyzer"))
private String name;
@Field
@Column
private String businessDesc;
@Field
@Column
private String logicalDesc;
.
.
}
然后根据这个新字段而不是默认字段排序:
SortFieldContext sortCtx = queryBuilder.sort().byField("name_sort",SortField.Type.STRING);
原回答(我提出的观点仍然有效):
不确定是什么原因导致您的情况出现异常,但请尝试在您的代码中修复这些问题:
- 在
name
属性 上添加 - 不使用
queryBuilder.sort().byField("name",SortField.Type.STRING)
,只使用queryBuilder.sort().byField("name")
@SortableField
注释
如果它不起作用,也许您应该尝试擦除索引并重新索引。