Hibernate Search/Lucene : 字符串字段不能用于排序 "indexed with multiple values per document, use SORTED_SET instead"

Hibernate Search/Lucene : String field cannot be used for sorting "indexed with multiple values per document, use SORTED_SET instead"

我有以下型号。

public class FeatureMeta {

    @Id
    @GeneratedValue(strategy=GenerationType.AUTO)
    private Long id;

    @Column(unique=true)
    private String uri;

    @Column
    @Field
    private String name;

    @Field
    @Column
    private String businessDesc;

    @Field
    @Column
    private String logicalDesc;

    .
    .

}

我正在尝试按 "name" 对文档进行排序,如下所示:

org.hibernate.search.jpa.FullTextQuery jpaQuery =
                    fullTextEntityManager.createFullTextQuery(aggrBuilder.build(), FeatureMeta.class);
.
.    

SortFieldContext sortCtx = queryBuilder.sort().byField("name",SortField.Type.STRING);
jpaQuery.setSort(sortCtx.createSort());
.

但是 Lucene 抛出以下异常?

java.lang.IllegalStateException: Type mismatch: name was indexed with multiple values per document, use SORTED_SET instead at org.apache.lucene.uninverting.FieldCacheImpl$SortedDocValuesCache.createValue(FieldCacheImpl.java:678) at org.apache.lucene.uninverting.FieldCacheImpl$Cache.get(FieldCacheImpl.java:189) at org.apache.lucene.uninverting.FieldCacheImpl.getTermsIndex(FieldCacheImpl.java:646) at org.apache.lucene.uninverting.FieldCacheImpl.getTermsIndex(FieldCacheImpl.java:626) at org.apache.lucene.uninverting.UninvertingReader.getSortedDocValues(UninvertingReader.java:256) at org.apache.lucene.index.DocValues.getSorted(DocValues.java:262) at org.apache.lucene.search.FieldComparator$TermOrdValComparator.getSortedDocValues(FieldComparator.java:762) at org.apache.lucene.search.FieldComparator$TermOrdValComparator.getLeafComparator(FieldComparator.java:767) at org.apache.lucene.search.FieldValueHitQueue.getComparators(FieldValueHitQueue.java:183) at org.apache.lucene.search.TopFieldCollector$SimpleFieldCollector.getLeafCollector(TopFieldCollector.java:164) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:812) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:535) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:523) at org.hibernate.search.query.engine.impl.LazyQueryState.search(LazyQueryState.java:103)

有什么建议吗?

编辑:实际上,在做任何其他事情之前,您应该在 name 字段中检查您使用的是哪个分析器。分析器可能有一个分词器,它会产生多值字段,无法对其进行排序。 尝试添加不同的字段进行排序,并在该字段上使用带有 KeywordTokenizer 的分析器:

@AnalyzerDef(name = "sort_analyzer",
   tokenizer = @TokenizerDef(factory = KeywordTokenizerFactory.class),
   filters = {
       @TokenFilterDef(factory = ASCIIFoldingFilterFactory.class),
       @TokenFilterDef(factory = LowerCaseFilterFactory.class)
   }
)
public class FeatureMeta {

    @Id
    @GeneratedValue(strategy=GenerationType.AUTO)
    private Long id;

    @Column(unique=true)
    private String uri;

    @Column
    @Field
    @Field(name = "name_sort", analyzer = @Analyzer(definition = "sort_analyzer"))
    private String name;

    @Field
    @Column
    private String businessDesc;

    @Field
    @Column
    private String logicalDesc;

    .
    .

}

然后根据这个新字段而不是默认字段排序:

SortFieldContext sortCtx = queryBuilder.sort().byField("name_sort",SortField.Type.STRING);

原回答(我提出的观点仍然有效):

不确定是什么原因导致您的情况出现异常,但请尝试在您的代码中修复这些问题:

  1. name 属性
  2. 上添加 @SortableField 注释
  3. 不使用queryBuilder.sort().byField("name",SortField.Type.STRING),只使用queryBuilder.sort().byField("name")

如果它不起作用,也许您应该尝试擦除索引并重新索引。