更新到 Lucene.net 4.8.0-beta00001 后代码中断

Code breaks after update to Lucene.net 4.8.0-beta00001

我刚刚开始将 Lucene.net 用于一个项目。我的代码基于此处提供的代码:https://github.com/synhershko/LuceneNetDemo by Itamar Syn-Hershko。在我更新到最新的 NuGet 后,代码在几个地方出现问题。我需要更改什么?

第一个问题:

searcherManager.ExecuteSearch(searcher =>
{
    var topDocs = searcher.Search(query, 10);
    _totalHits = topDocs.TotalHits;
    foreach (var result in topDocs.ScoreDocs)
    {
        var doc = searcher.Doc(result.Doc);
        l.Add(new SearchResult
        {
            Name = doc.GetField("name")?.StringValue,
            Description = doc.GetField("description")?.StringValue,
            Url = doc.GetField("url")?.StringValue,

            // Results are automatically sorted by relevance
            Score = result.Score,
        });
    }
}, exception => { Console.WriteLine(exception.ToString()); });

错误信息:

'SearcherManager' does not contain a definition for 'ExecuteSearch' and no extension method 'ExecuteSearch' accepting a first argument of type 'SearcherManager' could be found (are you missing a using directive or an assembly reference?)

第二题:

public class HtmlStripAnalyzerWrapper : Analyzer
{
    private readonly Analyzer _wrappedAnalyzer;

    public HtmlStripAnalyzerWrapper(Analyzer wrappedAnalyzer)
    {
        _wrappedAnalyzer = wrappedAnalyzer;
    }

    public override TokenStreamComponents CreateComponents(string fieldName, TextReader reader)
    {
        return _wrappedAnalyzer.CreateComponents(fieldName, new HTMLStripCharFilter(reader));
    }
}

错误信息:

'HtmlStripAnalyzerWrapper.CreateComponents(string, TextReader)': cannot change access modifiers when overriding 'protected internal' inherited member 'Analyzer.CreateComponents(string, TextReader)'

Cannot access protected member 'Analyzer.CreateComponents(string, TextReader)' via a qualifier of type 'Analyzer'; the qualifier must be of type 'HtmlStripAnalyzerWrapper' (or derived from it)

演示有更新:https://github.com/NightOwl888/LuceneNetDemo

第一个问题:

API 被无意中删除,因为它没有正确标记并且在 Lucene 4.8.0 中不存在。但是,它只是 API 对 SearcherManager.Acquire()SearcherManager.Release() 的补充。你可以在Lucene 4.8.0的SearcherManager documentation中看到它的用法。

var searcher = searcherManager.Acquire();
try
{
    var topDocs = searcher.Search(query, 10);
    _totalHits = topDocs.TotalHits;
    foreach (var result in topDocs.ScoreDocs)
    {
        var doc = searcher.Doc(result.Doc);
        l.Add(new SearchResult
        {
            Name = doc.GetField("name")?.GetStringValue(),
            Description = doc.GetField("description")?.GetStringValue(),
            Url = doc.GetField("url")?.GetStringValue(),

            // Results are automatically sorted by relevance
            Score = result.Score,
        });
    }
}
catch (Exception e)
{
    Console.WriteLine(e.ToString());
}
finally
{
    searcherManager.Release(searcher);
    searcher = null; // Never use searcher after this point!
}

我们正在考虑是要带回原来的 ExecuteSearch() API,还是创建一个可以与 using 块一起使用的新块以获得对 .NET 更友好的体验.请参阅第二个选项的示例 in pull request 207。欢迎反馈。

当然,默认吞噬异常的 API 并不理想。

第二个问题:

API 成员的可访问性也已更正以匹配 Lucene。 CharFilters were not intended to be used in conjunction with pre-built Analyzers 出于性能原因。相反,您必须从预构建的分词器和过滤器构建分析器。

using Lucene.Net.Analysis;
using Lucene.Net.Analysis.CharFilters;
using Lucene.Net.Analysis.Core;
using Lucene.Net.Analysis.Standard;
using Lucene.Net.Util;
using System.IO;

namespace LuceneNetDemo.Analyzers
{
    class HtmlStripAnalyzer : Analyzer
    {
        private readonly LuceneVersion matchVersion;

        public HtmlStripAnalyzer(LuceneVersion matchVersion)
        {
            this.matchVersion = matchVersion;
        }

        protected override TokenStreamComponents CreateComponents(string fieldName, TextReader reader)
        {
            StandardTokenizer standardTokenizer = new StandardTokenizer(matchVersion, reader);
            TokenStream stream = new StandardFilter(matchVersion, standardTokenizer);
            stream = new LowerCaseFilter(matchVersion, stream);
            stream = new StopFilter(matchVersion, stream, StopAnalyzer.ENGLISH_STOP_WORDS_SET);
            return new TokenStreamComponents(standardTokenizer, stream);
        }

        protected override TextReader InitReader(string fieldName, TextReader reader)
        {
            return base.InitReader(fieldName, new HTMLStripCharFilter(reader));
        }
    }
}

用法:

analyzer = new PerFieldAnalyzerWrapper(new HtmlStripAnalyzer(LuceneVersion.LUCENE_48),
new Dictionary<string, Analyzer>
{
    {"owner", new LowercaseKeywordAnalyzer()},
    {"name", new RepositoryNamesAnalyzer()},
});