突出显示与 Solr 6、Python 3 和 pysolr 的每场比赛

Highlight every match with Solr 6, Python 3 and pysolr

我有这个 Solr 索引,其中包含大量相当长的文本文件,使用 text_sv 模式索引。我想为每个索引文档打印出 every 个片段。然而,我只检索了几个,即使我试图按照 documentation.

中指定的方式操作各种设置。

这是代码部分:

results = solr.search(search_string, rows = result_limit, sort = order,
            **{
                'hl':'true',
                'hl.fragsize': 100,
                'hl.fl': 'fulltext',
                'hl.maxAnalyzedChars': -1,
                'hl.snippets': 100,
                })
resultcounter = 0
for result in results:
    resultcounter += 1
    fulltexturl = '<a href="http://localhost/source/\
    ' + result['filename'] + '">' + result['filename'][:-4] + '</a>'
    year = str(result['year'])
    number = str(result['number'])
    highlights = results.highlighting
    print("Saw {0} result(s).".format(len(results)))
    print('<p>' + str(resultcounter) + '. <b>År:</b> ' + year + ', <b>Nummer\
            : </b>' + number +' ,<b>Fulltext:</b> ' + fulltexturl + '. <b>\
            </b> träffar.<br></p>')
    inSOUresults = 1
    for idnumber, h in highlights.items():
        for key, value in h.items():
            for v in value:
                print('<p>' + str(inSOUresults) + ". " +  v + "</p>")
                inSOUresults += 1

我做错了什么?

您可能希望 hl.fragments 参数(来自 the Highlighting wiki page)的值非常大(或为 0):

With the original Highlighter, if you have a use case where you need to highlight the complete text of a field and need to highlight every instance of the search term(s) you can set hl.fragsize to a very high value (whatever it takes to include all the text for the largest value for that field), for example &hl.fragsize=50000.

However, if you want to change fragsize to a value greater than 51200 to return long document texts with highlighting, you will need to pass the same value to hl.maxAnalyzedChars parameter too. These two parameters go hand in hand and changing just the hl.fragsize would not be sufficient for highlighting in very large fields.