Solr(4.6.1)在响应前按自定义字段值重新排序
Solr (4.6.1) re-ordering by custom field value before response
假设我有如下几个文档
{
"id": 1,
"priority": "Low",
"summary": ".."
},
{
"id": 2,
"priority": "Medium",
"summary": ".."
},
{
"id": 3,
"priority": "High",
"summary": ".."
},
{
"id": 4,
"priority": "High",
"summary": ".."
},
{
"id": 5,
"priority": "Low",
"summary": ".."
},
... other documents ...
如果我发出查询,Solr return 文档顺序
1 (score 282)
4 (score 212)
5 (score 182)
2 (score 25)
3 (score 13)
按分数 desc 排序即可。
现在还是需要先按分数排序,但是附加要求是:
for each score segments, re-order the document using the document priority.
我知道这有点混乱,不清楚"score segments",但理论上我想申请
https://stats.stackexchange.com/questions/70801/how-to-normalize-data-to-0-1-range
到那些结果分数并将这些段分成
x >= 0.7
x < 0.7 & x > 0.3
x <= 0.3
我将取最小分数 = 0,因此归一化分数将为
1 (normalised score 1) (segment 1)
4 (normalised score 0.75) (segment 1)
5 (normalised score 0.64) (segment 2)
2 (normalised score 0.08) (segment 3)
3 (normalised score 0.04) (segment 3)
我想要实现的结果是重新排序每个段,这样结果就变成了
4 -> 1 -> 5 -> 3 -> 2
instead of
1 -> 4 -> 5 -> 2 -> 3
我正在研究函数查询、自定义插件。好像Plugin可以得到结果文档的分数,但是我不知道如何重新排序文档。
我希望得到一些关于这方面的建议,谢谢。
使用 CustomScoreQuery 和 CustomScoreProvider。在您的文档中添加一个具有值(high=3,medium=2,low=1)的整数优先级字段,以便缓存并在评分计算中使用它。
public class MyScoreProvider extends CustomScoreProvider {
private FieldCache.Ints priorities;
public MyScoreProvider(AtomicReaderContext context) throws IOException {
super(context);
priorities = FieldCache.DEFAULT.getInts(context.reader(), "priority_numeric", false);
}
public float customScore(int doc, float subQueryScore, float valSrcScore) {
int segment = 100 / getSegmentNumber(subQueryScore);
return segment + priorities.get(doc);
}
}
同时考虑编写QParserPlugin 以使用CustomScoreQuery。有关详细信息,请参阅此 link。 http://spykem.blogspot.com/2013/06/plug-in-external-score-to-solr.html
假设我有如下几个文档
{
"id": 1,
"priority": "Low",
"summary": ".."
},
{
"id": 2,
"priority": "Medium",
"summary": ".."
},
{
"id": 3,
"priority": "High",
"summary": ".."
},
{
"id": 4,
"priority": "High",
"summary": ".."
},
{
"id": 5,
"priority": "Low",
"summary": ".."
},
... other documents ...
如果我发出查询,Solr return 文档顺序
1 (score 282)
4 (score 212)
5 (score 182)
2 (score 25)
3 (score 13)
按分数 desc 排序即可。
现在还是需要先按分数排序,但是附加要求是:
for each score segments, re-order the document using the document priority.
我知道这有点混乱,不清楚"score segments",但理论上我想申请 https://stats.stackexchange.com/questions/70801/how-to-normalize-data-to-0-1-range 到那些结果分数并将这些段分成
x >= 0.7
x < 0.7 & x > 0.3
x <= 0.3
我将取最小分数 = 0,因此归一化分数将为
1 (normalised score 1) (segment 1)
4 (normalised score 0.75) (segment 1)
5 (normalised score 0.64) (segment 2)
2 (normalised score 0.08) (segment 3)
3 (normalised score 0.04) (segment 3)
我想要实现的结果是重新排序每个段,这样结果就变成了
4 -> 1 -> 5 -> 3 -> 2
instead of
1 -> 4 -> 5 -> 2 -> 3
我正在研究函数查询、自定义插件。好像Plugin可以得到结果文档的分数,但是我不知道如何重新排序文档。
我希望得到一些关于这方面的建议,谢谢。
使用 CustomScoreQuery 和 CustomScoreProvider。在您的文档中添加一个具有值(high=3,medium=2,low=1)的整数优先级字段,以便缓存并在评分计算中使用它。
public class MyScoreProvider extends CustomScoreProvider {
private FieldCache.Ints priorities;
public MyScoreProvider(AtomicReaderContext context) throws IOException {
super(context);
priorities = FieldCache.DEFAULT.getInts(context.reader(), "priority_numeric", false);
}
public float customScore(int doc, float subQueryScore, float valSrcScore) {
int segment = 100 / getSegmentNumber(subQueryScore);
return segment + priorities.get(doc);
}
}
同时考虑编写QParserPlugin 以使用CustomScoreQuery。有关详细信息,请参阅此 link。 http://spykem.blogspot.com/2013/06/plug-in-external-score-to-solr.html