如何在 Solr Streaming 中使用 Rank over partition by
How to use Rank over partition by in Solr Streaming
如何在 Solr Streaming 中使用 Rank over partition by。
Table A (city, streetname),需要如下具有等级 2 的查询,
city1 - streetname1, streetname2
city2 - streetname1, streetname2
city3 - streetname1, streetname2
有没有什么功能支持上面的?
提前致谢。
您可以使用reduce
和top
函数来实现您想要的。这些函数在从 search
返回的流上工作,并允许您创建与 group
在常规查询中具有的功能相同的功能。
reduce
The reduce function wraps an internal stream and groups tuples by common fields.
Each tuple group is operated on as a single block by a pluggable reduce operation. The group operation provided with Solr implements distributed grouping functionality. The group operation also serves as an example reduce operation that can be referred to when building custom reduce operations.
The reduce function relies on the sort order of the underlying stream. Accordingly the sort order of the underlying stream must be aligned with the group by field.
top
The top function wraps a streaming expression and re-orders the tuples. The top function emits only the top N tuples in the new sort order. The top function re-orders the underlying stream so the sort criteria does not have to match up with the underlying stream.
search
流源不接受常规查询接受的所有参数 - 但有其自己的允许参数子集:
search
Parameters
collection
: (Mandatory) the collection being searched.
q
: (Mandatory) The query to perform on the Solr index.
fl
: (Mandatory) The list of fields to return.
sort
: (Mandatory) The sort criteria.
zkHost
: Only needs to be defined if the collection being searched is found in a different zkHost
than the local stream handler.
qt
: Specifies the query type, or request handler, to use. Set this to /export to work with large result sets. The default is /select
.
rows
: (Mandatory with the /select
handler) The rows parameter specifies how many rows to return. This parameter is only needed with the /select
handler (which is the default) since the /export
handler always returns all rows.
partitionKeys
: Comma delimited list of keys to partition the search results by. To be used with the parallel function for parallelizing operations across worker nodes. See the parallel function for details.
如何在 Solr Streaming 中使用 Rank over partition by。
Table A (city, streetname),需要如下具有等级 2 的查询, city1 - streetname1, streetname2 city2 - streetname1, streetname2 city3 - streetname1, streetname2
有没有什么功能支持上面的?
提前致谢。
您可以使用reduce
和top
函数来实现您想要的。这些函数在从 search
返回的流上工作,并允许您创建与 group
在常规查询中具有的功能相同的功能。
reduce
The reduce function wraps an internal stream and groups tuples by common fields.
Each tuple group is operated on as a single block by a pluggable reduce operation. The group operation provided with Solr implements distributed grouping functionality. The group operation also serves as an example reduce operation that can be referred to when building custom reduce operations.
The reduce function relies on the sort order of the underlying stream. Accordingly the sort order of the underlying stream must be aligned with the group by field.
top
The top function wraps a streaming expression and re-orders the tuples. The top function emits only the top N tuples in the new sort order. The top function re-orders the underlying stream so the sort criteria does not have to match up with the underlying stream.
search
流源不接受常规查询接受的所有参数 - 但有其自己的允许参数子集:
search
Parameters
collection
: (Mandatory) the collection being searched.
q
: (Mandatory) The query to perform on the Solr index.
fl
: (Mandatory) The list of fields to return.
sort
: (Mandatory) The sort criteria.
zkHost
: Only needs to be defined if the collection being searched is found in a differentzkHost
than the local stream handler.
qt
: Specifies the query type, or request handler, to use. Set this to /export to work with large result sets. The default is/select
.
rows
: (Mandatory with the/select
handler) The rows parameter specifies how many rows to return. This parameter is only needed with the/select
handler (which is the default) since the/export
handler always returns all rows.
partitionKeys
: Comma delimited list of keys to partition the search results by. To be used with the parallel function for parallelizing operations across worker nodes. See the parallel function for details.