ElasticSearch - 过滤、分组并计算每个组的结果

ElasticSearch - filter, group by and count results for each group

我是 ElasticSearch 的新手,需要帮助解决以下问题:

我有一组包含多个产品的文档。我想通过 "Apple" 过滤产品-属性 product_brand 并获得符合过滤器的产品数量。然而,结果应该按文档 ID 分组,文档 ID 也是文档本身的一部分 (test_id)。

示例文档:

"test" : {
   "test_id" : 19988,
   "test_name" : "Test",
},
"products" : [ 
    {
        "product_id" : 1,
        "product_brand" : "Apple"
    }, 
    {
        "product_id" : 2,
        "product_brand" : "Apple"
    }, 
    {
        "product_id" : 3,
        "product_brand" : "Samsung"
    } 
]

结果应该是:

{
   "key" : 19988,
   "count" : 2
},

在 SQL 中,它看起来大致像这样:

SELECT test_id, COUNT(product_id) 
FROM `test` 
WHERE product_brand = 'Apple'
GROUP BY test_id;

我怎样才能做到这一点?

我认为这应该让你很接近:

GET /test/_search
{
  "_source": {
    "includes": [
      "test.test_id",
      "_score"
    ]
  },
  "query": {
    "function_score": {
      "query": {
        "match": {
          "products.product_brand.keyword": "Apple"
        }
      },
      "functions": [
        {
          "script_score": {
            "script": {
              "source": "def matches=0; def products = params['_source']['products']; for(p in products){if(p.product_brand == params['brand']){matches++;}} return matches;",
              "params": {
                "brand": "Apple"
              }
            }
          }
        }
      ]
    }
  }
}

此方法使用 function_score,但如果您想以不同的方式得分,也可以将其应用于脚本字段。以上仅匹配具有品牌文本完全设置为 "Apple".

的子产品对象的文档

你只需要控制apple的两个实例的输入即可。或者,您可以匹配 function_score 查询中的所有内容并只关注分数。您的输出可能如下所示:

{
  "took": 1,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 1,
    "max_score": 2,
    "hits": [
      {
        "_index": "test",
        "_type": "doc",
        "_id": "AV99vrBpgkgblFY6zscA",
        "_score": 2,
        "_source": {
          "test": {
            "test_id": 19988
          }
        }
      }
    ]
  }
}

我使用的索引中的映射如下所示:

{
  "test": {
    "mappings": {
      "doc": {
        "properties": {
          "products": {
            "properties": {
              "product_brand": {
                "type": "text",
                "fields": {
                  "keyword": {
                    "type": "keyword",
                    "ignore_above": 256
                  }
                }
              },
              "product_id": {
                "type": "long"
              }
            }
          },
          "test": {
            "properties": {
              "test_id": {
                "type": "long"
              },
              "test_name": {
                "type": "text",
                "fields": {
                  "keyword": {
                    "type": "keyword",
                    "ignore_above": 256
                  }
                }
              }
            }
          }
        }
      }
    }
  }
}