使用 ElasticSearch 在一个字段中对共享相同 ID 的多个文档进行分组

Group multiple document that shares same id in a field using ElasticSearch

我们在 ElasticSearch 中有一组数据,如下所示。它的产品列表从电子商务后端索引。

“hts” : [
{
   "_index": "test",
    "_type": "commerce_products_index",
    "_id": "466174",
    "_score": 1,
"_source": {
      "id": 261776,
      "changed": "1516367458",
      "commerce_price:amount": "2700",
      "field_product_node:nid": [
        "66741"
      ],
      "field_uom_type": "g",
      "field_weight": "337",
      "product_id": "261776",
      "title": "Brown Lobia",
    }
  },
{
   "_index": "test",
    "_type": "commerce_products_index",
    "_id": "466175",
    "_score": 1,
   "_source": {
      "id": 261781,
      "changed": "1526448108",
      "commerce_price:amount": "5900",
      "field_product_node:nid": [
        "66741"
      ],
      "field_uom_type": "g",
      "field_weight": "339",
      "product_id": "261781",
      "title": "Brown Lobia",
    }
 },
 {
   "_index": "test",
    "_type": "commerce_products_index",
    "_id": "466176",
    "_score": 1,
   "_source": {
      "id": 466176,
      "changed": "1515568794",
      "commerce_price:amount": "5400",
      "commerce_store": "651",
      "field_product_node:nid": [
        "84651"
      ],
      "field_uom_type": "g",
      "field_weight": "337",
      "product_id": "466176",
      "title": "Maggi Rich Tomato Ketchup",
    }
  }
]

如您所见,前 2 个文档具有 field_product_node:nid 相同的内容。 (即 66741)。这是同一产品的两种不同尺寸(变体)。

在搜索中,我们希望将这些相同的产品显示为一个。为此,我们需要带有字段 field_product_node:nid 的结果,这对于每个相同的产品都是唯一的。例如,白米 1 公斤和白米 500 克在 field_product_node:nid 中具有相同的值。因此,当搜索时,两个产品详细信息应归为一个 nid。

目前,我们为每个产品获取不同的文档。但是,我们希望将两种产品作为一个文档获取。

我们尝试了以下查询:

GET /commerce_products_index/_search
{
  "size": 20, 
  "query" : {
    "bool": {
      "must": [
        { "match": { "commerce_store": "651"}}
      ]
     }
   },
  "aggs": {
    "group_by_node": {
      "terms": {
        "field": "field_product_node:nid"
      }
    }
  }
}

GET /commerce_products_index/_search
 {
   "aggregations": {
     "grp_report": {
       "terms": {
         "field": "field_product_node:nid"
       },
      "aggregations": {
        "nested_node": {
          "nested": {
            "path": "node"
          },
        "aggregations": {
          "filters_customer": {
            "filters": {
              "filters": [
               {
                  "match": {
                    "node.commerce_store": "651"
                  }
                }
              ]
            }
          }
        }
      }
    }
  }
 },
  "query" : {
     "bool": {
       "must": [
         { "match": { "commerce_store": "651"}}
       ]
     }
   },
  "from": 0,
  "size": 100
}

我们无法找出正确的方法。如果这不可能,我们必须继续并重新开发索引部分,并尝试将具有相同 nid 的多个产品索引到单个文档中。这将是一个相当大的重写。

我试过以下查询。它对我们的问题有效。

GET /commerce_products_index/_search
{       
   "size": 20,        
   "aggs": {
     "by_node": {
       "terms": {
         "field": "field_product_node:nid",
         "size": 11,
         "order": {
          "max_score": "desc"
         }
       },
       "aggs": {
         "by_top_hit": {
         "top_hits": {
         "size": 15
        }
      },
      "max_score": {
        "max": {
          "field": "field_product_node:nid",
          "script": "_score"
         }
       }
      }
    }
  }
 }

它可能会帮助面临同样问题的人。