从 JSONB 列中的多个数组中过滤出对象

Filtering out objects from multiple arrays in a JSONB column

我有一个 JSON 结构,其中两个数组保存在 JSONB 列中。有点简化它看起来像这样

{
  "prop1": "abc",
  "prop2": "xyz",
  "items": [
    {
      "itemId": "123",
      "price": "10.00"
    },
    {
      "itemId": "124",
      "price": "9.00"
    },
    {
      "itemId": "125",
      "price": "8.00"
    }
  ],
  "groups": [
    {
      "groupId": "A",
      "discount": "20",
      "discountId": "1"
    },
    {
      "groupId": "B",
      "discount": "30",
      "discountId": "2"
    },
    {
      "groupId": "B",
      "discount": "20",
      "discountId": "3"
    },
    {
      "groupId": "C",
      "discount": "40",
      "discountId": "4"
    }
  ]
}

架构:

CREATE TABLE campaign
  (
     id       TEXT PRIMARY KEY,
     data     JSONB
  );

由于每一行(数据列)都可能相当大,我试图从 itemsgroups 数组中过滤掉匹配的项目对象和组对象。

我当前的查询是这个

SELECT * FROM campaign
WHERE 
(data -> 'items' @> '[{"productId": "123"}]') OR
(data -> 'groups' @> '[{"groupId": "B"}]')

其中 returns 行包含匹配组或匹配项。但是,根据行的不同,data 列可能是一个相当大的 JSON 对象(items 中可能有数百个对象,groups 中可能有数十个对象,我已经在这个例子中为了简洁省略了几个 keys/properties),这会影响查询性能(我在 itemsgroups 数组上添加了 GIN 索引,所以缺少索引并不是它变慢的原因)。

如何过滤掉 itemsgroups 数组以仅包含匹配元素?

给定这个匹配行

{
  "prop1": "abc",
  "prop2": "xyz",
  "items": [
    {
      "itemId": "123",
      "price": "10.00"
    },
    {
      "itemId": "124",
      "price": "9.00"
    },
    {
      "itemId": "125",
      "price": "8.00"
    }
  ],
  "groups": [
    {
      "groupId": "A",
      "discount": "20",
      "discountId": "1"
    },
    {
      "groupId": "B",
      "discount": "30",
      "discountId": "2"
    },
    {
      "groupId": "B",
      "discount": "20",
      "discountId": "3"
    },
    {
      "groupId": "C",
      "discount": "40",
      "discountId": "4"
    }
  ]
}

我希望结果是这样的(匹配的 item/group 可以与 data 列的其余部分位于不同的列中 - 不必是 return在单个 JSON 对象中使用两个像这样的数组,但如果不影响性能或导致非常多毛的查询,我会更喜欢它):

{
  "prop1": "abc",
  "prop2": "xyz",
  "items": [
    {
      "itemId": "123",
      "price": "10.00"
    }
  ],
  "groups": [
    {
      "groupId": "B"
      "discount": "20",
      "discountId": "3"
    }
  ]
}

到目前为止,我所做的是使用此查询解包并匹配 items 数组中的对象,这会从 data 列中删除 'items' 数组,并且将匹配的 item 对象过滤到单独的列中,但我正在努力将其与 groups 数组中的匹配项结合起来。

SELECT data - 'items', o.obj
FROM campaign c
CROSS JOIN LATERAL jsonb_array_elements(c.data #> '{items}') o(obj)
WHERE o.obj ->> 'productId' = '124'

如何在一个查询中过滤两个数组?

附加问题:对于 groups 数组,如果可能,我还想 return 具有最低 discount 值的对象。否则结果将需要是匹配组对象的数组而不是单个匹配 group.

相关问题: and

如果你的postgres版本是12以上,你可以使用jsonpathlanguage and functions。下面的查询 returns 预期结果与符合给定条件的项目和组的子集。然后,您可以在 sql 函数中调整此查询,以便将搜索条件作为输入参数。

SELECT jsonb_set(jsonb_set( data
                          , '{items}'
                          , jsonb_path_query_array(data, '$.items[*] ? (@.itemId == "123" && @.price == "10.00")'))
                , '{groups}'
                , jsonb_path_query_array(data, '$.groups[*] ? (@.groupId == "B" && @.discount == "20" && @.discountId == "3")'))
  FROM (SELECT
'{
  "prop1": "abc",
  "prop2": "xyz",
  "items": [
    {
      "itemId": "123",
      "price": "10.00"
    },
    {
      "itemId": "124",
      "price": "9.00"
    },
    {
      "itemId": "125",
      "price": "8.00"
    }
  ],
  "groups": [
    {
      "groupId": "A",
      "discount": "20",
      "discountId": "1"
    },
    {
      "groupId": "B",
      "discount": "30",
      "discountId": "2"
    },
    {
      "groupId": "B",
      "discount": "20",
      "discountId": "3"
    },
    {
      "groupId": "C",
      "discount": "40",
      "discountId": "4"
    }
  ]
}' :: jsonb) AS d(data)
WHERE jsonb_path_exists(data, '$.items[*] ? (@.itemId == "123" && @.price == "10.00")')
  AND jsonb_path_exists(data, '$.groups[*] ? (@.groupId == "B" && @.discount == "20" && @.discountId == "3")')