使用 psql 脚本将 jsonb 逗号分隔值转换为 json 对象
Convert jsonb comma separated values into a json object using a psql script
我在 postgresql 中有一个 table,它有两列:
Table "schemaname.tablename"
Column | Type | Collation | Nullable | Default
--------+-------------------+-----------+----------+---------
_key | character varying | | not null |
value | jsonb | | |
Indexes:
"tablename_pkey" PRIMARY KEY, btree (_key)
我想转换 jsonb 的嵌套 属性 值,如下所示:
{
"somekey": "[k1=v1, k2=v2, k3=v2]",
}
进入这个:
{
"somekey": [
"java.util.LinkedHashMap",
{
"k1": "v1",
"k2": "v2",
"k3": "v3"
}
]
}
我已经设法将逗号分隔的字符串解析为一个字符串数组,但除了必须在“=”上应用另一个拆分之外,我真的不知道如何对所有行进行实际更新table 并为 "somekey" 键生成正确的 jsonb 值。
select regexp_split_to_array(RTRIM(LTRIM(value->>'somekey','['),']'),',') from schemaname.tablename;
有什么想法吗?
试试这个(自带测试数据):
WITH tablename (_key, value) AS (
VALUES
('test', '{"somekey":"[k1=v1, k2=v2, k3=v2]"}'::jsonb),
('second', '{"somekey":"[no one=wants to, see=me, with garbage]"}'::jsonb),
('third', '{"somekey":"[some,key=with a = in it''s value, some=more here]"}'::jsonb)
)
SELECT
tab._key,
jsonb_insert(
'{"somekey":["java.util.LinkedHashMap"]}', -- basic JSON structure
'{somekey,0}', -- path to insert after
jsonb_object( -- create a JSONB object on-the-fly from the key-value array
array_agg(key_values) -- aggregate all key-value rows into one array
),
true -- we want to insert after the matching element, not before it
) AS json_transformed
FROM
tablename AS tab,
-- the following is an implicit LATERAL join (function based on eahc row for previous table)
regexp_matches( -- produces multiple rows
btrim(tab.value->>'somekey', '[]'), -- as you started with
'(\w[^=]*)=([^,]*)', -- define regular expression groups for keys and values
'g' -- we want all key-value sets
) AS key_values
GROUP BY 1
;
...导致:
_key | json_transformed
--------+-------------------------------------------------------------------------------------------------------
second | {"somekey": ["java.util.LinkedHashMap", {"see": "me", "no one": "wants to"}]}
third | {"somekey": ["java.util.LinkedHashMap", {"some": "more here", "some,key": "with a = in it's value"}]}
test | {"somekey": ["java.util.LinkedHashMap", {"k1": "v1", "k2": "v2", "k3": "v2"}]}
(3 rows)
我希望行内评论足够详细地解释它是如何工作的。
无需 aggregate/group 通过:
以下不需要分组,因为我们不需要聚合函数array_agg
,但对键值格式的要求稍微宽松一些,很容易因为某些数据(之前的变体只会删除一些键值):
WITH tablename (_key, value) AS (
VALUES
('test', '{"somekey":"[k1=v1, k2=v2, k3=v2]"}'::jsonb),
('second', '{"somekey":"[no one=wants to, see=me, with garbage]"}'::jsonb)
)
SELECT
_key,
jsonb_insert(
'{"somekey":["java.util.LinkedHashMap"]}', -- basic JSON structure
'{somekey,0}', -- path to insert after
jsonb_object( -- create a JSONB object on-the-fly from the key-value array
key_values -- take the keys + values as split using the function
),
true -- we want to insert after the matching element, not before it
) AS json_transformed
FROM
tablename AS tab,
-- the following is an implicit LATERAL join (function based on eahc row for previous table)
regexp_split_to_array( -- produces an array or keys and values: [k, v, k, v, ...]
btrim(tab.value->>'somekey', '[]'), -- as you started with
'(=|,\s*)' -- regex to match both separators
) AS key_values
;
...结果为:
_key | json_transformed
--------+--------------------------------------------------------------------------------
test | {"somekey": ["java.util.LinkedHashMap", {"k1": "v1", "k2": "v2", "k3": "v2"}]}
second | {"somekey": ["java.util.LinkedHashMap", {"see": "me", "no one": "wants to"}]}
(2 rows)
给它添加垃圾(如之前的 "second" 行)或值中的 =
字符(如之前的 "third" 行)将导致以下结果此处错误:
ERROR: array must have even number of elements
我在 postgresql 中有一个 table,它有两列:
Table "schemaname.tablename"
Column | Type | Collation | Nullable | Default
--------+-------------------+-----------+----------+---------
_key | character varying | | not null |
value | jsonb | | |
Indexes:
"tablename_pkey" PRIMARY KEY, btree (_key)
我想转换 jsonb 的嵌套 属性 值,如下所示:
{
"somekey": "[k1=v1, k2=v2, k3=v2]",
}
进入这个:
{
"somekey": [
"java.util.LinkedHashMap",
{
"k1": "v1",
"k2": "v2",
"k3": "v3"
}
]
}
我已经设法将逗号分隔的字符串解析为一个字符串数组,但除了必须在“=”上应用另一个拆分之外,我真的不知道如何对所有行进行实际更新table 并为 "somekey" 键生成正确的 jsonb 值。
select regexp_split_to_array(RTRIM(LTRIM(value->>'somekey','['),']'),',') from schemaname.tablename;
有什么想法吗?
试试这个(自带测试数据):
WITH tablename (_key, value) AS (
VALUES
('test', '{"somekey":"[k1=v1, k2=v2, k3=v2]"}'::jsonb),
('second', '{"somekey":"[no one=wants to, see=me, with garbage]"}'::jsonb),
('third', '{"somekey":"[some,key=with a = in it''s value, some=more here]"}'::jsonb)
)
SELECT
tab._key,
jsonb_insert(
'{"somekey":["java.util.LinkedHashMap"]}', -- basic JSON structure
'{somekey,0}', -- path to insert after
jsonb_object( -- create a JSONB object on-the-fly from the key-value array
array_agg(key_values) -- aggregate all key-value rows into one array
),
true -- we want to insert after the matching element, not before it
) AS json_transformed
FROM
tablename AS tab,
-- the following is an implicit LATERAL join (function based on eahc row for previous table)
regexp_matches( -- produces multiple rows
btrim(tab.value->>'somekey', '[]'), -- as you started with
'(\w[^=]*)=([^,]*)', -- define regular expression groups for keys and values
'g' -- we want all key-value sets
) AS key_values
GROUP BY 1
;
...导致:
_key | json_transformed
--------+-------------------------------------------------------------------------------------------------------
second | {"somekey": ["java.util.LinkedHashMap", {"see": "me", "no one": "wants to"}]}
third | {"somekey": ["java.util.LinkedHashMap", {"some": "more here", "some,key": "with a = in it's value"}]}
test | {"somekey": ["java.util.LinkedHashMap", {"k1": "v1", "k2": "v2", "k3": "v2"}]}
(3 rows)
我希望行内评论足够详细地解释它是如何工作的。
无需 aggregate/group 通过:
以下不需要分组,因为我们不需要聚合函数array_agg
,但对键值格式的要求稍微宽松一些,很容易因为某些数据(之前的变体只会删除一些键值):
WITH tablename (_key, value) AS (
VALUES
('test', '{"somekey":"[k1=v1, k2=v2, k3=v2]"}'::jsonb),
('second', '{"somekey":"[no one=wants to, see=me, with garbage]"}'::jsonb)
)
SELECT
_key,
jsonb_insert(
'{"somekey":["java.util.LinkedHashMap"]}', -- basic JSON structure
'{somekey,0}', -- path to insert after
jsonb_object( -- create a JSONB object on-the-fly from the key-value array
key_values -- take the keys + values as split using the function
),
true -- we want to insert after the matching element, not before it
) AS json_transformed
FROM
tablename AS tab,
-- the following is an implicit LATERAL join (function based on eahc row for previous table)
regexp_split_to_array( -- produces an array or keys and values: [k, v, k, v, ...]
btrim(tab.value->>'somekey', '[]'), -- as you started with
'(=|,\s*)' -- regex to match both separators
) AS key_values
;
...结果为:
_key | json_transformed
--------+--------------------------------------------------------------------------------
test | {"somekey": ["java.util.LinkedHashMap", {"k1": "v1", "k2": "v2", "k3": "v2"}]}
second | {"somekey": ["java.util.LinkedHashMap", {"see": "me", "no one": "wants to"}]}
(2 rows)
给它添加垃圾(如之前的 "second" 行)或值中的 =
字符(如之前的 "third" 行)将导致以下结果此处错误:
ERROR: array must have even number of elements