使用 JQ 到特定的 csv 格式
Using JQ to specific csv format
我有一个 json 看起来像这样:
[
{
"auth": 1,
"status": "Active",
"userCustomAttributes": [
{
"customAttributeName": "Attribute 1",
"customAttributeValue": "Value 1"
},
{
"customAttributeName": "Attribute 2",
"customAttributeValue": "Value 2"
},
{
"customAttributeName": "Attribute 3",
"customAttributeValue": "Value 3"
}
],
},
{
"auth": 1,
"status": "Active",
"userCustomAttributes": [
{
"customAttributeName": "Attribute 1",
"customAttributeValue": "Value 1"
},
{
"customAttributeName": "Attribute 2",
"customAttributeValue": "Value 2"
},
{
"customAttributeName": "Attribute 3",
"customAttributeValue": "Value 3"
},
{
"customAttributeName": "Attribute 4",
"customAttributeValue": "Value 4"
}
],
}
]
我想解析它并得到一个看起来像这样的 css 输出:
authType, status, attribute 1, attribute 2, attribute 3, attribute 4
"1", "active", "value1", "value2", "value3",""
"1", "active", "value1", "value2", "value3","value 4"
json 数组中有超过 180k 条记录,因此需要遍历所有记录。有些记录不具备所有属性。有些有全部 4 个,但有些只有 1 个。我希望在 csv 中为没有该属性的记录显示空值。
根据您的示例输入,以下程序不依赖于“属性”键的顺序:
jq -r '
["Attribute 1", "Attribute 2", "Attribute 3", "Attribute 4"] as $attributes
# Header row
| ["authType", "status"]
+ ($attributes | map( (.[:1] | ascii_upcase) + .[1:])),
# Data rows:
(.[]
| (INDEX(.userCustomAttributes[]; .customAttributeName)
| map_values(.customAttributeValue)) as $dict
| [.auth, .status] + [ $dict[ $attributes[] ] ]
)
| @csv
'
生成以下 CSV:
"authType","status","Attribute 1","Attribute 2","Attribute 3","Attribute 4"
1,"Active","Value 1","Value 2","Value 3",
1,"Active","Value 1","Value 2","Value 3","Value 4"
您可以轻松修改它以发出您选择的文字字符串来代替 JSON 空值。
说明
$dict[ $a[] ]
生成值流:
$dict[ $a[0] ]
$dict[ $a[1] ]
...
这用于确保以正确的顺序生成列,独立于键的排序甚至存在。
我有一个 json 看起来像这样:
[
{
"auth": 1,
"status": "Active",
"userCustomAttributes": [
{
"customAttributeName": "Attribute 1",
"customAttributeValue": "Value 1"
},
{
"customAttributeName": "Attribute 2",
"customAttributeValue": "Value 2"
},
{
"customAttributeName": "Attribute 3",
"customAttributeValue": "Value 3"
}
],
},
{
"auth": 1,
"status": "Active",
"userCustomAttributes": [
{
"customAttributeName": "Attribute 1",
"customAttributeValue": "Value 1"
},
{
"customAttributeName": "Attribute 2",
"customAttributeValue": "Value 2"
},
{
"customAttributeName": "Attribute 3",
"customAttributeValue": "Value 3"
},
{
"customAttributeName": "Attribute 4",
"customAttributeValue": "Value 4"
}
],
}
]
我想解析它并得到一个看起来像这样的 css 输出:
authType, status, attribute 1, attribute 2, attribute 3, attribute 4
"1", "active", "value1", "value2", "value3",""
"1", "active", "value1", "value2", "value3","value 4"
json 数组中有超过 180k 条记录,因此需要遍历所有记录。有些记录不具备所有属性。有些有全部 4 个,但有些只有 1 个。我希望在 csv 中为没有该属性的记录显示空值。
根据您的示例输入,以下程序不依赖于“属性”键的顺序:
jq -r '
["Attribute 1", "Attribute 2", "Attribute 3", "Attribute 4"] as $attributes
# Header row
| ["authType", "status"]
+ ($attributes | map( (.[:1] | ascii_upcase) + .[1:])),
# Data rows:
(.[]
| (INDEX(.userCustomAttributes[]; .customAttributeName)
| map_values(.customAttributeValue)) as $dict
| [.auth, .status] + [ $dict[ $attributes[] ] ]
)
| @csv
'
生成以下 CSV:
"authType","status","Attribute 1","Attribute 2","Attribute 3","Attribute 4"
1,"Active","Value 1","Value 2","Value 3",
1,"Active","Value 1","Value 2","Value 3","Value 4"
您可以轻松修改它以发出您选择的文字字符串来代替 JSON 空值。
说明
$dict[ $a[] ]
生成值流:
$dict[ $a[0] ]
$dict[ $a[1] ]
...
这用于确保以正确的顺序生成列,独立于键的排序甚至存在。