使用 JQ 到特定的 csv 格式

Using JQ to specific csv format

我有一个 json 看起来像这样:

[
  {
    "auth": 1,
    "status": "Active",
    "userCustomAttributes": [
      {
        "customAttributeName": "Attribute 1",
        "customAttributeValue": "Value 1"
      },
      {
        "customAttributeName": "Attribute 2",
        "customAttributeValue": "Value 2"
      },
      {
        "customAttributeName": "Attribute 3",
        "customAttributeValue": "Value 3"
      }
    ],
  },
  {
    "auth": 1,
    "status": "Active",
    "userCustomAttributes": [
      {
        "customAttributeName": "Attribute 1",
        "customAttributeValue": "Value 1"
      },
      {
        "customAttributeName": "Attribute 2",
        "customAttributeValue": "Value 2"
      },
      {
        "customAttributeName": "Attribute 3",
        "customAttributeValue": "Value 3"
      },
      {
        "customAttributeName": "Attribute 4",
        "customAttributeValue": "Value 4"
      }
    ],
  }
]

我想解析它并得到一个看起来像这样的 css 输出:

authType, status, attribute 1, attribute 2, attribute 3, attribute 4
"1", "active", "value1", "value2", "value3",""
"1", "active", "value1", "value2", "value3","value 4"

json 数组中有超过 180k 条记录,因此需要遍历所有记录。有些记录不具备所有属性。有些有全部 4 个,但有些只有 1 个。我希望在 csv 中为没有该属性的记录显示空值。

根据您的示例输入,以下程序不依赖于“属性”键的顺序:

jq -r '
["Attribute 1", "Attribute 2", "Attribute 3", "Attribute 4"] as $attributes
# Header row
| ["authType", "status"] 
  + ($attributes | map( (.[:1] | ascii_upcase) + .[1:])),
# Data rows:
  (.[]
   | (INDEX(.userCustomAttributes[]; .customAttributeName)
      | map_values(.customAttributeValue)) as $dict
   | [.auth, .status] + [ $dict[ $attributes[] ] ]
   )
| @csv
'

生成以下 CSV:

"authType","status","Attribute 1","Attribute 2","Attribute 3","Attribute 4"
1,"Active","Value 1","Value 2","Value 3",
1,"Active","Value 1","Value 2","Value 3","Value 4"

您可以轻松修改它以发出您选择的文字字符串来代替 JSON 空值。

说明

$dict[ $a[] ] 生成值流:

$dict[ $a[0] ]
$dict[ $a[1] ]
...

这用于确保以正确的顺序生成列,独立于键的排序甚至存在。