在 JSON 中跳过特定嵌套级别的非数组属性

Skip non-array attributes at a specific nesting level in JSON

我正在尝试从 the source list it uses 中检索 Firefox 的黑名单主机,以便我可以将其用于其他浏览器 (Qutebrowser)。

我已经相当成功地 jq 解析了 JSON。

#!/bin/sh
for term in Advertising Content Social Analytics Fingerprinting Cryptomining Disconnect; do
    jq ".categories.$term[][][][]" services.json
done

但是,某些类别的一些最深对象(始终处于同一嵌套级别)包含打破jq的额外信息,例如下面的"performance": "true"

{
  "categories": {
    ...
    "Cryptomining": [
      {
        "a.js": {
          "http://zymerget.bid": [
            "alflying.date",
            "alflying.win",
            ...
            "zymerget.faith"
          ],
          "performance": "true"
        }
      },
      {
        "CashBeet": {
          "http://cashbeet.com": [
            "cashbeet.com",
            "serv1swork.com"
          ]
        }
      },
      ...

因此,例如,当循环到达 jq ".categories.Cryptomining[][][][]" services.json 时,它会引发错误并停止处理类别:

"alflying.date"
"alflying.win"
...
"zymerget.faith"
jq: error (at servicesN.json:11167): Cannot iterate over string ("true")

有什么方法可以忽略那些带有jq的非数组属性吗?作为一个 extra,请告诉我是否可以放弃 for 循环并在一个 jq 中完成整个过程(因为目前,如上所示,我列出for 循环中的所有类别)。

Is there any way to disregard those non-array attributes with jq?

是的,arrays built-in

As an extra, please let me know if I could ditch the for loop and do the whole process in a single jq (because currently, as can be seen above, I list all the categories in the for loop).

Array/Object Value Iterator 为您完成。

jq '.categories[][][][] | arrays[]' services.json

但是,对于这个特定的任务,您似乎根本不需要 arrays;以下命令产生相同的输出:

jq '.categories[][][][][]?' services.json

参见.[]?

给出

{
  "categories": {
    "Cryptomining": [
      {
        "a.js": {
          "http://zymerget.bid": [
            "alflying.date",
            "alflying.win",
            "zymerget.faith"
          ],
          "performance": "true"
        }
      },
      {
        "CashBeet": {
          "http://cashbeet.com": [
            "cashbeet.com",
            "serv1swork.com"
          ]
        }
      }
    ]
  }
}

作为嵌套路径的替代方法,您可以使用递归下降:

.. | strings

产生:

"alflying.date"
"alflying.win"
"zymerget.faith"
"true"
"cashbeet.com"
"serv1swork.com"

要排除“true”,要么将其设为布尔值 ,要么 排除其中没有 . 的字符串:

.. | strings | select(contains("."))

Returns:

"alflying.date"
"alflying.win"
"zymerget.faith"
"cashbeet.com"
"serv1swork.com"