如何从文件之间存在细微变化的多个重复 json 字段中删除一段文本?

How do I remove a block of text from mutiple repetitive json files where there is a small change between the files?

我有一个包含重复部分的 json 文件,我正在尝试编写一个脚本来从多个文件中删除特定的文本块。 Python 脚本将是最优选的,否则从我的搜索中 sed 也可以工作,尽管我对此一无所知。 这是我的 json 文件的格式示例:

    {
      "Animal": {
        "Type_species": "Reptile"
      },
      "FindMe": "https://www.merriam-webster.com/dictionary/amphibian",
      "Description": "Most are cold blooded."
    },
    {
      "Animal": {
        "Type_species": "Mammal"
      },
      "FindMe": "https://kids.nationalgeographic.com/animals/mammals/",
      "Description": "There Are Approximately 5,000 Mammal Species."
    },
    {
      "Animal": {
        "Type_species": "Amphibian"
      },
      "FindMe": "https://en.wikipedia.org/wiki/Amphibian",
      "Description": "Most amphibians have thin, moist skin that helps them to breathe"
    },
  1. 如何从 json 文件中删除以下内容?
    {
      "Animal": {
        "Type_species": "Mammal"
      },
      "FindMe": "https://kids.nationalgeographic.com/animals/mammals/",
      "Description": "There Are Approximately 5,000 Mammal Species."
    },

我的另一个问题是, 2. 我如何调整脚本以说明跨多个文件的不同“FindMe”Url?例如,第二个文件将具有以下内容,多个文件依此类推?

    {
      "Animal": {
        "Type_species": "Mammal"
      },
      "FindMe": "https://kids.nationalgeographic.com/animals/mammals/facts/arctic-fox",
      "Description": "There Are Approximately 5,000 Mammal Species."
    },

我认为使用正则表达式会有所帮助,但我无法理解它们并在脚本中实现它们。

感谢任何帮助,谢谢。

更新: 我希望最终结果如下所示:

    {
      "Animal": {
        "Type_species": "Reptile"
      },
      "FindMe": "https://www.merriam-webster.com/dictionary/amphibian",
      "Description": "Most are cold blooded."
    },
    {
      "Animal": {
        "Type_species": "Amphibian"
      },
      "FindMe": "https://en.wikipedia.org/wiki/Amphibian",
      "Description": "Most amphibians have thin, moist skin that helps them to breathe"
    },

假设您的完整 JSON 包含字典列表(您的示例建议),那么:

JSON = {"data": [{
    "Animal": {
        "Type_species": "Reptile"
    },
    "FindMe": "https://www.merriam-webster.com/dictionary/amphibian",
    "Description": "Most are cold blooded."
},
    {
    "Animal": {
        "Type_species": "Mammal"
    },
    "FindMe": "https://kids.nationalgeographic.com/animals/mammals/",
    "Description": "There Are Approximately 5,000 Mammal Species."
},
    {
    "Animal": {
        "Type_species": "Amphibian"
    },
    "FindMe": "https://en.wikipedia.org/wiki/Amphibian",
    "Description": "Most amphibians have thin, moist skin that helps them to breathe"
}]}

JSON['data'] = [d for d in JSON['data'] if d['Animal']['Type_species'] != 'Mammal']

print(JSON)

这可能适合您 (GNU sed):

sed '/^\s*{/{:a;N;/^\(\s*\){.*\n},/!ba;/"Type_species": "Mammal"/d}' file

收集每只动物的详细信息,如果动物包含 "Type_species": "Mammal",则将其移除。