根据特定字符串删除 json 行
Remove json lines based on specific string
我有一个 json
文件,内容如下:
[{"headline":"Ntugamo court issues criminal summons against Rukutana",
"url_src":"\/news\/headlines\/67240-ntugamo-court-issues-criminal-summons-against-rukutana"},
{"headline":"Corruption: Equal Opportunities Commission boss granted bail",
"url_src":"\/news\/headlines\/67239-corruption-equal-opportunities-commission-boss-granted-bail"},
{"headline":"Bobi Wine to launch corruption manifesto in Mbarara rejects EC security team",
"url_src":"https:\/\/www.monitor.co.ug"}]
我正在尝试查找并删除 {
和 }
中包含“腐败”一词的所有部分,包括大括号本身。
例如,在这种情况下,.py 脚本将删除
{"headline":"Corruption: Equal Opportunities Commission boss granted bail",
"url_src":"/news/headlines/67239-corruption-equal-opportunities-commission-boss-granted-bail"}
并删除
{"headline":"Bobi Wine to launch corruption manifesto in Mbarara rejects EC security team","url_src":"https:\/\/www.monitor.co.ug"}
Python 2.7 可以吗?
您可以使用 list comprehension 遍历 list
中的每个 dict
。
在每次迭代中将每个字典转换为字符串,并使用if "corruption" not in str(d).lower()
检查字符串"corruption"
是否在小写字符串中。如果没有,那就保留它:
import json
with open("j.json", "rb") as f:
lst = json.load(f)
lst = [d for d in lst if "corruption" not in str(d).lower()]
print(lst)
输出:
[{'headline': 'Ntugamo court issues criminal summons against Rukutana',
'url_src': '/news/headlines/67240-ntugamo-court-issues-criminal-summons-against-rukutana'}]
如果要将列表写回 json
文件,请使用 json.dump
:
with open("j.json", "w", encoding="utf8") as f:
json.dump(lst, f)
我有一个 json
文件,内容如下:
[{"headline":"Ntugamo court issues criminal summons against Rukutana",
"url_src":"\/news\/headlines\/67240-ntugamo-court-issues-criminal-summons-against-rukutana"},
{"headline":"Corruption: Equal Opportunities Commission boss granted bail",
"url_src":"\/news\/headlines\/67239-corruption-equal-opportunities-commission-boss-granted-bail"},
{"headline":"Bobi Wine to launch corruption manifesto in Mbarara rejects EC security team",
"url_src":"https:\/\/www.monitor.co.ug"}]
我正在尝试查找并删除 {
和 }
中包含“腐败”一词的所有部分,包括大括号本身。
例如,在这种情况下,.py 脚本将删除
{"headline":"Corruption: Equal Opportunities Commission boss granted bail",
"url_src":"/news/headlines/67239-corruption-equal-opportunities-commission-boss-granted-bail"}
并删除
{"headline":"Bobi Wine to launch corruption manifesto in Mbarara rejects EC security team","url_src":"https:\/\/www.monitor.co.ug"}
Python 2.7 可以吗?
您可以使用 list comprehension 遍历 list
中的每个 dict
。
在每次迭代中将每个字典转换为字符串,并使用if "corruption" not in str(d).lower()
检查字符串"corruption"
是否在小写字符串中。如果没有,那就保留它:
import json
with open("j.json", "rb") as f:
lst = json.load(f)
lst = [d for d in lst if "corruption" not in str(d).lower()]
print(lst)
输出:
[{'headline': 'Ntugamo court issues criminal summons against Rukutana',
'url_src': '/news/headlines/67240-ntugamo-court-issues-criminal-summons-against-rukutana'}]
如果要将列表写回 json
文件,请使用 json.dump
:
with open("j.json", "w", encoding="utf8") as f:
json.dump(lst, f)