将 json -(utf-8) 转换为 json（unicode 转义）& 原始字符串（不能有奇数反冲）

Question

嗨，我在 Python

中将 utf-8 json 转换为 unicode 转义 json 时遇到了一些问题

我知道如何将 utf-8.txt 转换为 unicode escape.txt

with open("input.txt", "r", encoding='utf8') as f:
    text = f.read()

with open('output.txt', 'w', encoding='unicode-escape') as f:
    f.write(text)

但是，我在 python 中使用 json 模块时遇到了上述问题，如下所示

with codecs.open(self.input,'r', encoding='utf-8') as json_file:
    json_data = json.load(json_file)

with codecs.open(self.output,'w', encoding='unicode-escape') as json_file:
    prepare_json = json.dumps(json_data, ensure_ascii=False)
    json_file.write(prepare_json)

保存的很好，但是json里面的双引号（"），会自动加上双反斜杠（\\），所以unicode-escape.json文件不能正常使用在 python 脚本中调用时。

假设

1. Input file (UTF-8): {"context" : "-\" 너"}

我通过上面的第二个代码块转换它

2. Output file (UNICODE-ESCAPED) : {"context" : "-\" \ub108"}

3. What I want (UNICODE-ESCAPED) : {"context" : "-\" \ub108"}

由于双引号前面有双反斜杠，Python 在加载 unicode 转义 json 文件时显示错误。

更多详情

输入文件：./simple_test.json

{"context" : "-\" 너"}

with codecs.open('./simple_test.json', 'r', encoding='utf-8') as json_file:
    json_data = json.load(json_file)

prepare_json = json.dumps(json_data, ensure_ascii=False)
prepare_json
>>> '{"context": "-\" 너"}'
repr(prepare_json)
>>> '\'{"context": "-\\" 너"}\''
print(prepare_json)
>>> {"context": "-\" 너"}

所以它应该打印出 {"context": "-" \ub108"} ，这只是 {"context": "-" 너"}。

Output.json(I excpected}
{"context": "-\" \ub108"}

然而，通过下面的代码我得到了

with codecs.open('./simple_test_out.json','w', encoding='unicode-escape') as json_file:
    json_file.write(prepare_json)

Output.json
{"context": "-\" \ub108"}

经过多次尝试我弄明白了只有在使用编码 = "unicode-escape" 格式写入文件时才会发生这种情况。 并用奇数个反斜杠替换原始字符串将不起作用。

如有任何建议或想法，我们将不胜感激！

更多信息

import codecs
import json

with codecs.open('./simple_test.json','r', encoding='utf-8') as json_file:
    json_data = json.load(json_file)

with codecs.open('.=simple_test_out.json','w', encoding='utf-8') as json_file:
    prepare_json = json.dumps(json_data, ensure_ascii=False)
    json_file.write(prepare_json)

这很好用。

import codecs
import json

with codecs.open('./simple_test.json','r', encoding='utf-8') as json_file:
    json_data = json.load(json_file)

with codecs.open('.=simple_test_out.json','w', encoding='unicode-escape') as json_file:
    prepare_json = json.dumps(json_data, ensure_ascii=False)
    json_file.write(prepare_json)

但这行不通，这是我想要的格式

Answer 1

看起来你只想要 ensure_ascii=True（默认值）：

C:\>type input.json
{"context" : "-\" 너"}

C:\>py
Python 3.8.1 (tags/v3.8.1:1b293b6, Dec 18 2019, 23:11:46) [MSC v.1916 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import json
>>> with open('input.json',encoding='utf8') as f:
...    data = json.load(f)
...
>>> data
{'context': '-" 너'}
>>> with open('output.json','w',encoding='utf8') as f:
...    json.dump(data,f)
...
>>> ^Z

C:\>type output.json
{"context": "-\" \ub108"}

将 json -(utf-8) 转换为 json（unicode 转义）& 原始字符串（不能有奇数反冲）

converting json -(utf-8) to json(unicode escape) & raw strings(cant have odd number of backlash)

python

unicode

encoding

json

utf-8

更多信息