ast.literal_eval会将unicode代码点从\uxxxx转换为\\uxxxx，如何避免？

Question

例如，这里是处理这个 json 文件的代码

json.loads(u"\"{\\"title\\": \\"\\u5927\\"}\"")

json.loads会将其转换为unicode字符串，见下文

{"title": "\u5927"}

这里是处理unicode字符串的代码

ast.literal_eval(json.loads(u"\"{\\"title\\": \\"\\u5927\\"}\""))

ast.literal_eval 将其转换为字典，见下文

{'title': '\u5927'}

但我想要的是一本包含以下内容的字典

{'title': '\u5927'}

Answer 1

json.loads("{\"title\": \"\u5927\"}") 将 return 字典，因此您根本不需要 ast.literal_eval。

d = json.loads("{\"title\": \"\u5927\"}")

print d
{u'title': u'\u5927'}

type(d)
Out[2]: dict

有关 json.loads() json 到 python 的完整转换，请参阅 this。

如果您尝试解析文件，请使用没有 s 的 json.load()，如下所示：

with open('your-file.json') as f:
    # you can change the encoding to the one you need
    print json.load(f, encoding='utf-8')

测试：

from io import StringIO

s = StringIO(u"{\"title\": \"\u5927\"}")

print json.load(s)
{u'title': u'\u5927'}

OP 完全改变了 json 应该解析的内容，这里是另一个解决方案，再次解析 json：

json.loads(json.loads(u"\"{\\"title\\": \\"\\u5927\\"}\""))
Out[6]: {u'title': u'\u5927'}

这是因为第一个json.loads将字符串（非json）转换为json字符串，用json.loads再次解析它最终会反序列化。

ast.literal_eval will convert unicode code point from \uxxxx to \\uxxxx, how to avoid?