无法使用单个未转义的反斜杠将 json 存储在 python 中
Impossible to store json in python with single un-escaped backslash
我正在为 REST 负载主体创建一个 json 主体,如下所示:
>>> j = json.loads('["foo", {"bar": ["to_be_replaced", 1.1, 1.0, 2]}]')
>>> text = "aaaa" + "\" + "bbbbb" + "\" + "cccc"
>>> j[1]["bar"][0] = text
>>> j
['foo', {'bar': ['aaaa\bbbbb\cccc', 1.1, 1.0, 2]}]
烦人的是,对方期望的格式是这样的
"aaaa\bbbb\cccc".
我知道这是个糟糕的主意。
我已经尝试了所有方法,并且开始相信将这种格式的文本存储在 json 对象中是根本不可能的。有办法吗?或者我需要让 web 服务的开发人员选择一个更明智的分隔符。
我知道它实际上是一个反斜杠,如果我打印一个反斜杠
>>> print(text)
aaaa\bbbbb\cccc
但这并不能帮助我将它变成 json 对象。
是的, 不可能 -- 设计使然。
JSON 解析器本质上应该只发出有效的 JSON。来自 RFC 8259,强调我的:
7. Strings
The representation of strings is similar to conventions used in the C
family of programming languages. A string begins and ends with
quotation marks. All Unicode characters may be placed within the
quotation marks, except for the characters that MUST be escaped:
quotation mark, reverse solidus, and the control characters (U+0000
through U+001F).
Any character may be escaped. If the character is in the Basic
Multilingual Plane (U+0000 through U+FFFF), then it may be
represented as a six-character sequence: a reverse solidus, followed
by the lowercase letter u, followed by four hexadecimal digits that
encode the character's code point. The hexadecimal letters A through
F can be uppercase or lowercase. So, for example, a string
containing only a single reverse solidus character may be represented
as "\u005C".
Alternatively, there are two-character sequence escape
representations of some popular characters. So, for example, a
string containing only a single reverse solidus character may be
represented more compactly as "\"
.
请注意短语“必须转义”——“必须”是 formally-defined term-of-art;不符合 JSON 规范的 MUST 要求的东西不允许调用自身 JSON.
总结:在您的数据中仅包含文字反斜杠的字符串可能在 JSON 中编码为 "\u005c"
,或 "\"
。它可能不会编码为"\"
(包括该字符作为未转义文字)。
我正在为 REST 负载主体创建一个 json 主体,如下所示:
>>> j = json.loads('["foo", {"bar": ["to_be_replaced", 1.1, 1.0, 2]}]')
>>> text = "aaaa" + "\" + "bbbbb" + "\" + "cccc"
>>> j[1]["bar"][0] = text
>>> j
['foo', {'bar': ['aaaa\bbbbb\cccc', 1.1, 1.0, 2]}]
烦人的是,对方期望的格式是这样的
"aaaa\bbbb\cccc".
我知道这是个糟糕的主意。
我已经尝试了所有方法,并且开始相信将这种格式的文本存储在 json 对象中是根本不可能的。有办法吗?或者我需要让 web 服务的开发人员选择一个更明智的分隔符。
我知道它实际上是一个反斜杠,如果我打印一个反斜杠
>>> print(text)
aaaa\bbbbb\cccc
但这并不能帮助我将它变成 json 对象。
是的, 不可能 -- 设计使然。
JSON 解析器本质上应该只发出有效的 JSON。来自 RFC 8259,强调我的:
7. Strings
The representation of strings is similar to conventions used in the C family of programming languages. A string begins and ends with quotation marks. All Unicode characters may be placed within the quotation marks, except for the characters that MUST be escaped: quotation mark, reverse solidus, and the control characters (U+0000 through U+001F).
Any character may be escaped. If the character is in the Basic Multilingual Plane (U+0000 through U+FFFF), then it may be represented as a six-character sequence: a reverse solidus, followed by the lowercase letter u, followed by four hexadecimal digits that encode the character's code point. The hexadecimal letters A through F can be uppercase or lowercase. So, for example, a string containing only a single reverse solidus character may be represented as "\u005C".
Alternatively, there are two-character sequence escape representations of some popular characters. So, for example, a string containing only a single reverse solidus character may be represented more compactly as
"\"
.
请注意短语“必须转义”——“必须”是 formally-defined term-of-art;不符合 JSON 规范的 MUST 要求的东西不允许调用自身 JSON.
总结:在您的数据中仅包含文字反斜杠的字符串可能在 JSON 中编码为 "\u005c"
,或 "\"
。它可能不会编码为"\"
(包括该字符作为未转义文字)。