如何从R16B升级到17? list_to_binary里面有汉字就断
How to upgrade from R16B to 17? list_to_binary breaks if there are Chinese characters inside
我们正在使用 R16B03-1 并尝试升级到 R17。
iolist_to_binary
和list_to_binary
里面有汉字就断
我用谷歌搜索并找到以下链接来解释问题。
The default encoding of Erlang files has been changed from ISO-8859-1 to UTF-8. The encoding of XML files has also been changed to UTF-8
Only if a string contains code points < 256, can it be directly converted to a binary by using i.e. erlang:iolist_to_binary/1 or can be sent directly to a port. If the string contains Unicode characters > 255, an encoding has to be decided upon and the string should be converted to a binary in the preferred encoding using unicode:characters_to_binary/{1,2,3}. Strings are not generally lists of bytes, as they were before Erlang/OTP R13. They are lists of characters. Characters are not generally bytes, they are Unicode code points.
我的问题是我们是否必须将所有 list_to_binary
修改为 unicode:characters_to_binary
?
谢谢
来自以下link
http://www.erlang.org/doc/man/unicode.html
Other Unicode encodings than integers representing codepoints or UTF-8 in binaries are referred to as "external encodings". The ISO-latin-1 encoding is in binaries and lists referred to as latin1-encoding.
It is recommended to only use external encodings for communication with external entities where this is required. When working inside the Erlang/OTP environment, it is recommended to keep binaries in UTF-8 when representing Unicode characters.
不需要在所有地方都将list_to_binary修改为unicode:characters_to_binary。只有那些需要与外部世界接口的地方才需要它,并且您不确定该字符串是否会用 utf8 表示(或者您确定编码不是 utf8)。转换后可以使用标准 BIF。
例子:如果有一个列表有一个字符[52974]。
list_to_binary([52974]).
给出错误参数异常错误。
但是一旦你做了
A = unicode:characters_to_binary([52974], utf8).
<<"컮"
>>
经过上述转换后,您可以在业务逻辑中使用更快的内置函数。
B = binary_to_list(A).
"컮"
list_to_binary(B).
<<"컮">>
我们正在使用 R16B03-1 并尝试升级到 R17。
iolist_to_binary
和list_to_binary
里面有汉字就断
我用谷歌搜索并找到以下链接来解释问题。
The default encoding of Erlang files has been changed from ISO-8859-1 to UTF-8. The encoding of XML files has also been changed to UTF-8
Only if a string contains code points < 256, can it be directly converted to a binary by using i.e. erlang:iolist_to_binary/1 or can be sent directly to a port. If the string contains Unicode characters > 255, an encoding has to be decided upon and the string should be converted to a binary in the preferred encoding using unicode:characters_to_binary/{1,2,3}. Strings are not generally lists of bytes, as they were before Erlang/OTP R13. They are lists of characters. Characters are not generally bytes, they are Unicode code points.
我的问题是我们是否必须将所有 list_to_binary
修改为 unicode:characters_to_binary
?
谢谢
来自以下link http://www.erlang.org/doc/man/unicode.html
Other Unicode encodings than integers representing codepoints or UTF-8 in binaries are referred to as "external encodings". The ISO-latin-1 encoding is in binaries and lists referred to as latin1-encoding.
It is recommended to only use external encodings for communication with external entities where this is required. When working inside the Erlang/OTP environment, it is recommended to keep binaries in UTF-8 when representing Unicode characters.
不需要在所有地方都将list_to_binary修改为unicode:characters_to_binary。只有那些需要与外部世界接口的地方才需要它,并且您不确定该字符串是否会用 utf8 表示(或者您确定编码不是 utf8)。转换后可以使用标准 BIF。
例子:如果有一个列表有一个字符[52974]。
list_to_binary([52974]).
给出错误参数异常错误。
但是一旦你做了
A = unicode:characters_to_binary([52974], utf8).
<<"컮"
>>
经过上述转换后,您可以在业务逻辑中使用更快的内置函数。
B = binary_to_list(A).
"컮"
list_to_binary(B).
<<"컮">>