如何使用 Unicode 十六进制值 (UTF-16) 表示 Swift 中的字符串
How to express Strings in Swift using Unicode hexadecimal values (UTF-16)
我想在 Swift 中使用十六进制值编写一个 Unicode 字符串。我已经阅读了字符串和字符的 documentation,所以我知道我可以直接在字符串中使用特殊的 Unicode 字符,如下所示:
var variableString = "Cat‼" // "Cat" + Double Exclamation + cat emoji
但我想使用 Unicode 代码点来完成。文档(和 )针对字符显示了它,但不太清楚如何针对字符串执行此操作。
(注意:虽然答案现在对我来说似乎很明显,但就在不久前还不是很明显。我在下面回答我自己的问题,作为学习如何做到这一点的一种方式,也有助于帮助我自己理解 Unicode 术语以及 Swift 字符和字符串的工作原理。)
更新 Swift 3
字符
构成十六进制 code point 的 Swift 语法是
\u{n}
其中 n 是最多 8 位的十六进制数。用于编码较高值标量的 Unicode scalar is U+0 to U+D7FF and U+E000 to U+10FFFF inclusive. (The U+D800 to U+DFFF range is for surrogate pairs, which are not scalars themselves, but are used in UTF-16 的有效范围。)
示例:
// The following forms are equivalent. They all produce "C".
let char1: Character = "\u{43}"
let char2: Character = "\u{0043}"
let char3: Character = "\u{00000043}"
// Higher value Unicode scalars are done similarly
let char4: Character = "\u{203C}" // ‼ (DOUBLE EXCLAMATION MARK character)
let char5: Character = "\u{1F431}" // (cat emoji)
// Characters can be made up of multiple scalars
let char7: Character = "\u{65}\u{301}" // é = "e" + accent mark
let char8: Character = "\u{65}\u{301}\u{20DD}" // é⃝ = "e" + accent mark + circle
备注:
- 可以添加或省略前导零
- 字符被称为extended grapheme clusters。即使它们由多个标量组成,它们仍被视为单个字符。关键是它们对用户来说似乎是单个字符(字素)。
- 待办事项:How to convert surrogate pair to Unicode scalar in Swift
字符串
字符串是由字符组成的。请参阅以下示例,了解使用十六进制代码点形成它们的一些方法。
示例:
var string1 = "\u{0043}\u{0061}\u{0074}\u{203C}\u{1F431}" // Cat‼
// pass an array of characters to a String initializer
let catCharacters: [Character] = ["\u{0043}", "\u{0061}", "\u{0074}", "\u{203C}", "\u{1F431}"] // ["C", "a", "t", "‼", ""]
let string2 = String(catCharacters) // Cat‼
在运行时转换十六进制值
在运行时,您可以先将十六进制或 Int
值转换为 Character
或 String
,方法是先将其转换为 UnicodeScalar
。
示例:
// hex values
let value0: UInt8 = 0x43 // 97
let value1: UInt16 = 0x203C // 22823
let value2: UInt32 = 0x1F431 // 127822
// convert hex to UnicodeScalar
let scalar0 = UnicodeScalar(value0)
// make sure that UInt16 and UInt32 form valid Unicode values
guard
let scalar1 = UnicodeScalar(value1),
let scalar2 = UnicodeScalar(value2) else {
return
}
// convert to Character
let character0 = Character(scalar0) // C
let character1 = Character(scalar1) // ‼
let character2 = Character(scalar2) //
// convert to String
let string0 = String(scalar0) // C
let string1 = String(scalar1) // ‼
let string2 = String(scalar2) //
// convert hex array to String
let myHexArray = [0x43, 0x61, 0x74, 0x203C, 0x1F431] // an Int array
var myString = ""
for hexValue in myHexArray {
if let scalar = UnicodeScalar(hexValue) {
myString.append(Character(scalar))
}
}
print(myString) // Cat‼
进一步阅读
从你的十六进制“0x1F52D”到实际的表情符号
let c = 0x1F602
下一步可能会从你的十六进制中得到一个 Uint32
let intEmoji = UnicodeScalar(c!).value
从这里你可以做类似
的事情
titleLabel.text = String(UnicodeScalar(intEmoji)!)
这里有一个“”
它也适用于十六进制范围
let emojiRanges = [
0x1F600...0x1F636,
0x1F645...0x1F64F,
0x1F910...0x1F91F,
0x1F30D...0x1F52D
]
for range in emojiRanges {
for i in range {
let c = UnicodeScalar(i)!.value
data.append(c)
}
}
例如从您的 Hex 范围中获取多个 UInt32
我想在 Swift 中使用十六进制值编写一个 Unicode 字符串。我已经阅读了字符串和字符的 documentation,所以我知道我可以直接在字符串中使用特殊的 Unicode 字符,如下所示:
var variableString = "Cat‼" // "Cat" + Double Exclamation + cat emoji
但我想使用 Unicode 代码点来完成。文档(和
(注意:虽然答案现在对我来说似乎很明显,但就在不久前还不是很明显。我在下面回答我自己的问题,作为学习如何做到这一点的一种方式,也有助于帮助我自己理解 Unicode 术语以及 Swift 字符和字符串的工作原理。)
更新 Swift 3
字符
构成十六进制 code point 的 Swift 语法是
\u{n}
其中 n 是最多 8 位的十六进制数。用于编码较高值标量的 Unicode scalar is U+0 to U+D7FF and U+E000 to U+10FFFF inclusive. (The U+D800 to U+DFFF range is for surrogate pairs, which are not scalars themselves, but are used in UTF-16 的有效范围。)
示例:
// The following forms are equivalent. They all produce "C".
let char1: Character = "\u{43}"
let char2: Character = "\u{0043}"
let char3: Character = "\u{00000043}"
// Higher value Unicode scalars are done similarly
let char4: Character = "\u{203C}" // ‼ (DOUBLE EXCLAMATION MARK character)
let char5: Character = "\u{1F431}" // (cat emoji)
// Characters can be made up of multiple scalars
let char7: Character = "\u{65}\u{301}" // é = "e" + accent mark
let char8: Character = "\u{65}\u{301}\u{20DD}" // é⃝ = "e" + accent mark + circle
备注:
- 可以添加或省略前导零
- 字符被称为extended grapheme clusters。即使它们由多个标量组成,它们仍被视为单个字符。关键是它们对用户来说似乎是单个字符(字素)。
- 待办事项:How to convert surrogate pair to Unicode scalar in Swift
字符串
字符串是由字符组成的。请参阅以下示例,了解使用十六进制代码点形成它们的一些方法。
示例:
var string1 = "\u{0043}\u{0061}\u{0074}\u{203C}\u{1F431}" // Cat‼
// pass an array of characters to a String initializer
let catCharacters: [Character] = ["\u{0043}", "\u{0061}", "\u{0074}", "\u{203C}", "\u{1F431}"] // ["C", "a", "t", "‼", ""]
let string2 = String(catCharacters) // Cat‼
在运行时转换十六进制值
在运行时,您可以先将十六进制或 Int
值转换为 Character
或 String
,方法是先将其转换为 UnicodeScalar
。
示例:
// hex values
let value0: UInt8 = 0x43 // 97
let value1: UInt16 = 0x203C // 22823
let value2: UInt32 = 0x1F431 // 127822
// convert hex to UnicodeScalar
let scalar0 = UnicodeScalar(value0)
// make sure that UInt16 and UInt32 form valid Unicode values
guard
let scalar1 = UnicodeScalar(value1),
let scalar2 = UnicodeScalar(value2) else {
return
}
// convert to Character
let character0 = Character(scalar0) // C
let character1 = Character(scalar1) // ‼
let character2 = Character(scalar2) //
// convert to String
let string0 = String(scalar0) // C
let string1 = String(scalar1) // ‼
let string2 = String(scalar2) //
// convert hex array to String
let myHexArray = [0x43, 0x61, 0x74, 0x203C, 0x1F431] // an Int array
var myString = ""
for hexValue in myHexArray {
if let scalar = UnicodeScalar(hexValue) {
myString.append(Character(scalar))
}
}
print(myString) // Cat‼
进一步阅读
从你的十六进制“0x1F52D”到实际的表情符号
let c = 0x1F602
下一步可能会从你的十六进制中得到一个 Uint32
let intEmoji = UnicodeScalar(c!).value
从这里你可以做类似
的事情titleLabel.text = String(UnicodeScalar(intEmoji)!)
这里有一个“”
它也适用于十六进制范围
let emojiRanges = [
0x1F600...0x1F636,
0x1F645...0x1F64F,
0x1F910...0x1F91F,
0x1F30D...0x1F52D
]
for range in emojiRanges {
for i in range {
let c = UnicodeScalar(i)!.value
data.append(c)
}
}
例如从您的 Hex 范围中获取多个 UInt32