如何使用 Unicode 十六进制值 (UTF-16) 表示 Swift 中的字符串

How to express Strings in Swift using Unicode hexadecimal values (UTF-16)

我想在 Swift 中使用十六进制值编写一个 Unicode 字符串。我已经阅读了字符串和字符的 documentation,所以我知道我可以直接在字符串中使用特殊的 Unicode 字符,如下所示:

var variableString = "Cat‼" // "Cat" + Double Exclamation + cat emoji

但我想使用 Unicode 代码点来完成。文档(和 )针对字符显示了它,但不太清楚如何针对字符串执行此操作。

(注意:虽然答案现在对我来说似乎很明显,但就在不久前还不是很明显。我在下面回答我自己的问题,作为学习如何做到这一点的一种方式,也有助于帮助我自己理解 Unicode 术语以及 Swift 字符和字符串的工作原理。)

更新 Swift 3

字符

构成十六进制 code point 的 Swift 语法是

\u{n}

其中 n 是最多 8 位的十六进制数。用于编码较高值标量的 Unicode scalar is U+0 to U+D7FF and U+E000 to U+10FFFF inclusive. (The U+D800 to U+DFFF range is for surrogate pairs, which are not scalars themselves, but are used in UTF-16 的有效范围。)

示例:

// The following forms are equivalent. They all produce "C". 
let char1: Character = "\u{43}"
let char2: Character = "\u{0043}"
let char3: Character = "\u{00000043}"

// Higher value Unicode scalars are done similarly
let char4: Character = "\u{203C}" // ‼ (DOUBLE EXCLAMATION MARK character)
let char5: Character = "\u{1F431}" //  (cat emoji)

// Characters can be made up of multiple scalars
let char7: Character = "\u{65}\u{301}" // é = "e" + accent mark
let char8: Character = "\u{65}\u{301}\u{20DD}" // é⃝ = "e" + accent mark + circle

备注:

  • 可以添加或省略前导零
  • 字符被称为extended grapheme clusters。即使它们由多个标量组成,它们仍被视为单个字符。关键是它们对用户来说似乎是单个字符(字素)。
  • 待办事项:How to convert surrogate pair to Unicode scalar in Swift

字符串

字符串是由字符组成的。请参阅以下示例,了解使用十六进制代码点形成它们的一些方法。

示例:

var string1 = "\u{0043}\u{0061}\u{0074}\u{203C}\u{1F431}" // Cat‼

// pass an array of characters to a String initializer
let catCharacters: [Character] = ["\u{0043}", "\u{0061}", "\u{0074}", "\u{203C}", "\u{1F431}"] // ["C", "a", "t", "‼", ""]
let string2 = String(catCharacters) // Cat‼

在运行时转换十六进制值

在运行时,您可以先将十六进制或 Int 值转换为 CharacterString,方法是先将其转换为 UnicodeScalar

示例:

// hex values
let value0: UInt8  = 0x43     // 97
let value1: UInt16 = 0x203C   // 22823
let value2: UInt32 = 0x1F431  // 127822

// convert hex to UnicodeScalar
let scalar0 = UnicodeScalar(value0)
// make sure that UInt16 and UInt32 form valid Unicode values
guard
    let scalar1 = UnicodeScalar(value1),
    let scalar2 = UnicodeScalar(value2) else {
    return
}

// convert to Character
let character0 = Character(scalar0) // C
let character1 = Character(scalar1) // ‼
let character2 = Character(scalar2) // 

// convert to String
let string0 = String(scalar0) // C
let string1 = String(scalar1) // ‼
let string2 = String(scalar2) // 

// convert hex array to String
let myHexArray = [0x43, 0x61, 0x74, 0x203C, 0x1F431] // an Int array
var myString = ""
for hexValue in myHexArray {
    if let scalar = UnicodeScalar(hexValue) {
        myString.append(Character(scalar))
    }
}
print(myString) // Cat‼

进一步阅读

从你的十六进制“0x1F52D”到实际的表情符号

let c = 0x1F602

下一步可能会从你的十六进制中得到一个 Uint32

let intEmoji = UnicodeScalar(c!).value

从这里你可以做类似

的事情
titleLabel.text = String(UnicodeScalar(intEmoji)!)

这里有一个“”

它也适用于十六进制范围

let emojiRanges = [
            0x1F600...0x1F636,
            0x1F645...0x1F64F,
            0x1F910...0x1F91F,
            0x1F30D...0x1F52D
        ]

        for range in emojiRanges {
            for i in range {
                let c = UnicodeScalar(i)!.value
                data.append(c)
            }
        }

例如从您的 Hex 范围中获取多个 UInt32