Swift 和 Wordpress API:Wordpress API 将一些字符转义为 unicode

Swift and Wordpress API: Wordpress API escapes some characters to unicode

我是 copy-pasting 2 post Wordpress 返回的标题 API:

Haydarpaşa’da ortaya çıktı! Tam 1700 yıllık…

Pakistan’da terör saldırısı

我为 categories/posts 和其他东西创建结构并使它们可解码,但这些不处理 Unicode。这是一个例子;我为类别创建的结构。 (posts 的结构太大,所以我共享类别结构。它们都是基于相同的想法构建的。)

struct WPCategory: Decodable {

  let id: Int
  let count: Int
  let description: String
  let link: URL
  let name: String
  let slug: String
  let taxonomy: WPCategoryTaxonomy
  let parent: Int

  enum WPCategoryTaxonomy: String, Codable {
    case category, postTag = "post_tag", navMenu = "nav_menu", linkCategory = "link_category", postFormat = "post_format"
  }

  enum CodingKeys: String, CodingKey {
    case id, count, description, link, name, slug, taxonomy, parent, meta
  }

  init(from decoder: Decoder) throws {
    let container = try decoder.container(keyedBy: CodingKeys.self)

    id = try container.decode(Int.self, forKey: .id)
    count = try container.decode(Int.self, forKey: .count)
    description = try container.decode(String.self, forKey: .description)
    let linkString  = try container.decode(String.self, forKey: .link)
    guard let link = URL.init(string: linkString) else {
      throw WPAPIError.urlToStringFailed
    }
    self.link = link
    name = try container.decode(String.self, forKey: .name)
    slug = try container.decode(String.self, forKey: .slug)
    taxonomy = try container.decode(WPCategoryTaxonomy.self, forKey: .taxonomy)
    parent = try container.decode(Int.self, forKey: .parent)
  }
}

我正在使用 Alamofire 获取数据:

  func getCategories(page: Int = 1, onCompletion completionHandler: @escaping (_ categories: [WPCategory]?, _ totalPages: Int?, _ error: Error?) -> Void) {
    let request = alamofire.request(categoriesURL, method: .get, parameters: ["page": page, "per_page": 100, "exclude":"117"], encoding: URLEncoding.httpBody).validate()
    request.responseData  { (response) in
      switch response.result {
      case .success(let result):
        guard let total = response.response?.allHeaderFields["x-wp-totalpages"] as? String else {
          completionHandler(nil, nil, WPAPIError.couldNotFetchTotalHeader)
          return
        }

        do {
          let categories = try JSONDecoder.init().decode([WPCategory].self, from: result)
          completionHandler(categories, Int(total), nil)
        } catch(let err) {
          completionHandler(nil, nil, err)
        }

      case .failure(let error):
        completionHandler(nil, nil, error)
      }
    }
  }

那么,我该如何处理这些 Unicode 字符呢?有任何想法吗?谢谢。

根据@OOper 的建议,最好更新使用 unicode 的标题和文本。 swift 字符串基于 unicode,请参阅下面的 link 文档,证明 swift 标准库或苹果的框架可以正确处理 unicode。所以你提到的标题没有意义… ’这些是数字字符参考,不是 unicode 字符。

https://docs.swift.org/swift-book/LanguageGuide/StringsAndCharacters.html

如果需要在Swift中处理这样的转义字符串,可以将其转换为普通 Unicode字符串。

因此,如果将以下代码输入 Swift 游乐场:

import Foundation


func convert(escapedString: String) -> String {
    guard let regex = try? NSRegularExpression(pattern: "(&#([0-9]+);)",
                                               options: []) else { return escapedString }

    let escapedNSString = escapedString as NSString
    let matches: [NSTextCheckingResult] = regex.matches(in: escapedString,
                                                        options: [],
                                                        range: NSMakeRange(0, escapedString.count))
    var convertedString = escapedNSString

    for match in matches.reversed() {
        let matchString = escapedNSString.substring(with: match.range(at: 2))
        var replacement: String
        if let unicode = UnicodeScalar(Int(matchString)!) {
            replacement = String(unicode)
        } else {
            replacement = "?"
        }
        convertedString = convertedString.replacingCharacters(in: match.range, with: replacement) as NSString
    }
    return String(convertedString)
}


let str1 = "Haydarpaşa’da ortaya çıktı! Tam 1700 yıllık…"
print (convert(escapedString: str1))
let str2 = "Pakistan’da terör saldırısı"
print (convert(escapedString: str2))

所以你会得到结果:

Haydarpaşa’da ortaya çıktı! Tam 1700 yıllık…
Pakistan’da terör saldırısı

使用我为它写的这个扩展:

extension String {
    func htmlDocument() throws -> String {
        let data = self.data(using: .unicode)
        let options: [NSAttributedString.DocumentReadingOptionKey: NSAttributedString.DocumentType] = [.documentType : .html]
        return try NSAttributedString(data: data!, options: options, documentAttributes: nil).string
    }
}

因此您可以在解码器中使用它,例如:

...
        name = try container.decode(String.self, forKey: .name).htmlDocument()
...