比较字符串中的非英文字符

Compare non English characters in string

我需要比较非英文字符串,如下所示

Majsstärkelse unicode - Majsstärkelse

Majsstärkelse unicode - Majsstärkelse

如果我这样比较

if('Majsstärkelse' === 'Majsstärkelse')

有些角色无法正常工作 comparison.So 我试过

const collator = new Intl.Collator('de')
const order = collator.compare('Ü', 'ß')
console.log(order)

但是还是没有成功的结果。我怎样才能做到这一点

您可以使用 String.protoype.normalize 规范化等价字符串。

a='Majsst\u{00E4}rkelse'
b='Majssta\u{0308}rkelse'
console.log(a,b)
console.log(a === b)
console.log(a.normalize('NFC')===b.normalize('NFC'))

注意:您拥有的字符串已被转义。以上用于比较未转义的字符串。
首先从 unicode HTML 个实体解码的代码:

const decodeUEntities = u=>u.replace(/&#(x[\dA-F]+|\d+);/g,
  (_,u)=>String.fromCodePoint(u[0]==='x'?parseInt(u.substr(1),16):+u))

str1 = decodeUEntities("Majsstärkelse")
str2 = decodeUEntities("Majsstärkelse")

// decode unicode HTML entities, if you want named HTML entities too, find a list of them and add them to the replacement code, for simplicty I will be leaving that out
console.log(str1, str2, str1===str2)

console.log(str1.normalize('NFC'),str2.normalize('NFC'),
            str1.normalize('NFC')===str2.normalize('NFC'))