为什么 Unicode table 中有漏洞?
Why are there holes in the Unicode table?
给定 this area 的 Unicode table,例如:
...
U+1D44E Dec:119886 MATHEMATICAL ITALIC SMALL A 𝑎
U+1D44F Dec:119887 MATHEMATICAL ITALIC SMALL B 𝑏
U+1D450 Dec:119888 MATHEMATICAL ITALIC SMALL C 𝑐
U+1D451 Dec:119889 MATHEMATICAL ITALIC SMALL D 𝑑
U+1D452 Dec:119890 MATHEMATICAL ITALIC SMALL E 𝑒
U+1D453 Dec:119891 MATHEMATICAL ITALIC SMALL F 𝑓
U+1D454 Dec:119892 MATHEMATICAL ITALIC SMALL G 𝑔
U+1D456 Dec:119894 MATHEMATICAL ITALIC SMALL I 𝑖 # what?!
U+1D457 Dec:119895 MATHEMATICAL ITALIC SMALL J 𝑗
U+1D458 Dec:119896 MATHEMATICAL ITALIC SMALL K 𝑘
U+1D459 Dec:119897 MATHEMATICAL ITALIC SMALL L 𝑙
U+1D45A Dec:119898 MATHEMATICAL ITALIC SMALL M 𝑚
U+1D45B Dec:119899 MATHEMATICAL ITALIC SMALL N 𝑛
U+1D45C Dec:119900 MATHEMATICAL ITALIC SMALL O 𝑜
...
我自然希望 u+1d455 为 MATHEMATICAL ITALIC SMALL H
。但是好像没有在any table I look around.
上定义
为什么 Unicode table 中有漏洞? (还有 U+1d49d、u+1d53a、等)
有什么方法可以填充它们吗?
[编辑]:这些链接声明:
The "holes" in the alphabetic ranges are filled by previously defined characters in the Letter like Symbols block shown below.
和
The Unicode Consortium adds new codepoints to the standard all the time. Visit their website to find out about pending codepoints and whether this one is in the pipe. The following table shows typical representations of how the codepoint would look, if it existed. This may help you when debugging, but is not of real use otherwise.
但我只是...不明白他们的意思:\
从评论(伙计们)中,我了解到这些漏洞是由于在添加了整个字母表后某些字符在 Unicode 中已经分配。
例如:在定义 U+1D4* MATHEMATICAL ITALIC SMALL *
标识符之前,ℎ
在 table 中 已经知道 为
ℎ U+210E Dec:008462 PLANCK CONSTANT ℎ # here it is
所以为了保持编号的一致性而不是重复ℎ
id,在U+1D455
位置插入了一个洞
同样,ℬ
在 MATHEMATICAL SCRIPT CAPITAL
字母家族中被称为 U+212C SCRIPT CAPITAL B
而不是 U+1D49D - - - reserved
。
类似地,来自 MATHEMATICAL DOUBLE-STRUCK CAPITAL
字母家族的 ℂ
是 而不是 U+1D53A
因为它已经被称为 U+2102 DOUBLE-STRUCK CAPITAL C
。
这是一个艰难的选择,必须同时处理追溯兼容性、一致性和可靠性:)
给定 this area 的 Unicode table,例如:
...
U+1D44E Dec:119886 MATHEMATICAL ITALIC SMALL A 𝑎
U+1D44F Dec:119887 MATHEMATICAL ITALIC SMALL B 𝑏
U+1D450 Dec:119888 MATHEMATICAL ITALIC SMALL C 𝑐
U+1D451 Dec:119889 MATHEMATICAL ITALIC SMALL D 𝑑
U+1D452 Dec:119890 MATHEMATICAL ITALIC SMALL E 𝑒
U+1D453 Dec:119891 MATHEMATICAL ITALIC SMALL F 𝑓
U+1D454 Dec:119892 MATHEMATICAL ITALIC SMALL G 𝑔
U+1D456 Dec:119894 MATHEMATICAL ITALIC SMALL I 𝑖 # what?!
U+1D457 Dec:119895 MATHEMATICAL ITALIC SMALL J 𝑗
U+1D458 Dec:119896 MATHEMATICAL ITALIC SMALL K 𝑘
U+1D459 Dec:119897 MATHEMATICAL ITALIC SMALL L 𝑙
U+1D45A Dec:119898 MATHEMATICAL ITALIC SMALL M 𝑚
U+1D45B Dec:119899 MATHEMATICAL ITALIC SMALL N 𝑛
U+1D45C Dec:119900 MATHEMATICAL ITALIC SMALL O 𝑜
...
我自然希望 u+1d455 为 MATHEMATICAL ITALIC SMALL H
。但是好像没有在any table I look around.
为什么 Unicode table 中有漏洞? (还有 U+1d49d、u+1d53a、等)
有什么方法可以填充它们吗?
[编辑]:这些链接声明:
The "holes" in the alphabetic ranges are filled by previously defined characters in the Letter like Symbols block shown below.
和
The Unicode Consortium adds new codepoints to the standard all the time. Visit their website to find out about pending codepoints and whether this one is in the pipe. The following table shows typical representations of how the codepoint would look, if it existed. This may help you when debugging, but is not of real use otherwise.
但我只是...不明白他们的意思:\
从评论(伙计们)中,我了解到这些漏洞是由于在添加了整个字母表后某些字符在 Unicode 中已经分配。
例如:在定义 U+1D4* MATHEMATICAL ITALIC SMALL *
标识符之前,ℎ
在 table 中 已经知道 为
ℎ U+210E Dec:008462 PLANCK CONSTANT ℎ # here it is
所以为了保持编号的一致性而不是重复ℎ
id,在U+1D455
位置插入了一个洞
同样,ℬ
在 MATHEMATICAL SCRIPT CAPITAL
字母家族中被称为 U+212C SCRIPT CAPITAL B
而不是 U+1D49D - - - reserved
。
类似地,来自 MATHEMATICAL DOUBLE-STRUCK CAPITAL
字母家族的 ℂ
是 而不是 U+1D53A
因为它已经被称为 U+2102 DOUBLE-STRUCK CAPITAL C
。
这是一个艰难的选择,必须同时处理追溯兼容性、一致性和可靠性:)