为什么 Unicode table 中有漏洞？

Question

给定 this area 的 Unicode table，例如：

  ...
    U+1D44E Dec:119886       MATHEMATICAL ITALIC SMALL A &#x1D44E;
    U+1D44F Dec:119887       MATHEMATICAL ITALIC SMALL B &#x1D44F;
    U+1D450 Dec:119888       MATHEMATICAL ITALIC SMALL C &#x1D450;
    U+1D451 Dec:119889       MATHEMATICAL ITALIC SMALL D &#x1D451;
    U+1D452 Dec:119890       MATHEMATICAL ITALIC SMALL E &#x1D452;
    U+1D453 Dec:119891       MATHEMATICAL ITALIC SMALL F &#x1D453;
    U+1D454 Dec:119892       MATHEMATICAL ITALIC SMALL G &#x1D454;
    U+1D456 Dec:119894       MATHEMATICAL ITALIC SMALL I &#x1D456; # what?!
    U+1D457 Dec:119895       MATHEMATICAL ITALIC SMALL J &#x1D457;
    U+1D458 Dec:119896       MATHEMATICAL ITALIC SMALL K &#x1D458;
    U+1D459 Dec:119897       MATHEMATICAL ITALIC SMALL L &#x1D459;
    U+1D45A Dec:119898       MATHEMATICAL ITALIC SMALL M &#x1D45A;
    U+1D45B Dec:119899       MATHEMATICAL ITALIC SMALL N &#x1D45B;
    U+1D45C Dec:119900       MATHEMATICAL ITALIC SMALL O &#x1D45C;
  ...

我自然希望 u+1d455 为 MATHEMATICAL ITALIC SMALL H。但是好像没有在any table I look around.

上定义

为什么 Unicode table 中有漏洞？（还有 U+1d49d、u+1d53a、等）
有什么方法可以填充它们吗？

[编辑]：这些链接声明：

The "holes" in the alphabetic ranges are filled by previously defined characters in the Letter like Symbols block shown below.

和

The Unicode Consortium adds new codepoints to the standard all the time. Visit their website to find out about pending codepoints and whether this one is in the pipe. The following table shows typical representations of how the codepoint would look, if it existed. This may help you when debugging, but is not of real use otherwise.

但我只是...不明白他们的意思:\

Answer 1

从评论（伙计们）中，我了解到这些漏洞是由于在添加了整个字母表后某些字符在 Unicode 中已经分配。

例如：在定义 U+1D4* MATHEMATICAL ITALIC SMALL * 标识符之前，ℎ 在 table 中 已经知道 为

ℎ    U+210E Dec:008462        PLANCK CONSTANT &planckh; # here it is

所以为了保持编号的一致性而不是重复ℎ id，在U+1D455位置插入了一个洞

同样，ℬ 在 MATHEMATICAL SCRIPT CAPITAL 字母家族中被称为 U+212C SCRIPT CAPITAL B 而不是 U+1D49D - - - reserved。

类似地，来自 MATHEMATICAL DOUBLE-STRUCK CAPITAL 字母家族的 ℂ 是 而不是 U+1D53A 因为它已经被称为 U+2102 DOUBLE-STRUCK CAPITAL C。

这是一个艰难的选择，必须同时处理追溯兼容性、一致性和可靠性:)

为什么 Unicode table 中有漏洞？

Why are there holes in the Unicode table?

unicode

standards

utf-8

character-encoding