__ctype_b_loc 它的目的是什么？

Question

我试图理解一段使用 __ctype_b_loc() 的代码，问题是我不知道这个函数的目的是什么。

到目前为止，我发现它是在ctype.h中定义的。我还找到了它的原型和实现。我仍然不知道这个功能是干什么用的。

有人可以启发我吗？

Answer 1

经过公平的研究，我想我可以自己回答这个问题。

`unsigned short int** __ctype_b_loc (void)`

是一个函数，它 returns 指向“traits” table 的指针，其中包含一些与每个字符的特征相关的标志。

这是带有标志的枚举：

来自`ctype.h`

enum
{
  _ISupper = _ISbit (0),        /* UPPERCASE.  */
  _ISlower = _ISbit (1),        /* lowercase.  */
  _ISalpha = _ISbit (2),        /* Alphabetic.  */
  _ISdigit = _ISbit (3),        /* Numeric.  */
  _ISxdigit = _ISbit (4),       /* Hexadecimal numeric.  */
  _ISspace = _ISbit (5),        /* Whitespace.  */
  _ISprint = _ISbit (6),        /* Printing.  */
  _ISgraph = _ISbit (7),        /* Graphical.  */
  _ISblank = _ISbit (8),        /* Blank (usually SPC and TAB).  */
  _IScntrl = _ISbit (9),        /* Control character.  */
  _ISpunct = _ISbit (10),       /* Punctuation.  */
  _ISalnum = _ISbit (11)        /* Alphanumeric.  */
};

举个例子，如果你查找 table __ctype_b_loc() returns 的 ascii 码是 0x30 的字符 ('0') 你会得到 0x08d8.

0x08d8=0000 1000 1101 1000 (Alphanumeric, Graphical, Printing, Hexadecimal, Numeric)

table 与机器上安装的语言环境 localchar 相关，因此与您系统上的结果相比，该示例可能不准确。

Answer 2

Alessandro 自己的回答非常有用，但我想补充一些信息。

如 Alessandro 所述，__ctype_b_loc(void) 函数 returns 一个数组，其中每个元素包含一个 ASCII 字符的特征。例如，通过在 table 中查找，我们可以了解到字符 'A' 是大写、十六进制、图形、打印、字母数字。

准确的说，__ctype_b_loc()函数returns一个const unsigned short int**是一个指向384unsigned short int*数组的指针。之所以有 ara 384 个元素，是因为 table 可以通过以下方式索引：

任意 unsigned char 值 [0,255]（所以 256 个元素）
EOF (-1)
任意 signed char 值 [-128,-1) （所以 127 个元素）

此 table 被以下函数使用：

上层
变慢
isalpha
不统一
...

但是这些函数被定义为宏，所以您永远不会在汇编代码中看到它们被调用。您将看到调用 __ctype_b_loc() 以获取 table、检索正确条目的一些代码以及位掩码的用法以查看我们正在检查的属性是否已设置.例如，如果我们想查看一个字符是否为大写，我们必须检查是否设置了位 0。

这里是调用isupper('A');生成的汇编代码：

call sym.imp.__ctype_b_loc  ; isupper('A');
mov rax, qword [rax]        ; get the pointer to the array of 'unsigned short int*'
movsx rdx, byte 0x41        ; prepare to look up for character 'A'
add rdx, rdx                ; each entry is 2 bytes, so we double the value of 'A'
add rax, rdx                ; look up for 'A' in the table
movzx eax, word [rax]       ; get the 'unsigned short int' containing the properties
movzx eax, ax               
and eax, 0x100              ; 0x0100 in little-endian is 0x0001 in big-endian (check if bit 0 is set)

__ctype_b_loc 它的目的是什么？

__ctype_b_loc what is its purpose?

glibc

reverse-engineering

`unsigned short int** __ctype_b_loc (void)`

来自`ctype.h`

__ctype_b_loc 它的目的是什么？

__ctype_b_loc what is its purpose?

glibc

reverse-engineering

unsigned short int** __ctype_b_loc (void)

来自ctype.h

`unsigned short int** __ctype_b_loc (void)`

来自`ctype.h`