`struct task_struct current` 的兄弟总是包含一个 `pid = 0` 的进程

Siblings of `struct task_struct current` always include a process with `pid = 0`

我正在破解 linux 内核并与 struct task_struct current 的兄弟姐妹和 children 一起玩。

在输出兄弟姐妹的pid和命令名时,似乎出现了pid = 0的格式错误的过程,并且命令名是乱码。

进程'parents.

也发生同样的事情

为什么兄弟姐妹中出现pid=0的进程?那个流程不是给swapper保留的吗?

代码

// Loop over process and parents using something like:

/*
printk("--syscall ## Begin process results ##"); 

printk("--syscall // View children //");
my_list_head = &(current->children);

printk("--syscall // View siblings //");
my_list_head = &(current->sibling)

printk("--syscall results:  ...");
*/


if (my_list_head == NULL) {
  return 0;
}

list_for_each(tempNode, my_list_head) {
  tempTask = list_entry(tempNode, struct task_struct,
          sibling);
  printk("--syscall The %ld-th process's pid is %d and command %s",
         count, tempTask->pid, tempTask->comm);
 }

输出

带空格的格式

[ 2938.994084] --syscall ## Begin process results ##
[ 2938.994089] --syscall // View children //
[ 2938.994105] --syscall // View siblings //
[ 2938.994116] --syscall The 1-th process's pid is 0 and command \x80ݶE\x96\xff\xff
[ 2938.994133] --syscall results: pid=1400 name=process_ancesto state=0 uid=1000 nvcsw=1 nivcsw=0 num_children=0 num_siblings=1

[ 2938.994139] --syscall ## Begin process results ##
[ 2938.994144] --syscall // View children //
[ 2938.994149] --syscall The 1-th process's pid is 1400 and command process_ancesto
[ 2938.994158] --syscall // View siblings //
[ 2938.994163] --syscall The 1-th process's pid is 0 and command
[ 2938.994176] --syscall results: pid=1282 name=bash state=1 uid=1000 nvcsw=88 nivcsw=18 num_children=1 num_siblings=1

[ 2938.994180] --syscall ## Begin process results ##
[ 2938.994185] --syscall // View children //
[ 2938.994190] --syscall The 1-th process's pid is 1282 and command bash
[ 2938.994198] --syscall // View siblings //
[ 2938.994203] --syscall The 1-th process's pid is 1275 and command systemd
[ 2938.994210] --syscall The 2-th process's pid is 0 and command
[ 2938.994216] --syscall The 3-th process's pid is 117 and command systemd-journal
[ 2938.994222] --syscall The 4-th process's pid is 145 and command systemd-udevd
[ 2938.994227] --syscall The 5-th process's pid is 148 and command systemd-network
[ 2938.994233] --syscall The 6-th process's pid is 369 and command systemd-resolve
[ 2938.994239] --syscall The 7-th process's pid is 370 and command systemd-timesyn
[ 2938.994245] --syscall The 8-th process's pid is 412 and command accounts-daemon
[ 2938.994321] --syscall The 9-th process's pid is 413 and command dbus-daemon
[ 2938.994336] --syscall The 10-th process's pid is 417 and command irqbalance
[ 2938.994346] --syscall The 11-th process's pid is 418 and command rsyslogd
[ 2938.994352] --syscall The 12-th process's pid is 419 and command snapd
[ 2938.994359] --syscall The 13-th process's pid is 420 and command systemd-logind
[ 2938.994365] --syscall The 14-th process's pid is 439 and command cron
[ 2938.994372] --syscall The 15-th process's pid is 451 and command atd
[ 2938.994378] --syscall The 16-th process's pid is 456 and command agetty
[ 2938.994385] --syscall The 17-th process's pid is 461 and command sshd
[ 2938.994390] --syscall The 18-th process's pid is 491 and command unattended-upgr
[ 2938.994397] --syscall The 19-th process's pid is 501 and command polkitd
[ 2938.994413] --syscall results: pid=1200 name=login state=1 uid=0 nvcsw=31 nivcsw=33 num_children=1 num_siblings=19

这里是两个同级 child 进程如何 linked 到它们的 parent 进程的 children 列表中的说明:

     PARENT              CHILD 1             CHILD 2
     ======              =======             =======

                         task_struct         task_struct
                        +-------------+     +-------------+
                        |             |     |             |
     task_struct        ~             ~     ~             ~
    +-------------+     |             |     |             |
    |             |     |-------------|     |-------------|
    ~             ~     | children    |     | children    |
    |             |     |             |     |             |
. . |-------------| . . |-------------| . . |-------------| . .
    | children    |     | sibling     |     | sibling     |
X==>| prev | next |<===>| prev | next |<===>| prev | next |<==X
. . |-------------| . . |-------------| . . |-------------| . .
    | sibling     |     |             |     |             |
    |             |     ~             ~     ~             ~
    |-------------|     |             |     |             |
    |             |     +-------------+     +-------------+
    ~             ~
    |             |     'X's are joined together, making
    +-------------+     a doubly linked, circular list.

尽管 childrensibling 都是 struct list_head 类型,children 被用作实际的列表头(linking 到它的列表child 个进程),而 sibling 被用作列表条目。

parent的children.nextlink指向child1的sibling成员,child1的sibling.nextlink 指向 child 2 的 sibling 成员,child 2 的 sibling.next link 指向 parent 的 children 成员(列表头)。同样,parent的children.prevlink指向child2的sibling成员,child2的sibling.prevlink指向child1的sibling成员,child1的sibling.prevlink指向parent的children成员.

list_for_each(pos, head) 宏访问列表中的每个节点 pos,从 head->next 开始,而 pos != head.

正常情况下,list_for_each(pos, head)head参数应该是一个真正的表头,但是宏无法区分表头和表项。它们都是同一类型,所有节点都link循环在一起。 (整个列表由一个列表头和零个或多个列表条目 link 组成一个圆圈。对于一个空列表,列表头只是 link 回到它自己。) list_for_each 宏将只遍历双重 linked 列表,直到它回到它开始的地方。

如果调用 list_for_each(pos, head)head 指向 parent 的 children 成员,那么 pos 将指向 child 1 的 sibling 成员在第一次迭代中指向 child 2 的 sibling 成员在第二次迭代中终止循环,pos 指向回 parent 的 children 成员。在循环内,list_entry(pos, struct task_struct, sibling) 将正确指向 child 进程的 struct task_struct 的开头。

假设 child 1 是 current 进程。 OP 的代码使用 list_for_each(pos, head)head 指向 child 1 的 sibling 成员。因此,pos 将在第一次迭代中指向 child 2 的 sibling 成员,并在第二次迭代中指向 parent 的 children 成员,并会终止循环 pos 指向 child 1 的 sibling 成员。在循环内部,list_entry(pos, struct task_struct, sibling) 会在第一次迭代中正确指向 child 2 的 struct task_struct 的开头,但 pos 会指向 [=96 开头之前的某处=] 的 struct task_struct 在第二次迭代中。这就是OP代码的问题所在。