ld：这个 ld 脚本是如何工作的？

Question

在他关于理解 Linux Kernel Initcall Mechanism 的文章中，Trevor 创建了一个用户空间程序来模拟调用 linux 驱动程序的 init_module() 的机制。

#include <stdio.h>

typedef int (*initcall_t)(void);
extern initcall_t __initcall_start, __initcall_end;

#define __initcall(fn) \
        static initcall_t __initcall_##fn __init_call = fn
#define __init_call     __attribute__ ((unused,__section__ ("function_ptrs")))
#define module_init(x)  __initcall(x);

#define __init __attribute__ ((__section__ ("code_segment")))

static int __init
my_init1 (void)
{
        printf ("my_init () #1\n");
        return 0;
}

static int __init
my_init2 (void)
{
        printf ("my_init () #2\n");
        return 0;
}

module_init (my_init1);
module_init (my_init2);

void
do_initcalls (void)
{
        initcall_t *call_p;

        call_p = &__initcall_start;
        do {
                fprintf (stderr, "call_p: %p\n", call_p);
                (*call_p)();
                ++call_p;
        } while (call_p < &__initcall_end);
}

int
main (void)
{
        fprintf (stderr, "in main()\n");
        do_initcalls ();
        return 0;
}

如您所见，__initcall_start 和 __initcall_end 未定义，因此 linker 会抱怨并且不会生成可执行文件。解决方案是通过在文本部分之前添加以下行来自定义默认的 linker 脚本（由 ld --verbose 生成）：

__initcall_start = .;
function_ptrs : { *(function_ptrs) }
__initcall_end   = .;
code_segment    : { *(code_segment) }

这是 objdump -t 输出的一个片段：

0000000000000618 g function_ptrs        0000000000000000         __initcall_end<br>
0000000000000608 g .plt.got             0000000000000000         __initcall_start<br>
0000000000000608 l O function_ptrs      0000000000000008      __initcall_my_init1<br>
0000000000000610   O function_ptrs      0000000000000008      __initcall_my_init2<br>
0000000000000618 l F code_segment       0000000000000017          my_init1<br>

我理解这个机制，我只是不明白 linker 如何理解 __initcall_start 应该指向 function_ptrs 部分或者 __initcall_end 将如何指向到 code_segment 部分。

在我看来，__initcall_start 被分配了当前输出位置的值，然后定义了一个 function_ptrs 部分，它将指向输入文件中的 function_ptrs 部分，但是我看不到 __initcall_start 和 funtction_ptrs 部分之间的 link。

我的问题是：linker 如何理解 __initcall_start 应该指向 funtion_ptrs ??

Answer 1

__initcall_start = .;
function_ptrs : { *(function_ptrs) }
__initcall_end   = .;
code_segment    : { *(code_segment) }

这段链接描述文件指示链接器如何编写一个输出文件的某些部分。意思是：-

发出符号 __initcall_start 寻址 location-counter（即 .）
然后发出一个名为 function_ptrs 的部分，由所有名为 function_ptrs 的输入部分（即 function_ptrs 来自所有输入文件的片段）。
然后发出一个符号 __initcall_end 再次寻址位置计数器。
然后发出一个名为 code_segment 的部分，由所有名为 code_seqment)

function_ptrs 部分是放置在该位置的第一个存储由 __initcall_start 提出。所以__initcall_start是链接器所在的地址开始 function_ptrs 段。 __initcall_end 地址位置在 function_ptrs 段之后。出于同样的原因，它是地址链接器启动 code_segment 段。

The way I see it, __initcall_start is assigned the value of the current output location,...

您在想：

    __initcall_start = .;

导致链接器创建一个符号，在某种意义上是一个指针并将当前位置指定为该指针的值。有一点像这个 C 代码：

void * ptr = &ptr;

这里也有同样的想法（强调我的）：

I just don't see how the the linker understood that __initcall_start should point to function_ptrs section or how the __initcall_end will point to the code_segment section either.

链接器没有指针的概念。它处理 符号化地址 .

的符号

在链接器手册中，Assignment: Defining Symbols 你看：

You may create global symbols, and assign values (addresses) to global symbols, using any of the C assignment operators:

symbol = expression ;

...

这意味着 symbol 被定义为一个符号 用于 expression 计算的地址 。同样：

__initcall_start = .;

表示__initcall_start定义为符号为地址在当前位置计数器。它意味着没有 type 该符号的任何内容 - 甚至不是它是 data 符号或 function 符号。符号 S 的 type 是一个编程- 表达该语言的程序如何使用 byte-sequence 的语言概念地址 符号为 S.

C 程序可以自由地声明它喜欢的任何类型它使用的外部符号 S，只要链接提供该符号即可。无论是什么类型，程序都将获得由符号化的地址 S 与表达式 &S.

您的 C 程序选择同时声明 __initcall_start 和 __initcall_end 从类型开始：

int (*initcall_t)(void);

这在程序告诉链接器做什么的上下文中很有意义。它告诉链接器在地址之间布置 function_ptrs 部分由 __initcall_start 和 __initcall_end 符号化。本节包括 int ()(void) 类型的函数数组。所以输入 int (*initcall_t)(void) 完全适合遍历该数组，如：

call_p = &__initcall_start;
do {
        fprintf (stderr, "call_p: %p\n", call_p);
        (*call_p)();
        ++call_p;
} while (call_p < &__initcall_end)

ld：这个 ld 脚本是如何工作的？

ld: How this ld script works?

linux

ld