使用结构时，如何将以下汇编代码从编译器翻译成 C？

Question

假设我定义了一个新的 struct:

struct s {
   int *x;
   struct {
      short sh[2];
      int i;
   } w;
   struct s *next;
};

另外，我写了一个函数来初始化它：

void init_s(struct s *ss) {
   ss->w.sh[1] = /* Line 1 */;
   ss->x = /* Line 2 */;
   ss->next = /* Line 3 */;
}

编译器为init_s生成以下汇编代码：

init_s:             # line 1
   movw 8(%rdi), %ax    # line 2
   movw %ax, 10(%rdi)   # line 3
   leaq 12(%rdi), %rax  # line 4
   movq %rax, (%rdi)    # line 5
   movq %rdi, 16(%rdi)  # line 6
   retq                 # line 7

我想做的是根据程序集为 init_s 填充缺失的代码行。我已经弄清楚（或者至少我认为是）第 1 行和第 2 行。第 1 行应该是 ss->w.sh[0]，第 2 行应该是 &(ss->w.sh[2])。但是，我在第 3 行遇到了问题。我认为它是 &(ss->x) 基于程序集，但我觉得这是不正确的，我不确定为什么。非常感谢任何反馈或建议，以帮助我了解更多关于汇编和结构的信息。

Answer 1

Line 1 should be ss->w.sh[0]

同意。

line 2 should be &(ss->w.sh[2])

这是正确的地址，除了 ss->w.sh 只有 2 个元素，所以 w.sh[2] 超出范围。这是指向结构的下一个成员的指针，即 ss->x = &(ss->w.i)。这对于 ss->x 成员是 int * 而不是 short *.

也是有意义的

However, I am having trouble with line 3. I think it would be &(ss->x) based on the assembly

类似的问题：%rdi 确实可以是指向 ss->x 的指针，但按类型分配 &ss->x 没有意义（类型 int **) 到 ss->next（键入 struct s *）。您还可以将 %rdi 视为指向结构 *ss 本身的指针，这样更明智：ss->next = ss;。它使用单个节点创建一个循环链表，其 next 是它自己。

这里的寓意是，在 C 中可以有不同的方式来引用同一个地址，所有这些方式都会生成相同的程序集，您必须使用常识来猜测作者是哪一种方式更有可能是故意的。理论上 C 代码的作者可能将 ss->next = (struct s *)&(ss->x); 写为第三行 - 我们无法证明他们没有 - 但 ss->next = ss; 更明智，因此更有可能。

正因为如此，逆向工程既是一门艺术又是一门科学。

使用结构时，如何将以下汇编代码从编译器翻译成 C？

How would I translate the following assembly code from the compiler to C when working with structs?

c

assembly

struct

pointers

x86-64