试图将 C 转换为汇编

Question

我在 C 中有一个结构：

struct struct1 {
    uint16_t a;
    uint32_t b;
    char * c;
    struct struct2* d;
}

如何在 Nasm 中定义相同的结构？我试过这个：

struc struct1
  .a resw
  .b resdw
  .c ???? ; what should be here?
  .d ???? ; what should be here?   
endstruc

我该怎么做？

Answer 1

这是某种考试测试，还是真实世界的 C 结构？

在现实世界中，它可能被填充以对齐它的成员，因此 .b 可能是 4 或 8（或更多，取决于填充的编译时设置），而不是 2。

真正执行 C<->asm 时，请确保使用一些 "padding/packing" 编译指示或编译时开关以始终编译为 C 二进制文件中的相同结构（第一步）。

然后可能是手动pad/align，例如我会把"a"放在最后，"c"和"d"放在开头。因此内存中的顺序将是 "c, d, b, a"（我会发现 "enough" 即使对于 "packed" 模式下的 64b 目标也是对齐的，由此产生的偏移量将是 [0, 8, 16, 20] 并且大小将是 22 字节）（编辑：我会在末尾添加另一个 word 只是为了将其填充到 24B 大小，如果我知道我将在数组中使用它们中的许多）。

最后，内存中的 c 和 d 是什么 -> 指针。通过 "nasm" 单词用法，我感觉到 x86 目标平台，通过 "uint32_t" 我感觉到它不是 16b 实模式，所以它们是 32 位或 64 位（取决于您的目标平台）。 32位是4字节，64位是8字节。

顺便说一句，您总是可以编写一些简短的 C 源代码来练习访问该结构，并检查编译器的输出。

例如我把它放到 http://godbolt.org/:

#include <cstdint>

struct struct1 {
    uint16_t a;
    uint32_t b;
    char * c;
    void * d;
};

std::size_t testFunction(struct1 *in) {
    std::size_t r = in->a;
    r += in->b;
    r += uintptr_t(in->c);
    r += uintptr_t(in->d);
    return r;
}

然后把它弄出来 (clang 3.9.0 -O3 -m32 -std=c++11):

testFunction(struct1*):              # @testFunction(struct1*)
        mov     ecx, dword ptr [esp + 4]   ; ecx = "in" pointer
        movzx   eax, word ptr [ecx]        ; +0 for "a"
        add     eax, dword ptr [ecx + 4]   ; +4 for "b"
        add     eax, dword ptr [ecx + 8]   ; +8 for "c"
        add     eax, dword ptr [ecx + 12]  ; +12 for "d"
        ret     ; size of struct is 16B

以及 64b 目标：

testFunction(struct1*):              # @testFunction(struct1*)
        mov     rax, qword ptr [rdi]
        movzx   ecx, ax
        shr     rax, 32
        add     rax, rcx
        add     rax, qword ptr [rdi + 8]
        add     rax, qword ptr [rdi + 16]
        ret

偏移量现在为 0、4、8 和 16，大小为 24B。

和添加了“-fpack-struct=1”的 64b 目标：

testFunction(struct1*):              # @testFunction(struct1*)
        movzx   ecx, word ptr [rdi]
        mov     eax, dword ptr [rdi + 2]
        add     rax, rcx
        add     rax, qword ptr [rdi + 6]
        add     rax, qword ptr [rdi + 14]
        ret

偏移量为 0、2、6 和 14，大小为 22B（对成员 b、c 和 d 的未对齐访问会影响性能）。

例如，对于 0、4、8、16 情况（64b 对齐），您的 NASM 结构应该是：

struc struct1
  .a resd 1
  .b resd 1
  .c resq 1
  .d resq 1
endstruc

根据您的进一步评论...我认为您可能有点想念汇编中的 "struc"。这是一个噱头，它只是指定地址偏移量的另一种方式。上面的例子也可以写成：

struc struct1
  .a resw 1
     resw 1   ; padding to make "b" start at offset 4
  .b resd 1
  .c resq 1
  .d resq 1
endstruc

现在您也有 "a" 的 "resw" 了。这对 ASM 无关紧要，至于代码，只有符号 .a 和 .b 的值很重要，并且在两个示例中这些值都是 0 和 4 .无论您如何在 struc 定义中保留 space，它都不会影响结果，只要您为特定的 "variable" + 它指定正确的字节数填充。

试图将 C 转换为汇编

Trying to convert C to Assembly

c

assembly

nasm