GCC 内联汇编的副作用
GCC Inline Assembly side effects
有人可以向我解释一下(换句话说)GCC doc 的以下部分:
Here is a fictitious sum of squares instruction, that takes two pointers to floating point values in memory and produces a floating point register output. Notice that x, and y both appear twice in the asm parameters, once to specify memory accessed, and once to specify a base register used by the asm. You won’t normally be wasting a register by doing this as GCC can use the same register for both purposes. However, it would be foolish to use both %1 and %3 for x in this asm and expect them to be the same. In fact, %3 may well not be a register. It might be a symbolic memory reference to the object pointed to by x.
asm ("sumsq %0, %1, %2"
: "+f" (result)
: "r" (x), "r" (y), "m" (*x), "m" (*y));
Here is a fictitious *z++ = *x++ * *y++ instruction. Notice that the x, y and z pointer registers must be specified as input/output because the asm modifies them.
asm ("vecmul %0, %1, %2"
: "+r" (z), "+r" (x), "+r" (y), "=m" (*z)
: "m" (*x), "m" (*y));
在第一个例子中,在输入操作数中列出 *x
和 *y
有什么意义?同一文档指出:
In particular, there is no way to specify that input operands get modified without also specifying them as output operands.
在第二个例子中,为什么要使用输入操作数部分? None 的操作数无论如何都用在汇编语句中。
作为奖励,如何将以下示例从 SO post 更改为不需要 volatile
关键字?
void swap_2 (int *a, int *b)
{
int tmp0, tmp1;
__asm__ volatile (
"movl (%0), %k2\n\t" /* %2 (tmp0) = (*a) */
"movl (%1), %k3\n\t" /* %3 (tmp1) = (*b) */
"cmpl %k3, %k2\n\t"
"jle %=f\n\t" /* if (%2 <= %3) (at&t!) */
"movl %k3, (%0)\n\t"
"movl %k2, (%1)\n\t"
"%=:\n\t"
: "+r" (a), "+r" (b), "=r" (tmp0), "=r" (tmp1) :
: "memory" /* "cc" */ );
}
提前致谢。我现在已经为此苦苦挣扎了两天。
在第一个示例中,*x
和 *y
必须列为输入操作数,以便 GCC 知道指令的结果取决于它们。否则,GCC 可以将存储移动到 *x
和 *y
通过内联汇编片段,然后这将访问未初始化的内存。通过编译这个例子可以看出这一点:
double
f (void)
{
double result;
double a = 5;
double b = 7;
double *x = &a;
double *y = &b;
asm ("sumsq %0, %1, %2"
: "+X" (result)
: "r" (x), "r" (y) /*, "m" (*x), "m" (*y)*/);
return result;
}
这导致:
f:
leaq -16(%rsp), %rax
leaq -8(%rsp), %rdx
pxor %xmm0, %xmm0
#APP
# 8 "t.c" 1
sumsq %xmm0, %rax, %rdx
# 0 "" 2
#NO_APP
ret
两条leaq
指令只是将寄存器设置为指向堆栈上未初始化的红色区域。作业都没有了。
第二个例子也是如此
我认为您可以使用相同的技巧来消除 volatile
。但我认为这里实际上没有必要,因为已经有一个 "memory"
clobber,它告诉 GCC 内存是从内联汇编读取或写入的。
有人可以向我解释一下(换句话说)GCC doc 的以下部分:
Here is a fictitious sum of squares instruction, that takes two pointers to floating point values in memory and produces a floating point register output. Notice that x, and y both appear twice in the asm parameters, once to specify memory accessed, and once to specify a base register used by the asm. You won’t normally be wasting a register by doing this as GCC can use the same register for both purposes. However, it would be foolish to use both %1 and %3 for x in this asm and expect them to be the same. In fact, %3 may well not be a register. It might be a symbolic memory reference to the object pointed to by x.
asm ("sumsq %0, %1, %2"
: "+f" (result)
: "r" (x), "r" (y), "m" (*x), "m" (*y));
Here is a fictitious *z++ = *x++ * *y++ instruction. Notice that the x, y and z pointer registers must be specified as input/output because the asm modifies them.
asm ("vecmul %0, %1, %2"
: "+r" (z), "+r" (x), "+r" (y), "=m" (*z)
: "m" (*x), "m" (*y));
在第一个例子中,在输入操作数中列出 *x
和 *y
有什么意义?同一文档指出:
In particular, there is no way to specify that input operands get modified without also specifying them as output operands.
在第二个例子中,为什么要使用输入操作数部分? None 的操作数无论如何都用在汇编语句中。
作为奖励,如何将以下示例从 volatile
关键字?
void swap_2 (int *a, int *b)
{
int tmp0, tmp1;
__asm__ volatile (
"movl (%0), %k2\n\t" /* %2 (tmp0) = (*a) */
"movl (%1), %k3\n\t" /* %3 (tmp1) = (*b) */
"cmpl %k3, %k2\n\t"
"jle %=f\n\t" /* if (%2 <= %3) (at&t!) */
"movl %k3, (%0)\n\t"
"movl %k2, (%1)\n\t"
"%=:\n\t"
: "+r" (a), "+r" (b), "=r" (tmp0), "=r" (tmp1) :
: "memory" /* "cc" */ );
}
提前致谢。我现在已经为此苦苦挣扎了两天。
在第一个示例中,*x
和 *y
必须列为输入操作数,以便 GCC 知道指令的结果取决于它们。否则,GCC 可以将存储移动到 *x
和 *y
通过内联汇编片段,然后这将访问未初始化的内存。通过编译这个例子可以看出这一点:
double
f (void)
{
double result;
double a = 5;
double b = 7;
double *x = &a;
double *y = &b;
asm ("sumsq %0, %1, %2"
: "+X" (result)
: "r" (x), "r" (y) /*, "m" (*x), "m" (*y)*/);
return result;
}
这导致:
f:
leaq -16(%rsp), %rax
leaq -8(%rsp), %rdx
pxor %xmm0, %xmm0
#APP
# 8 "t.c" 1
sumsq %xmm0, %rax, %rdx
# 0 "" 2
#NO_APP
ret
两条leaq
指令只是将寄存器设置为指向堆栈上未初始化的红色区域。作业都没有了。
第二个例子也是如此
我认为您可以使用相同的技巧来消除 volatile
。但我认为这里实际上没有必要,因为已经有一个 "memory"
clobber,它告诉 GCC 内存是从内联汇编读取或写入的。