挥发性未按预期工作
Volatile not working as expected
考虑这段代码:
struct A{
volatile int x;
A() : x(12){
}
};
A foo(){
A ret;
//Do stuff
return ret;
}
int main()
{
A a;
a.x = 13;
a = foo();
}
使用 g++ -std=c++14 -pedantic -O3
我得到了这个程序集:
foo():
movl , %eax
ret
main:
xorl %eax, %eax
ret
根据我的估计,变量x
应该至少被写入三次(可能是四次),但它甚至没有被写入一次(函数foo是'甚至打电话!)
更糟糕的是,当您将 inline
关键字添加到 foo
时,结果如下:
main:
xorl %eax, %eax
ret
我认为 volatile 意味着每次读取或写入 必须 发生,即使编译器看不到 read/write.
的点
这是怎么回事?
更新:
像这样将 A a;
的声明放在 main 之外:
A a;
int main()
{
a.x = 13;
a = foo();
}
生成此代码:
foo():
movl , %eax
ret
main:
movl , a(%rip)
xorl %eax, %eax
movl , a(%rip)
ret
movl , a(%rip)
ret
a:
.zero 4
这更接近你的期望....我比以往任何时候都更困惑
Visual C++ 2015 没有优化分配:
A a;
mov dword ptr [rsp+8],0Ch <-- write 1
a.x = 13;
mov dword ptr [a],0Dh <-- write2
a = foo();
mov dword ptr [a],0Ch <-- write3
mov eax,dword ptr [rsp+8]
mov dword ptr [rsp+8],eax
mov eax,dword ptr [rsp+8]
mov dword ptr [rsp+8],eax
}
xor eax,eax
ret
/O2(最大化速度)和 /Ox(完全优化)也是如此。
gcc 3.4.4 也使用 -O2 和 -O3 保留易失性写入
_main:
pushl %ebp
movl , %eax
movl %esp, %ebp
subl , %esp
andl $-16, %esp
call __alloca
call ___main
movl , -4(%ebp) <-- write1
xorl %eax, %eax
movl , -4(%ebp) <-- write2
movl , -8(%ebp) <-- write3
leave
ret
使用这两个编译器,如果我删除 volatile 关键字,main() 基本上变为空。
我会说你有这样一种情况,编译器过于激进(错误地恕我直言)决定因为 'a' 没有被使用,所以对它的操作是不必要的并且忽略了 volatile 成员。使 'a' 本身易变可以得到你想要的东西,但由于我没有重现它的编译器,所以我不能肯定地说。
最后(虽然这是 Microsoft 特有的),https://msdn.microsoft.com/en-us/library/12a04hfd.aspx 说:
If a struct member is marked as volatile, then volatile is propagated to the whole structure.
这也表明您所看到的行为是编译器问题。
最后,如果您将 'a' 设为全局变量,编译器不太愿意将其视为未使用并丢弃它,这在某种程度上是可以理解的。全局变量默认是extern的,所以不能光看main函数就说一个global'a'没有用。其他一些编译单元(.cpp 文件)可能正在使用它。
GCC 在 Volatile access 上的页面提供了一些关于其工作原理的见解:
The standard encourages compilers to refrain from optimizations concerning accesses to volatile objects, but leaves it implementation defined as to what constitutes a volatile access. The minimum requirement is that at a sequence point all previous accesses to volatile objects have stabilized and no subsequent accesses have occurred. Thus an implementation is free to reorder and combine volatile accesses that occur between sequence points, but cannot do so for accesses across a sequence point. The use of volatile does not allow you to violate the restriction on updating objects multiple times between two sequence points.
在 C 标准语中:
§5.1.2.3
2 Accessing a volatile object, modifying an object, modifying a file,
or calling a function that does any of those operations are all side
effects, 11) which are changes in the state of the
execution environment. Evaluation of an expression may produce side
effects. At certain specified points in the execution sequence called
sequence points, all side effects of previous evaluations shall be complete and no side effects of subsequent evaluations shall have
taken place. (A summary of the sequence points is given in annex C.)
3 In the abstract machine, all expressions are evaluated as specified
by the semantics. An actual implementation need not evaluate part of
an expression if it can deduce that its value is not used and that no
needed side effects are produced (including any caused by calling a
function or accessing a volatile object).
[...]
5 The least requirements on a conforming implementation are:
- At sequence points, volatile objects are stable in the sense that previous accesses are complete and subsequent accesses have not yet
occurred. [...]
我选择了C标准,因为语言更简单,但规则在C++中基本相同。请参阅 "as-if" 规则。
现在在我的机器上,-O1
没有优化掉对 foo()
的调用,所以让我们使用 -fdump-tree-optimized
看看区别:
-O1
*[definition to foo() omitted]*
;; Function int main() (main, funcdef_no=4, decl_uid=2131, cgraph_uid=4, symbol_order=4) (executed once)
int main() ()
{
struct A a;
<bb 2>:
a.x ={v} 12;
a.x ={v} 13;
a = foo ();
a ={v} {CLOBBER};
return 0;
}
和-O3
:
*[definition to foo() omitted]*
;; Function int main() (main, funcdef_no=4, decl_uid=2131, cgraph_uid=4, symbol_order=4) (executed once)
int main() ()
{
struct A ret;
struct A a;
<bb 2>:
a.x ={v} 12;
a.x ={v} 13;
ret.x ={v} 12;
ret ={v} {CLOBBER};
a ={v} {CLOBBER};
return 0;
}
gdb
在这两种情况下都表明 a
最终被优化掉了,但我们担心 foo()
。转储告诉我们 GCC 重新排序了访问,因此 foo()
甚至不是必需的,随后 main()
中的所有代码都被优化掉了。这是真的吗?让我们看看 -O1
:
的汇编输出
foo():
mov eax, 12
ret
main:
call foo()
mov eax, 0
ret
这基本上证实了我上面所说的。一切都被优化了:唯一的区别是对 foo()
的调用是否也是如此。
考虑这段代码:
struct A{
volatile int x;
A() : x(12){
}
};
A foo(){
A ret;
//Do stuff
return ret;
}
int main()
{
A a;
a.x = 13;
a = foo();
}
使用 g++ -std=c++14 -pedantic -O3
我得到了这个程序集:
foo():
movl , %eax
ret
main:
xorl %eax, %eax
ret
根据我的估计,变量x
应该至少被写入三次(可能是四次),但它甚至没有被写入一次(函数foo是'甚至打电话!)
更糟糕的是,当您将 inline
关键字添加到 foo
时,结果如下:
main:
xorl %eax, %eax
ret
我认为 volatile 意味着每次读取或写入 必须 发生,即使编译器看不到 read/write.
的点这是怎么回事?
更新:
像这样将 A a;
的声明放在 main 之外:
A a;
int main()
{
a.x = 13;
a = foo();
}
生成此代码:
foo():
movl , %eax
ret
main:
movl , a(%rip)
xorl %eax, %eax
movl , a(%rip)
ret
movl , a(%rip)
ret
a:
.zero 4
这更接近你的期望....我比以往任何时候都更困惑
Visual C++ 2015 没有优化分配:
A a;
mov dword ptr [rsp+8],0Ch <-- write 1
a.x = 13;
mov dword ptr [a],0Dh <-- write2
a = foo();
mov dword ptr [a],0Ch <-- write3
mov eax,dword ptr [rsp+8]
mov dword ptr [rsp+8],eax
mov eax,dword ptr [rsp+8]
mov dword ptr [rsp+8],eax
}
xor eax,eax
ret
/O2(最大化速度)和 /Ox(完全优化)也是如此。
gcc 3.4.4 也使用 -O2 和 -O3 保留易失性写入
_main:
pushl %ebp
movl , %eax
movl %esp, %ebp
subl , %esp
andl $-16, %esp
call __alloca
call ___main
movl , -4(%ebp) <-- write1
xorl %eax, %eax
movl , -4(%ebp) <-- write2
movl , -8(%ebp) <-- write3
leave
ret
使用这两个编译器,如果我删除 volatile 关键字,main() 基本上变为空。
我会说你有这样一种情况,编译器过于激进(错误地恕我直言)决定因为 'a' 没有被使用,所以对它的操作是不必要的并且忽略了 volatile 成员。使 'a' 本身易变可以得到你想要的东西,但由于我没有重现它的编译器,所以我不能肯定地说。
最后(虽然这是 Microsoft 特有的),https://msdn.microsoft.com/en-us/library/12a04hfd.aspx 说:
If a struct member is marked as volatile, then volatile is propagated to the whole structure.
这也表明您所看到的行为是编译器问题。
最后,如果您将 'a' 设为全局变量,编译器不太愿意将其视为未使用并丢弃它,这在某种程度上是可以理解的。全局变量默认是extern的,所以不能光看main函数就说一个global'a'没有用。其他一些编译单元(.cpp 文件)可能正在使用它。
GCC 在 Volatile access 上的页面提供了一些关于其工作原理的见解:
The standard encourages compilers to refrain from optimizations concerning accesses to volatile objects, but leaves it implementation defined as to what constitutes a volatile access. The minimum requirement is that at a sequence point all previous accesses to volatile objects have stabilized and no subsequent accesses have occurred. Thus an implementation is free to reorder and combine volatile accesses that occur between sequence points, but cannot do so for accesses across a sequence point. The use of volatile does not allow you to violate the restriction on updating objects multiple times between two sequence points.
在 C 标准语中:
§5.1.2.3
2 Accessing a volatile object, modifying an object, modifying a file, or calling a function that does any of those operations are all side effects, 11) which are changes in the state of the execution environment. Evaluation of an expression may produce side effects. At certain specified points in the execution sequence called sequence points, all side effects of previous evaluations shall be complete and no side effects of subsequent evaluations shall have taken place. (A summary of the sequence points is given in annex C.)
3 In the abstract machine, all expressions are evaluated as specified by the semantics. An actual implementation need not evaluate part of an expression if it can deduce that its value is not used and that no needed side effects are produced (including any caused by calling a function or accessing a volatile object).
[...]
5 The least requirements on a conforming implementation are:
- At sequence points, volatile objects are stable in the sense that previous accesses are complete and subsequent accesses have not yet occurred. [...]
我选择了C标准,因为语言更简单,但规则在C++中基本相同。请参阅 "as-if" 规则。
现在在我的机器上,-O1
没有优化掉对 foo()
的调用,所以让我们使用 -fdump-tree-optimized
看看区别:
-O1
*[definition to foo() omitted]*
;; Function int main() (main, funcdef_no=4, decl_uid=2131, cgraph_uid=4, symbol_order=4) (executed once)
int main() ()
{
struct A a;
<bb 2>:
a.x ={v} 12;
a.x ={v} 13;
a = foo ();
a ={v} {CLOBBER};
return 0;
}
和-O3
:
*[definition to foo() omitted]*
;; Function int main() (main, funcdef_no=4, decl_uid=2131, cgraph_uid=4, symbol_order=4) (executed once)
int main() ()
{
struct A ret;
struct A a;
<bb 2>:
a.x ={v} 12;
a.x ={v} 13;
ret.x ={v} 12;
ret ={v} {CLOBBER};
a ={v} {CLOBBER};
return 0;
}
gdb
在这两种情况下都表明 a
最终被优化掉了,但我们担心 foo()
。转储告诉我们 GCC 重新排序了访问,因此 foo()
甚至不是必需的,随后 main()
中的所有代码都被优化掉了。这是真的吗?让我们看看 -O1
:
foo():
mov eax, 12
ret
main:
call foo()
mov eax, 0
ret
这基本上证实了我上面所说的。一切都被优化了:唯一的区别是对 foo()
的调用是否也是如此。