编译器涉嫌优化 while 循环
Compiler's alleged optimisation of while loop
在嵌入式编程讲座的 this 部分,Samek 博士解释了编译器如何能够实现比 "original code" 更高的效率(大概他的意思是指由逻辑顺序决定的实现语法):
I hope you have noticed
that the disassembled code implements a different flow of control from
what I've described for the while loop. The original code was supposed
to test the condition first and then jump over the loop body if the
condition wasn't true. The compiled code starts with an unconditional
branch and reverses the order of the loop body and the testing of the
condition. When you think about it, though, those two flows of control
are equivalent, except the generated one is faster, because it has only
one conditional branch at the bottom of the loop.
在整个解释过程中,他参考了所示的两个流程图。描述编译代码的图表说明了 "it has only one conditional branch at the bottom of the loop." 我似乎看不出它在实践中是如何工作的:
- Here 是我截取的屏幕截图,展示了原始代码的实现方式。 (初始)机器指令序列:
0x2815 CMP
-> 0x1c40 ADDS
->repeat until condition is false
- Here 是另一个屏幕截图,显示了编译器使用的序列。 (初始)机器指令序列:
0x2815 CMP
-> 0xdbfc BLT.N
-> 0x1c40 ADDS
-> repeat until condition is false
我可以理解这两种控制流是如何等效的(至少仅从图表上看是这样),但肯定 不是 生成的控制流(右侧)如何更快。首先,两个图最开始的箭头指向相同的分支。其次,虽然分支似乎稍后出现在右侧的示意图中,正如您从屏幕截图中看到的那样,模拟器似乎显示 both 方法以与比较相关的机器指令 (0x2815 CMP
).
如何使流程图与我在实践中看到的相一致?
唯一重要的指令是在每个循环中重复的指令。而逻辑重排的重点是每循环只有一次跳跃。整个循环前后可能会有几次跳转,但我们不关心这些。唯一昂贵的是一遍又一遍地做的事情。
我同意 Kerren SB 的回答。我将更详细地讨论它:
原逻辑有如下跳转指令在每次迭代时执行:
begin_loop:
cmp counter, 21
jge end_loop ; jump when greater or equal
...
jmp begin_loop
end_loop:
....
修改后的逻辑每次迭代少执行一条指令:
jmp test_loop ; executed once; not part of loop
begin_loop:
...
test_loop:
cmp counter, 21
jlt begin_loop ; jump when less than
...
所以在循环中无条件jmp begin_loop
被保存了。
在嵌入式编程讲座的 this 部分,Samek 博士解释了编译器如何能够实现比 "original code" 更高的效率(大概他的意思是指由逻辑顺序决定的实现语法):
I hope you have noticed that the disassembled code implements a different flow of control from what I've described for the while loop. The original code was supposed to test the condition first and then jump over the loop body if the condition wasn't true. The compiled code starts with an unconditional branch and reverses the order of the loop body and the testing of the condition. When you think about it, though, those two flows of control are equivalent, except the generated one is faster, because it has only one conditional branch at the bottom of the loop.
在整个解释过程中,他参考了所示的两个流程图。描述编译代码的图表说明了 "it has only one conditional branch at the bottom of the loop." 我似乎看不出它在实践中是如何工作的:
- Here 是我截取的屏幕截图,展示了原始代码的实现方式。 (初始)机器指令序列:
0x2815 CMP
-> 0x1c40 ADDS
->repeat until condition is false
- Here 是另一个屏幕截图,显示了编译器使用的序列。 (初始)机器指令序列:
0x2815 CMP
-> 0xdbfc BLT.N
-> 0x1c40 ADDS
-> repeat until condition is false
我可以理解这两种控制流是如何等效的(至少仅从图表上看是这样),但肯定 不是 生成的控制流(右侧)如何更快。首先,两个图最开始的箭头指向相同的分支。其次,虽然分支似乎稍后出现在右侧的示意图中,正如您从屏幕截图中看到的那样,模拟器似乎显示 both 方法以与比较相关的机器指令 (0x2815 CMP
).
如何使流程图与我在实践中看到的相一致?
唯一重要的指令是在每个循环中重复的指令。而逻辑重排的重点是每循环只有一次跳跃。整个循环前后可能会有几次跳转,但我们不关心这些。唯一昂贵的是一遍又一遍地做的事情。
我同意 Kerren SB 的回答。我将更详细地讨论它:
原逻辑有如下跳转指令在每次迭代时执行:
begin_loop:
cmp counter, 21
jge end_loop ; jump when greater or equal
...
jmp begin_loop
end_loop:
....
修改后的逻辑每次迭代少执行一条指令:
jmp test_loop ; executed once; not part of loop
begin_loop:
...
test_loop:
cmp counter, 21
jlt begin_loop ; jump when less than
...
所以在循环中无条件jmp begin_loop
被保存了。