将 uint32 写入 uint64 不是原子操作。为什么?
Writing uint32 to uint64 is not atomic. Why?
在第 410 页的 Is Parallel Programming Hard, And, If So,What Can You Do About It 中写道:
Quick Quiz 5.17:
Why doesn’t inc_count()
in Listing 5.4 need to use atomic instructions?
Answer:
(..) atomic instructions would be needed in cases where the
per-thread counter variables were smaller than the global global_count
(..)
简化,该句子适用于以下示例:
uint64 global_count = 0;
void f(){
uint32 sum = sum_of_smaller_thread_locals(); # sum is a variable
WRITE_ONCE(global_count, sum);
}
我不明白为什么在那种情况下我们需要原子指令?
正如 Peter Cordes 指出的那样,per-thread 增量需要原子指令。文中给出了原因,但多余的'however'稍微遮蔽了它:
That said, atomic instructions would be needed in cases where the
per-thread counter variables were smaller than the global global_
count. However, note that on a 32-bit system, the per-thread counter
variables might need to be limited to 32 bits in order to sum them
accurately, but with a 64-bit global_count variable to avoid overflow.
In this case, it is necessary to zero the per-thread counter variables
periodically in order to avoid overflow. It is extremely important to
note that this zeroing cannot be delayed too long or overflow of the
smaller per-thread variables will result. This approach therefore
imposes real-time requirements on the underlying system, and in turn
must be used with extreme care.
In contrast, if all variables are the
same size, overflow of any variable is harmless because the eventual
sum will be modulo the word size.
如果主线程清除 per-thread 计数器,它需要通过原子交换来执行此操作以避免可能的数据丢失。如果 per-thread 增量进行清除,为了避免数据丢失,他们将需要一些其他(可能更复杂)的互锁。
在第 410 页的 Is Parallel Programming Hard, And, If So,What Can You Do About It 中写道:
Quick Quiz 5.17:
Why doesn’t
inc_count()
in Listing 5.4 need to use atomic instructions?Answer:
(..) atomic instructions would be needed in cases where the per-thread counter variables were smaller than the global global_count (..)
简化,该句子适用于以下示例:
uint64 global_count = 0;
void f(){
uint32 sum = sum_of_smaller_thread_locals(); # sum is a variable
WRITE_ONCE(global_count, sum);
}
我不明白为什么在那种情况下我们需要原子指令?
正如 Peter Cordes 指出的那样,per-thread 增量需要原子指令。文中给出了原因,但多余的'however'稍微遮蔽了它:
That said, atomic instructions would be needed in cases where the per-thread counter variables were smaller than the global global_ count.
However,note that on a 32-bit system, the per-thread counter variables might need to be limited to 32 bits in order to sum them accurately, but with a 64-bit global_count variable to avoid overflow. In this case, it is necessary to zero the per-thread counter variables periodically in order to avoid overflow. It is extremely important to note that this zeroing cannot be delayed too long or overflow of the smaller per-thread variables will result. This approach therefore imposes real-time requirements on the underlying system, and in turn must be used with extreme care.In contrast, if all variables are the same size, overflow of any variable is harmless because the eventual sum will be modulo the word size.
如果主线程清除 per-thread 计数器,它需要通过原子交换来执行此操作以避免可能的数据丢失。如果 per-thread 增量进行清除,为了避免数据丢失,他们将需要一些其他(可能更复杂)的互锁。