C++ 队列中 read/write 操作的排序

Question

假设我们有一个 SyncQueue class 具有以下实现：

class SyncQueue {
    std::mutex mtx;
    std::queue<std::shared_ptr<ComplexType> > m_q;
public:
    void push(const std::shared_ptr<ComplexType> & ptr) {
        std::lock_guard<std::mutex> lck(mtx);
        m_q.push(ptr);
    }
    std::shared_ptr<ComplexType> pop() {
        std::lock_guard<std::mutex> lck(mtx);
        std::shared_ptr<ComplexType> rv(m_q.front());
        m_q.pop();
        return rv;
    }
};

然后我们有使用它的代码：

SyncQueue q;

// Thread 1, Producer:
std::shared_ptr<ComplexType> ct(new ComplexType);
ct->foo = 3;
q.push(ct);

// Thread 2, Consumer:
std::shared_ptr<ComplexType> ct(q.pop());
std::cout << ct->foo << std::endl;

当打印ct->foo时，我能保证得到3吗？ mtx 为指针本身提供 happens-before 语义，但我不确定这对 ComplexType 的内存有什么影响。如果有保证，是否意味着每个互斥锁 (std::lock_guard<std::mutex> lck(mtx);) 都会强制对任何修改的内存位置进行完全缓存失效，直到独立内核的内存层次结构合并的位置？

Answer 1

std::mutex() 符合 Mutex 要求 (http://en.cppreference.com/w/cpp/concept/Mutex)

Prior m.unlock() operations on the same mutex synchronize-with this lock operation (equivalent to release-acquire std::memory_order)

release-acquire 在这里解释 (http://en.cppreference.com/w/cpp/atomic/memory_order)

Release-Acquire ordering

If an atomic store in thread A is tagged memory_order_release and an atomic load in thread B from the same variable is tagged memory_order_acquire, all memory writes (non-atomic and relaxed atomic) that happened-before the atomic store from the point of view of thread A, become visible side-effects in thread B, that is, once the atomic load is completed, thread B is guaranteed to see everything thread A wrote to memory.

The synchronization is established only between the threads releasing and acquiring the same atomic variable. Other threads can see different order of memory accesses than either or both of the synchronized threads.

本节中的代码示例与您的非常相似。所以应该保证线程1中的所有写入都发生在push()中的mutex解锁之前。

当然，如果 "ct->foo = 3" 没有任何特殊的棘手含义，即实际分配发生在另一个线程中:)

wrt 缓存失效，来自 cppreference：

On strongly-ordered systems (x86, SPARC TSO, IBM mainframe), release-acquire ordering is automatic for the majority of operations. No additional CPU instructions are issued for this synchronization mode, only certain compiler optimizations are affected (e.g. the compiler is prohibited from moving non-atomic stores past the atomic store-release or perform non-atomic loads earlier than the atomic load-acquire). On weakly-ordered systems (ARM, Itanium, PowerPC), special CPU load or memory fence instructions have to be used.

所以这真的取决于架构。

C++ 队列中 read/write 操作的排序

Ordering of read/write operations in a C++ queue

c++

multithreading

caching

mutex

c++11