为什么 Rust RwLock 在 fork 中表现异常？

Question

当我使用 RwLock 和 fork 时，我看到了一些我无法解释的行为。基本上 child 进程报告 RwLock 仍然被获取，而 parent 没有，即使它们都运行相同的代码路径。我的理解是 child 进程应该接收 parent 进程内存的独立副本 space，包括锁，因此它们应该报告不同的结果是没有意义的。

预期的行为是 child 和 parent 都报告 "mutex held: false"。有趣的是，当使用 Mutex 而不是 RwLock 时，这会按预期工作。

Rust Playground link

use libc::fork;
use std::error::Error;
use std::sync::RwLock;

fn main() -> Result<(), Box<dyn Error>> {
    let lock = RwLock::new(());

    let guard = lock.write();
    let res = unsafe { fork() };
    drop(guard);

    match res {
        0 => {
            let held = lock.try_write().is_err();
            println!("CHILD mutex held: {}", held);
        }
        _child_pid => {
            let held = lock.try_write().is_err();
            println!("PARENT mutex held: {}", held);
        }
    }
    Ok(())
}

输出：

PARENT mutex held: false
CHILD mutex held: true

Answer 1

我假设您运行在 Linux 系统上。 Rust 这样做是因为 glibc 这样做，而 Rust 的 RwLock 基于 glibc 在 glibc-using Linux 系统上的 pthreads 实现。

您可以使用等效的 C 程序确认此行为：

#include <pthread.h>
#include <unistd.h>
#include <stdio.h>

int main(void)
{
    pthread_rwlock_t lock = PTHREAD_RWLOCK_INITIALIZER;

    pthread_rwlock_wrlock(&lock);
    pid_t pid = fork();
    int res = pthread_rwlock_unlock(&lock);
    int res2 = pthread_rwlock_trywrlock(&lock);

    printf("%s unlock_errno=%d trywrlock_errno=%d\n", (pid == 0) ? "child" : "parent", res, res2);
    return 0;
}

打印以下内容：

parent unlock_errno=0 trywrlock_errno=0
child unlock_errno=0 trywrlock_errno=16

16 在我的系统上是 EBUSY。

glibc出现这种情况的原因是POSIX为rwlocks指定了单一的解锁函数，glibc存储当前线程ID来判断当前线程持有的锁是读锁还是写锁.如果当前线程ID等于存储的值，则线程有写锁，否则有读锁。所以你实际上没有解锁 child 中的任何东西，但你可能已经损坏了锁中的 reader 计数器。

如评论中所述，根据POSIX，这是child中的未定义行为，因为线程解锁不是持有锁的线程。为了让它工作，Rust 必须像 Go 那样实现自己的同步原语，这通常是一个主要的可移植性噩梦。

为什么 Rust RwLock 在 fork 中表现异常？

Why does Rust RwLock behave unexpectedly with fork?

mutex

fork

locking

multiprocessing

rust