为什么将getpid实现为系统调用？

Question

我正在接受我的第一个 OS class，所以希望我在这里没有任何大的误解。

我想知道为什么 getpid() 在 Linux 中被实现为系统调用。据我了解，某些功能被制成系统调用，因为它们访问或更改 OS 可能想要保护的信息，因此它们被实现为系统调用，以便将控制权转移到内核。

但据我了解，getpid() 只是返回调用进程的进程 ID。是否存在不授予此信息许可的情况？简单地让 getpid() 成为一个普通的用户函数难道不安全吗？

感谢您的帮助。

Answer 1

Getpid() 可能只是从某个位置读取，但必须有人写入该位置。为了防止任何旧进程将垃圾写入操作系统使用的位置，需要保护它免受用户模式访问。为了让应用程序访问该位置，它需要在内核模式下进行。因此，它必须作为系统调用来完成。

Answer 2

我认为将 pid 暴露给进程没有任何安全问题。进程地址 space 隔离是由操作系统强制执行的。如果我没记错的话，对 getpid() 的第一次调用是系统调用，但对 getpid() 的未来调用会被缓存（可能由 libc 调用）并在本地处理。

Answer 3

在没有系统调用的情况下实现 getpid() 的唯一方法是先执行一个系统调用并缓存其结果。然后每次调用 getpid() 都会 return 该缓存值而不需要系统调用。

但是，Linux man-pages 项目解释了为什么不缓存 getpid()：

   From glibc version 2.3.4 up to and including version 2.24, the glibc
   wrapper function for getpid() cached PIDs, with the goal of avoiding
   additional system calls when a process calls getpid() repeatedly.
   Normally this caching was invisible, but its correct operation relied
   on support in the wrapper functions for fork(2), vfork(2), and
   clone(2): if an application bypassed the glibc wrappers for these
   system calls by using syscall(2), then a call to getpid() in the
   child would return the wrong value (to be precise: it would return
   the PID of the parent process).  In addition, there were cases where
   getpid() could return the wrong value even when invoking clone(2) via
   the glibc wrapper function.  (For a discussion of one such case, see
   BUGS in clone(2).)  Furthermore, the complexity of the caching code
   had been the source of a few bugs within glibc over the years.

   Because of the aforementioned problems, since glibc version 2.25, the
   PID cache is removed: calls to getpid() always invoke the actual
   system call, rather than returning a cached value.

总而言之，如果 getpid() 被缓存，它可能 return 错误的值（即使缓存完美地完成而不允许任何程序写入，等等......）并且它是错误的来源过去。

通常您只需要在任何进程中调用一次 getpid()，如果您多次使用结果，请将其保存在变量中（application-level 缓存！）。

干杯！

Answer 4

正如其他答案所解释的，进程的PID是内核的内部数据，用户空间的进程必须通过系统调用访问它，否则，它有被恶意写入的风险。

但是，有一个错误的假设必须更正：

getpid() is just returning the process id of the calling process.

事实上，PID比我们想象的要复杂得多，主要有两个方面：

命名空间。它是 Docker.
线程组、进程组和会话组。

为什么将getpid实现为系统调用？

Why is getpid implemented as a system call?

linux

operating-system

kernel

system