谁创建并拥有调用堆栈以及调用堆栈如何在多线程中工作？

Question

我知道每个线程通常都有一个调用栈，就是一块内存，用esp和ebp控制。

1，这些调用堆栈是如何创建的，谁负责创建？我的猜测是运行时，例如 iOS 应用程序的 Swift 运行时。线程是通过 esp 和 ebp 还是通过运行时直接与自己的调用堆栈对话？

2，对于每个调用堆栈，他们必须使用 esp 和 ebb cpu 寄存器，如果我有一个 CPU 2 核 4 线程，那么假设它有 4 核（指令集）。这是否意味着每个调用堆栈将仅在特定核心中使用这些寄存器？

Answer 1

（我假设 Swift 线程就像其他语言中的线程一样。真的没有太多好的选择，普通的 OS 级线程或用户级 space "green threads"，或者两者兼而有之。不同之处仅在于发生上下文切换的地方；主要概念仍然相同）

每个线程都有自己的堆栈，在进程地址 space 中由 mmap 分配，或由父线程分配，或者可能由创建线程的同一系统调用分配。 IDK iOS 系统调用。在 Linux 中，您必须将 void *child_stack 传递给实际创建新线程的 Linux-specific clone(2) 系统调用。很少直接使用低级 OS-specific 系统调用；语言运行时可能会在诸如 pthread_create 之类的 pthreads 函数之上执行线程处理，并且 pthreads 库将处理 OS 特定的细节。

是的，每个软件线程都有自己的体系结构状态，包括 x86-64 上的 RSP，或 sp on AArch64。（或者如果您制作过时的 32 位 x86 代码，则为 ESP）。我假设帧指针对于 swift.

是可选的

是的每个逻辑核心都有自己的体系结构状态（包括堆栈指针的寄存器）；软件线程在逻辑核心上运行，软件线程之间的上下文切换 save/restore 寄存器。相关，可能是 What resources are shared between threads?.

的重复

软件线程共享相同的页表（虚拟地址space），但不寄存器。

Answer 2

XNU 内核做到了。 Swift 个线程是 POSIX 个 pthread，又名 Mach 个线程。在程序启动期间，XNU 内核解析 Mach-O 可执行格式并处理现代 LC_MAIN 或遗留 LC_UNIXTHREAD 加载命令（以及其他）。这是在内核函数中处理的：

static
load_return_t
load_main(
        struct entry_point_command  *epc,
        thread_t        thread,
        int64_t             slide,
        load_result_t       *result
    )

&

static
load_return_t
load_unixthread(
    struct thread_command   *tcp,
    thread_t        thread,
    int64_t             slide,
    load_result_t       *result
)

恰好是 open source

LC_MAIN 通过 thread_userstackdefault

初始化堆栈

LC_UNIXTHREAD 到 load_threadstack.

正如@PeterCordes 在评论中提到的那样，仅当内核创建主线程时，启动的进程本身可以通过某些 api 像 GCD 或直接通过系统调用（bsdthread_create，不确定是否还有其他）。系统调用恰好有 user_addr_t stack 因为它是第三个参数（即 MacOS 使用的 x86-64 System V 内核 ABI 中的 rdx）。 Reference for MacOS syscalls
我还没有彻底研究这个特定堆栈参数的细节，但我想它类似于 thread_userstackdefault / load_threadstack 方法。

我相信您对 Swift 运行时责任的怀疑可能是由于经常提到存储在堆栈中的数据结构（如 Swift struct - 没有双关语意）（顺便说一句，这是实现细节，而不是运行时保证的功能。

更新:
他是一个示例 main.swift 命令行程序来说明这个想法。

import Foundation

struct testStruct {
    var a: Int
}

class testClass {
}

func testLocalVariables() {
    print("main thread function with local varablies")
    var struct1 = testStruct(a: 5)
    withUnsafeBytes(of: &struct1) { print([=12=]) }
    var classInstance = testClass()
    print(NSString(format: "%p", unsafeBitCast(classInstance, to: Int.self)))
}
testLocalVariables()

print("Main thread", Thread.isMainThread)
var struct1 = testStruct(a: 5)
var struct1Copy = struct1

withUnsafeBytes(of: &struct1) { print([=12=]) }
withUnsafeBytes(of: &struct1Copy) { print([=12=]) }

var string = "testString"
var stringCopy = string

withUnsafeBytes(of: &string) { print([=12=]) }
withUnsafeBytes(of: &stringCopy) { print([=12=]) }

var classInstance = testClass()
var classInstanceAssignment = classInstance
var classInstance2 = testClass()

print(NSString(format: "%p", unsafeBitCast(classInstance, to: Int.self)))
print(NSString(format: "%p", unsafeBitCast(classInstanceAssignment, to: Int.self)))
print(NSString(format: "%p", unsafeBitCast(classInstance2, to: Int.self)))

DispatchQueue.global(qos: .background).async {
    print("Child thread", Thread.isMainThread)
    withUnsafeBytes(of: &struct1) { print([=12=]) }
    withUnsafeBytes(of: &struct1Copy) { print([=12=]) }
    withUnsafeBytes(of: &string) { print([=12=]) }
    withUnsafeBytes(of: &stringCopy) { print([=12=]) }
    print(NSString(format: "%p", unsafeBitCast(classInstance, to: Int.self)))
    print(NSString(format: "%p", unsafeBitCast(classInstanceAssignment, to: Int.self)))
    print(NSString(format: "%p", unsafeBitCast(classInstance2, to: Int.self)))
}

//Keep main thread alive indefinitely so that process doesn't exit
CFRunLoopRun()

我的输出如下所示：

main thread function with local varablies
UnsafeRawBufferPointer(start: 0x00007ffeefbfeff8, count: 8)
0x7fcd0940cd30
Main thread true
UnsafeRawBufferPointer(start: 0x000000010058a6f0, count: 8)
UnsafeRawBufferPointer(start: 0x000000010058a6f8, count: 8)
UnsafeRawBufferPointer(start: 0x000000010058a700, count: 16)
UnsafeRawBufferPointer(start: 0x000000010058a710, count: 16)
0x7fcd0940cd40
0x7fcd0940cd40
0x7fcd0940c900
Child thread false
UnsafeRawBufferPointer(start: 0x000000010058a6f0, count: 8)
UnsafeRawBufferPointer(start: 0x000000010058a6f8, count: 8)
UnsafeRawBufferPointer(start: 0x000000010058a700, count: 16)
UnsafeRawBufferPointer(start: 0x000000010058a710, count: 16)
0x7fcd0940cd40
0x7fcd0940cd40
0x7fcd0940c900

现在我们可以观察到一些有趣的事情：

Class 实例显然与 Structs
将结构分配给新变量会复制到新的内存地址
分配class实例只是复制指针。
引用全局时主线程和子线程都Structs指向完全相同的内存
字符串确实有一个结构容器。

Update2 - 证明 4^ 我们实际上可以检查下面的内存：

x 0x10058a6f0 -c 8
0x10058a6f0: 05 00 00 00 00 00 00 00                          ........
x 0x10058a6f8 -c 8
0x10058a6f8: 05 00 00 00 00 00 00 00                          ........

所以这绝对是实际的结构原始数据，即结构本身。

更新 3

我添加了一个testLocalVariables()函数，用来区分SwiftStruct定义为全局变量和局部变量。在这种情况下

x 0x00007ffeefbfeff8 -c 8
0x7ffeefbfeff8: 05 00 00 00 00 00 00 00                          ........

它显然位于 线程堆栈。

最后但并非最不重要的是，当我在 lldb 时：

re read rsp
rsp = 0x00007ffeefbfefc0  from main thread
re read rsp
rsp = 0x000070000291ea40  from child thread

它为每个线程产生不同的值，因此线程堆栈明显不同。

进一步挖掘
有一个方便的 lldb 命令可以阐明正在发生的事情。

memory region 0x000000010058a6f0
[0x000000010053d000-0x000000010058b000) rw- __DATA

所以全局 Structs 位于预分配的可执行可写 __DATA 内存页面（与全局变量所在的页面相同）。 class 0x7fcd0940cd40 地址的相同命令并不那么引人注目（我认为这是一个动态分配的堆）。类似于线程堆栈地址 0x7ffeefbfefc0 ，它显然不是进程内存区域。

幸运的是，还有最后一个工具可以进一步深入兔子洞。
vmmap -v -purge pid 确认 classes 位于 MALLOC_ed 堆中，同样线程堆栈（至少对于主线程）可以交叉引用到 Stack.

有些相关的问题也是 here。

HTH

谁创建并拥有调用堆栈以及调用堆栈如何在多线程中工作？

Who creates and owns the call stack and how does call stack works in multithread?

assembly

multithreading

runtime

ios

swift