为什么 subprocess getoutput 会增加非常量开销?

Why does subprocess getoutput add a non-constant overhead?

在 macOS 的 Python 3.6.7 中,我 运行 这个测试时间 subprocess.getoutput:

In [11]: %timeit for _ in range(100000): x = 1                                                                                                         
1.97 ms ± 48.4 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
In [12]: %timeit subprocess.getoutput('python -c "for _ in range(100000): x=1"')                                                                       
42.1 ms ± 1.01 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
In [13]: %timeit for _ in range(1000000): x = 1                                                                                                        
19.3 ms ± 128 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
In [14]: %timeit subprocess.getoutput('python -c "for _ in range(1000000): x=1"')                                                                      
92.5 ms ± 3.19 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
In [15]: %timeit for _ in range(10000000): x = 1                                                                                                       
189 ms ± 4.27 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
In [16]: %timeit subprocess.getoutput('python -c "for _ in range(10000000): x=1"')                                                                     
551 ms ± 11.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
In [17]: %timeit for _ in range(100000000): x = 1                                                                                                      
1.94 s ± 51.2 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
In [18]: %timeit subprocess.getoutput('python -c "for _ in range(100000000): x=1"')                                                                    
5.25 s ± 26.6 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

我惊讶地发现子进程的时间开销似乎与内部调用所花费的时间成比例。我原以为开销大致不变。为什么会这样?谢谢!

正如@user2357112 在评论中支持 Monica 所建议的那样,如果我在命令行中使用 time 而不是 %timeit,差异似乎就会消失,这表明不同之处在于对function-local 个变量。

$ time python -c "for _ in range(10000000): x=1"
real    0m0.592s
user    0m0.551s
sys 0m0.034s
$ time python -c "import subprocess; subprocess.getoutput('python -c \"for _ in range(10000000): x=1\"')"
real    0m0.644s
user    0m0.590s
sys 0m0.046s
$ time python -c "for _ in range(100000000): x=1"
real    0m5.104s
user    0m5.053s
sys 0m0.039s
$ time python -c "import subprocess; subprocess.getoutput('python -c \"for _ in range(100000000): x=1\"')"
real    0m5.161s
user    0m5.098s
sys 0m0.051s