实时从子进程读取标准输出

Question

鉴于此代码片段：

from subprocess import Popen, PIPE, CalledProcessError


def execute(cmd):
    with Popen(cmd, shell=True, stdout=PIPE, bufsize=1, universal_newlines=True) as p:
        for line in p.stdout:
            print(line, end='')

    if p.returncode != 0:
        raise CalledProcessError(p.returncode, p.args)

base_cmd = [
    "cmd", "/c", "d:\virtual_envs\py362_32\Scripts\activate",
    "&&"
]
cmd1 = " ".join(base_cmd + ['python -c "import sys; print(sys.version)"'])
cmd2 = " ".join(base_cmd + ["python -m http.server"])

如果我运行 execute(cmd1) 输出将毫无问题地打印出来。

但是，如果我运行 execute(cmd2) 而不是打印任何内容，这是为什么，我该如何修复它以便我可以实时看到 http.server 的输出.

此外，for line in p.stdout 是如何在内部评估的？它是某种无限循环直到到达 stdout eof 还是什么？

此主题已在 SO 中多次提及，但我尚未找到 windows 解决方案。上面的代码片段是来自 answer 的代码，我运行ning http.server 来自 virtualenv（python3.6.2-32bits on win7）

Answer 1

使用此代码，由于缓冲，您无法看到实时输出：

for line in p.stdout:
    print(line, end='')

但是如果你使用 p.stdout.readline() 它应该可以工作：

while True:
  line = p.stdout.readline()
  if not line: break
  print(line, end='')

详情见对应的python bug discussion

更新： 在这里你可以在 Whosebug 上找到几乎相同的 problem with various solutions。

Answer 2

如果您想从运行子进程连续读取，您必须使那个进程的输出无缓冲。您的子进程是一个 Python 程序，这可以通过将 -u 传递给解释器来完成：

python -u -m http.server

这是它在 Windows 盒子上的样子。

Answer 3

How for line in p.stdout is been evaluated internally? is it some sort of endless loop till reaches stdout eof or something?

p.stdout 是一个缓冲区（阻塞）。当您从 empty 缓冲区读取数据时，您会被阻塞，直到有内容写入该缓冲区。一旦里面有东西，你就得到数据并执行内部部分。

想一想 tail -f 如何在 linux 上工作：它会一直等待，直到有内容被写入文件，当它写入时，它会将新数据回显到屏幕上。没有数据时会发生什么？ 它等待。 所以当你的程序到达这一行时，它等待数据并处理它。

因为您的代码有效，但是当运行作为模型无效时，它必须以某种方式与此相关。 http.server 模块可能缓冲输出。尝试将 -u 参数添加到 Python 到运行无缓冲的过程：

-u : unbuffered binary stdout and stderr; also PYTHONUNBUFFERED=x see man page for details on internal buffering relating to '-u'

此外，您可能想尝试将循环更改为 for line in iter(lambda: p.stdout.read(1), ''):，因为这会在处理前一次读取 1 个字节。

更新：完整的循环代码为

for line in iter(lambda: p.stdout.read(1), ''):
    sys.stdout.write(line)
    sys.stdout.flush()

此外，您将命令作为字符串传递。尝试将其作为列表传递，每个元素都在其自己的插槽中：

cmd = ['python', '-m', 'http.server', ..]

Answer 4

我认为主要问题是 http.server 以某种方式将输出记录到 stderr，这里我有一个 asyncio 的示例，从 [=15= 读取数据] 或 stderr.

我的第一次尝试是使用 asyncio，一个不错的 API，自 Python 3.4 以来就存在。后来我找到了一个更简单的解决方案，所以你可以选择，两者都应该有效。

asyncio 作为解决方案

在后台 asyncio 正在使用 IOCP - windows API 来异步化内容。

# inspired by https://pymotw.com/3/asyncio/subprocesses.html

import asyncio
import sys
import time

if sys.platform == 'win32':
    loop = asyncio.ProactorEventLoop()
    asyncio.set_event_loop(loop)

async def run_webserver():
    buffer = bytearray()

    # start the webserver without buffering (-u) and stderr and stdin as the arguments
    print('launching process')
    proc = await asyncio.create_subprocess_exec(
        sys.executable, '-u', '-mhttp.server',
        stdout=asyncio.subprocess.PIPE,
        stderr=asyncio.subprocess.PIPE
    )

    print('process started {}'.format(proc.pid))
    while 1:
        # wait either for stderr or stdout and loop over the results
        for line in asyncio.as_completed([proc.stderr.readline(), proc.stdout.readline()]):
            print('read {!r}'.format(await line))

event_loop = asyncio.get_event_loop()
try:
    event_loop.run_until_complete(run_df())
finally:
    event_loop.close()

从标准输出重定向

根据您的示例，这是一个非常简单的解决方案。它只是将 stderr 重定向到 stdout，并且只读取 stdout。

from subprocess import Popen, PIPE, CalledProcessError, run, STDOUT import os

def execute(cmd):
    with Popen(cmd, stdout=PIPE, stderr=STDOUT, bufsize=1) as p:
        while 1:
            print('waiting for a line')
            print(p.stdout.readline())

cmd2 = ["python", "-u", "-m", "http.server"]

execute(cmd2)

Answer 5

您可以在 OS 级别实现无缓冲行为。

在 Linux 中，您可以用 stdbuf 包装现有的命令行：

stdbuf -i0 -o0 -e0 YOURCOMMAND

或者在 Windows 中，您可以用 winpty:

包装现有的命令行

winpty.exe -Xallow-non-tty -Xplain YOURCOMMAND

我不知道 OS-中性工具。

实时从子进程读取标准输出

Reading stdout from a subprocess in real time

python

windows

subprocess

popen

asyncio 作为解决方案

从标准输出重定向