运行 工作进程中的外部命令并将输出捕获到单个文件
Running external command in worker process and capturing output to a single file
这是我在工作进程中调用外部命令并将命令的所有输出附加到单个文件的幼稚方法。这是示例代码。
from concurrent.futures import ProcessPoolExecutor
from functools import partial
import multiprocessing
import subprocess
def worker_process_write_output(fh, lock, mylist):
output = subprocess.run("dir /b", shell=True, stdout=subprocess.PIPE, universal_newlines=True).stdout
with lock: # Need lock to prevent multiple processes writing to the file simultenously
fh.write(mylist)
fh.writelines(output)
if __name__ == '__main__':
with open("outfile.txt", "a") as fh: # I am opening file in main process to avoid the overhead of opening & closing the file multiple times in each worker process
mylist = [1, 2, 3, 4]
with ProcessPoolExecutor() as executor:
lock = multiprocessing.Manager().Lock()
executor.map(partial(worker_process_write_output, fh, lock), mylist)
此代码在 运行 时挂起。有哪些错误和更正?
其中一些我猜是 1. 无法将文件句柄传递给工作进程。需要在工作进程中打开和关闭文件。不确定原因
2.不能在工作进程中使用subprocess.run,需要使用os.popen("dir /b").read()或其他东西 3.不确定是否需要锁,如果需要是这个正确的锁?
文件上下文可以是 passed between processes 所以我不确定为什么你的代码在文件处理程序中死锁。话虽如此,我假设您在 run()
函数中做了很多工作,因此 opening/closing 每个进程一次文件的开销应该不会非常大。如果没有完成大量工作,多处理可能不是开始的最佳选择,因为它涉及严重的开销。
此外,fh.write(mylist)
引发了一个TypeError: write() argument must be str, not int
,所以我们需要用fh.write(str(mylist))
进行转换。
解决方法如下:
import multiprocessing
import subprocess
from concurrent.futures import ProcessPoolExecutor
from functools import partial
def worker_process_write_output(lock, mylist):
output = subprocess.run("dir /b", shell=True, stdout=subprocess.PIPE,
universal_newlines=True).stdout
with lock:
with open("outfile.txt", "a") as fh:
fh.write(str(mylist))
fh.writelines(output)
if __name__ == '__main__':
mylist = [1, 2, 3, 4]
with ProcessPoolExecutor() as executor:
lock = multiprocessing.Manager().Lock()
executor.map(partial(worker_process_write_output, lock), mylist)
这是我在工作进程中调用外部命令并将命令的所有输出附加到单个文件的幼稚方法。这是示例代码。
from concurrent.futures import ProcessPoolExecutor
from functools import partial
import multiprocessing
import subprocess
def worker_process_write_output(fh, lock, mylist):
output = subprocess.run("dir /b", shell=True, stdout=subprocess.PIPE, universal_newlines=True).stdout
with lock: # Need lock to prevent multiple processes writing to the file simultenously
fh.write(mylist)
fh.writelines(output)
if __name__ == '__main__':
with open("outfile.txt", "a") as fh: # I am opening file in main process to avoid the overhead of opening & closing the file multiple times in each worker process
mylist = [1, 2, 3, 4]
with ProcessPoolExecutor() as executor:
lock = multiprocessing.Manager().Lock()
executor.map(partial(worker_process_write_output, fh, lock), mylist)
此代码在 运行 时挂起。有哪些错误和更正? 其中一些我猜是 1. 无法将文件句柄传递给工作进程。需要在工作进程中打开和关闭文件。不确定原因 2.不能在工作进程中使用subprocess.run,需要使用os.popen("dir /b").read()或其他东西 3.不确定是否需要锁,如果需要是这个正确的锁?
文件上下文可以是 passed between processes 所以我不确定为什么你的代码在文件处理程序中死锁。话虽如此,我假设您在 run()
函数中做了很多工作,因此 opening/closing 每个进程一次文件的开销应该不会非常大。如果没有完成大量工作,多处理可能不是开始的最佳选择,因为它涉及严重的开销。
此外,fh.write(mylist)
引发了一个TypeError: write() argument must be str, not int
,所以我们需要用fh.write(str(mylist))
进行转换。
解决方法如下:
import multiprocessing
import subprocess
from concurrent.futures import ProcessPoolExecutor
from functools import partial
def worker_process_write_output(lock, mylist):
output = subprocess.run("dir /b", shell=True, stdout=subprocess.PIPE,
universal_newlines=True).stdout
with lock:
with open("outfile.txt", "a") as fh:
fh.write(str(mylist))
fh.writelines(output)
if __name__ == '__main__':
mylist = [1, 2, 3, 4]
with ProcessPoolExecutor() as executor:
lock = multiprocessing.Manager().Lock()
executor.map(partial(worker_process_write_output, lock), mylist)