为什么使用线程和进程的结果不同？

Question

我在练习python的线程和进程的时候，发现在处理一些函数的时候，线程和进程的打印结果会不一样。不太明白是什么原因，就把代码和打印结果一起发了。以及如何让进程部分和线程部分的结果一致？

使用线程时，一切正常：

import os, time, random, threading, multiprocessing

list = ['python', 'django', 'tornado', 'flask', 'bs5', 'requests', 'uvloop']

new_lists = []

def work():
    if len(list) == 0:
        return
    data = random.choice(list)
    list.remove(data)
    new_data = '%s_new' % data
    new_lists.append(new_data)
    time.sleep(1)

if __name__ == '__main__':
    start = time.time()
    print('old list lenc is %s' % len(list))

    for i in range(len(list)):
        t = threading.Thread(target=work)
        t.start()
    t.join()

    print('old list:', list)
    print('new list', new_lists, len(new_lists))
    print('time is %s' % (time.time() - start))

这将打印：（结果很好）

old list lenc is 7
old list: []
new list ['uvloop_new', 'python_new', 'bs5_new', 'tornado_new', 'django_new', 'requests_new', 'flask_new'] 7
time is 1.0153822898864746

但是，当把线程换成进程时，出现了错误：

import os, time, random, threading, multiprocessing

list = ['python', 'django', 'tornado', 'flask', 'bs5', 'requests', 'uvloop']

new_lists = []


def work():
    if len(list) == 0:
        return
    data = random.choice(list)
    list.remove(data)
    new_data = '%s_new' % data
    new_lists.append(new_data)
    time.sleep(1)


if __name__ == '__main__':
    start = time.time()
    print('old list lenc is %s' % len(list))

    for i in range(len(list)):
        t = multiprocessing.Process(target=work)
        t.start()
    t.join()

    print('old list:', list)
    print('new list', new_lists, len(new_lists))
    print('time is %s' % (time.time() - start))

将打印：（结果与预期不符）

old list lenc is 7
old list: ['python', 'django', 'tornado', 'flask', 'bs5', 'requests', 'uvloop']
new list [] 0
time is 1.4266910552978516

Answer 1

在多线程中你有共享内存，但在多进程中，你没有共享内存。因此，当您尝试在每个进程中更改全局变量时，您会丢失数据。这个问题没有办法通过这种方式解决，你应该根据你的项目情况选择合适的选项。如果您需要在每个并行函数中操作共享数据，您应该使用线程。虽然在multiprocessing中可以使用Queue在各个进程之间传递数据。 https://docs.python.org/3/library/queue.html

为什么使用线程和进程的结果不同？

Why are results different between using threads and processes?

python

parallel-processing

concurrency

multithreading

process