python peewee 多处理池错误
python peewee multiprocessing pool error
堆栈:python3.4,PostgreSQL 9.4.7,peewee 2.8.0,psycopg2 2.6.1(dt dec pq3 ext lo64)
我需要能够与每个工作人员中的 postgresql 数据库对话(select、插入、更新)。我正在使用 pythons 多处理池创建 10 个工作人员,每个工作人员进行 curl 调用,然后根据它找到的内容与数据库对话。
在互联网上阅读了一些主题后,我认为连接池是可行的方法。所以我将代码放在 models.py 文件的顶部。我对连接池有疑问,因为我的理解是跨线程重用数据库连接是不行的。
db = PooledPostgresqlExtDatabase(
'uc',
max_connections=32,
stale_timeout=300, # 5 minutes.
**{'password': cfg['psql']['pass'],
'port': cfg['psql']['port'],
'register_hstore':False,
'host': cfg['psql']['host'],
'user': cfg['psql']['user']})
现在进入正题。与某些工作人员的数据库交谈时,我收到随机 sql 错误。在我将 peewee 引入组合之前,我使用的是没有包装器的 "psycopg2" 库。我还为每个工作人员创建了一个新的数据库连接。没有错误。
我得到的示例错误是:
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "/usr/local/lib/python3.4/dist-packages/playhouse/postgres_ext.py", line 377, in execute_sql
self.commit()
File "/usr/local/lib/python3.4/dist-packages/peewee.py", line 3468, in commit
self.get_conn().commit()
psycopg2.DatabaseError: error with no message from the libpq
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/lib/python3.4/multiprocessing/pool.py", line 119, in worker
result = (True, func(*args, **kwds))
File "/usr/lib/python3.4/multiprocessing/pool.py", line 44, in mapstar
return list(map(*args))
File "/home/dan/dev/link-checker/crawler/manager.py", line 17, in startWorker
wrk.perform()
File "/home/dan/dev/link-checker/crawler/worker.py", line 49, in perform
self.pullUrls()
File "/home/dan/dev/link-checker/crawler/worker.py", line 63, in pullUrls
newUrlDict = UrlManager.createUrlWithInProgress(self._url['crawl'], source_url, self._url['base'])
File "/home/dan/dev/link-checker/crawler/models.py", line 152, in createUrlWithInProgress
newUrl = Url.create(**newUrlDict)
File "/usr/local/lib/python3.4/dist-packages/peewee.py", line 4494, in create
inst.save(force_insert=True)
File "/usr/local/lib/python3.4/dist-packages/peewee.py", line 4680, in save
pk_from_cursor = self.insert(**field_dict).execute()
File "/usr/local/lib/python3.4/dist-packages/peewee.py", line 3213, in execute
cursor = self._execute()
File "/usr/local/lib/python3.4/dist-packages/peewee.py", line 2628, in _execute
return self.database.execute_sql(sql, params, self.require_commit)
File "/usr/local/lib/python3.4/dist-packages/playhouse/postgres_ext.py", line 377, in execute_sql
self.commit()
File "/usr/local/lib/python3.4/dist-packages/peewee.py", line 3285, in __exit__
reraise(new_type, new_type(*exc_args), traceback)
File "/usr/local/lib/python3.4/dist-packages/peewee.py", line 127, in reraise
raise value.with_traceback(tb)
File "/usr/local/lib/python3.4/dist-packages/playhouse/postgres_ext.py", line 377, in execute_sql
self.commit()
File "/usr/local/lib/python3.4/dist-packages/peewee.py", line 3468, in commit
self.get_conn().commit()
peewee.DatabaseError: error with no message from the libpq
我还跟踪了 postgresql 文件,这是我看到的:
2016-04-19 20:34:23 EDT [26824-3] uc_user@uc WARNING: there is already a transaction in progress
2016-04-19 20:34:23 EDT [26824-4] uc_user@uc WARNING: there is already a transaction in progress
2016-04-19 20:34:23 EDT [26824-5] uc_user@uc WARNING: there is no transaction in progress
2016-04-19 20:34:23 EDT [26824-6] uc_user@uc WARNING: there is already a transaction in progress
2016-04-19 20:34:23 EDT [26824-7] uc_user@uc WARNING: there is no transaction in progress
2016-04-19 20:34:23 EDT [26824-8] uc_user@uc WARNING: there is already a transaction in progress
2016-04-19 20:34:23 EDT [26824-9] uc_user@uc WARNING: there is already a transaction in progress
2016-04-19 20:35:14 EDT [26976-1] uc_user@uc WARNING: there is already a transaction in progress
2016-04-19 20:35:14 EDT [26976-2] uc_user@uc WARNING: there is no transaction in progress
2016-04-19 20:35:14 EDT [26976-3] uc_user@uc WARNING: there is already a transaction in progress
2016-04-19 20:35:14 EDT [26976-4] uc_user@uc WARNING: there is already a transaction in progress
2016-04-19 20:35:14 EDT [26976-5] uc_user@uc WARNING: there is no transaction in progress
2016-04-19 20:35:14 EDT [26976-6] uc_user@uc WARNING: there is already a transaction in progress
2016-04-19 20:35:14 EDT [26976-7] uc_user@uc WARNING: there is no transaction in progress
2016-04-19 20:35:14 EDT [26976-8] uc_user@uc WARNING: there is already a transaction in progress
2016-04-19 20:35:14 EDT [26976-9] uc_user@uc WARNING: there is no transaction in progress
我的预感是连接池和多处理不能很好地结合在一起。有没有人成功地做到这一点而没有错误,如果是这样,你能给我举个例子或者给我一些有用的建议吗?
我是否需要在我的 worker 中显式创建与 peewee 的新连接,或者是否有更简单的方法将 peewee 与多处理池库一起使用。
感谢您的回答和阅读。
我让它工作了,models.py 文件中的所有代码都将被工作人员使用。我按照本页所述将其包装在 "with db.execution_context as ctx" 中:
http://docs.peewee-orm.com/en/latest/peewee/database.html#advanced-connection-management
堆栈:python3.4,PostgreSQL 9.4.7,peewee 2.8.0,psycopg2 2.6.1(dt dec pq3 ext lo64)
我需要能够与每个工作人员中的 postgresql 数据库对话(select、插入、更新)。我正在使用 pythons 多处理池创建 10 个工作人员,每个工作人员进行 curl 调用,然后根据它找到的内容与数据库对话。
在互联网上阅读了一些主题后,我认为连接池是可行的方法。所以我将代码放在 models.py 文件的顶部。我对连接池有疑问,因为我的理解是跨线程重用数据库连接是不行的。
db = PooledPostgresqlExtDatabase(
'uc',
max_connections=32,
stale_timeout=300, # 5 minutes.
**{'password': cfg['psql']['pass'],
'port': cfg['psql']['port'],
'register_hstore':False,
'host': cfg['psql']['host'],
'user': cfg['psql']['user']})
现在进入正题。与某些工作人员的数据库交谈时,我收到随机 sql 错误。在我将 peewee 引入组合之前,我使用的是没有包装器的 "psycopg2" 库。我还为每个工作人员创建了一个新的数据库连接。没有错误。
我得到的示例错误是:
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "/usr/local/lib/python3.4/dist-packages/playhouse/postgres_ext.py", line 377, in execute_sql
self.commit()
File "/usr/local/lib/python3.4/dist-packages/peewee.py", line 3468, in commit
self.get_conn().commit()
psycopg2.DatabaseError: error with no message from the libpq
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/lib/python3.4/multiprocessing/pool.py", line 119, in worker
result = (True, func(*args, **kwds))
File "/usr/lib/python3.4/multiprocessing/pool.py", line 44, in mapstar
return list(map(*args))
File "/home/dan/dev/link-checker/crawler/manager.py", line 17, in startWorker
wrk.perform()
File "/home/dan/dev/link-checker/crawler/worker.py", line 49, in perform
self.pullUrls()
File "/home/dan/dev/link-checker/crawler/worker.py", line 63, in pullUrls
newUrlDict = UrlManager.createUrlWithInProgress(self._url['crawl'], source_url, self._url['base'])
File "/home/dan/dev/link-checker/crawler/models.py", line 152, in createUrlWithInProgress
newUrl = Url.create(**newUrlDict)
File "/usr/local/lib/python3.4/dist-packages/peewee.py", line 4494, in create
inst.save(force_insert=True)
File "/usr/local/lib/python3.4/dist-packages/peewee.py", line 4680, in save
pk_from_cursor = self.insert(**field_dict).execute()
File "/usr/local/lib/python3.4/dist-packages/peewee.py", line 3213, in execute
cursor = self._execute()
File "/usr/local/lib/python3.4/dist-packages/peewee.py", line 2628, in _execute
return self.database.execute_sql(sql, params, self.require_commit)
File "/usr/local/lib/python3.4/dist-packages/playhouse/postgres_ext.py", line 377, in execute_sql
self.commit()
File "/usr/local/lib/python3.4/dist-packages/peewee.py", line 3285, in __exit__
reraise(new_type, new_type(*exc_args), traceback)
File "/usr/local/lib/python3.4/dist-packages/peewee.py", line 127, in reraise
raise value.with_traceback(tb)
File "/usr/local/lib/python3.4/dist-packages/playhouse/postgres_ext.py", line 377, in execute_sql
self.commit()
File "/usr/local/lib/python3.4/dist-packages/peewee.py", line 3468, in commit
self.get_conn().commit()
peewee.DatabaseError: error with no message from the libpq
我还跟踪了 postgresql 文件,这是我看到的:
2016-04-19 20:34:23 EDT [26824-3] uc_user@uc WARNING: there is already a transaction in progress
2016-04-19 20:34:23 EDT [26824-4] uc_user@uc WARNING: there is already a transaction in progress
2016-04-19 20:34:23 EDT [26824-5] uc_user@uc WARNING: there is no transaction in progress
2016-04-19 20:34:23 EDT [26824-6] uc_user@uc WARNING: there is already a transaction in progress
2016-04-19 20:34:23 EDT [26824-7] uc_user@uc WARNING: there is no transaction in progress
2016-04-19 20:34:23 EDT [26824-8] uc_user@uc WARNING: there is already a transaction in progress
2016-04-19 20:34:23 EDT [26824-9] uc_user@uc WARNING: there is already a transaction in progress
2016-04-19 20:35:14 EDT [26976-1] uc_user@uc WARNING: there is already a transaction in progress
2016-04-19 20:35:14 EDT [26976-2] uc_user@uc WARNING: there is no transaction in progress
2016-04-19 20:35:14 EDT [26976-3] uc_user@uc WARNING: there is already a transaction in progress
2016-04-19 20:35:14 EDT [26976-4] uc_user@uc WARNING: there is already a transaction in progress
2016-04-19 20:35:14 EDT [26976-5] uc_user@uc WARNING: there is no transaction in progress
2016-04-19 20:35:14 EDT [26976-6] uc_user@uc WARNING: there is already a transaction in progress
2016-04-19 20:35:14 EDT [26976-7] uc_user@uc WARNING: there is no transaction in progress
2016-04-19 20:35:14 EDT [26976-8] uc_user@uc WARNING: there is already a transaction in progress
2016-04-19 20:35:14 EDT [26976-9] uc_user@uc WARNING: there is no transaction in progress
我的预感是连接池和多处理不能很好地结合在一起。有没有人成功地做到这一点而没有错误,如果是这样,你能给我举个例子或者给我一些有用的建议吗?
我是否需要在我的 worker 中显式创建与 peewee 的新连接,或者是否有更简单的方法将 peewee 与多处理池库一起使用。
感谢您的回答和阅读。
我让它工作了,models.py 文件中的所有代码都将被工作人员使用。我按照本页所述将其包装在 "with db.execution_context as ctx" 中:
http://docs.peewee-orm.com/en/latest/peewee/database.html#advanced-connection-management