Django Postgres 内存泄漏
Django Postgres memory leak
我有一个自定义 Django (v 2.0.0) 命令以多线程方式启动后台作业执行器,这似乎给我带来了内存泄漏问题。
命令可以这样启动:
./manage.py start_job_executer --thread=1
每个线程都有一个 while True 循环,从 PostgreSQL table 中获取作业。
为了完成工作并自动更改状态,我使用了事务:
# atomic transaction to temporary lock the db access and to
# get the most recent job from db with column status = pending
with transaction.atomic():
job = Job.objects.select_for_update() \
.filter(status=Job.STATUS['pending']) \
.order_by('created_at').first()
if job:
job.status = Job.STATUS['executing']
job.save()
看来这个 Django 自定义命令分配的内存在不断增长。
我尝试使用 tracemalloc 通过创建检查内存分配的后台线程来查找导致内存泄漏的原因:
def check_memory(self):
while True:
s1 = tracemalloc.take_snapshot()
sleep(10)
s2 = tracemalloc.take_snapshot()
for alog in s2.compare_to(s1, 'lineno')[:10]:
log.info(alog)
找出以下日志:
01.04.20 13:50:06 operations.py:222: size=23.7 KiB (+23.7 KiB), count=66 (+66), average=367 B
01.04.20 13:50:36 operations.py:222: size=127 KiB (+43.7 KiB), count=353 (+122), average=367 B
01.04.20 13:51:04 operations.py:222: size=251 KiB (+66.7 KiB), count=699 (+186), average=367 B
01.04.20 13:51:31 operations.py:222: size=379 KiB (+68.9 KiB), count=1056 (+192), average=367 B
01.04.20 13:51:57 operations.py:222: size=495 KiB (+60.3 KiB), count=1380 (+168), average=367 B
貌似/usr/local/lib/python3.5/dist-packages/django/db/backends/postgresql/operations.py:222没有释放内存
1 个线程的泄漏很慢,但如果我使用 8 个线程,内存泄漏更严重:
01.04.20 13:07:51 operations.py:222: size=68.3 KiB (+68.3 KiB), count=191 (+191), average=366 B
01.04.20 13:08:56 operations.py:222: size=770 KiB (+140 KiB), count=2151 (+390), average=367 B
01.04.20 13:10:07 operations.py:222: size=1476 KiB (+138 KiB), count=4122 (+386), average=367 B
01.04.20 13:36:22 operations.py:222: size=17.3 MiB (+138 KiB), count=49506 (+385), average=367 B
01.04.20 13:48:16 operations.py:222: size=24.5 MiB (+136 KiB), count=69993 (+379), average=367 B
这是/usr/local/lib/python3.5/dist-packages/django/db/backends/postgresql/operations.py:222:
中第222行的代码
def last_executed_query(self, cursor, sql, params):
# http://initd.org/psycopg/docs/cursor.html#cursor.query
# The query attribute is a Psycopg extension to the DB API 2.0.
if cursor.query is not None:
return cursor.query.decode() # this is line 222!
return None
我不知道如何解决这个问题。有什么想法吗?
也张贴在这里:https://code.djangoproject.com/ticket/31419#ticket
我正在考虑为每个需要执行的作业创建一个新进程,一旦完成,内存将随着进程本身的死亡而被释放。这可能会奏效,但似乎有点矫枉过正。
提前致谢
更新
我正在使用 Django 2.0,我想更新到 Django 3.0.5(最新的 stable 版本),但不幸的是问题仍然存在。
新日志下方:
01.04.20 20:15:06 operations.py:235: size=977 KiB (+53.9 KiB), count=2750 (+152), average=364 B
01.04.20 20:15:28 operations.py:235: size=1070 KiB (+50.1 KiB), count=3012 (+141), average=364 B
01.04.20 20:15:53 operations.py:235: size=1156 KiB (+43.7 KiB), count=3255 (+123), average=364 B
01.04.20 20:16:19 operations.py:235: size=1245 KiB (+44.7 KiB), count=3507 (+126), average=364 B
01.04.20 20:20:23 operations.py:235: size=2154 KiB (+44.3 KiB), count=6065 (+125), average=364 B
当 settings.DEBUG = True
时,Django 在环形缓冲区中保留对所有已执行查询的引用
It is also important to remember that when running with DEBUG
turned on, Django will remember every SQL query it executes. This is useful when you’re debugging, but it’ll rapidly consume memory on a production server.
设置 DEBUG = False
应该可以解决您的问题。
在开发中可能出现问题的情况下擦除环形缓冲区:
from django.db import reset_queries
if settings.DEBUG:
reset_queries()
我有一个自定义 Django (v 2.0.0) 命令以多线程方式启动后台作业执行器,这似乎给我带来了内存泄漏问题。
命令可以这样启动:
./manage.py start_job_executer --thread=1
每个线程都有一个 while True 循环,从 PostgreSQL table 中获取作业。
为了完成工作并自动更改状态,我使用了事务:
# atomic transaction to temporary lock the db access and to
# get the most recent job from db with column status = pending
with transaction.atomic():
job = Job.objects.select_for_update() \
.filter(status=Job.STATUS['pending']) \
.order_by('created_at').first()
if job:
job.status = Job.STATUS['executing']
job.save()
看来这个 Django 自定义命令分配的内存在不断增长。
我尝试使用 tracemalloc 通过创建检查内存分配的后台线程来查找导致内存泄漏的原因:
def check_memory(self):
while True:
s1 = tracemalloc.take_snapshot()
sleep(10)
s2 = tracemalloc.take_snapshot()
for alog in s2.compare_to(s1, 'lineno')[:10]:
log.info(alog)
找出以下日志:
01.04.20 13:50:06 operations.py:222: size=23.7 KiB (+23.7 KiB), count=66 (+66), average=367 B
01.04.20 13:50:36 operations.py:222: size=127 KiB (+43.7 KiB), count=353 (+122), average=367 B
01.04.20 13:51:04 operations.py:222: size=251 KiB (+66.7 KiB), count=699 (+186), average=367 B
01.04.20 13:51:31 operations.py:222: size=379 KiB (+68.9 KiB), count=1056 (+192), average=367 B
01.04.20 13:51:57 operations.py:222: size=495 KiB (+60.3 KiB), count=1380 (+168), average=367 B
貌似/usr/local/lib/python3.5/dist-packages/django/db/backends/postgresql/operations.py:222没有释放内存
1 个线程的泄漏很慢,但如果我使用 8 个线程,内存泄漏更严重:
01.04.20 13:07:51 operations.py:222: size=68.3 KiB (+68.3 KiB), count=191 (+191), average=366 B
01.04.20 13:08:56 operations.py:222: size=770 KiB (+140 KiB), count=2151 (+390), average=367 B
01.04.20 13:10:07 operations.py:222: size=1476 KiB (+138 KiB), count=4122 (+386), average=367 B
01.04.20 13:36:22 operations.py:222: size=17.3 MiB (+138 KiB), count=49506 (+385), average=367 B
01.04.20 13:48:16 operations.py:222: size=24.5 MiB (+136 KiB), count=69993 (+379), average=367 B
这是/usr/local/lib/python3.5/dist-packages/django/db/backends/postgresql/operations.py:222:
中第222行的代码def last_executed_query(self, cursor, sql, params):
# http://initd.org/psycopg/docs/cursor.html#cursor.query
# The query attribute is a Psycopg extension to the DB API 2.0.
if cursor.query is not None:
return cursor.query.decode() # this is line 222!
return None
我不知道如何解决这个问题。有什么想法吗?
也张贴在这里:https://code.djangoproject.com/ticket/31419#ticket
我正在考虑为每个需要执行的作业创建一个新进程,一旦完成,内存将随着进程本身的死亡而被释放。这可能会奏效,但似乎有点矫枉过正。
提前致谢
更新
我正在使用 Django 2.0,我想更新到 Django 3.0.5(最新的 stable 版本),但不幸的是问题仍然存在。
新日志下方:
01.04.20 20:15:06 operations.py:235: size=977 KiB (+53.9 KiB), count=2750 (+152), average=364 B
01.04.20 20:15:28 operations.py:235: size=1070 KiB (+50.1 KiB), count=3012 (+141), average=364 B
01.04.20 20:15:53 operations.py:235: size=1156 KiB (+43.7 KiB), count=3255 (+123), average=364 B
01.04.20 20:16:19 operations.py:235: size=1245 KiB (+44.7 KiB), count=3507 (+126), average=364 B
01.04.20 20:20:23 operations.py:235: size=2154 KiB (+44.3 KiB), count=6065 (+125), average=364 B
当 settings.DEBUG = True
It is also important to remember that when running with
DEBUG
turned on, Django will remember every SQL query it executes. This is useful when you’re debugging, but it’ll rapidly consume memory on a production server.
设置 DEBUG = False
应该可以解决您的问题。
在开发中可能出现问题的情况下擦除环形缓冲区:
from django.db import reset_queries
if settings.DEBUG:
reset_queries()