我有一个长时间运行的后台过程,它分析了几十万行CSV。我注意到进程有内存泄漏,偶尔会导致任务达到其软内存限制并终止。我已经将代码部分缩小到以下代码块:
class BaseModel(db.Model):
_keyNamespace = 'MyApp.Models'
@classmethod
def get_by_item_id(cls, id):
key = "%s_%d" % (cls._keyNamespace, id)
item = CacheStrategy.get(key)
if not item:
query = cls.gql("WHERE Id = :1", id)
item = query.get()
del query
return item我已经将其简化为基本框架,但它仍然导致查询对象保留在内存中。注释末尾包含一个示例GC引用转储,显示查询和Query_Filter计数在每200个订单批处理步骤之后增加200个。如果我摆脱了查询调用,这当然会消失。
我的问题是,为什么会有这个泄漏的查询引用,以及如何让它遵守这个模型并删除这个查询引用?
我试着把它变成一个实例方法(没有区别)。参考计数跟踪如下:
INFO 2011-10-17 16:29:39,158 orderparser.py:151] Putting a 200 unit batch of orders, 0.335000 seconds from start
DEBUG 2011-10-17 16:29:40,315 memleaker.py:20] Top Mem Leaks
DEBUG 2011-10-17 16:29:40,334 memleaker.py:22] 356306 Property
DEBUG 2011-10-17 16:29:40,334 memleaker.py:22] 356305 PropertyValue
DEBUG 2011-10-17 16:29:40,334 memleaker.py:22] 74410 Path
DEBUG 2011-10-17 16:29:40,334 memleaker.py:22] 74408 Path_Element
DEBUG 2011-10-17 16:29:40,334 memleaker.py:22] 45127 PropertyValue_ReferenceValue
DEBUG 2011-10-17 16:29:40,334 memleaker.py:22] 45127 PropertyValue_ReferenceValuePathElement
DEBUG 2011-10-17 16:29:40,334 memleaker.py:22] 43822 Reference
DEBUG 2011-10-17 16:29:40,335 memleaker.py:22] 30595 EntityProto
DEBUG 2011-10-17 16:29:40,335 memleaker.py:22] 320 ProtocolMessage
DEBUG 2011-10-17 16:29:40,335 memleaker.py:22] 217 Query
DEBUG 2011-10-17 16:29:40,335 memleaker.py:22] 209 Query_Filter
DEBUG 2011-10-17 16:29:40,335 memleaker.py:22] 55 NOT_PROVIDED
DEBUG 2011-10-17 16:29:40,335 memleaker.py:22] 34 Index_Property
DEBUG 2011-10-17 16:29:40,335 memleaker.py:22] 28 ExtendableProtocolMessage
DEBUG 2011-10-17 16:29:40,336 memleaker.py:22] 18 CompositeIndex
INFO 2011-10-17 16:29:40,644 orderparser.py:151] Putting a 200 unit batch of orders, 1.821000 seconds from start
DEBUG 2011-10-17 16:29:41,930 memleaker.py:20] Top Mem Leaks
DEBUG 2011-10-17 16:29:41,948 memleaker.py:22] 356506 Property
DEBUG 2011-10-17 16:29:41,948 memleaker.py:22] 356505 PropertyValue
DEBUG 2011-10-17 16:29:41,948 memleaker.py:22] 74410 Path
DEBUG 2011-10-17 16:29:41,948 memleaker.py:22] 74408 Path_Element
DEBUG 2011-10-17 16:29:41,948 memleaker.py:22] 45127 PropertyValue_ReferenceValue
DEBUG 2011-10-17 16:29:41,948 memleaker.py:22] 45127 PropertyValue_ReferenceValuePathElement
DEBUG 2011-10-17 16:29:41,948 memleaker.py:22] 43822 Reference
DEBUG 2011-10-17 16:29:41,951 memleaker.py:22] 30595 EntityProto
DEBUG 2011-10-17 16:29:41,951 memleaker.py:22] 417 Query
DEBUG 2011-10-17 16:29:41,951 memleaker.py:22] 409 Query_Filter
DEBUG 2011-10-17 16:29:41,951 memleaker.py:22] 320 ProtocolMessage
DEBUG 2011-10-17 16:29:41,951 memleaker.py:22] 55 NOT_PROVIDED
DEBUG 2011-10-17 16:29:41,951 memleaker.py:22] 34 Index_Property
DEBUG 2011-10-17 16:29:41,951 memleaker.py:22] 28 ExtendableProtocolMessage
DEBUG 2011-10-17 16:29:41,953 memleaker.py:22] 18 CompositeIndex
INFO 2011-10-17 16:29:42,276 orderparser.py:151] Putting a 200 unit batch of orders, 3.450000 seconds from start
DEBUG 2011-10-17 16:29:43,565 memleaker.py:20] Top Mem Leaks
DEBUG 2011-10-17 16:29:43,585 memleaker.py:22] 356706 Property
DEBUG 2011-10-17 16:29:43,585 memleaker.py:22] 356705 PropertyValue
DEBUG 2011-10-17 16:29:43,585 memleaker.py:22] 74410 Path
DEBUG 2011-10-17 16:29:43,585 memleaker.py:22] 74408 Path_Element
DEBUG 2011-10-17 16:29:43,585 memleaker.py:22] 45127 PropertyValue_ReferenceValue
DEBUG 2011-10-17 16:29:43,585 memleaker.py:22] 45127 PropertyValue_ReferenceValuePathElement
DEBUG 2011-10-17 16:29:43,585 memleaker.py:22] 43822 Reference
DEBUG 2011-10-17 16:29:43,586 memleaker.py:22] 30595 EntityProto
DEBUG 2011-10-17 16:29:43,586 memleaker.py:22] 617 Query
DEBUG 2011-10-17 16:29:43,586 memleaker.py:22] 609 Query_Filter
DEBUG 2011-10-17 16:29:43,586 memleaker.py:22] 320 ProtocolMessage
DEBUG 2011-10-17 16:29:43,586 memleaker.py:22] 55 NOT_PROVIDED
DEBUG 2011-10-17 16:29:43,586 memleaker.py:22] 34 Index_Property
DEBUG 2011-10-17 16:29:43,586 memleaker.py:22] 28 ExtendableProtocolMessage
DEBUG 2011-10-17 16:29:43,588 memleaker.py:22] 18 CompositeIndex发布于 2011-10-19 00:47:40
我无法使用您的refcount代码和下面的一个小片段(在shell.appspot.com或一个新的应用程序上)来再现这段代码:
from google.appengine.ext import db
import logging
import sys
import types
def get_refcounts():
d = {}
# collect all classes
for m in sys.modules.values():
for sym in dir(m):
o = getattr (m, sym)
if type(o) is types.ClassType:
d[o] = sys.getrefcount (o)
# sort by refcount
pairs = map (lambda x: (x[1],x[0]), d.items())
pairs.sort()
pairs.reverse()
return pairs
def print_top(num = 15):
print 'Top Mem Leaks'
for n, c in get_refcounts()[:num]:
print '%10d %s' % (n, c.__name__)
class TestModel(db.Model):
id = db.IntegerProperty()
print_top()
q = TestModel.gql("WHERE id = :1", 1)
item = q.get()
del q
print_top()您的环境中似乎有些东西保存了对已经执行的查询的引用。您是在使用appstats还是其他开发或调试工具?你能创造一个最小的再现案例来展示你所观察到的行为吗?
https://stackoverflow.com/questions/7796816
复制相似问题