我试图仅用第一个参数来缓存函数expand。为了缓存的目的,我不关心其他参数的值。
因为其他参数是dicts,所以它们不能缓存,所以我定义了一个类来包含这些参数,这些参数的散列总是返回0,所以缓存函数应该忽略它。
我在下面添加了一些削减代码。我正在使用Python版本3.5.2。
class Node:
def __init__(self, value):
self.value = value
def expand(self, a1, a2):
return '{},{},{}'.format(self.value, a1, a2)
class ExpandArgs:
def __init__(self, a1, a2):
self.a1 = a1
self.a2 = a2
def __hash__(self):
# We don't care about the hash, but it's required for caching
return 0
@functools.lru_cache(maxsize=None) # hash of args is always 0, so it should be ignored, and the hash of node should be used as the cache key
def expand(node, args):
a1 = args.a1
a2 = args.a2
return node.expand(a1, a2)
e1 = ExpandArgs({}, {})
e2 = ExpandArgs({}, {})
print(hash(e1)) # 0
print(hash(e2)) # 0
node = Node(123)
print(expand.cache_info()) # CacheInfo(hits=0, misses=0, maxsize=None, currsize=0)
expand(node, e1)
print(expand.cache_info()) # CacheInfo(hits=0, misses=1, maxsize=None, currsize=1)
expand(node, e2)
print(expand.cache_info()) # CacheInfo(hits=0, misses=2, maxsize=None, currsize=2)
expand(node, e1)
print(expand.cache_info()) # CacheInfo(hits=1, misses=2, maxsize=None, currsize=2)
expand(node, e2)
print(expand.cache_info()) # CacheInfo(hits=2, misses=2, maxsize=None, currsize=2)从hash(e1) == hash(e2)开始,我希望对expand()的第二个调用会击中e1的缓存值,但它没有命中。
为什么我不得到1次缓存丢失,3次缓存命中以上代码?
发布于 2017-05-12 11:24:31
原来,为了缓存的目的,使用eq而不是散列来检查函数参数是否相等,所以当我更改类时,它可以工作。
class ExpandArgs:
def __init__(self, context, forecast_transaction_node_map, date_range_func):
self.context = context
self.forecast_transaction_node_map = forecast_transaction_node_map
self.date_range_func = date_range_func
def __hash__(self):
# We don't care about the hash, but it's required for caching
return 0
def __eq__(self, other):
return isinstance(other, self.__class__)发布于 2021-09-01 15:12:15
我只是想在上面添加一些注释,因为我花了一些时间阅读源代码,试图弄清楚为什么在这里使用__eq__。
事实证明,这是用作缓存的Python字典的一个基本特性(在python3.9上进行了测试):首先使用__hash__,但当散列匹配时使用__eq__,以确保对象实际上是不同的:
In [5]: class CompTuple(tuple):
...: """Tuple that prints whenever equality operator is
...: used"""
...:
...: __hash__ = tuple.__hash__
...:
...: def __eq__(self, other):
...: print("equality comparison")
...: return super().__eq__(other)
...:
In [6]: t1 = CompTuple( (1,2,3,) )
In [7]: t2 = CompTuple( (1,2,4,) ) # has different hash than t1
In [8]: t3 = CompTuple( (1,2,3,) ) # has same hash as t1
In [9]: d={}
In [10]: d[t1]=1
In [11]: d[t2]
--------------------------------------------------------------
KeyError Traceback (most recent call last)
<ipython-input-11-c1f54cc7c51f> in <module>
----> 1 d[t2]
KeyError: (1, 2, 4)
In [12]: d[t3]
equality comparison # equality comparison because of same hash
Out[12]: 1
In [13]: d[t2]=2
In [14]: d[t3] # still only one equality comparison
equality comparison
Out[14]: 1比较相等的对象的python 明确要求具有相同的散列值。
但是,对__eq__的调用可能比调用__hash__要昂贵得多。当缓存中已经包含相同哈希的对象时,这可能会产生违反直觉的效果,使得lru_cache查找变得非常昂贵。
为了使事情更加混乱,字典查找中有一个快捷方式,在比较相同的对象(=对象和相同的__eq__ )时跳过对id的调用:
In [15]: d[t1] # no equality comparison, t1 *is* in the cache
Out[15]: 1https://stackoverflow.com/questions/43935563
复制相似问题