文章/答案/技术大牛

发布

社区首页 >问答首页 >Python lru_cache假负数

问Python lru_cache假负数
EN

Stack Overflow用户

提问于 2017-05-12 10:31:26

回答 2查看 1.1K关注 0票数 1

我试图仅用第一个参数来缓存函数expand。为了缓存的目的，我不关心其他参数的值。

因为其他参数是dicts，所以它们不能缓存，所以我定义了一个类来包含这些参数，这些参数的散列总是返回0，所以缓存函数应该忽略它。

我在下面添加了一些削减代码。我正在使用Python版本3.5.2。

class Node:
    def __init__(self, value):
        self.value = value

    def expand(self, a1, a2):
        return '{},{},{}'.format(self.value, a1, a2)


class ExpandArgs:
    def __init__(self, a1, a2):
        self.a1 = a1
        self.a2 = a2

    def __hash__(self):
        # We don't care about the hash, but it's required for caching
        return 0


@functools.lru_cache(maxsize=None)  # hash of args is always 0, so it should be ignored, and the hash of node should be used as the cache key
def expand(node, args):
    a1 = args.a1
    a2 = args.a2
    return node.expand(a1, a2)


e1 = ExpandArgs({}, {})
e2 = ExpandArgs({}, {})
print(hash(e1))  # 0
print(hash(e2))  # 0
node = Node(123)
print(expand.cache_info())  # CacheInfo(hits=0, misses=0, maxsize=None, currsize=0)
expand(node, e1)
print(expand.cache_info())  # CacheInfo(hits=0, misses=1, maxsize=None, currsize=1)
expand(node, e2)
print(expand.cache_info())  # CacheInfo(hits=0, misses=2, maxsize=None, currsize=2)
expand(node, e1)
print(expand.cache_info())  # CacheInfo(hits=1, misses=2, maxsize=None, currsize=2)
expand(node, e2)
print(expand.cache_info())  # CacheInfo(hits=2, misses=2, maxsize=None, currsize=2)

从hash(e1) == hash(e2)开始，我希望对expand()的第二个调用会击中e1的缓存值，但它没有命中。

为什么我不得到1次缓存丢失，3次缓存命中以上代码？

python

caching

functools

回答 2

Stack Overflow用户

发布于 2017-05-12 11:24:31

原来，为了缓存的目的，使用eq而不是散列来检查函数参数是否相等，所以当我更改类时，它可以工作。

class ExpandArgs:
    def __init__(self, context, forecast_transaction_node_map, date_range_func):
        self.context = context
        self.forecast_transaction_node_map = forecast_transaction_node_map
        self.date_range_func = date_range_func

    def __hash__(self):
        # We don't care about the hash, but it's required for caching
        return 0

    def __eq__(self, other):
        return isinstance(other, self.__class__)

票数 1

Stack Overflow用户

发布于 2021-09-01 15:12:15

我只是想在上面添加一些注释，因为我花了一些时间阅读源代码，试图弄清楚为什么在这里使用__eq__。

事实证明，这是用作缓存的Python字典的一个基本特性(在python3.9上进行了测试)：首先使用__hash__，但当散列匹配时使用__eq__，以确保对象实际上是不同的：

In [5]: class CompTuple(tuple):
   ...:     """Tuple that prints whenever equality operator is
   ...:  used"""
   ...:
   ...:     __hash__ = tuple.__hash__
   ...:
   ...:     def __eq__(self, other):
   ...:         print("equality comparison")
   ...:         return super().__eq__(other)
   ...:

In [6]: t1 = CompTuple( (1,2,3,) )

In [7]: t2 = CompTuple( (1,2,4,) )  # has different hash than t1

In [8]: t3 = CompTuple( (1,2,3,) )  # has same hash as t1

In [9]: d={}

In [10]: d[t1]=1

In [11]: d[t2]
--------------------------------------------------------------
KeyError                     Traceback (most recent call last)
<ipython-input-11-c1f54cc7c51f> in <module>
----> 1 d[t2]

KeyError: (1, 2, 4)

In [12]: d[t3]
equality comparison  # equality comparison because of same hash
Out[12]: 1

In [13]: d[t2]=2

In [14]: d[t3]       # still only one equality comparison
equality comparison
Out[14]: 1

比较相等的对象的python 明确要求具有相同的散列值。

但是，对__eq__的调用可能比调用__hash__要昂贵得多。当缓存中已经包含相同哈希的对象时，这可能会产生违反直觉的效果，使得lru_cache查找变得非常昂贵。

为了使事情更加混乱，字典查找中有一个快捷方式，在比较相同的对象(=对象和相同的__eq__ )时跳过对id的调用：

In [15]: d[t1]  # no equality comparison, t1 *is* in the cache
Out[15]: 1

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/43935563

复制

相似问题

问Python lru_cache假负数
EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问Python lru_cache假负数EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问Python lru_cache假负数
EN