文章/答案/技术大牛

发布

社区首页 >问答首页 >优化策略使用(数据生成)

问优化策略使用(数据生成)
EN

Stack Overflow用户

提问于 2019-07-24 07:54:50

回答 1查看 239关注 0票数 2

我想为我的单元测试优化数据生成速度。像from_regex和dictionaries这样的策略似乎需要很长时间才能生成示例。

下面是我编写的一个示例，试图对示例生成进行基准测试：

from hypothesis import given
from hypothesis.strategies import (
    booleans,
    composite,
    dictionaries,
    from_regex,
    integers,
    lists,
    one_of,
    text,
)

param_names = from_regex(r"[a-z][a-zA-Z0-9]*(_[a-zA-Z0-9]+)*", fullmatch=True)
param_values = one_of(booleans(), integers(), text(), lists(text()))


@composite
def composite_params_dicts(draw, min_size=0):
    """Provides a dictionary of parameters."""
    params = draw(
        dictionaries(keys=param_names, values=param_values, min_size=min_size)
    )

    return params


params_dicts = dictionaries(keys=param_names, values=param_values)


@given(params=params_dicts)
def test_standard(params):
    assert params is not None


@given(params=composite_params_dicts(min_size=1))
def test_composite(params):
    assert len(params) > 0


@given(integer=integers(min_value=1))
def test_integer(integer):
    assert integer > 0

test_integer()测试是一个引用，因为它使用了一个简单的策略。

因为我的一个项目中的一些长时间运行的测试使用regexes来生成参数名称，而字典来生成这些参数，所以我使用这些策略添加了两个测试。

test_composite()使用带有可选参数的复合策略。test_standard()使用类似的策略，除非它不是复合的。

测试结果如下：

> pytest hypothesis-sandbox/test_dicts.py --hypothesis-show-statistics
============================ test session starts =============================
platform linux -- Python 3.7.3, pytest-5.0.1, py-1.8.0, pluggy-0.12.0
hypothesis profile 'default' -> database=DirectoryBasedExampleDatabase('/home/damien/Sandbox/hypothesis/.hypothesis/examples')
rootdir: /home/damien/Sandbox/hypothesis
plugins: hypothesis-4.28.2
collected 3 items                                                                                                                                                       

hypothesis-sandbox/test_dicts.py ...                                    [100%]
=========================== Hypothesis Statistics ============================

hypothesis-sandbox/test_dicts.py::test_standard:

  - 100 passing examples, 0 failing examples, 1 invalid examples
  - Typical runtimes: 0-35 ms
  - Fraction of time spent in data generation: ~ 98%
  - Stopped because settings.max_examples=100
  - Events:
    * 2.97%, Retried draw from TupleStrategy((<hypothesis._strategies.CompositeStrategy object at 0x7f72108b9630>,
    one_of(booleans(), integers(), text(), lists(elements=text()))))
    .filter(lambda val: all(key(val) not in seen 
    for (key, seen) in zip(self.keys, seen_sets))) to satisfy filter

hypothesis-sandbox/test_dicts.py::test_composite:

  - 100 passing examples, 0 failing examples, 1 invalid examples
  - Typical runtimes: 0-47 ms
  - Fraction of time spent in data generation: ~ 98%
  - Stopped because settings.max_examples=100

hypothesis-sandbox/test_dicts.py::test_integer:

  - 100 passing examples, 0 failing examples, 0 invalid examples
  - Typical runtimes: < 1ms
  - Fraction of time spent in data generation: ~ 57%
  - Stopped because settings.max_examples=100

========================== 3 passed in 3.17 seconds ==========================

复合策略比较慢吗？

如何优化定制策略？

python

python-hypothesis

回答 1

Stack Overflow用户

回答已采纳

发布于 2019-07-28 06:27:58

复合策略与生成相同数据的任何其他方法一样快，但是人们倾向于将它们用于大而复杂的输入(这比小的和简单的输入慢)。

战略优化建议减少到“不要做慢的事情”，因为没有更快的方法。

尽量减少.filter(...)的使用，因为重试比不重试慢。
帽的大小，尤指嵌套的东西。

因此，对于您的例子来说，如果限制列表的大小，它可能会更快，但否则它就会很慢(ish!)因为你正在生成大量的数据，但却没有做太多的事情。

票数 2

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/57177904

复制

相似问题

问优化策略使用(数据生成)
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问优化策略使用(数据生成)EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问优化策略使用(数据生成)
EN