问PyTorch的model.eval() + no_grad()在TensorFlow中等价于什么？
EN

Stack Overflow用户

提问于 2022-06-22 13:33:03

回答 1查看 562关注 0票数 1

我正在尝试提取BERT嵌入并使用tensorflow而不是py手电筒来再现这段代码。我知道tf.stop_gradient()相当于torch.no_grad()，但是model.eval() /两者的组合又如何呢？

# Put the model in "evaluation" mode, meaning feed-forward operation.
model.eval()

# Run the text through BERT, and collect all of the hidden states produced
# from all 12 layers. 
with torch.no_grad():

    outputs = model(tokens_tensor, segments_tensors)

    # Evaluating the model will return a different number of objects based on 
    # how it's  configured in the `from_pretrained` call earlier. In this case, 
    # becase we set `output_hidden_states = True`, the third item will be the 
    # hidden states from all layers. See the documentation for more details:
    # https://huggingface.co/transformers/model_doc/bert.html#bertmodel
    hidden_states = outputs[2]

python

pytorch

回答 1

Stack Overflow用户

发布于 2022-06-22 13:44:37

TLDR; eval和no_grad是两种完全不同的东西，但通常是结合使用，主要用于在评估/测试循环的情况下执行快速推理。

nn.Module.eval函数应用于PyTorch模块，并使其能够根据阶段类型(培训或评估)改变其行为。只有少数几个层，这实际上是有影响的层。退出层和归一化层等功能有不同的行为，这取决于它们是在培训模式还是在评估模式。你可以在this thread上读到更多关于它的信息。

然而，torch.no_grad实用程序是一个上下文管理器，它改变了包含在该作用域中的代码的运行方式。当应用时，no_grad具有防止梯度计算的作用。实际上，这意味着没有在内存中缓存层激活。这通常用于评估和测试循环，在这些循环中，在推断之后不期望反向传播。但是，它也可以在训练期间使用，例如，当对冻结部件的推断和梯度不需要通过它时。

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/72716490

复制

相似问题

问PyTorch的model.eval() + no_grad()在TensorFlow中等价于什么？
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问PyTorch的model.eval() + no_grad()在TensorFlow中等价于什么？EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问PyTorch的model.eval() + no_grad()在TensorFlow中等价于什么？
EN