文章/答案/技术大牛

发布

社区首页 >问答首页 >打印预先训练模型的所有层的输入和输出的大小。

问打印预先训练模型的所有层的输入和输出的大小。
EN

Stack Overflow用户

提问于 2022-07-26 10:41:15

回答 1查看 133关注 0票数 0

我想打印一个预先训练的模型的所有层的输入和输出的大小。我把这个预先训练过的模型作为self.feature在我的class中。

这一预先训练的模型的打印如下：

TimeSformer(
  (model): VisionTransformer(
(dropout): Dropout(p=0.0, inplace=False)
(patch_embed): PatchEmbed(
  (proj): Conv2d(3, 768, kernel_size=(16, 16), stride=(16, 16))
)
(pos_drop): Dropout(p=0.0, inplace=False)
(time_drop): Dropout(p=0.0, inplace=False)
(blocks): ModuleList(  
  (0): Block(
    (norm1): LayerNorm((768,), eps=1e-06, elementwise_affine=True)
    (attn): Attention(
      (qkv): Linear(in_features=768, out_features=2304, bias=True)
      (proj): Linear(in_features=768, out_features=768, bias=True)
      (proj_drop): Dropout(p=0.0, inplace=False)
      (attn_drop): Dropout(p=0.0, inplace=False)
    )
    (temporal_norm1): LayerNorm((768,), eps=1e-06, elementwise_affine=True)
    (temporal_attn): Attention(
      (qkv): Linear(in_features=768, out_features=2304, bias=True)
      (proj): Linear(in_features=768, out_features=768, bias=True)
      (proj_drop): Dropout(p=0.0, inplace=False)
      (attn_drop): Dropout(p=0.0, inplace=False)
    )
    (temporal_fc): Linear(in_features=768, out_features=768, bias=True)
    (drop_path): Identity()
    (norm2): LayerNorm((768,), eps=1e-06, elementwise_affine=True)
    (mlp): Mlp(
      (fc1): Linear(in_features=768, out_features=3072, bias=True)
      (act): GELU()
      (fc2): Linear(in_features=3072, out_features=768, bias=True)
      (drop): Dropout(p=0.0, inplace=False)
    )
  )
  (1): Block(
    (norm1): LayerNorm((768,), eps=1e-06, elementwise_affine=True)
    (attn): Attention(
      (qkv): Linear(in_features=768, out_features=2304, bias=True)
      (proj): Linear(in_features=768, out_features=768, bias=True)
      (proj_drop): Dropout(p=0.0, inplace=False)
      (attn_drop): Dropout(p=0.0, inplace=False)
    )
    (temporal_norm1): LayerNorm((768,), eps=1e-06, elementwise_affine=True)
    (temporal_attn): Attention(  # *********
      (qkv): Linear(in_features=768, out_features=2304, bias=True)
      (proj): Linear(in_features=768, out_features=768, bias=True) # @@@@@@@
      (proj_drop): Dropout(p=0.0, inplace=False)
      (attn_drop): Dropout(p=0.0, inplace=False)
    )
    (temporal_fc): Linear(in_features=768, out_features=768, bias=True)
    (drop_path): DropPath()
    (norm2): LayerNorm((768,), eps=1e-06, elementwise_affine=True)
    (mlp): Mlp(
      (fc1): Linear(in_features=768, out_features=3072, bias=True)
      (act): GELU()
      (fc2): Linear(in_features=3072, out_features=768, bias=True)
      (drop): Dropout(p=0.0, inplace=False)
    )
  )
.
.
.
.
.
.
  (11): Block(
    (norm1): LayerNorm((768,), eps=1e-06, elementwise_affine=True)
    (attn): Attention(
      (qkv): Linear(in_features=768, out_features=2304, bias=True)
      (proj): Linear(in_features=768, out_features=768, bias=True)
      (proj_drop): Dropout(p=0.0, inplace=False)
      (attn_drop): Dropout(p=0.0, inplace=False)
    )
    (temporal_norm1): LayerNorm((768,), eps=1e-06, elementwise_affine=True)
    (temporal_attn): Attention(
      (qkv): Linear(in_features=768, out_features=2304, bias=True)
      (proj): Linear(in_features=768, out_features=768, bias=True)
      (proj_drop): Dropout(p=0.0, inplace=False)
      (attn_drop): Dropout(p=0.0, inplace=False)
    )
    (temporal_fc): Linear(in_features=768, out_features=768, bias=True)
    (drop_path): DropPath()
    (norm2): LayerNorm((768,), eps=1e-06, elementwise_affine=True)
    (mlp): Mlp(
      (fc1): Linear(in_features=768, out_features=3072, bias=True)
      (act): GELU()
      (fc2): Linear(in_features=3072, out_features=768, bias=True)
      (drop): Dropout(p=0.0, inplace=False)
    )
  )
)
(norm): LayerNorm((768,), eps=1e-06, elementwise_affine=True) 
(head): Linear(in_features=768, out_features=400, bias=True)

)

这是我类的代码和打印层大小的方法：

class Class(nn.Module):
    def __init__(self, pretrained=False):
        super(Class, self).__init__()
        
        
        self.feature =TimeSformer(img_size=224, num_classes=400, num_frames=8, attention_type='divided_space_time',  
                                           pretrained_model='path/to/the/weight.pyth')

def forward(self, x):
    for layer in self.feature:
        x = layer(x)
        print(x.size())
    return x

我使用以下方法打印

但我面临着这样的错误：

TypeError: 'TimeSformer' object is not iterable

如何打印所有层的大小？

更新：

使用以下代码接收注释中提到的错误：

def forward(self, x, out_consp = False):
layers=list(self.featureExtractor.children())
for layer in layers:
    x = layer(x)
    print(x.size())
return x

python

deep-learning

model

pytorch

pre-trained-model

回答 1

Stack Overflow用户

回答已采纳

发布于 2022-07-26 23:19:30

您可以使用钩子打印输入的形状和每个层的输出。你可以用这段代码来做你想做的事。

def hook_function(module, input, output):
    print(f'{module.name} :')
    print(module)
    #print(module)
    if isinstance(input[0], tuple):
        print('input shapes:')
        for elem in input[0]:
            print(elem.shape)
    else:
        print(f'input shape: {input[0].shape}')
    if isinstance(output, tuple):
        print('output shapes:')
        for elem in output:
            print(elem.shape)
    else:
        print(f'output shape: {output.shape}')
    print('')

def set_names(net):
    def recurs(net,parent_name=None):
        for name, mod in net.named_children():
            if parent_name is not None:
                name = '_'.join([parent_name, name])
            recurs(mod, name)
            setattr(mod,'name',name)
        
    recurs(net)
def print_shapes(network, dummy_input_shape, device='cuda', eval=True):
    network = network.to(device)
    if eval:
        network.eval()
    else:
        network.train()
        assert dummy_input_shape[0] > 1
    #print(network)
    dummy = torch.randn(dummy_input_shape, device=device)
    set_names(network)
    handles = []
    def attach_hooks(net):
        leaf_layers = 0
        for mod in net.children():
            leaf_layers += 1
            attach_hooks(mod)
        if leaf_layers == 0:
            handles.append(net.register_forward_hook(hook_function))
    attach_hooks(network)
    network(dummy) 
    # if needed
    for handle in handles:
        handle.remove()

示例：

network = TimeSformer(img_size=224, num_classes=400, num_frames=8, attention_type='divided_space_time',  
                                           pretrained_model='path/to/the/weight.pyth')
# The behaviour of a forward function could be different during training
print_shapes(network,(1,3,224,224),'cpu', eval=True)
print_shapes(network,(2,3,224,224),'cpu', eval=False)

输出的一个片段包括一个层，该层在'temporal_norm1‘层之前定义在’块‘模块中，但在稍后调用或执行(norm1)。

model_blocks_11_temporal_fc :
Linear(in_features=768, out_features=768, bias=True)
input shape: torch.Size([2, 1568, 768])
output shape: torch.Size([2, 1568, 768])

model_blocks_11_norm1 :
LayerNorm((768,), eps=1e-06, elementwise_affine=True)
input shape: torch.Size([16, 197, 768])
output shape: torch.Size([16, 197, 768])

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/73121935

复制

相似问题

问打印预先训练模型的所有层的输入和输出的大小。
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问打印预先训练模型的所有层的输入和输出的大小。EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问打印预先训练模型的所有层的输入和输出的大小。
EN