为什么这会给我两个完全不同的答案?如何获得与使用方法2的方法1相同的结果?
import torch
from torch import nn
kernel_size = 7
stride = 1
# approach 1
data = torch.rand(4, 64, 174, 120)
data1 = data.unfold(3, kernel_size * 2 + 1, stride)
print(data1.shape)
# approach 2
data = torch.rand(4, 64, 174, 120)
unfold = nn.Unfold(3, kernel_size * 2 + 1, stride)
data2 = unfold(data)
print(data2.shape)输出:
torch.Size([4, 64, 174, 106, 15])
torch.Size([4, 576, 13432])编辑
我试过你的方法了。形状是相同的,但内容不是。知道为什么吗?
import torch
from torch import nn
kernel_size = 7
stride = 1
# approach 1
data = torch.rand(4, 64, 174, 120)
data1 = data.unfold(3, kernel_size * 2 + 1, stride)
print(data1.shape)
# approach 2
data = torch.rand(4, 64, 174, 120)
b, c, h, w = data.shape
unfold = nn.Unfold(kernel_size=(1, 2*kernel_size + 1), dilation=1, stride=1, padding=0)
data2 = unfold(data.reshape(-1, 1, 1, w)).permute(0, 2, 1).reshape(b, c, h, -1, 2*kernel_size + 1)
print(data2.shape)
print(torch.equal(data1, data2))输出:
torch.Size([4, 64, 174, 106, 15])
torch.Size([4, 64, 174, 106, 15])
False发布于 2022-08-08 11:04:09
torch.unfold沿着某个维度展开。在您的示例中,它获取了dim 120的4x64x174样本,并提取了所有重叠的15个窗口,其结果是形状为4x64x174x106x15的data1。
相反,nn.Unfold工作在bxcx...张量和提取空间补丁。在您的示例中,nn.Unfold得到了kernel_size=3、dilation=kernel_size*2+1和padding=1。因此,它提取了13432个64个通道的3x3斑块(3_3_64=576),得到了形状为3_3_64=576的data2。
要从torch.unfold中获得相同的nn.Unfold输出,您需要对其进行整形和修改:
b, c, h, w = data.shape
unfold = nn.Unfold(kernel_size=(1, 2*kernel_size + 1), dilation=1, stride=1, padding=0)
data2 = unfold(data.reshape(-1, 1, 1, w)).permute(0, 2, 1).reshape(b, c, h, -1, 2*kernel_size + 1)请仔细阅读,,nn.Unfold的文档,因为它的工作方式与torch.unfold完全不同。有关nn.Unfold和nn.Fold的更多信息,请参见this thread。
https://stackoverflow.com/questions/73276139
复制相似问题