首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >如何提高PyTorch的人工智能在图像分类器中的准确性?

如何提高PyTorch的人工智能在图像分类器中的准确性?
EN

Stack Overflow用户
提问于 2022-09-13 12:09:30
回答 3查看 119关注 0票数 0

我正在尝试建立一个强大的图像分类器。但我有个问题。我使用CIFRAS-100数据集,并根据它训练了一个模型。请注意,正确的分类数等于15%。我试着继续学习的过程,但经过2-3次尝试,模型没有改变.

用于培训的代码:

代码语言:javascript
复制
import torch
import sys,os
import torchvision
import torchvision.transforms as transforms

transform = transforms.Compose(
    [transforms.ToTensor(),
     transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])

batch_size = 4

trainset = torchvision.datasets.CIFAR100(root='./dataone', train=True,
                                        download=True, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=batch_size,
                                          shuffle=True, num_workers=2)

testset = torchvision.datasets.CIFAR100(root='./dataone', train=False,
                                       download=True, transform=transform)
testloader = torch.utils.data.DataLoader(testset, batch_size=batch_size,
                                         shuffle=False, num_workers=2)
classes = ('aquatic mammals','fish','flowers','food containers','fruit and vegetables','household electrical devices','household furniture','insects','large carnivores','large man-made outdoor things','large natural outdoor scenes','large omnivores and herbivores','medium-sized mammals','non-insect invertebrates','people','reptiles','small mammals','trees','vehicles 1','vehicles 2')
import torch.nn as nn
import torch.nn.functional as F

class Net(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv1 = nn.Conv2d(3, 6, 5)
        self.pool = nn.MaxPool2d(2, 2)
        self.conv2 = nn.Conv2d(6, 16, 5)
        self.fc1 = nn.Linear(16 * 5 * 5, 120)
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, 100)

    def forward(self, x):
        x = self.pool(F.relu(self.conv1(x)))
        x = self.pool(F.relu(self.conv2(x)))
        x = torch.flatten(x, 1) # flatten all dimensions except batch
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        return x
import torch.optim as optim
PATH = "./model.pt"
model = Net()
net = Net()
print(os.path.exists(PATH))
if os.path.exists(PATH):
    optimizer = optim.SGD(net.parameters(), lr=0.001, momentum=0.9)
    checkpoint = torch.load(PATH)
    model.load_state_dict(checkpoint['model_state_dict'])
    optimizer.load_state_dict(checkpoint['optimizer_state_dict'])
    epoch = checkpoint['epoch']
    loss = checkpoint['loss']
    print("using checkpoint")
    #model.eval()
    # - or -
    model.train()

#criterion = nn.CrossEntropyLoss()
#optimizer = optim.SGD(net.parameters(), lr=0.001, momentum=0.9)import torch.optim as optim

criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(net.parameters(), lr=0.001, momentum=0.9)

for epoch in range(2):  # loop over the dataset multiple times

    running_loss = 0.0
    for i, data in enumerate(trainloader, 0):
        # get the inputs; data is a list of [inputs, labels]
        inputs, labels = data

        # zero the parameter gradients
        optimizer.zero_grad()

        # forward + backward + optimize
        outputs = net(inputs)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()
        print("training..")
        # print statistics
        #running_loss += loss.item()
        #if i % 2000 == 1999:    # print every 2000 mini-batches
        #    print(f'[{epoch + 1}, {i + 1:5d}] loss: {running_loss / 2000:.3f}')
        #    running_loss = 0.0

print('Finished Training')

#PATH = './cifar_net.pth'
#torch.save(net.state_dict(), PATH)

EPOCH = 5

LOSS = 0.4

torch.save({
            'epoch': EPOCH,
            'model_state_dict': net.state_dict(),
            'optimizer_state_dict': optimizer.state_dict(),
            'loss': LOSS,
            }, PATH)```
It's based on PyTorch tutorial about image cassifiers, that can be found [here](https://pytorch.org/tutorials/beginner/blitz/cifar10_tutorial.html).
I took code for resuming training from [here.](https://pytorch.org/tutorials/recipes/recipes/saving_and_loading_a_general_checkpoint.html)

Code that I used for testing model:

导入火炬导入火炬视觉导入torchvision.transforms作为转换

代码语言:javascript
复制
transform = transforms.Compose(
    [transforms.ToTensor(),
     transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])

batch_size = 4

trainset = torchvision.datasets.CIFAR100(root='./dataone', train=False,
                                        download=True, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=batch_size,
                                          shuffle=True, num_workers=2)

testset = torchvision.datasets.CIFAR100(root='./dataone', train=False,
                                       download=True, transform=transform)
testloader = torch.utils.data.DataLoader(testset, batch_size=batch_size,
                                         shuffle=False, num_workers=2)
classes = ('aquatic mammals','fish','flowers','food containers','fruit and vegetables','household electrical devices','household furniture','insects','large carnivores','large man-made outdoor things','large natural outdoor scenes','large omnivores and herbivores','medium-sized mammals','non-insect invertebrates','people','reptiles','small mammals','trees','vehicles 1','vehicles 2')
import torch.nn as nn
import torch.nn.functional as F


class Net(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv1 = nn.Conv2d(3, 6, 5)
        self.pool = nn.MaxPool2d(2, 2)
        self.conv2 = nn.Conv2d(6, 16, 5)
        self.fc1 = nn.Linear(16 * 5 * 5, 120)
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, 100)

    def forward(self, x):
        x = self.pool(F.relu(self.conv1(x)))
        x = self.pool(F.relu(self.conv2(x)))
        x = torch.flatten(x, 1) # flatten all dimensions except batch
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        return x


net = Net()
PATH = './cifar_net.pth'
net.load_state_dict(torch.load(PATH))
correct = 0
total = 0
# since we're not training, we don't need to calculate the gradients for our outputs
with torch.no_grad():
    for data in testloader:
        images, labels = data
        # calculate outputs by running images through the network
        outputs = net(images)
        # the class with the highest energy is what we choose as prediction
        _, predicted = torch.max(outputs.data, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()
print(correct)
print(total)
print(f'Accuracy of the network on the 100000 test images: {100 * correct // total} %')```

它来自PyTorch提供的同一图像分类器教程。我增加了打印总数和正确检测到的图像进行测试。

我怎样才能提高准确度,所以它将至少在50-70%左右?或者这是正常的,这15%是不正确的吗?请帮帮忙。

EN

回答 3

Stack Overflow用户

发布于 2022-09-13 20:42:52

你试过增加划时代的数量吗?训练通常需要数百到数千次迭代才能获得良好的效果。

还可以通过继续卷积层来改进结构,直到留下1×1×N图像,其中N是最终卷积中的滤波器数。然后压平并加入线性层。在池层之前进行批处理规范化和LeakyReLU激活也可能有帮助。最后,您应该对输出使用Softmax激活,因为您正在处理一个分类器。

我强烈建议查看流行的分类器,如VGG和ResNet。特别是ResNet有一个名为“剩余/跳过连接”的特性,它将一层输出的副本向前传递到线上,以弥补功能的损失。

票数 0
EN

Stack Overflow用户

发布于 2022-09-14 08:52:56

你能提供准确的和损失的计划,以便我们能够更好地了解在培训中正在发生的事情(或者培训期间的精确性和损失的列表)。

此外,计算验证精度和每个时代后的损失是一个很好的做法,以监测网络在未见数据上的行为。

虽然,正如Xynias所说的,您可以对您的体系结构做一些改进--我相信第一步将是从准确性和损失的角度进行调查。

票数 0
EN

Stack Overflow用户

发布于 2022-09-15 13:11:07

考虑到CIFAR100有100个类,这是可以期待的。你需要一个非常复杂的网络才能很好地完成这个任务。当然更多的功能地图,从64个或更多的频道开始。

这种Q&D结构经过10个年代左右(使用0.1的学习速率和256个批处理大小,我还添加了RandomHorizontalFlip()变换),总的精度超过了50%:

代码语言:javascript
复制
class Net(nn.Module):
    def __init__(self):
        super().__init__()
        self.layers = nn.Sequential(
            nn.Conv2d(3, 128, 3, stride=1, padding=1),
            nn.ReLU(),
            nn.Conv2d(128, 128, 3, stride=1, padding=1),
            nn.ReLU(),
            nn.AvgPool2d(2, 2),
            nn.Conv2d(128, 256, 3, stride=1, padding=1),
            nn.ReLU(),
            nn.Conv2d(256, 256, 3, stride=1, padding=1),
            nn.ReLU(),
            nn.AvgPool2d(2, 2),
            nn.Flatten(),
            nn.Dropout(0.5),
            nn.Linear(16384, 100),
        )
    def forward(self, x):
        return self.layers(x)

为了获得更好的结果,您可以尝试实现类似ResNet的东西,或者使用预先制作的(可能是预先训练的)模型,例如,使用timm:

代码语言:javascript
复制
import timm
net = timm.create_model('resnet18d', pretrained=True, num_classes=100)

它以与上面相同的参数相当快地实现了目标度量。

票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/73702756

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档