0%

Pytorch初步应用

Pytorch初步应用

使用Pytorch构建一个神经网络

关于torch.nn

  • 使用Pytorch来构建神经网络,主要的工具都在torch.nn包中
  • nn依赖于autograd来定义模型,并对其自动求导

构建神经网络的典型流程

  • 定义一个拥有可学习参数的神经网络
  • 遍历训练数据集
  • 处理输入数据使其流经神经网络
  • 计算损失值
  • 将网络参数的梯度进行反向传播
  • 以一定的规则更新网络的权重

定义一个Pytorch实现的神经网络

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
import torch
import torch.nn as nn
import torch.nn.functional as F

# 定义一个简单的网络类
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
# 定义第一层卷积神经网络,输入通道维度=1,输出=6,卷积核大小3*3
self.conv1 = nn.Conv2d(1, 6, 3)
self.conv2 = nn.Conv2d(6, 16, 3)
self.fc1 = nn.Linear(16*6*6, 120)
self.fc2 = nn.Linear(120, 84)
self.fc3 = nn.Linear(84, 10)
def forward(self, x):
x = F.max_pool2d(F.relu(self.conv1(x)), (2,2))
x = F.max_pool2d(F.relu(self.conv2(x)), 2)
x = x.view(-1, self.num_flat_features(x))
x = F.relu(self.fc1(x))
x = F.relu(self.fc2(x))
x = self.fc3(x)
return x
def num_flat_features(self, x):
size = x.size()[1:]
num_features = 1
for s in size:
num_features *= s
return num_features
net = Net()
print(net)

Net(

(conv1): Conv2d(1, 6, kernel_size=(3, 3), stride=(1, 1))

(conv2): Conv2d(6, 16, kernel_size=(3, 3), stride=(1, 1))

(fc1): Linear(in_features=576, out_features=120, bias=True)

(fc2): Linear(in_features=120, out_features=84, bias=True)

(fc3): Linear(in_features=84, out_features=10, bias=True)

)

注意:

  • 模型中所有可训练的参数可以通过net.parameters()来获得
1
2
3
params = list(net.parameters())
print(len(params))
print(params[0].size())

10

torch.Size([6, 1, 3, 3])

输出测试

1
2
3
4
input = torch.randn(1, 1, 32, 32)
out = net(input)
print(out)
print(out.size())

tensor([[ 0.0089, -0.0182, -0.1752, -0.1430, -0.0278, 0.1290, -0.1476, 0.1454,-0.0877, -0.0579]], grad_fn=)

torch.Size([1, 10])

有了输出张量,就可以执行梯度归零和反向传播的操作了

1
2
net.zero_grad()
out.backward(torch.randn(1, 10))

注意:

  • torch.nn构建的神经网络只支持mini-batches的输入,不支持单一样本的输入
  • 如:nn.Conv2d需要一个4D Tensor,形状为(nSamples, nChannels, Height, Width),如果输入只有单一样本的形式,则需要执行input.unsqueeze(0)主动将3D Tensor扩充为4D Tensor

损失函数

  • 损失函数的输入是一个输入的pair:(output, target),然后计算出一个数值来评估output和target之间的差距大小
  • 在torch.nn中有若干不同的损失函数可供使用,比如nn.MSELoss就是通过计算均方差损失来评估输入和目标值之间的差距
1
2
3
4
5
6
7
output = net(input)
target = torch.randn(10)
# 改变target的形状为二维张量,为了和output匹配
target = target.view(1, -1)
criterion = nn.MSELoss()
loss = criterion(output, target)
print(loss)

tensor(0.8665, grad_fn=)

  • 关于反向传播的链:如果我们跟踪loss反向传播的方向,使用.grad_fn属性打印,可以看到计算图如下,箭头反向就是反向传播了:

input -> conv2d -> relu -> maxpool2d -> conv2d -> relu -> maxpool2d -> view -> linear -> relu -> linear -> relu -> linear -> MSELoss -> loss

  • 当调用loss.backward()时,整张计算图将对loss进行自动求导,所有属性requires_grad=True的Tensors都将参与梯度求导的运算,并将梯度累加到Tensors中的.grad属性中
1
2
3
print(loss.grad_fn) # MSELoss
print(loss.grad_fn.next_functions[0][0]) # Linear
print(loss.grad_fn.next_functions[0][0].next_functions[0][0]) # ReLU

<MseLossBackward object at 0x7fefb0264390>

<AddmmBackward object at 0x7fefb027f350>

<AccumulateGrad object at 0x7fefb027f950>

反向传播

  • 在Pytorch中执行反向传播非常简便,全部的操作就是loss.backward()
  • 在执行反向传播之前,要先将梯度清零,否则梯度会在不同的批次数据之间被累加
1
2
3
4
5
6
7
# 梯度清零
net.zero_grad()
print('conv1.bias.grad before backward')
print(net.conv1.bias.grad)
loss.backward() # 反向传播
print('conv1.bias.grad after backward')
print(net.conv1.bias.grad)

conv1.bias.grad before backward

None

conv1.bias.grad before backward

tensor([ 0.0051, 0.0086, 0.0177, -0.0007, -0.0182, 0.0006])

更新网络参数

  • 更新参数最简单的算法就是SGD(随机梯度下降)
  • 具体的算法公式表达式为:\(weight = weight - learning_rate*gradient\)

传统的Python代码实现SGD

1
2
3
learning_rate = 0.01
for f in net.parameters():
f.data.sub_(f.grad.data * learning_rate)

使用Pytorch官方推荐的标准代码

1
2
3
4
5
6
7
8
9
10
11
import torch.optim as optim
# 通过optim创建优化器对象
optimizer = optim.SGD(net.parameters(), lr=0.01)
# 将优化器执行梯度清零的操作
optimizer.zero_grad()
output = net(input)
loss = criterion(output, target)
# 对损失值进行反向传播的操作
loss.backward()
# 参数更新通过一行标准代码来执行
optimizer.step()

使用Pytorch构建一个分类器

分类器任务和数据介绍

  • 构造一个将不同图像进行分类的神经网络分类器,对输入的图片进行判别并完成分类
  • 本案例采用CIFAR10数据集作为原始图片数据
  • CIFAR10数据集介绍:数据集中每张图片的尺寸是3 * 32 * 32
  • CIFAR10数据集总共有10种不同的分类,分别是

样例:

训练分类器的步骤

  1. 使用torchvision下载CIFAR10数据集
  2. 定义卷积神经网络
  3. 定义损失函数
  4. 在训练集上训练模型
  5. 在测试集上测试模型

使用torchvision下载CIFAR10数据集

1
2
3
4
# 导入torchvision包
import torch
import torchvision
import torchvision.transforms as transforms
1
2
3
4
5
6
7
8
9
10
11
12
# 下载数据集并对图片进行调整,因为torchvision数据集的输出是PILImage格式,
# 数据域在[0,1],我们将其转换为标准数据域[-1,1]的张量格式
transform = transforms.Compose(
[transforms.ToTensor(),
transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))]
)
trainset = torchvision.datasets.CIFAR10(root='./data', train=True, download=True, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=4, shuffle=True, num_workers=2)
testset = torchvision.datasets.CIFAR10(root='./data', train=False, download=True, transform=transform)
testloader = torch.utils.data.DataLoader(testset, batch_size=4, shuffle=False, num_workers=2)

classes = ("airplane", "automobile", "bird", "cat", "deer", "dog", "frog", "horse", "ship", "truck")

展示部分图片

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
import matplotlib.pyplot as plt
import numpy as np
# 构建展示图片的函数
def imshow(img):
img = img / 2 + 0.5
npimg = img.numpy() # Tensor转numpy
plt.imshow(np.transpose(npimg, (1, 2, 0)))# 维度转换
plt.show()
# 从数据迭代器中读取一张图片
dataiter = iter(trainloader)
images, labels = dataiter.next()
# 展示图片
imshow(torchvision.utils.make_grid(images))
# 打印标签
print(' '.join('%10s'%classes[labels[j]] for j in range(4)))

构建卷积神经网络

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
import torch.nn as nn
import torch.nn.functional as F
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.conv1 = nn.Conv2d(3, 6, 5)
self.pool = nn.MaxPool2d(2, 2)
self.conv2 = nn.Conv2d(6, 16, 5)
self.fc1 = nn.Linear(16*5*5, 120)
self.fc2 = nn.Linear(120, 84)
self.fc3 = nn.Linear(84, 10)
def forward(self, x):
x = self.pool(F.relu(self.conv1(x)))
x = self.pool(F.relu(self.conv2(x)))
x = x.view(-1, 16*5*5*5)
x = F.relu(self.fc1(x))
x = F.relu(self.fc2(x))
x = self.fc3(x)
return x

net = Net()

定义损失函数

1
2
3
import torch.optim as optim
criterion = nn.CrossEntropyLoss() # 交叉熵损失
optimizer = optim.SGD(net.parameters(), lr=0.001, momentum=0.9)# 随机梯度下降优化器

在训练集上训练模型

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
for epoch in range(2):
running_loss = 0.0
for i, data in enumerate(trainloader, 0):
inputs, labels = data
optimizer.zero_grad()
outputs = net(inputs)
# 计算损失
loss = criterion(outputs, labels)
# 反向传播+参数更新
loss.backward()
optimizer.step()

# 打印轮次和损失值
running_loss += loss.item()
if (i+1) % 2000 == 0:
print('[%d, %5d] loss: %.3f'%(epoch+1, i+1, running_loss/2000))

print("Finished Training")

[1, 2000] loss: 2.239

[1, 4000] loss: 4.129

[1, 6000] loss: 5.800

[1, 8000] loss: 7.396

[1, 10000] loss: 8.933

[1, 12000] loss: 10.440

[2, 2000] loss: 1.436

[2, 4000] loss: 2.850

[2, 6000] loss: 4.255

[2, 8000] loss: 5.605

[2, 10000] loss: 6.959

[2, 12000] loss: 8.272

Finished Training

保存模型

1
2
3
PATH = './cifar_net.pth'
# 保存模型的状态字典
torch.save(net.state_dict(), PATH)

在测试集上测试模型

展示测试集若干图片

1
2
3
4
5
6
dataiter = iter(testloader)
images, labels = dataiter.next()
# 打印原始图片
imshow(torchvision.utils.make_grid(images))
# 打印真实标签
print("GroundTruth: ", " ".join("%5s"%classes[labels[j]] for j in range(4)))

加载模型并对测试图片进行预测

1
2
3
4
5
6
7
8
9
10
# 实例化模型的类对象
net = Net()
# 加载保存的模型字典
net.load_state_dict(torch.load(PATH))
# 预测
outputs = net(images)
# 共有10个类别,采用模型计算出的概率最大的作为预测类别
_, predicted = torch.max(outputs, 1)
# 打印结果
print("Predicted:", " ".join("%5s"%classes[predicted[j]] for j in range(4)))

Predicted: cat automobile automobile airplane

1
2
3
4
5
6
7
8
9
10
11
correct = 0
total = 0
with torch.no_grad():
for data in testloader:
images, labels = data
outputs = net(images)
_, predicted = torch.max(outputs.data, 1)
total += labels.size(0)
correct += (predicted == labels).sum().item()

print("Accuracy of the network on the 10000 test images: %d %%" % (100 * correct/total))

Accuracy of the network on the 10000 test images: 52 %

分析结果

对于拥有10个类别的数据集,随机猜测的准确率是10%,模型达到了52%,说明模型学到了真实的东西

对各个类别上预测的准确率进行计算

1
2
3
4
5
6
7
8
9
10
11
12
13
14
class_correct = list(0. for i in range(10))
class_total = list(0. for i in range(10))
with torch.no_grad():
for data in testloader:
images, labels = data
outputs = net(images)
_, predicted = torch.max(outputs, 1)
c = (predicted == labels).squeeze()
for i in range(4):
label = labels[i]
class_correct[label] += c[i].item()
class_total[label] += 1
for i in range(10):
print("Accuracy of %5s: %2d %%" % (classes[i], 100*class_correct[i] / class_total[i]))

Accuracy of airplane: 60 %

Accuracy of automobile: 85 %

Accuracy of bird: 47 %

Accuracy of cat: 24 %

Accuracy of deer: 58 %

Accuracy of dog: 38 %

Accuracy of frog: 59 %

Accuracy of horse: 51 %

Accuracy of ship: 53 %

Accuracy of truck: 45 %

在GPU上训练模型

  • 为了真正利用Pytorch中Tensor的优秀属性,加速模型的训练,我们可以将训练过程转移到GPU上进行

定义设备,如果CUDA可用,则定义成GPU,否则为CPU

1
2
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
print(device)

cuda:0

当训练模型时,只需要将模型转移到GPU上,同时将输入的图片和标签转移到GPU上即可

1
2
net.to(device)
inputs, labels = data[0].to(device), data[1].to(device)