前面實現(xiàn)了基于多層感知機的MNIST手寫體識別,本章將實現(xiàn)以卷積神經(jīng)網(wǎng)絡(luò)完成的MNIST手寫體識別。
1.? 數(shù)據(jù)的準備
在本例中,依舊使用MNIST數(shù)據(jù)集,對這個數(shù)據(jù)集的數(shù)據(jù)和標簽介紹,前面的章節(jié)已詳細說明過了,相對于前面章節(jié)直接對數(shù)據(jù)進行“折疊”處理,這里需要顯式地標注出數(shù)據(jù)的通道,代碼如下:
import numpy as np
import einops.layers.torch as elt
#載入數(shù)據(jù)
x_train = np.load("../dataset/mnist/x_train.npy")
y_train_label = np.load("../dataset/mnist/y_train_label.npy")
x_train = np.expand_dims(x_train,axis=1)?? #在指定維度上進行擴充
print(x_train.shape)
這里是對數(shù)據(jù)的修正,np.expand_dims的作用是在指定維度上進行擴充,這里在第二維(也就是PyTorch的通道維度)進行擴充,結(jié)果如下:
(60000, 1, 28, 28)
2.? 模型的設(shè)計
下面使用PyTorch 2.0框架對模型進行設(shè)計,在本例中將使用卷積層對數(shù)據(jù)進行處理,完整的模型如下:
import torch
import torch.nn as nn
import numpy as np
import einops.layers.torch as elt
class MnistNetword(nn.Module):
def __init__(self):
super(MnistNetword, self).__init__()
#前置的特征提取模塊
self.convs_stack = nn.Sequential(
nn.Conv2d(1,12,kernel_size=7), #第一個卷積層
nn.ReLU(),
nn.Conv2d(12,24,kernel_size=5), #第二個卷積層
nn.ReLU(),
nn.Conv2d(24,6,kernel_size=3) #第三個卷積層
)
#最終分類器層
self.logits_layer = nn.Linear(in_features=1536,out_features=10)
def forward(self,inputs):
image = inputs
x = self.convs_stack(image)
#elt.Rearrange的作用是對輸入數(shù)據(jù)的維度進行調(diào)整,讀者可以使用torch.nn.Flatten函數(shù)完成此工作
x = elt.Rearrange("b c h w -> b (c h w)")(x)
logits = self.logits_layer(x)
return logits
model = MnistNetword()
torch.save(model,"model.pth")
這里首先設(shè)定了3個卷積層作為前置的特征提取層,最后一個全連接層作為分類器層,需要注意的是,對于分類器的全連接層,輸入維度需要手動計算,當然讀者可以一步一步嘗試打印特征提取層的結(jié)果,依次將結(jié)果作為下一層的輸入維度。最后對模型進行保存。
3.? 基于卷積的MNIST分類模型
下面進入本章的最后示例部分,也就是MNIST手寫體的分類。完整的訓練代碼如下:
import torch
import torch.nn as nn
import numpy as np
import einops.layers.torch as elt
#載入數(shù)據(jù)
x_train = np.load("../dataset/mnist/x_train.npy")
y_train_label = np.load("../dataset/mnist/y_train_label.npy")
x_train = np.expand_dims(x_train,axis=1)
print(x_train.shape)
class MnistNetword(nn.Module):
def __init__(self):
super(MnistNetword, self).__init__()
self.convs_stack = nn.Sequential(
nn.Conv2d(1,12,kernel_size=7),
nn.ReLU(),
nn.Conv2d(12,24,kernel_size=5),
nn.ReLU(),
nn.Conv2d(24,6,kernel_size=3)
)
self.logits_layer = nn.Linear(in_features=1536,out_features=10)
def forward(self,inputs):
image = inputs
x = self.convs_stack(image)
x = elt.Rearrange("b c h w -> b (c h w)")(x)
logits = self.logits_layer(x)
return logits
device = "cuda" if torch.cuda.is_available() else "cpu"
#注意記得將model發(fā)送到GPU計算
model = MnistNetword().to(device)
model = torch.compile(model)
loss_fn = nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(model.parameters(), lr=1e-4)
batch_size = 128
for epoch in range(42):
train_num = len(x_train)//128
train_loss = 0.
for i in range(train_num):
start = i * batch_size
end = (i + 1) * batch_size
x_batch = torch.tensor(x_train[start:end]).to(device)
y_batch = torch.tensor(y_train_label[start:end]).to(device)
pred = model(x_batch)
loss = loss_fn(pred, y_batch)
optimizer.zero_grad()
loss.backward()
optimizer.step()
train_loss += loss.item() # 記錄每個批次的損失值
# 計算并打印損失值
train_loss /= train_num
accuracy = (pred.argmax(1) == y_batch).type(torch.float32).sum().item() / batch_size
print("epoch:",epoch,"train_loss:", round(train_loss,2),"accuracy:",round(accuracy,2))
在這里,我們使用了本章新定義的卷積神經(jīng)網(wǎng)絡(luò)模塊作為局部特征抽取,而對于其他的損失函數(shù)以及優(yōu)化函數(shù),只使用了與前期一樣的模式進行模型訓練。最終結(jié)果如下所示,請讀者自行驗證。
(60000, 1, 28, 28)
epoch: 0 train_loss: 2.3 accuracy: 0.11
epoch: 1 train_loss: 2.3 accuracy: 0.13
epoch: 2 train_loss: 2.3 accuracy: 0.2
epoch: 3 train_loss: 2.3 accuracy: 0.18
…
epoch: 58 train_loss: 0.5 accuracy: 0.98
epoch: 59 train_loss: 0.49 accuracy: 0.98
epoch: 60 train_loss: 0.49 accuracy: 0.98
epoch: 61 train_loss: 0.48 accuracy: 0.98
epoch: 62 train_loss: 0.48 accuracy: 0.98
Process finished with exit code 0
本文節(jié)選自《PyTorch 2.0深度學習從零開始學》,本書實戰(zhàn)案例豐富,可帶領(lǐng)讀者快速掌握深度學習算法及其常見案例。文章來源:http://www.zghlxwxcb.cn/news/detail-683502.html
???文章來源地址http://www.zghlxwxcb.cn/news/detail-683502.html
到了這里,關(guān)于實戰(zhàn):基于卷積的MNIST手寫體分類的文章就介紹完了。如果您還想了解更多內(nèi)容,請在右上角搜索TOY模板網(wǎng)以前的文章或繼續(xù)瀏覽下面的相關(guān)文章,希望大家以后多多支持TOY模板網(wǎng)!