?網(wǎng)絡(luò)中的亮點(diǎn):
1.超深的網(wǎng)絡(luò)結(jié)構(gòu)(超過1000層)
2.提出residual(殘差)模塊
3.使用Batch Normalization加速訓(xùn)練(丟棄dropout)
左邊是將卷積層和池化層進(jìn)行一個(gè)簡單的堆疊所搭建的網(wǎng)絡(luò)結(jié)構(gòu)
20層的訓(xùn)練錯(cuò)誤率大概在1%~2%左右
56層的訓(xùn)練錯(cuò)誤率大概在7%~8%
所以通過簡單的卷積層和池化層的堆疊,并不是層數(shù)越深訓(xùn)練效果越好
隨著網(wǎng)絡(luò)層數(shù)不斷地加深,梯度消失和梯度爆炸這個(gè)現(xiàn)象會(huì)越來越明顯:
假設(shè)我們每一層的誤差梯度是一個(gè)小于1的數(shù),那么在我們的反向傳播過程中,
每向前傳播一次,都要乘以一個(gè)小于1的系數(shù),當(dāng)我們網(wǎng)絡(luò)越來越深的時(shí)候,結(jié)果就越趨近于0
這樣梯度就會(huì)越來越小
假設(shè)誤差梯度是一個(gè)大于1的數(shù),最后會(huì)發(fā)生梯度爆炸
通常解決梯度消失和梯度爆炸問題的方法:
標(biāo)準(zhǔn)化處理,權(quán)重初始化,BN(Batch Normalization)
退化問題:
在我們解決了梯度消失和梯度爆炸問題后,我們?nèi)匀粫?huì)存在層數(shù)深的效果不如層數(shù)淺的效果的問題
提出了殘差結(jié)構(gòu):
?左邊的殘差結(jié)構(gòu)主要是針對(duì)于網(wǎng)絡(luò)層數(shù)較少的網(wǎng)絡(luò)所使用的殘差結(jié)構(gòu)(ResNet-34)
主線是通過2個(gè)3*3的卷積層得到我們的一個(gè)結(jié)果,右邊有一條弧線從輸入連接到輸出
將卷積之后的特征矩陣與我們輸入的特征矩陣進(jìn)行相加,相加之后再通過Relu激活函數(shù)
能相加就要求主分支和側(cè)分支(shortcut)的輸出特征矩陣shape必須相同(shape:高,寬,channel)
右邊是針對(duì)網(wǎng)絡(luò)層數(shù)較多的網(wǎng)絡(luò)(50/101/152)
主線是先通過一個(gè)1*1的卷積層(降維),再通過一個(gè)3*3的卷積層,再通過一個(gè)1*1的卷積層(升維)
通過兩個(gè)網(wǎng)絡(luò)所需參數(shù)的對(duì)比可以發(fā)現(xiàn),殘差結(jié)構(gòu)越多,所節(jié)省的參數(shù)越多
?左邊的實(shí)線部分,輸入的特征矩陣與輸出的特征矩陣shape相同,所以可以直接相加
右邊虛線輸入與輸出shape不同
Batch Normalization
目的是使我們的一批(Batch)數(shù)據(jù)所對(duì)應(yīng)的feature map(特征矩陣)每一個(gè)維度(channel)滿足均值為0,方差為1的分布規(guī)律
?通過該方法能夠加速網(wǎng)絡(luò)的收斂(訓(xùn)練)并提升準(zhǔn)確率
?對(duì)于一個(gè)擁有d維的輸入x,我們將對(duì)它的每一個(gè)維度進(jìn)行標(biāo)準(zhǔn)化處理
假設(shè)我們輸入的x是RGB三通道的彩色圖像,這里的d就是圖像的channels,即d=3
?使用BN時(shí),在訓(xùn)練時(shí)將trainning參數(shù)設(shè)置為True,在驗(yàn)證時(shí)將trainning設(shè)置為False
將BN層放在卷積層和激活層的中間
遷移學(xué)習(xí)的簡介
優(yōu)勢(shì):
1.能夠快速的訓(xùn)練出一個(gè)理想的結(jié)果(訓(xùn)練的epoch較少)
2.當(dāng)數(shù)據(jù)集較少時(shí)也能訓(xùn)練出理想的結(jié)果
注意:使用別人的預(yù)訓(xùn)練模型參數(shù)時(shí),注意別人的預(yù)處理方式
?遷移學(xué)習(xí)就是將學(xué)習(xí)好的一些淺層網(wǎng)絡(luò)的參數(shù)遷移到我們新的網(wǎng)絡(luò)當(dāng)中來
這樣我們新的網(wǎng)絡(luò)也有了識(shí)別底層通用特征的能力了
常見的遷移學(xué)習(xí)方式:
1.載入權(quán)重后訓(xùn)練所有參數(shù)
2.載入權(quán)重后只訓(xùn)練最后幾層參數(shù)
3.載入權(quán)重后再原網(wǎng)絡(luò)的基礎(chǔ)上在添加一層全連接層,僅訓(xùn)練最后一個(gè)全連接層
ResNext
?組卷積
?g是組數(shù)
代碼實(shí)現(xiàn)
網(wǎng)絡(luò)搭建
import torch.nn as nn
import torch
class BasicBlock(nn.Module):#對(duì)應(yīng)的是18層和34層對(duì)應(yīng)的殘差結(jié)構(gòu)
expansion = 1#(擴(kuò)張)對(duì)應(yīng)的是殘差層的卷積核個(gè)數(shù)有沒有發(fā)生變化 18 34 layers都沒有發(fā)生變化
def __init__(self, in_channel, out_channel, stride=1, downsample=None, **kwargs):
super(BasicBlock, self).__init__()
self.conv1 = nn.Conv2d(in_channels=in_channel, out_channels=out_channel,
kernel_size=3, stride=stride, padding=1, bias=False)
self.bn1 = nn.BatchNorm2d(out_channel)
self.relu = nn.ReLU()
self.conv2 = nn.Conv2d(in_channels=out_channel, out_channels=out_channel,
kernel_size=3, stride=1, padding=1, bias=False)
self.bn2 = nn.BatchNorm2d(out_channel)
self.downsample = downsample
def forward(self, x):
identity = x#側(cè)分支的輸出值
if self.downsample is not None:
identity = self.downsample(x)
out = self.conv1(x)
out = self.bn1(out)
out = self.relu(out)
out = self.conv2(out)
out = self.bn2(out)
out += identity
out = self.relu(out)
return out
class Bottleneck(nn.Module):#50/101/152層的殘差結(jié)構(gòu)
"""
注意:原論文中,在虛線殘差結(jié)構(gòu)的主分支上,第一個(gè)1x1卷積層的步距是2,第二個(gè)3x3卷積層步距是1。
但在pytorch官方實(shí)現(xiàn)過程中是第一個(gè)1x1卷積層的步距是1,第二個(gè)3x3卷積層步距是2,
這么做的好處是能夠在top1上提升大概0.5%的準(zhǔn)確率。
可參考Resnet v1.5 https://ngc.nvidia.com/catalog/model-scripts/nvidia:resnet_50_v1_5_for_pytorch
"""
expansion = 4#4對(duì)應(yīng)的就是殘差結(jié)構(gòu)所使用的卷積核的變化
def __init__(self, in_channel, out_channel, stride=1, downsample=None,
groups=1, width_per_group=64):
super(Bottleneck, self).__init__()
width = int(out_channel * (width_per_group / 64.)) * groups
self.conv1 = nn.Conv2d(in_channels=in_channel, out_channels=width,
kernel_size=1, stride=1, bias=False) # squeeze channels
self.bn1 = nn.BatchNorm2d(width)
# -----------------------------------------
self.conv2 = nn.Conv2d(in_channels=width, out_channels=width, groups=groups,
kernel_size=3, stride=stride, bias=False, padding=1)
self.bn2 = nn.BatchNorm2d(width)
# -----------------------------------------
self.conv3 = nn.Conv2d(in_channels=width, out_channels=out_channel*self.expansion,
kernel_size=1, stride=1, bias=False) # unsqueeze channels
self.bn3 = nn.BatchNorm2d(out_channel*self.expansion)
self.relu = nn.ReLU(inplace=True)
self.downsample = downsample
def forward(self, x):
identity = x
if self.downsample is not None:
identity = self.downsample(x)
out = self.conv1(x)
out = self.bn1(out)
out = self.relu(out)
out = self.conv2(out)
out = self.bn2(out)
out = self.relu(out)
out = self.conv3(out)
out = self.bn3(out)
out += identity
out = self.relu(out)
return out
class ResNet(nn.Module):
def __init__(self,
block,
blocks_num,
num_classes=1000,
include_top=True,
groups=1,
width_per_group=64):
super(ResNet, self).__init__()
self.include_top = include_top
self.in_channel = 64
self.groups = groups
self.width_per_group = width_per_group
self.conv1 = nn.Conv2d(3, self.in_channel, kernel_size=7, stride=2,
padding=3, bias=False)
self.bn1 = nn.BatchNorm2d(self.in_channel)
self.relu = nn.ReLU(inplace=True)
self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, padding=1)
self.layer1 = self._make_layer(block, 64, blocks_num[0])
self.layer2 = self._make_layer(block, 128, blocks_num[1], stride=2)
self.layer3 = self._make_layer(block, 256, blocks_num[2], stride=2)
self.layer4 = self._make_layer(block, 512, blocks_num[3], stride=2)
if self.include_top:
self.avgpool = nn.AdaptiveAvgPool2d((1, 1)) # output size = (1, 1)
self.fc = nn.Linear(512 * block.expansion, num_classes)
for m in self.modules():
if isinstance(m, nn.Conv2d):
nn.init.kaiming_normal_(m.weight, mode='fan_out', nonlinearity='relu')
def _make_layer(self, block, channel, block_num, stride=1):
downsample = None
if stride != 1 or self.in_channel != channel * block.expansion:
downsample = nn.Sequential(
nn.Conv2d(self.in_channel, channel * block.expansion, kernel_size=1, stride=stride, bias=False),
nn.BatchNorm2d(channel * block.expansion))
layers = []
layers.append(block(self.in_channel,
channel,
downsample=downsample,
stride=stride,
groups=self.groups,
width_per_group=self.width_per_group))
self.in_channel = channel * block.expansion
for _ in range(1, block_num):
layers.append(block(self.in_channel,
channel,
groups=self.groups,
width_per_group=self.width_per_group))
return nn.Sequential(*layers)
def forward(self, x):
x = self.conv1(x)
x = self.bn1(x)
x = self.relu(x)
x = self.maxpool(x)
x = self.layer1(x)
x = self.layer2(x)
x = self.layer3(x)
x = self.layer4(x)
if self.include_top:
x = self.avgpool(x)
x = torch.flatten(x, 1)
x = self.fc(x)
return x
def resnet34(num_classes=1000, include_top=True):
# https://download.pytorch.org/models/resnet34-333f7ec4.pth
return ResNet(BasicBlock, [3, 4, 6, 3], num_classes=num_classes, include_top=include_top)
def resnet50(num_classes=1000, include_top=True):
# https://download.pytorch.org/models/resnet50-19c8e357.pth
return ResNet(Bottleneck, [3, 4, 6, 3], num_classes=num_classes, include_top=include_top)
def resnet101(num_classes=1000, include_top=True):
# https://download.pytorch.org/models/resnet101-5d3b4d8f.pth
return ResNet(Bottleneck, [3, 4, 23, 3], num_classes=num_classes, include_top=include_top)
def resnext50_32x4d(num_classes=1000, include_top=True):
# https://download.pytorch.org/models/resnext50_32x4d-7cdf4587.pth
groups = 32
width_per_group = 4
return ResNet(Bottleneck, [3, 4, 6, 3],
num_classes=num_classes,
include_top=include_top,
groups=groups,
width_per_group=width_per_group)
def resnext101_32x8d(num_classes=1000, include_top=True):
# https://download.pytorch.org/models/resnext101_32x8d-8ba56ff5.pth
groups = 32
width_per_group = 8
return ResNet(Bottleneck, [3, 4, 23, 3],
num_classes=num_classes,
include_top=include_top,
groups=groups,
width_per_group=width_per_group)
?訓(xùn)練模塊
import os
import sys
import json
import torch
import torch.nn as nn
import torch.optim as optim
from torchvision import transforms, datasets
from tqdm import tqdm
from model import resnet34
def main():
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
print("using {} device.".format(device))
data_transform = {
"train": transforms.Compose([transforms.RandomResizedCrop(224),
transforms.RandomHorizontalFlip(),
transforms.ToTensor(),
transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])]),
"val": transforms.Compose([transforms.Resize(256),
transforms.CenterCrop(224),
transforms.ToTensor(),
transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])])}
data_root = os.path.abspath(os.path.join(os.getcwd(), "../..")) # get data root path
image_path = os.path.join(data_root, "data_set", "flower_data") # flower data set path
assert os.path.exists(image_path), "{} path does not exist.".format(image_path)
train_dataset = datasets.ImageFolder(root=os.path.join(image_path, "train"),
transform=data_transform["train"])
train_num = len(train_dataset)
# {'daisy':0, 'dandelion':1, 'roses':2, 'sunflower':3, 'tulips':4}
flower_list = train_dataset.class_to_idx
cla_dict = dict((val, key) for key, val in flower_list.items())
# write dict into json file
json_str = json.dumps(cla_dict, indent=4)
with open('class_indices.json', 'w') as json_file:
json_file.write(json_str)
batch_size = 16
nw = min([os.cpu_count(), batch_size if batch_size > 1 else 0, 8]) # number of workers
print('Using {} dataloader workers every process'.format(nw))
train_loader = torch.utils.data.DataLoader(train_dataset,
batch_size=batch_size, shuffle=True,
num_workers=0)
validate_dataset = datasets.ImageFolder(root=os.path.join(image_path, "val"),
transform=data_transform["val"])
val_num = len(validate_dataset)
validate_loader = torch.utils.data.DataLoader(validate_dataset,
batch_size=batch_size, shuffle=False,
num_workers=0)
print("using {} images for training, {} images for validation.".format(train_num,
val_num))
net = resnet34()
# load pretrain weights
# download url: https://download.pytorch.org/models/resnet34-333f7ec4.pth
model_weight_path = "./resnet34-pre.pth"#載入遷移學(xué)習(xí)模型參數(shù)
assert os.path.exists(model_weight_path), "file {} does not exist.".format(model_weight_path)
net.load_state_dict(torch.load(model_weight_path, map_location='cpu'))
# for param in net.parameters():
# param.requires_grad = False
# change fc layer structure
in_channel = net.fc.in_features
net.fc = nn.Linear(in_channel, 5)
net.to(device)
# define loss function
loss_function = nn.CrossEntropyLoss()
# construct an optimizer
params = [p for p in net.parameters() if p.requires_grad]
optimizer = optim.Adam(params, lr=0.0001)
epochs = 3
best_acc = 0.0
save_path = './resNet34.pth'
train_steps = len(train_loader)
for epoch in range(epochs):
# train
net.train()
running_loss = 0.0
train_bar = tqdm(train_loader, file=sys.stdout)
for step, data in enumerate(train_bar):
images, labels = data
optimizer.zero_grad()
logits = net(images.to(device))
loss = loss_function(logits, labels.to(device))
loss.backward()
optimizer.step()
# print statistics
running_loss += loss.item()
train_bar.desc = "train epoch[{}/{}] loss:{:.3f}".format(epoch + 1,
epochs,
loss)
# validate
net.eval()
acc = 0.0 # accumulate accurate number / epoch
with torch.no_grad():
val_bar = tqdm(validate_loader, file=sys.stdout)
for val_data in val_bar:
val_images, val_labels = val_data
outputs = net(val_images.to(device))
# loss = loss_function(outputs, test_labels)
predict_y = torch.max(outputs, dim=1)[1]
acc += torch.eq(predict_y, val_labels.to(device)).sum().item()
val_bar.desc = "valid epoch[{}/{}]".format(epoch + 1,
epochs)
val_accurate = acc / val_num
print('[epoch %d] train_loss: %.3f val_accuracy: %.3f' %
(epoch + 1, running_loss / train_steps, val_accurate))
if val_accurate > best_acc:
best_acc = val_accurate
torch.save(net.state_dict(), save_path)
print('Finished Training')
if __name__ == '__main__':
main()
訓(xùn)練結(jié)果
預(yù)測(cè)模塊文章來源:http://www.zghlxwxcb.cn/news/detail-500075.html
import os
import json
import torch
from PIL import Image
from torchvision import transforms
import matplotlib.pyplot as plt
from model import resnet34
def main():
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
data_transform = transforms.Compose(
[transforms.Resize(256),
transforms.CenterCrop(224),
transforms.ToTensor(),
transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])])
# load image
img_path = "../tulip.jpg"
assert os.path.exists(img_path), "file: '{}' dose not exist.".format(img_path)
img = Image.open(img_path)
plt.imshow(img)
# [N, C, H, W]
img = data_transform(img)
# expand batch dimension
img = torch.unsqueeze(img, dim=0)
# read class_indict
json_path = './class_indices.json'
assert os.path.exists(json_path), "file: '{}' dose not exist.".format(json_path)
with open(json_path, "r") as f:
class_indict = json.load(f)
# create model
model = resnet34(num_classes=5).to(device)
# load model weights
weights_path = "./resNet34.pth"
assert os.path.exists(weights_path), "file: '{}' dose not exist.".format(weights_path)
model.load_state_dict(torch.load(weights_path, map_location=device))
# prediction
model.eval()
with torch.no_grad():
# predict class
output = torch.squeeze(model(img.to(device))).cpu()
predict = torch.softmax(output, dim=0)
predict_cla = torch.argmax(predict).numpy()
print_res = "class: {} prob: {:.3}".format(class_indict[str(predict_cla)],
predict[predict_cla].numpy())
plt.title(print_res)
for i in range(len(predict)):
print("class: {:10} prob: {:.3}".format(class_indict[str(i)],
predict[i].numpy()))
plt.show()
if __name__ == '__main__':
main()
文章來源地址http://www.zghlxwxcb.cn/news/detail-500075.html
到了這里,關(guān)于ResNet網(wǎng)絡(luò)結(jié)構(gòu),BN以及遷移學(xué)習(xí)詳解的文章就介紹完了。如果您還想了解更多內(nèi)容,請(qǐng)?jiān)谟疑辖撬阉鱐OY模板網(wǎng)以前的文章或繼續(xù)瀏覽下面的相關(guān)文章,希望大家以后多多支持TOY模板網(wǎng)!