ResNet網(wǎng)絡(luò)結(jié)構(gòu)，BN以及遷移學(xué)習(xí)詳解

這篇具有很好參考價(jià)值的文章主要介紹了ResNet網(wǎng)絡(luò)結(jié)構(gòu)，BN以及遷移學(xué)習(xí)詳解。希望對(duì)大家有所幫助。如果存在錯(cuò)誤或未考慮完全的地方，請(qǐng)大家不吝賜教，您也可以點(diǎn)擊"舉報(bào)違法"按鈕提交疑問。

?網(wǎng)絡(luò)中的亮點(diǎn)：

1.超深的網(wǎng)絡(luò)結(jié)構(gòu)（超過1000層）

2.提出residual(殘差)模塊

3.使用Batch Normalization加速訓(xùn)練（丟棄dropout） ResNet網(wǎng)絡(luò)結(jié)構(gòu)，BN以及遷移學(xué)習(xí)詳解

左邊是將卷積層和池化層進(jìn)行一個(gè)簡單的堆疊所搭建的網(wǎng)絡(luò)結(jié)構(gòu)

20層的訓(xùn)練錯(cuò)誤率大概在1%~2%左右

56層的訓(xùn)練錯(cuò)誤率大概在7%~8%

所以通過簡單的卷積層和池化層的堆疊，并不是層數(shù)越深訓(xùn)練效果越好

隨著網(wǎng)絡(luò)層數(shù)不斷地加深，梯度消失和梯度爆炸這個(gè)現(xiàn)象會(huì)越來越明顯：

假設(shè)我們每一層的誤差梯度是一個(gè)小于1的數(shù)，那么在我們的反向傳播過程中，

每向前傳播一次，都要乘以一個(gè)小于1的系數(shù)，當(dāng)我們網(wǎng)絡(luò)越來越深的時(shí)候，結(jié)果就越趨近于0

這樣梯度就會(huì)越來越小

假設(shè)誤差梯度是一個(gè)大于1的數(shù)，最后會(huì)發(fā)生梯度爆炸

通常解決梯度消失和梯度爆炸問題的方法：

標(biāo)準(zhǔn)化處理，權(quán)重初始化，BN（Batch Normalization）

退化問題：

在我們解決了梯度消失和梯度爆炸問題后，我們?nèi)匀粫?huì)存在層數(shù)深的效果不如層數(shù)淺的效果的問題

提出了殘差結(jié)構(gòu):

ResNet網(wǎng)絡(luò)結(jié)構(gòu)，BN以及遷移學(xué)習(xí)詳解

?左邊的殘差結(jié)構(gòu)主要是針對(duì)于網(wǎng)絡(luò)層數(shù)較少的網(wǎng)絡(luò)所使用的殘差結(jié)構(gòu)（ResNet-34）

主線是通過2個(gè)3*3的卷積層得到我們的一個(gè)結(jié)果，右邊有一條弧線從輸入連接到輸出

將卷積之后的特征矩陣與我們輸入的特征矩陣進(jìn)行相加，相加之后再通過Relu激活函數(shù)

能相加就要求主分支和側(cè)分支（shortcut）的輸出特征矩陣shape必須相同（shape：高，寬，channel）

右邊是針對(duì)網(wǎng)絡(luò)層數(shù)較多的網(wǎng)絡(luò)（50/101/152）

主線是先通過一個(gè)1*1的卷積層（降維），再通過一個(gè)3*3的卷積層，再通過一個(gè)1*1的卷積層（升維）

通過兩個(gè)網(wǎng)絡(luò)所需參數(shù)的對(duì)比可以發(fā)現(xiàn)，殘差結(jié)構(gòu)越多，所節(jié)省的參數(shù)越多

ResNet網(wǎng)絡(luò)結(jié)構(gòu)，BN以及遷移學(xué)習(xí)詳解

?左邊的實(shí)線部分，輸入的特征矩陣與輸出的特征矩陣shape相同，所以可以直接相加

右邊虛線輸入與輸出shape不同

Batch Normalization

目的是使我們的一批(Batch)數(shù)據(jù)所對(duì)應(yīng)的feature map(特征矩陣)每一個(gè)維度（channel）滿足均值為0，方差為1的分布規(guī)律

?通過該方法能夠加速網(wǎng)絡(luò)的收斂（訓(xùn)練）并提升準(zhǔn)確率

?對(duì)于一個(gè)擁有d維的輸入x，我們將對(duì)它的每一個(gè)維度進(jìn)行標(biāo)準(zhǔn)化處理

假設(shè)我們輸入的x是RGB三通道的彩色圖像，這里的d就是圖像的channels，即d=3

ResNet網(wǎng)絡(luò)結(jié)構(gòu)，BN以及遷移學(xué)習(xí)詳解

?使用BN時(shí)，在訓(xùn)練時(shí)將trainning參數(shù)設(shè)置為True，在驗(yàn)證時(shí)將trainning設(shè)置為False

將BN層放在卷積層和激活層的中間

遷移學(xué)習(xí)的簡介

優(yōu)勢(shì)：

1.能夠快速的訓(xùn)練出一個(gè)理想的結(jié)果(訓(xùn)練的epoch較少)

2.當(dāng)數(shù)據(jù)集較少時(shí)也能訓(xùn)練出理想的結(jié)果

注意：使用別人的預(yù)訓(xùn)練模型參數(shù)時(shí)，注意別人的預(yù)處理方式

ResNet網(wǎng)絡(luò)結(jié)構(gòu)，BN以及遷移學(xué)習(xí)詳解

?遷移學(xué)習(xí)就是將學(xué)習(xí)好的一些淺層網(wǎng)絡(luò)的參數(shù)遷移到我們新的網(wǎng)絡(luò)當(dāng)中來

這樣我們新的網(wǎng)絡(luò)也有了識(shí)別底層通用特征的能力了

常見的遷移學(xué)習(xí)方式：

1.載入權(quán)重后訓(xùn)練所有參數(shù)

2.載入權(quán)重后只訓(xùn)練最后幾層參數(shù)

3.載入權(quán)重后再原網(wǎng)絡(luò)的基礎(chǔ)上在添加一層全連接層，僅訓(xùn)練最后一個(gè)全連接層

ResNext

ResNet網(wǎng)絡(luò)結(jié)構(gòu)，BN以及遷移學(xué)習(xí)詳解

ResNet網(wǎng)絡(luò)結(jié)構(gòu)，BN以及遷移學(xué)習(xí)詳解 ?組卷積

ResNet網(wǎng)絡(luò)結(jié)構(gòu)，BN以及遷移學(xué)習(xí)詳解

?g是組數(shù)

代碼實(shí)現(xiàn)

網(wǎng)絡(luò)搭建

import torch.nn as nn
import torch


class BasicBlock(nn.Module):#對(duì)應(yīng)的是18層和34層對(duì)應(yīng)的殘差結(jié)構(gòu)
    expansion = 1#（擴(kuò)張）對(duì)應(yīng)的是殘差層的卷積核個(gè)數(shù)有沒有發(fā)生變化  18 34 layers都沒有發(fā)生變化

    def __init__(self, in_channel, out_channel, stride=1, downsample=None, **kwargs):
        super(BasicBlock, self).__init__()
        self.conv1 = nn.Conv2d(in_channels=in_channel, out_channels=out_channel,
                               kernel_size=3, stride=stride, padding=1, bias=False)
        self.bn1 = nn.BatchNorm2d(out_channel)
        self.relu = nn.ReLU()
        self.conv2 = nn.Conv2d(in_channels=out_channel, out_channels=out_channel,
                               kernel_size=3, stride=1, padding=1, bias=False)
        self.bn2 = nn.BatchNorm2d(out_channel)
        self.downsample = downsample

    def forward(self, x):
        identity = x#側(cè)分支的輸出值
        if self.downsample is not None:
            identity = self.downsample(x)

        out = self.conv1(x)
        out = self.bn1(out)
        out = self.relu(out)

        out = self.conv2(out)
        out = self.bn2(out)

        out += identity
        out = self.relu(out)

        return out


class Bottleneck(nn.Module):#50/101/152層的殘差結(jié)構(gòu)
    """
    注意：原論文中，在虛線殘差結(jié)構(gòu)的主分支上，第一個(gè)1x1卷積層的步距是2，第二個(gè)3x3卷積層步距是1。
    但在pytorch官方實(shí)現(xiàn)過程中是第一個(gè)1x1卷積層的步距是1，第二個(gè)3x3卷積層步距是2，
    這么做的好處是能夠在top1上提升大概0.5%的準(zhǔn)確率。
    可參考Resnet v1.5 https://ngc.nvidia.com/catalog/model-scripts/nvidia:resnet_50_v1_5_for_pytorch
    """
    expansion = 4#4對(duì)應(yīng)的就是殘差結(jié)構(gòu)所使用的卷積核的變化

    def __init__(self, in_channel, out_channel, stride=1, downsample=None,
                 groups=1, width_per_group=64):
        super(Bottleneck, self).__init__()

        width = int(out_channel * (width_per_group / 64.)) * groups

        self.conv1 = nn.Conv2d(in_channels=in_channel, out_channels=width,
                               kernel_size=1, stride=1, bias=False)  # squeeze channels
        self.bn1 = nn.BatchNorm2d(width)
        # -----------------------------------------
        self.conv2 = nn.Conv2d(in_channels=width, out_channels=width, groups=groups,
                               kernel_size=3, stride=stride, bias=False, padding=1)
        self.bn2 = nn.BatchNorm2d(width)
        # -----------------------------------------
        self.conv3 = nn.Conv2d(in_channels=width, out_channels=out_channel*self.expansion,
                               kernel_size=1, stride=1, bias=False)  # unsqueeze channels
        self.bn3 = nn.BatchNorm2d(out_channel*self.expansion)
        self.relu = nn.ReLU(inplace=True)
        self.downsample = downsample

    def forward(self, x):
        identity = x
        if self.downsample is not None:
            identity = self.downsample(x)

        out = self.conv1(x)
        out = self.bn1(out)
        out = self.relu(out)

        out = self.conv2(out)
        out = self.bn2(out)
        out = self.relu(out)

        out = self.conv3(out)
        out = self.bn3(out)

        out += identity
        out = self.relu(out)

        return out


class ResNet(nn.Module):

    def __init__(self,
                 block,
                 blocks_num,
                 num_classes=1000,
                 include_top=True,
                 groups=1,
                 width_per_group=64):
        super(ResNet, self).__init__()
        self.include_top = include_top
        self.in_channel = 64

        self.groups = groups
        self.width_per_group = width_per_group

        self.conv1 = nn.Conv2d(3, self.in_channel, kernel_size=7, stride=2,
                               padding=3, bias=False)
        self.bn1 = nn.BatchNorm2d(self.in_channel)
        self.relu = nn.ReLU(inplace=True)
        self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, padding=1)
        self.layer1 = self._make_layer(block, 64, blocks_num[0])
        self.layer2 = self._make_layer(block, 128, blocks_num[1], stride=2)
        self.layer3 = self._make_layer(block, 256, blocks_num[2], stride=2)
        self.layer4 = self._make_layer(block, 512, blocks_num[3], stride=2)
        if self.include_top:
            self.avgpool = nn.AdaptiveAvgPool2d((1, 1))  # output size = (1, 1)
            self.fc = nn.Linear(512 * block.expansion, num_classes)

        for m in self.modules():
            if isinstance(m, nn.Conv2d):
                nn.init.kaiming_normal_(m.weight, mode='fan_out', nonlinearity='relu')

    def _make_layer(self, block, channel, block_num, stride=1):
        downsample = None
        if stride != 1 or self.in_channel != channel * block.expansion:
            downsample = nn.Sequential(
                nn.Conv2d(self.in_channel, channel * block.expansion, kernel_size=1, stride=stride, bias=False),
                nn.BatchNorm2d(channel * block.expansion))

        layers = []
        layers.append(block(self.in_channel,
                            channel,
                            downsample=downsample,
                            stride=stride,
                            groups=self.groups,
                            width_per_group=self.width_per_group))
        self.in_channel = channel * block.expansion

        for _ in range(1, block_num):
            layers.append(block(self.in_channel,
                                channel,
                                groups=self.groups,
                                width_per_group=self.width_per_group))

        return nn.Sequential(*layers)

    def forward(self, x):
        x = self.conv1(x)
        x = self.bn1(x)
        x = self.relu(x)
        x = self.maxpool(x)

        x = self.layer1(x)
        x = self.layer2(x)
        x = self.layer3(x)
        x = self.layer4(x)

        if self.include_top:
            x = self.avgpool(x)
            x = torch.flatten(x, 1)
            x = self.fc(x)

        return x


def resnet34(num_classes=1000, include_top=True):
    # https://download.pytorch.org/models/resnet34-333f7ec4.pth
    return ResNet(BasicBlock, [3, 4, 6, 3], num_classes=num_classes, include_top=include_top)


def resnet50(num_classes=1000, include_top=True):
    # https://download.pytorch.org/models/resnet50-19c8e357.pth
    return ResNet(Bottleneck, [3, 4, 6, 3], num_classes=num_classes, include_top=include_top)


def resnet101(num_classes=1000, include_top=True):
    # https://download.pytorch.org/models/resnet101-5d3b4d8f.pth
    return ResNet(Bottleneck, [3, 4, 23, 3], num_classes=num_classes, include_top=include_top)


def resnext50_32x4d(num_classes=1000, include_top=True):
    # https://download.pytorch.org/models/resnext50_32x4d-7cdf4587.pth
    groups = 32
    width_per_group = 4
    return ResNet(Bottleneck, [3, 4, 6, 3],
                  num_classes=num_classes,
                  include_top=include_top,
                  groups=groups,
                  width_per_group=width_per_group)


def resnext101_32x8d(num_classes=1000, include_top=True):
    # https://download.pytorch.org/models/resnext101_32x8d-8ba56ff5.pth
    groups = 32
    width_per_group = 8
    return ResNet(Bottleneck, [3, 4, 23, 3],
                  num_classes=num_classes,
                  include_top=include_top,
                  groups=groups,
                  width_per_group=width_per_group)

?訓(xùn)練模塊

import os
import sys
import json

import torch
import torch.nn as nn
import torch.optim as optim
from torchvision import transforms, datasets
from tqdm import tqdm

from model import resnet34


def main():
    device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
    print("using {} device.".format(device))

    data_transform = {
        "train": transforms.Compose([transforms.RandomResizedCrop(224),
                                     transforms.RandomHorizontalFlip(),
                                     transforms.ToTensor(),
                                     transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])]),
        "val": transforms.Compose([transforms.Resize(256),
                                   transforms.CenterCrop(224),
                                   transforms.ToTensor(),
                                   transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])])}

    data_root = os.path.abspath(os.path.join(os.getcwd(), "../.."))  # get data root path
    image_path = os.path.join(data_root, "data_set", "flower_data")  # flower data set path
    assert os.path.exists(image_path), "{} path does not exist.".format(image_path)
    train_dataset = datasets.ImageFolder(root=os.path.join(image_path, "train"),
                                         transform=data_transform["train"])
    train_num = len(train_dataset)

    # {'daisy':0, 'dandelion':1, 'roses':2, 'sunflower':3, 'tulips':4}
    flower_list = train_dataset.class_to_idx
    cla_dict = dict((val, key) for key, val in flower_list.items())
    # write dict into json file
    json_str = json.dumps(cla_dict, indent=4)
    with open('class_indices.json', 'w') as json_file:
        json_file.write(json_str)

    batch_size = 16
    nw = min([os.cpu_count(), batch_size if batch_size > 1 else 0, 8])  # number of workers
    print('Using {} dataloader workers every process'.format(nw))

    train_loader = torch.utils.data.DataLoader(train_dataset,
                                               batch_size=batch_size, shuffle=True,
                                               num_workers=0)

    validate_dataset = datasets.ImageFolder(root=os.path.join(image_path, "val"),
                                            transform=data_transform["val"])
    val_num = len(validate_dataset)
    validate_loader = torch.utils.data.DataLoader(validate_dataset,
                                                  batch_size=batch_size, shuffle=False,
                                                  num_workers=0)

    print("using {} images for training, {} images for validation.".format(train_num,
                                                                           val_num))
    
    net = resnet34()
    # load pretrain weights
    # download url: https://download.pytorch.org/models/resnet34-333f7ec4.pth
    model_weight_path = "./resnet34-pre.pth"#載入遷移學(xué)習(xí)模型參數(shù)
    assert os.path.exists(model_weight_path), "file {} does not exist.".format(model_weight_path)
    net.load_state_dict(torch.load(model_weight_path, map_location='cpu'))
    # for param in net.parameters():
    #     param.requires_grad = False

    # change fc layer structure
    in_channel = net.fc.in_features
    net.fc = nn.Linear(in_channel, 5)
    net.to(device)

    # define loss function
    loss_function = nn.CrossEntropyLoss()

    # construct an optimizer
    params = [p for p in net.parameters() if p.requires_grad]
    optimizer = optim.Adam(params, lr=0.0001)

    epochs = 3
    best_acc = 0.0
    save_path = './resNet34.pth'
    train_steps = len(train_loader)
    for epoch in range(epochs):
        # train
        net.train()
        running_loss = 0.0
        train_bar = tqdm(train_loader, file=sys.stdout)
        for step, data in enumerate(train_bar):
            images, labels = data
            optimizer.zero_grad()
            logits = net(images.to(device))
            loss = loss_function(logits, labels.to(device))
            loss.backward()
            optimizer.step()

            # print statistics
            running_loss += loss.item()

            train_bar.desc = "train epoch[{}/{}] loss:{:.3f}".format(epoch + 1,
                                                                     epochs,
                                                                     loss)

        # validate
        net.eval()
        acc = 0.0  # accumulate accurate number / epoch
        with torch.no_grad():
            val_bar = tqdm(validate_loader, file=sys.stdout)
            for val_data in val_bar:
                val_images, val_labels = val_data
                outputs = net(val_images.to(device))
                # loss = loss_function(outputs, test_labels)
                predict_y = torch.max(outputs, dim=1)[1]
                acc += torch.eq(predict_y, val_labels.to(device)).sum().item()

                val_bar.desc = "valid epoch[{}/{}]".format(epoch + 1,
                                                           epochs)

        val_accurate = acc / val_num
        print('[epoch %d] train_loss: %.3f  val_accuracy: %.3f' %
              (epoch + 1, running_loss / train_steps, val_accurate))

        if val_accurate > best_acc:
            best_acc = val_accurate
            torch.save(net.state_dict(), save_path)

    print('Finished Training')


if __name__ == '__main__':
    main()

訓(xùn)練結(jié)果 ResNet網(wǎng)絡(luò)結(jié)構(gòu)，BN以及遷移學(xué)習(xí)詳解

預(yù)測(cè)模塊

import os
import json

import torch
from PIL import Image
from torchvision import transforms
import matplotlib.pyplot as plt

from model import resnet34


def main():
    device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")

    data_transform = transforms.Compose(
        [transforms.Resize(256),
         transforms.CenterCrop(224),
         transforms.ToTensor(),
         transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])])

    # load image
    img_path = "../tulip.jpg"
    assert os.path.exists(img_path), "file: '{}' dose not exist.".format(img_path)
    img = Image.open(img_path)
    plt.imshow(img)
    # [N, C, H, W]
    img = data_transform(img)
    # expand batch dimension
    img = torch.unsqueeze(img, dim=0)

    # read class_indict
    json_path = './class_indices.json'
    assert os.path.exists(json_path), "file: '{}' dose not exist.".format(json_path)

    with open(json_path, "r") as f:
        class_indict = json.load(f)

    # create model
    model = resnet34(num_classes=5).to(device)

    # load model weights
    weights_path = "./resNet34.pth"
    assert os.path.exists(weights_path), "file: '{}' dose not exist.".format(weights_path)
    model.load_state_dict(torch.load(weights_path, map_location=device))

    # prediction
    model.eval()
    with torch.no_grad():
        # predict class
        output = torch.squeeze(model(img.to(device))).cpu()
        predict = torch.softmax(output, dim=0)
        predict_cla = torch.argmax(predict).numpy()

    print_res = "class: {}   prob: {:.3}".format(class_indict[str(predict_cla)],
                                                 predict[predict_cla].numpy())
    plt.title(print_res)
    for i in range(len(predict)):
        print("class: {:10}   prob: {:.3}".format(class_indict[str(i)],
                                                  predict[i].numpy()))
    plt.show()


if __name__ == '__main__':
    main()

ResNet網(wǎng)絡(luò)結(jié)構(gòu)，BN以及遷移學(xué)習(xí)詳解文章來源地址http://www.zghlxwxcb.cn/news/detail-500075.html

到了這里，關(guān)于ResNet網(wǎng)絡(luò)結(jié)構(gòu)，BN以及遷移學(xué)習(xí)詳解的文章就介紹完了。如果您還想了解更多內(nèi)容，請(qǐng)?jiān)谟疑辖撬阉鱐OY模板網(wǎng)以前的文章或繼續(xù)瀏覽下面的相關(guān)文章，希望大家以后多多支持TOY模板網(wǎng)！