在目標(biāo)檢測(cè)網(wǎng)絡(luò)里加注意力機(jī)制已經(jīng)是很常見(jiàn)的了,顧名思義,注意力機(jī)制是指在全局圖像中獲得重點(diǎn)關(guān)注的目標(biāo),常用的注意力機(jī)制有SE、CA、ECA、CBAM、GAM、NAM等。
1、SE模塊
論文:https://arxiv.org/pdf/1709.01507.pdf
參考:CV領(lǐng)域常用的注意力機(jī)制模塊(SE、CBAM)_學(xué)學(xué)沒(méi)完的博客-CSDN博客_se注意力機(jī)制
?SE模塊主要包括Squeeze和Excitation兩個(gè)部分
Squeeze是Global pooling,對(duì)特征進(jìn)行壓縮;
Excitation是通過(guò)兩層全連接結(jié)構(gòu)得到feature map中每個(gè)通道的權(quán)值,并將加權(quán)后的feature map作為下一層網(wǎng)絡(luò)的輸入。
在ECA的論文中表示,SE結(jié)構(gòu)的降維操作對(duì)通道注意有副作用。
class SELayer(nn.Module):
def __init__(self, channel, reduction=16):
super(SELayer, self).__init__()
self.avg_pool = nn.AdaptiveAvgPool2d(1)
self.fc = nn.Sequential(
nn.Linear(channel, channel // reduction, bias=False),
nn.ReLU(inplace=True),
nn.Linear(channel // reduction, channel, bias=False),
nn.Sigmoid()
)
def forward(self, x):
b, c, _, _ = x.size()
y = self.avg_pool(x).view(b, c)
y = self.fc(y).view(b, c, 1, 1)
return x * y.expand_as(x)
2、CA模塊(Coordinate attention)
論文:https://arxiv.org/abs/2103.02907
參考:CA(Coordinate attention) 注意力機(jī)制 - 知乎 (zhihu.com)
CVPR 2021 | 即插即用! CA:新注意力機(jī)制,助力分類(lèi)/檢測(cè)/分割漲點(diǎn)!_Amusi(CVer)的博客-CSDN博客
CVPR 2021 | 即插即用! CA:新注意力機(jī)制,助力分類(lèi)/檢測(cè)/分割漲點(diǎn)!_Amusi(CVer)的博客-CSDN博客CA對(duì)寬度和高度兩個(gè)方向分別全局平均池化,分別獲得在寬度和高度兩個(gè)方向的特征圖,然后將兩個(gè)方向的特征圖concat,然后送入共享卷積將維度降為C/r,再通過(guò)批量歸一化處理和激活函數(shù)后得到特征圖。
import torch
from torch import nn
class CA_Block(nn.Module):
def __init__(self, channel, h, w, reduction=16):
super(CA_Block, self).__init__()
self.h = h
self.w = w
self.avg_pool_x = nn.AdaptiveAvgPool2d((h, 1))
self.avg_pool_y = nn.AdaptiveAvgPool2d((1, w))
self.conv_1x1 = nn.Conv2d(in_channels=channel, out_channels=channel//reduction, kernel_size=1, stride=1, bias=False)
self.relu = nn.ReLU()
self.bn = nn.BatchNorm2d(channel//reduction)
self.F_h = nn.Conv2d(in_channels=channel//reduction, out_channels=channel, kernel_size=1, stride=1, bias=False)
self.F_w = nn.Conv2d(in_channels=channel//reduction, out_channels=channel, kernel_size=1, stride=1, bias=False)
self.sigmoid_h = nn.Sigmoid()
self.sigmoid_w = nn.Sigmoid()
def forward(self, x):
x_h = self.avg_pool_x(x).permute(0, 1, 3, 2)
x_w = self.avg_pool_y(x)
x_cat_conv_relu = self.relu(self.conv_1x1(torch.cat((x_h, x_w), 3)))
x_cat_conv_split_h, x_cat_conv_split_w = x_cat_conv_relu.split([self.h, self.w], 3)
s_h = self.sigmoid_h(self.F_h(x_cat_conv_split_h.permute(0, 1, 3, 2)))
s_w = self.sigmoid_w(self.F_w(x_cat_conv_split_w))
out = x * s_h.expand_as(x) * s_w.expand_as(x)
return out
if __name__ == '__main__':
x = torch.randn(1, 16, 128, 64) # b, c, h, w
ca_model = CA_Block(channel=16, h=128, w=64)
y = ca_model(x)
print(y.shape)
?3、ECA模塊
論文:(PDF) ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks (researchgate.net)
參考:注意力機(jī)制(SE、Coordinate Attention、CBAM、ECA,SimAM)、即插即用的模塊整理_吳大炮的博客-CSDN博客_se注意力機(jī)制
ECA首先通過(guò)全局平均池化,然后利用卷積進(jìn)行特征提取,實(shí)現(xiàn)跨通道的交互。
4、CBAM模塊
論文:?[1807.06521] CBAM:卷積塊注意模塊 (arxiv.org)
參考:注意力機(jī)制之《CBAM: Convolutional Block Attention Module》論文閱讀_落櫻彌城的博客-CSDN博客
?
CBAM模塊分為channel-wise attention和spatial attention,通道注意力和SE結(jié)構(gòu)相同,只是加了一個(gè)maxpooling,中間共享一個(gè)MLP,最后將兩部分的輸出相加經(jīng)過(guò)sigmoid。
空間注意力使用平均池化和最大池化對(duì)輸入特征層進(jìn)行通道壓縮,在使用卷積操作。
class ChannelAttention(nn.Module):
def __init__(self, in_planes, ratio=16):
super(ChannelAttention, self).__init__()
self.avg_pool = nn.AdaptiveAvgPool2d(1)
self.max_pool = nn.AdaptiveMaxPool2d(1)
self.fc1 = nn.Conv2d(in_planes, in_planes // ratio, 1, bias=False)
self.relu1 = nn.ReLU()
self.fc2 = nn.Conv2d(in_planes // ratio, in_planes, 1, bias=False)
self.sigmoid = nn.Sigmoid()
def forward(self, x):
avg_out = self.fc2(self.relu1(self.fc1(self.avg_pool(x))))
max_out = self.fc2(self.relu1(self.fc1(self.max_pool(x))))
out = avg_out + max_out
return self.sigmoid(out)
class SpatialAttention(nn.Module):
def __init__(self, kernel_size=7):
super(SpatialAttention, self).__init__()
assert kernel_size in (3, 7), 'kernel size must be 3 or 7'
padding = 3 if kernel_size == 7 else 1
self.conv1 = nn.Conv2d(2, 1, kernel_size, padding=padding, bias=False)
self.sigmoid = nn.Sigmoid()
def forward(self, x):
avg_out = torch.mean(x, dim=1, keepdim=True)
max_out, _ = torch.max(x, dim=1, keepdim=True)
x = torch.cat([avg_out, max_out], dim=1)
x = self.conv1(x)
return self.sigmoid(x)
5、GAM模塊
論文:https://paperswithcode.com/paper/global-attention-mechanism-retain-information
GAM注意力機(jī)制分為兩個(gè)模塊:CAM和SAM,通道注意是學(xué)習(xí)不同通道的權(quán)值,并用權(quán)值對(duì)不同通道進(jìn)行多重劃分,空間注意關(guān)注目標(biāo)在圖像上的位置信息,并通過(guò)空間特征的加權(quán)選擇性的聚焦每個(gè)空間的特征。
?通道注意力模塊首先重新排列圖像三維信息,然后通過(guò)MLP來(lái)放大跨維通道空間,如圖6所示。在空間注意子模塊中,使用兩個(gè)卷積層進(jìn)行空間信息融合,如圖7所示,這樣使通道更能關(guān)注空間信息。
import torch.nn as nn
import torch
class GAM_Attention(nn.Module):
def __init__(self, in_channels, out_channels, rate=4):
super(GAM_Attention, self).__init__()
self.channel_attention = nn.Sequential(
nn.Linear(in_channels, int(in_channels / rate)),
nn.ReLU(inplace=True),
nn.Linear(int(in_channels / rate), in_channels)
)
self.spatial_attention = nn.Sequential(
nn.Conv2d(in_channels, int(in_channels / rate), kernel_size=7, padding=3),
nn.BatchNorm2d(int(in_channels / rate)),
nn.ReLU(inplace=True),
nn.Conv2d(int(in_channels / rate), out_channels, kernel_size=7, padding=3),
nn.BatchNorm2d(out_channels)
)
def forward(self, x):
b, c, h, w = x.shape
x_permute = x.permute(0, 2, 3, 1).view(b, -1, c)
x_att_permute = self.channel_attention(x_permute).view(b, h, w, c)
x_channel_att = x_att_permute.permute(0, 3, 1, 2)
x = x * x_channel_att
x_spatial_att = self.spatial_attention(x).sigmoid()
out = x * x_spatial_att
return out
if __name__ == '__main__':
x = torch.randn(1, 64, 32, 48)
b, c, h, w = x.shape
net = GAM_Attention(in_channels=c, out_channels=c)
y = net(x)
?6、NAM模塊
論文:https://arxiv.org/abs/2111.12419
參考:https://cloud.tencent.com/developer/article/1909196
NAM采用CBAM的模塊整合,重新設(shè)計(jì)了通道和空間注意子模塊。在通道注意模塊中使用批歸一化中的比例因子。并且將其也運(yùn)用到空間維度,來(lái)衡量像素的重要性。
文章來(lái)源:http://www.zghlxwxcb.cn/news/detail-456725.html
文章來(lái)源地址http://www.zghlxwxcb.cn/news/detail-456725.html
import torch.nn as nn
import torch
from torch.nn import functional as F
# 具體流程可以參考圖1,通道注意力機(jī)制
class Channel_Att(nn.Module):
def __init__(self, channels, t=16):
super(Channel_Att, self).__init__()
self.channels = channels
self.bn2 = nn.BatchNorm2d(self.channels, affine=True)
def forward(self, x):
residual = x
x = self.bn2(x)
# 式2的計(jì)算,即Mc的計(jì)算
weight_bn = self.bn2.weight.data.abs() / torch.sum(self.bn2.weight.data.abs())
x = x.permute(0, 2, 3, 1).contiguous()
x = torch.mul(weight_bn, x)
x = x.permute(0, 3, 1, 2).contiguous()
x = torch.sigmoid(x) * residual #
return x
class Att(nn.Module):Yichao Liu, 2 months ago: ? Add files via upload
def __init__(self, channels,shape, out_channels=None, no_spatial=True):
super(Att, self).__init__()
self.Channel_Att = Channel_Att(channels)
def forward(self, x):
x_out1=self.Channel_Att(x)
return x_out1
到了這里,關(guān)于學(xué)習(xí)筆記1——常用的注意力機(jī)制(即插即用)的文章就介紹完了。如果您還想了解更多內(nèi)容,請(qǐng)?jiān)谟疑辖撬阉鱐OY模板網(wǎng)以前的文章或繼續(xù)瀏覽下面的相關(guān)文章,希望大家以后多多支持TOY模板網(wǎng)!