- ??統(tǒng)一使用 YOLOv5、YOLOv7 代碼框架,結(jié)合不同模塊來構(gòu)建不同的YOLO目標(biāo)檢測模型。
- 論文所提的
Coordinate注意力
很簡單,可以靈活地插入到經(jīng)典的移動(dòng)網(wǎng)絡(luò)中,而且?guī)缀鯖]有計(jì)算開銷。大量實(shí)驗(yàn)表明,Coordinate注意力不僅有益于ImageNet分類,而且更有趣的是,它在下游任務(wù)(如目標(biāo)檢測和語義分割)中表現(xiàn)也很好。本文結(jié)合目標(biāo)檢測任務(wù)應(yīng)用 - 應(yīng)
專欄讀者
的要求,寫一篇關(guān)于YOLOv7+CA(Coordinate attention) 注意力機(jī)制
的改進(jìn) - 重點(diǎn):有不少讀者已經(jīng)反映該專欄的改進(jìn) 在自有數(shù)據(jù)集上
有效漲點(diǎn)
!!!同時(shí)COCO也能漲點(diǎn)
最新創(chuàng)新點(diǎn)改進(jìn)推薦
-??統(tǒng)一使用 YOLO 代碼框架,結(jié)合不同模塊來構(gòu)建不同的YOLO目標(biāo)檢測模型。
?? 《芒果書》系列改進(jìn)專欄內(nèi)的改進(jìn)文章,均包含多種模型改進(jìn)方式,均適用于YOLOv3
、YOLOv4
、 YOLOR
、 YOLOX
、YOLOv5
、 YOLOv7
、 YOLOv8
改進(jìn)(重點(diǎn))?。?!
?? 專欄創(chuàng)新點(diǎn)教程 均有不少同學(xué)反應(yīng)和我說已經(jīng)在自己的數(shù)據(jù)集上有效漲點(diǎn)啦!! 包括COCO數(shù)據(jù)集也能漲點(diǎn)
所有文章博客均包含 改進(jìn)源代碼部分,一鍵訓(xùn)練即可
?? 對應(yīng)專欄訂閱的越早,就可以越早使用原創(chuàng)創(chuàng)新點(diǎn)
去改進(jìn)模型,搶先一步
芒果書 點(diǎn)擊以下鏈接 查看文章目錄詳情??
-
??????:一、CSDN原創(chuàng)《芒果改進(jìn)YOLO高階指南》強(qiáng)烈改進(jìn)漲點(diǎn)推薦!??推薦指數(shù):??????????
-
??????:二、CSDN原創(chuàng)YOLO進(jìn)階 | 《芒果改進(jìn)YOLO進(jìn)階指南》改進(jìn)漲點(diǎn)推薦!??推薦指數(shù):??????????
-
??????:三、CSDN獨(dú)家全網(wǎng)首發(fā)專欄 | 《目標(biāo)檢測YOLO改進(jìn)指南》改進(jìn)漲點(diǎn)推薦!推薦指數(shù):??????????
一、Coordinate Attention論文理論部分
最近對移動(dòng)網(wǎng)絡(luò)設(shè)計(jì)的研究已經(jīng)證明了通道注意力的顯著效果(例如, Squeeze-and-Excitation 注意)用于提升模型性能,但它們通常忽略位置信息,這對于生成空間選擇性注意圖很重要。在本文中,我們提出了一種新的移動(dòng)網(wǎng)絡(luò)注意機(jī)制,將位置信息嵌入到通道注意中,我們稱之為“坐標(biāo)注意力”。與通過 2D 全局池化將特征張量轉(zhuǎn)換為單個(gè)特征向量的通道注意不同,坐標(biāo)注意將通道注意分解為兩個(gè) 1D 特征編碼過程,分別沿兩個(gè)空間方向聚合特征。通過這種方式,可以沿一個(gè)空間方向捕獲遠(yuǎn)程依賴關(guān)系,同時(shí)可以沿另一個(gè)空間方向保留精確的位置信息。然后將生成的特征圖分別編碼為一對方向感知和位置敏感的注意力圖,這些注意力圖可以互補(bǔ)地應(yīng)用于輸入特征圖以增強(qiáng)感興趣對象的表示。我們的坐標(biāo)注意力很簡單,可以靈活地插入經(jīng)典的移動(dòng)網(wǎng)絡(luò),例如 MobileNetV2、MobileNeXt 和 EfficientNet,幾乎沒有計(jì)算開銷。大量實(shí)驗(yàn)表明,我們的協(xié)調(diào)注意力不僅有利于 ImageNet 分類,而且更有趣的是,在對象檢測和語義分割等下游任務(wù)中表現(xiàn)得更好。代碼可在 大量實(shí)驗(yàn)表明,我們的協(xié)調(diào)注意力不僅有利于 ImageNet 分類,而且更有趣的是,在對象檢測和語義分割等下游任務(wù)中表現(xiàn)得更好。代碼可在 大量實(shí)驗(yàn)表明,我們的協(xié)調(diào)注意力不僅有利于 ImageNet 分類,而且更有趣的是,在對象檢測等下游任務(wù)中表現(xiàn)得更好。。
Coordinate Attention介紹
Coordinate Attention設(shè)計(jì)
圖 2:提議的坐標(biāo)注意塊 ? 與經(jīng)典 SE 通道注意塊的示意圖比較[18] (a) 和 CBAM [44] (b)。這里,“GAP”和“GMP”分別指的是全局平均池化和全局最大池化?!癤 Avg Pool”和“Y Avg Pool”分別指一維水平全局池和一維垂直全局池。
Coordinate Attention Block
論文實(shí)驗(yàn)
在上圖中,作者還將使用不同注意力方法的模型生成的特征圖進(jìn)行了可視化。顯然,CA注意力比SE和CBAM更有助于目標(biāo)的定位。
二、結(jié)合YOLOv7 改進(jìn)代碼
2.1 網(wǎng)絡(luò)配置
1.增加YOLOv7_CA.yaml文件
# YOLOv7 ??, GPL-3.0 license
# parameters
nc: 80 # number of classes
depth_multiple: 1.0 # model depth multiple
width_multiple: 1.0 # layer channel iscyy multiple
# anchors
anchors:
- [12,16, 19,36, 40,28] # P3/8
- [36,75, 76,55, 72,146] # P4/16
- [142,110, 192,243, 459,401] # P5/32
# yolov7 backbone
backbone:
# [from, number, module, args]
[[-1, 1, Conv, [32, 3, 1]], # 0
[-1, 1, Conv, [64, 3, 2]], # 1-P1/2
[-1, 1, Conv, [64, 3, 1]],
[-1, 1, Conv, [128, 3, 2]], # 3-P2/4
[-1, 1, C3, [128]],
[-1, 1, Conv, [256, 3, 2]],
[-1, 1, MP, []],
[-1, 1, Conv, [128, 1, 1]],
[-3, 1, Conv, [128, 1, 1]],
[-1, 1, Conv, [128, 3, 2]],
[[-1, -3], 1, Concat, [1]], # 16-P3/8
[-1, 1, Conv, [128, 1, 1]],
[-2, 1, Conv, [128, 1, 1]],
[-1, 1, Conv, [128, 3, 1]],
[-1, 1, Conv, [128, 3, 1]],
[-1, 1, Conv, [128, 3, 1]],
[-1, 1, Conv, [128, 3, 1]],
[[-1, -3, -5, -6], 1, Concat, [1]],
[-1, 1, Conv, [512, 1, 1]],
[-1, 1, MP, []],
[-1, 1, Conv, [256, 1, 1]],
[-3, 1, Conv, [256, 1, 1]],
[-1, 1, Conv, [256, 3, 2]],
[[-1, -3], 1, Concat, [1]],
[-1, 1, Conv, [256, 1, 1]],
[-2, 1, Conv, [256, 1, 1]],
[-1, 1, Conv, [256, 3, 1]],
[-1, 1, Conv, [256, 3, 1]],
[-1, 1, Conv, [256, 3, 1]],
[-1, 1, Conv, [256, 3, 1]],
[[-1, -3, -5, -6], 1, Concat, [1]],
[-1, 1, Conv, [1024, 1, 1]],
[-1, 1, MP, []],
[-1, 1, Conv, [512, 1, 1]],
[-3, 1, Conv, [512, 1, 1]],
[-1, 1, Conv, [512, 3, 2]],
[[-1, -3], 1, Concat, [1]],
[-1, 1, C3, [1024]],
[-1, 1, Conv, [256, 3, 1]],
]
# yolov7 head by iscyy
head:
[[-1, 1, SPPCSPC, [512]],
[-1, 1, Conv, [256, 1, 1]],
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
[31, 1, Conv, [256, 1, 1]],
[[-1, -2], 1, Concat, [1]],
[-1, 1, C3, [128]],
[-1, 1, Conv, [128, 1, 1]],
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
[18, 1, Conv, [128, 1, 1]],
[[-1, -2], 1, Concat, [1]],
[-1, 1, C3, [128]],
[-1, 1, MP, []],
[-1, 1, Conv, [128, 1, 1]],
[-3, 1, CA, [128]],
[-1, 1, Conv, [128, 3, 2]],
[[-1, -3, 44], 1, Concat, [1]],
[-1, 1, C3, [256]],
[-1, 1, MP, []],
[-1, 1, Conv, [256, 1, 1]],
[-3, 1, Conv, [256, 1, 1]],
[-1, 1, Conv, [256, 3, 2]],
[[-1, -3, 39], 1, Concat, [1]],
[-1, 3, C3, [512]],
# 檢測頭 -----------------------------
[49, 1, RepConv, [256, 3, 1]],
[55, 1, RepConv, [512, 3, 1]],
[61, 1, RepConv, [1024, 3, 1]],
[[62,63,64], 1, IDetect, [nc, anchors]], # Detect(P3, P4, P5)
]
2.2 核心代碼
1.在common中新增以下代碼
class h_sigmoid(nn.Module):
def __init__(self, inplace=True):
super(h_sigmoid, self).__init__()
self.relu = nn.ReLU6(inplace=inplace)
def forward(self, x):
return self.relu(x + 3) / 6
class h_swish(nn.Module):
def __init__(self, inplace=True):
super(h_swish, self).__init__()
self.sigmoid = h_sigmoid(inplace=inplace)
def forward(self, x):
return x * self.sigmoid(x)
class CA(nn.Module):
# Coordinate Attention for Efficient Mobile Network Design
'''
Recent studies on mobile network design have demonstrated the remarkable effectiveness of channel attention (e.g., the Squeeze-and-Excitation attention) for lifting
model performance, but they generally neglect the positional information, which is important for generating spatially selective attention maps. In this paper, we propose a
novel attention mechanism for mobile iscyy networks by embedding positional information into channel attention, which
we call “coordinate attention”. Unlike channel attention
that transforms a feature tensor to a single feature vector iscyy via 2D global pooling, the coordinate attention factorizes channel attention into two 1D feature encoding
processes that aggregate features along the two spatial directions, respectively
'''
def __init__(self, inp, oup, reduction=32):
super(CA, self).__init__()
mip = max(8, inp // reduction)
self.conv1 = nn.Conv2d(inp, mip, kernel_size=1, stride=1, padding=0)
self.bn1 = nn.BatchNorm2d(mip)
self.act = h_swish()
self.conv_h = nn.Conv2d(mip, oup, kernel_size=1, stride=1, padding=0)
self.conv_w = nn.Conv2d(mip, oup, kernel_size=1, stride=1, padding=0)
def forward(self, x):
identity = x
n,c,h,w = x.size()
pool_h = nn.AdaptiveAvgPool2d((h, 1))
pool_w = nn.AdaptiveAvgPool2d((1, w))
x_h = pool_h(x)
x_w = pool_w(x).permute(0, 1, 3, 2)
y = torch.cat([x_h, x_w], dim=2)
y = self.conv1(y)
y = self.bn1(y)
y = self.act(y)
x_h, x_w = torch.split(y, [h, w], dim=2)
x_w = x_w.permute(0, 1, 3, 2)
a_h = self.conv_h(x_h).sigmoid()
a_w = self.conv_w(x_w).sigmoid()
out = identity * a_w * a_h
return out
然后在 在yolo.py
中配置
找到./models/yolo.py文件下里的parse_model
函數(shù),將類名加入進(jìn)去
for i, (f, n, m, args) in enumerate(d['backbone'] + d['head']):
內(nèi)部
對應(yīng)位置 下方只需要增加 代碼
參考代碼文章來源:http://www.zghlxwxcb.cn/news/detail-449970.html
elif m in [CA]:
c1, c2 = ch[f], args[0]
if c2 != no: # if not outputss
c2 = make_divisible(c2 * gw, 8)
args = [c1, c2, *args[1:]]
2.3 運(yùn)行
python train.py --cfg yolov7_CA.yaml
三、結(jié)合YOLOv5 改進(jìn)代碼
3.1 網(wǎng)絡(luò)配置
1.增加YOLOv5_CA.yaml文件
# YOLOv5 ??, GPL-3.0 license
# Parameters
nc: 80 # number of classes
depth_multiple: 0.33 # model depth iscyy multiple
width_multiple: 0.50 # layer channel iscyy multiple
anchors:
- [10,13, 16,30, 33,23] # P3/8
- [30,61, 62,45, 59,119] # P4/16
- [116,90, 156,198, 373,326] # P5/32
# YOLOv5 v6.0 backbone
backbone:
# [from, number, module, args]
[[-1, 1, Conv, [64, 6, 2, 2]], # 0-P1/2
[-1, 1, Conv, [128, 3, 2]], # 1-P2/4
[-1, 3, C3, [128]],
[-1, 1, Conv, [256, 3, 2]], # 3-P3/8
[-1, 6, C3, [256]],
[-1, 1, Conv, [512, 3, 2]], # 5-P4/16
[-1, 9, C3, [512]],
[-1, 1, Conv, [1024, 3, 2]], # 7-P5/32
[-1, 3, C3, [1024]],
[-1, 1, SPPF, [1024, 5]], # 9
]
# YOLOv5 v6.0 head
head:
[[-1, 1, Conv, [512, 1, 1]],
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
[[-1, 6], 1, Concat, [1]], # cat backbone P4
[-1, 3, C3, [512, False]], # 13
[-1, 1, Conv, [256, 1, 1]],
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
[[-1, 4], 1, Concat, [1]], # cat backbone P3
[-1, 3, C3, [256, False]], # 17 (P3/8-small)
[-1, 1, Conv, [256, 3, 2]],
[[-1, 14], 1, Concat, [1]], # cat head P4
[-1, 3, C3, [512, False]], # 20 (P4/16-medium)
[-1, 1, Conv, [512, 3, 2]],
[[-1, 10], 1, Concat, [1]], # cat head P5
[-1, 3, C3, [1024, False]], # 23 (P5/32-large)
[-1, 1, CA, [1024]],
[[17, 20, 24], 1, Detect, [nc, anchors]], # Detect(P3, P4, P5)
]
3.2 核心代碼
1.在common中新增以下代碼
class h_sigmoid(nn.Module):
def __init__(self, inplace=True):
super(h_sigmoid, self).__init__()
self.relu = nn.ReLU6(inplace=inplace)
def forward(self, x):
return self.relu(x + 3) / 6
class h_swish(nn.Module):
def __init__(self, inplace=True):
super(h_swish, self).__init__()
self.sigmoid = h_sigmoid(inplace=inplace)
def forward(self, x):
return x * self.sigmoid(x)
class CA(nn.Module):
# Coordinate Attention for Efficient Mobile Network Design
'''
Recent studies on mobile network design have demonstrated the remarkable effectiveness of channel attention (e.g., the Squeeze-and-Excitation attention) for lifting
model performance, but they generally neglect the positional information, which is important for generating spatially selective attention maps. In this paper, we propose a
novel attention mechanism for mobile iscyy networks by embedding positional information into channel attention, which
we call “coordinate attention”. Unlike channel attention
that transforms a feature tensor to a single feature vector iscyy via 2D global pooling, the coordinate attention factorizes channel attention into two 1D feature encoding
processes that aggregate features along the two spatial directions, respectively
'''
def __init__(self, inp, oup, reduction=32):
super(CA, self).__init__()
mip = max(8, inp // reduction)
self.conv1 = nn.Conv2d(inp, mip, kernel_size=1, stride=1, padding=0)
self.bn1 = nn.BatchNorm2d(mip)
self.act = h_swish()
self.conv_h = nn.Conv2d(mip, oup, kernel_size=1, stride=1, padding=0)
self.conv_w = nn.Conv2d(mip, oup, kernel_size=1, stride=1, padding=0)
def forward(self, x):
identity = x
n,c,h,w = x.size()
pool_h = nn.AdaptiveAvgPool2d((h, 1))
pool_w = nn.AdaptiveAvgPool2d((1, w))
x_h = pool_h(x)
x_w = pool_w(x).permute(0, 1, 3, 2)
y = torch.cat([x_h, x_w], dim=2)
y = self.conv1(y)
y = self.bn1(y)
y = self.act(y)
x_h, x_w = torch.split(y, [h, w], dim=2)
x_w = x_w.permute(0, 1, 3, 2)
a_h = self.conv_h(x_h).sigmoid()
a_w = self.conv_w(x_w).sigmoid()
out = identity * a_w * a_h
return out
然后在 在yolo.py
中配置
找到./models/yolo.py文件下里的parse_model
函數(shù),將類名加入進(jìn)去
for i, (f, n, m, args) in enumerate(d['backbone'] + d['head']):
內(nèi)部
對應(yīng)位置 下方只需要增加 代碼
參考代碼
elif m in [CA]:
c1, c2 = ch[f], args[0]
if c2 != no: # if not outputss
c2 = make_divisible(c2 * gw, 8)
args = [c1, c2, *args[1:]]
3.2 運(yùn)行
python train.py --cfg yolov5_CA.yaml文章來源地址http://www.zghlxwxcb.cn/news/detail-449970.html
到了這里,關(guān)于YOLOv7改進(jìn)注意力機(jī)制系列:最新結(jié)合即插即用CA(Coordinate attention) 注意力機(jī)制(適用于YOLOv5),CVPR 2021 頂會(huì)助力分類檢測漲點(diǎn)!的文章就介紹完了。如果您還想了解更多內(nèi)容,請?jiān)谟疑辖撬阉鱐OY模板網(wǎng)以前的文章或繼續(xù)瀏覽下面的相關(guān)文章,希望大家以后多多支持TOY模板網(wǎng)!