IoU損失函數(shù)
IoU損失是目標(biāo)檢測(cè)中最常見(jiàn)的損失函數(shù),表示的就是真實(shí)框和預(yù)測(cè)框的交并比,數(shù)學(xué)公式如下:
I
o
U
=
∣
A
∩
B
∣
∣
A
∪
B
∣
IoU =\frac{|A \cap B|}{|A \cup B|}
IoU=∣A∪B∣∣A∩B∣?
L
o
s
s
I
o
U
=
1
?
I
o
U
Loss_{IoU}=1-IoU
LossIoU?=1?IoU
IoU損失會(huì)有兩個(gè)主要的缺點(diǎn)
- 當(dāng)預(yù)測(cè)框與真實(shí)框都沒(méi)有交集的時(shí)候,計(jì)算出來(lái)的IoU都為0,損失都為1,但是,從圖中可以看出,預(yù)測(cè)框1與真實(shí)框更加接近,損失應(yīng)該更小才對(duì)
- 當(dāng)預(yù)測(cè)框和真實(shí)框的交并比相同,但是預(yù)測(cè)框所在位置不同,因?yàn)橛?jì)算出來(lái)的損失一樣,所以這樣并不能判斷哪種預(yù)測(cè)框更加準(zhǔn)確
IoU代碼實(shí)現(xiàn)
def IoU(bbox, prebox):
# bbox, prebox = [x,y,width,height]
# bbox,prebox左上角坐標(biāo)
xmin1, ymin1 = int(bbox[0] - bbox[2] / 2.0), int(bbox[1] - bbox[3] / 2.0)
xmax1, ymax1 = int(bbox[0] + bbox[2] / 2.0), int(bbox[1] + bbox[3] / 2.0)
xmin2, ymin2 = int(prebox[0] - prebox[2] / 2.0), int(prebox[1] - prebox[3] / 2.0)
xmax2, ymax2 = int(prebox[0] + prebox[2] / 2.0), int(prebox[1] + prebox[3] / 2.0)
# 獲取矩形框交集對(duì)應(yīng)的左上角和右下角的坐標(biāo)(intersection)
xx1 = np.max([xmin1, xmin2])
yy1 = np.max([ymin1, ymin2])
xx2 = np.min([xmax1, xmax2])
yy2 = np.min([ymax1, ymax2])
# 計(jì)算兩個(gè)矩形框面積
bbox_area = (xmax1 - xmin1) * (ymax1 - ymin1)
prebox_area = (xmax2 - xmin2) * (ymax2 - ymin2)
inter_area = (np.max([0, xx2 - xx1])) * (np.max([0, yy2 - yy1])) # 計(jì)算交集面積
iou = inter_area / (bbox_area + prebox_area - inter_area + 1e-6) # 計(jì)算交并比
return iou
GIoU損失函數(shù)
為了解決IoU的第一個(gè)問(wèn)題,即當(dāng)預(yù)測(cè)框與真實(shí)框都沒(méi)有交集的時(shí)候,計(jì)算出來(lái)的IoU都為0,損失都為1,引入了一個(gè)最小閉包區(qū)的概念,即能將預(yù)測(cè)框和真實(shí)框包裹住的最小矩形框
GIoU的計(jì)算公式為:
G
I
o
U
=
I
o
U
?
∣
A
c
?
U
∣
∣
A
c
∣
GIoU =IoU-\frac{|A_c-U|}{|A_c|}
GIoU=IoU?∣Ac?∣∣Ac??U∣?
L
o
s
s
G
I
o
U
=
1
?
G
I
o
U
Loss_{GIoU} =1-GIoU
LossGIoU?=1?GIoU
其中,
A
c
A_c
Ac?為最小閉包區(qū),
U
U
U為預(yù)測(cè)框和真實(shí)框的并集
GIoU的特性:
與IoU相似,GIoU也是一種距離度量,IoU取值[0,1],GIoU取值范圍[-1,1]。在兩者重合的時(shí)候取最大值1,在兩者無(wú)交集且無(wú)限遠(yuǎn)的時(shí)候取最小值-1,因此GIoU是一個(gè)非常好的距離度量指標(biāo)。
與IoU只關(guān)注重疊區(qū)域不同,GIoU不僅關(guān)注重疊區(qū)域,還關(guān)注其他的非重合區(qū)域,能更好的反映兩者的重合度。
GIoU代碼實(shí)現(xiàn)
def GIoU(bbox, prebox):
# bbox, prebox = [x,y,width,height]
# bbox,prebox左上角坐標(biāo)
xmin1, ymin1 = int(bbox[0] - bbox[2] / 2.0), int(bbox[1] - bbox[3] / 2.0)
xmax1, ymax1 = int(bbox[0] + bbox[2] / 2.0), int(bbox[1] + bbox[3] / 2.0)
xmin2, ymin2 = int(prebox[0] - prebox[2] / 2.0), int(prebox[1] - prebox[3] / 2.0)
xmax2, ymax2 = int(prebox[0] + prebox[2] / 2.0), int(prebox[1] + prebox[3] / 2.0)
# 獲取矩形框交集對(duì)應(yīng)的左上角和右下角的坐標(biāo)(intersection)
xx1 = np.max([xmin1, xmin2])
yy1 = np.max([ymin1, ymin2])
xx2 = np.min([xmax1, xmax2])
yy2 = np.min([ymax1, ymax2])
# 計(jì)算兩個(gè)矩形框面積
bbox_area = (xmax1 - xmin1) * (ymax1 - ymin1)
prebox_area = (xmax2 - xmin2) * (ymax2 - ymin2)
inter_area = (np.max([0, xx2 - xx1])) * (np.max([0, yy2 - yy1])) # 計(jì)算交集面積
iou = inter_area / (bbox_area + prebox_area - inter_area + 1e-6) # 計(jì)算交并比
# 計(jì)算Ac
area_C = (max(xmin1, xmax1, xmin2, xmax2) - min(xmin1, xmax1, xmin2, xmax2)) * (max(ymin1, ymax1, ymin2, ymax2) - min(ymin1, ymax1, ymin2, ymax2))
# 計(jì)算并集
area_U = bbox_area + prebox_area - inter_area
giou = iou - (area_C - area_U) / area_C
return giou
DIoU損失函數(shù)
GIoU同樣也存在一些問(wèn)題,如下圖
這兩種情況的IoU和GIoU都是一樣的,但是從我們的自覺(jué)認(rèn)為第一種應(yīng)該更好,loss應(yīng)該更小些,為了解決這一問(wèn)題,又提出了DIoU
D
I
o
U
=
I
o
U
?
ρ
2
(
b
,
b
g
t
)
c
2
DIoU =IoU-\frac{\rho^2(b,b^{gt})}{c^2}
DIoU=IoU?c2ρ2(b,bgt)?
L
o
s
s
D
I
o
U
=
1
?
D
I
o
U
Loss_{DIoU} =1-DIoU
LossDIoU?=1?DIoU
其中,
b
b
b,
b
g
t
b^{gt}
bgt分別代表了預(yù)測(cè)框和真實(shí)框的中心點(diǎn),且
ρ
\rho
ρ代表的是計(jì)算兩個(gè)中心點(diǎn)間的歐式距離。
c
c
c代表的是能夠同時(shí)包含預(yù)測(cè)框和真實(shí)框的最小閉包區(qū)域的對(duì)角線距離。
DIoU的優(yōu)點(diǎn):
- 與GIoU loss類似,DIoU loss在與目標(biāo)框不重疊時(shí),仍然可以為邊界框提供移動(dòng)方向。
- DIoU loss可以直接最小化兩個(gè)目標(biāo)框的距離,因此比GIoU loss收斂快得多。
- 對(duì)于包含兩個(gè)框在水平方向和垂直方向上這種情況,DIoU損失可以使回歸非???,而GIoU損失幾乎退化為IoU損失。
- DIoU還可以替換普通的IoU評(píng)價(jià)策略,應(yīng)用于NMS中,使得NMS得到的結(jié)果更加合理和有效。
DIoU的缺點(diǎn):
- 當(dāng)真實(shí)框和預(yù)測(cè)框的中心點(diǎn)重合時(shí),但是長(zhǎng)寬比不同,交并比一樣,如下圖
計(jì)算出來(lái)的DIoU、GIoU、IoU都一樣
DIoU代碼實(shí)現(xiàn)
def Diou(bboxes1, bboxes2):
rows = bboxes1.shape[0]
cols = bboxes2.shape[0]
dious = torch.zeros((rows, cols))
if rows * cols == 0:#
return dious
exchange = False
if bboxes1.shape[0] > bboxes2.shape[0]:
bboxes1, bboxes2 = bboxes2, bboxes1
dious = torch.zeros((cols, rows))
exchange = True
# #xmin,ymin,xmax,ymax->[:,0],[:,1],[:,2],[:,3]
w1 = bboxes1[:, 2] - bboxes1[:, 0]
h1 = bboxes1[:, 3] - bboxes1[:, 1]
w2 = bboxes2[:, 2] - bboxes2[:, 0]
h2 = bboxes2[:, 3] - bboxes2[:, 1]
area1 = w1 * h1
area2 = w2 * h2
center_x1 = (bboxes1[:, 2] + bboxes1[:, 0]) / 2
center_y1 = (bboxes1[:, 3] + bboxes1[:, 1]) / 2
center_x2 = (bboxes2[:, 2] + bboxes2[:, 0]) / 2
center_y2 = (bboxes2[:, 3] + bboxes2[:, 1]) / 2
inter_max_xy = torch.min(bboxes1[:, 2:],bboxes2[:, 2:])
inter_min_xy = torch.max(bboxes1[:, :2],bboxes2[:, :2])
out_max_xy = torch.max(bboxes1[:, 2:],bboxes2[:, 2:])
out_min_xy = torch.min(bboxes1[:, :2],bboxes2[:, :2])
inter = torch.clamp((inter_max_xy - inter_min_xy), min=0)
inter_area = inter[:, 0] * inter[:, 1]
inter_diag = (center_x2 - center_x1)**2 + (center_y2 - center_y1)**2
outer = torch.clamp((out_max_xy - out_min_xy), min=0)
outer_diag = (outer[:, 0] ** 2) + (outer[:, 1] ** 2)
union = area1+area2-inter_area
dious = inter_area / union - (inter_diag) / outer_diag
dious = torch.clamp(dious,min=-1.0,max = 1.0)
if exchange:
dious = dious.T
return dious
CIoU損失函數(shù)
CIoU在DIoU的基礎(chǔ)上加了對(duì)長(zhǎng)寬比的考慮,其懲罰項(xiàng)如下面公式
R
C
I
o
U
=
ρ
2
(
b
,
b
g
t
)
c
2
+
α
v
R_{CIoU}=\frac{\rho^2(b,b^{gt})}{c^2}+\alpha v
RCIoU?=c2ρ2(b,bgt)?+αv
其中
α
\alpha
α是權(quán)重函數(shù),而
v
v
v用來(lái)度量長(zhǎng)寬比的相似性,定義為
v
=
4
π
2
(
a
r
c
t
a
n
w
g
t
h
g
t
?
a
r
c
t
a
n
w
h
)
2
v=\frac{4}{\pi^2}(arctan\frac{w^{gt}}{h^{gt}}-arctan\frac{w}{h})^2
v=π24?(arctanhgtwgt??arctanhw?)2
α
=
v
(
1
?
I
o
U
)
+
v
\alpha=\frac{v}{(1-IoU)+v}
α=(1?IoU)+vv?
當(dāng)真實(shí)框和預(yù)測(cè)框的長(zhǎng)寬比越接近,
v
v
v越小,
v
v
v不變時(shí),IoU越大,
α
\alpha
α越大,損失越大,說(shuō)明高IoU時(shí),更加關(guān)注長(zhǎng)寬比,低IoU時(shí),更關(guān)注IoU
完整的 CIoU 損失函數(shù)定義:
L
o
s
s
C
I
o
U
=
1
?
I
o
U
+
ρ
2
(
b
,
b
g
t
)
c
2
+
α
v
Loss_{CIoU}=1-IoU+\frac{\rho^2(b,b^{gt})}{c^2}+\alpha v
LossCIoU?=1?IoU+c2ρ2(b,bgt)?+αv
CIoU代碼實(shí)現(xiàn)
def bbox_overlaps_ciou(bboxes1, bboxes2):
rows = bboxes1.shape[0]
cols = bboxes2.shape[0]
cious = torch.zeros((rows, cols))
if rows * cols == 0:
return cious
exchange = False
if bboxes1.shape[0] > bboxes2.shape[0]:
bboxes1, bboxes2 = bboxes2, bboxes1
cious = torch.zeros((cols, rows))
exchange = True
w1 = bboxes1[:, 2] - bboxes1[:, 0]
h1 = bboxes1[:, 3] - bboxes1[:, 1]
w2 = bboxes2[:, 2] - bboxes2[:, 0]
h2 = bboxes2[:, 3] - bboxes2[:, 1]
area1 = w1 * h1
area2 = w2 * h2
center_x1 = (bboxes1[:, 2] + bboxes1[:, 0]) / 2
center_y1 = (bboxes1[:, 3] + bboxes1[:, 1]) / 2
center_x2 = (bboxes2[:, 2] + bboxes2[:, 0]) / 2
center_y2 = (bboxes2[:, 3] + bboxes2[:, 1]) / 2
inter_max_xy = torch.min(bboxes1[:, 2:],bboxes2[:, 2:])
inter_min_xy = torch.max(bboxes1[:, :2],bboxes2[:, :2])
out_max_xy = torch.max(bboxes1[:, 2:],bboxes2[:, 2:])
out_min_xy = torch.min(bboxes1[:, :2],bboxes2[:, :2])
inter = torch.clamp((inter_max_xy - inter_min_xy), min=0)
inter_area = inter[:, 0] * inter[:, 1]
inter_diag = (center_x2 - center_x1)**2 + (center_y2 - center_y1)**2
outer = torch.clamp((out_max_xy - out_min_xy), min=0)
outer_diag = (outer[:, 0] ** 2) + (outer[:, 1] ** 2)
union = area1+area2-inter_area
u = (inter_diag) / outer_diag
iou = inter_area / union
with torch.no_grad():
arctan = torch.atan(w2 / h2) - torch.atan(w1 / h1)
v = (4 / (math.pi ** 2)) * torch.pow((torch.atan(w2 / h2) - torch.atan(w1 / h1)), 2)
S = 1 - iou
alpha = v / (S + v)
w_temp = 2 * w1
ar = (8 / (math.pi ** 2)) * arctan * ((w1 - w_temp) * h1)
cious = iou - (u + alpha * ar)
cious = torch.clamp(cious,min=-1.0,max = 1.0)
if exchange:
cious = cious.T
return cious
SIoU損失函數(shù)
迄今為止提出和使用的方法都沒(méi)有考慮到所需真實(shí)框與預(yù)測(cè)框之間不匹配的方向。這種不足導(dǎo)致收斂速度較慢且效率較低,因?yàn)轭A(yù)測(cè)框可能在訓(xùn)練過(guò)程中“四處游蕩”并最終產(chǎn)生更差的模型。
在本文中,提出了一種新的損失函數(shù) SIoU,其中考慮到所需回歸之間的向量角度,重新定義了懲罰指標(biāo)。應(yīng)用于傳統(tǒng)的神經(jīng)網(wǎng)絡(luò)和數(shù)據(jù)集,表明 SIoU 提高了訓(xùn)練的速度和推理的準(zhǔn)確性。
SIoU損失函數(shù)由4個(gè)Cost函數(shù)組成:
- Angle cost
- Distance cost
- Shape cost
- IoU cost
Angle cost
如果
α
<
π
4
\alpha<\frac{\pi}{4}
α<4π?,則收斂過(guò)程將首先最小化
α
\alpha
α,否則最小化
β
\beta
β
為了實(shí)現(xiàn)這一點(diǎn),引入了下面的定義:
Λ
=
1
?
2
?
s
i
n
2
(
a
r
c
s
i
n
(
x
)
?
π
4
)
\Lambda=1-2*sin^2(arcsin(x)-\frac{\pi}{4})
Λ=1?2?sin2(arcsin(x)?4π?)
其中
x
=
c
h
σ
=
s
i
n
(
α
)
x=\frac{c_h}{\sigma}=sin(\alpha)
x=σch??=sin(α)
σ
=
(
b
c
x
g
t
?
b
c
x
)
2
+
(
b
c
y
g
t
?
b
c
y
)
2
\sigma=\sqrt{(b_{c_x}^{gt}-b_{c_x})^2+(b_{c_y}^{gt}-b_{c_y})^2}
σ=(bcx?gt??bcx??)2+(bcy?gt??bcy??)2?
c
h
=
m
a
x
(
b
c
y
g
t
,
b
c
y
)
?
m
i
n
(
b
c
y
g
t
,
b
c
y
)
c_h=max(b_{c_y}^{gt},b_{c_y})-min(b_{c_y}^{gt},b_{c_y})
ch?=max(bcy?gt?,bcy??)?min(bcy?gt?,bcy??)
Distance cost
Δ
=
∑
t
=
x
,
y
(
1
?
e
?
γ
ρ
t
)
\Delta=\sum_{t=x,y}^{}{(1-e^{-\gamma \rho_t})}
Δ=t=x,y∑?(1?e?γρt?)
其中
ρ
x
=
(
b
c
x
g
t
?
b
c
x
c
w
)
,
ρ
y
=
(
b
c
y
g
t
?
b
c
y
c
h
)
,
γ
=
2
?
Λ
\rho_x=(\frac{b_{c_x}^{gt}-b_{c_x}}{c_w}), \rho_y=(\frac{b_{c_y}^{gt}-b_{c_y}}{c_h}), \gamma=2-\Lambda
ρx?=(cw?bcx?gt??bcx???),ρy?=(ch?bcy?gt??bcy???),γ=2?Λ
可以看出,當(dāng)??→0時(shí),Distance cost的貢獻(xiàn)大大降低。相反,??越接近Π/4,Distance cost貢獻(xiàn)越大。隨著角度的增大,問(wèn)題變得越來(lái)越難。因此,γ被賦予時(shí)間優(yōu)先的距離值,隨著角度的增加
Shape cost
Ω
=
∑
t
=
w
,
h
(
1
?
e
?
ω
t
)
θ
\Omega=\sum_{t=w,h}^{}{(1-e^{-\omega_t})^\theta}
Ω=t=w,h∑?(1?e?ωt?)θ
其中
ω
w
=
∣
w
?
w
g
t
∣
m
a
x
(
w
,
w
g
t
)
,
ω
h
=
∣
h
?
h
g
t
∣
m
a
x
(
h
,
h
g
t
)
\omega_w=\frac{|w-w^{gt}|}{max(w,w^{gt})},\omega_h=\frac{|h-h^{gt}|}{max(h,h^{gt})}
ωw?=max(w,wgt)∣w?wgt∣?,ωh?=max(h,hgt)∣h?hgt∣?
?? 的值定義了每個(gè)數(shù)據(jù)集的Shape cost及其值是唯一的。?? 的值是這個(gè)等式中非常重要的一項(xiàng),它控制著對(duì)Shape cost的關(guān)注程度。如果 ?? 的值設(shè)置為 1,它將立即優(yōu)化一個(gè)Shape,從而損害Shape的自由移動(dòng)。為了計(jì)算 ?? 的值,作者將遺傳算法用于每個(gè)數(shù)據(jù)集,實(shí)驗(yàn)上 ?? 的值接近 4,文中作者為此參數(shù)定義的范圍是 2 到 6。
IoU cost
L I o U C o s t = 1 ? I o U L_{IoUCost}=1-IoU LIoUCost?=1?IoU
最后回歸損失函數(shù)為:
L
b
o
x
=
1
?
I
o
U
+
Δ
+
Ω
2
L_{box}=1-IoU+\frac{\Delta+\Omega}{2}
Lbox?=1?IoU+2Δ+Ω?文章來(lái)源:http://www.zghlxwxcb.cn/news/detail-444763.html
實(shí)驗(yàn)結(jié)果
COCO-val 上 SIoU
的 mAP 為 52.7% mAP@0.5:0.95(包括預(yù)處理、推理和后處理為 7.6ms)和 70% mAP@0.5,同時(shí) CIoU
為分別只有 50.3% 和 66.4%。文章來(lái)源地址http://www.zghlxwxcb.cn/news/detail-444763.html
到了這里,關(guān)于目標(biāo)檢測(cè)中的損失函數(shù)IoU、GIoU、DIoU、CIoU、SIoU的文章就介紹完了。如果您還想了解更多內(nèi)容,請(qǐng)?jiān)谟疑辖撬阉鱐OY模板網(wǎng)以前的文章或繼續(xù)瀏覽下面的相關(guān)文章,希望大家以后多多支持TOY模板網(wǎng)!