一、V7效果真的的v587,識(shí)別率和速度都有了極大的提升,這里先放最新鮮的github鏈接:
https://github.com/WongKinYiu/yolov7
二、v7的訓(xùn)練我這里就不做過多的贅述了,這里主要是進(jìn)行講解怎么把.pt文件轉(zhuǎn)為onnx和后續(xù)的推理問題:
?2.1首先是pip的版本非常重要,博主親自測試了,發(fā)現(xiàn)確實(shí)只有對應(yīng)版本,ONNX才能成功,以至于后續(xù)的onnxruntime才能正常的ReadLoad~~
pip install onnx==1.12.0
pip install onnx-simplifier==0.4.0
pip install coloredlogs==15.0.1
pip install humanfriendly==10.0
pip install onnxruntime-gpu==1.12.0
pip isntall onnxsim-no-ort==0.4.0
pip install opencv-python==4.5.2.52(注意cv2一定不能用4.6.0)
pip install protobuf==3.19.4
pip install setuptools==63.2.0
我進(jìn)行運(yùn)行的torch和torchvision版本是:
torch1.12.0+cu113? +??torchvision?0.13.0+cu113 + python3.8
這里值得注意的是,在v7的requirements.txt中備注了不要使用這個(gè)版本訓(xùn)練,但是做推理的時(shí)候我發(fā)現(xiàn),就這個(gè)版本可以完成推理,太難了QAQ。。。。
這里推薦一個(gè)下載torch的.whl文件的鏈接:https://download.pytorch.org/whl/torch_stable.html
在里面找到對應(yīng)的版本就行了(PS:只要torch與torchvison版本對應(yīng)上就行,cu前綴其實(shí)限制沒那么多,我的cuda是11.0,但是我pip install 的torch輪子的cu113的O(∩_∩)O):
2.2 有了轉(zhuǎn)換環(huán)境之后我們進(jìn)行export.py的關(guān)鍵步驟:
在V7項(xiàng)目的根目錄下有export.py文件,這里不對文件內(nèi)部做修改,可以直接執(zhí)行命令語句:
python export.py --grid --end2end --simplify --topk-all 100 --iou-thres 0.3 --conf-thres 0.8 --img-size 640 640 --max-wh 640 --weights weights/onnxruntime.pt
?切記需要將這些超參數(shù)打上,看了下解釋因該是做了模型剪枝simple啥的,所以相關(guān)的參數(shù)必須齊全,--weights這里寫你們自己的模型pt文件就可以了,其它參數(shù)根據(jù)自己的模型進(jìn)行修改就好了,我這里拿V7官方的預(yù)訓(xùn)練好的模型(yolov7-tiny.pt)進(jìn)行演示:
?忽略一些警告,如下圖所示就這么輕松轉(zhuǎn)換成功啦:
?我們在netro(https://netron.app/)中打開這個(gè)onnx看一看,發(fā)現(xiàn)在輸出已經(jīng)處理好變?yōu)?維的向量了:
?三、關(guān)于推理部署
有了onnx文件,我們需要放在onnxruntime工具包下做推理
3.1 加載推理模型:
def init_engine(self):
providers = ['CUDAExecutionProvider', 'CPUExecutionProvider'] if self.device else ['CPUExecutionProvider']
self.session = ort.InferenceSession(self.weights, providers=providers)
3.2 對輸入圖像進(jìn)行前處理(做灰條填充)+變?yōu)閠ensor能認(rèn)的4維:
def letterbox(self, im, color=(114, 114, 114), auto=True, scaleup=True, stride=32):
# 調(diào)整大小和墊圖像,同時(shí)滿足跨步多約束
shape = im.shape[:2] # current shape [height, width]
new_shape = self.img_new_shape
# 如果是單個(gè)size的話,就在這里變成一雙
if isinstance(new_shape, int):
new_shape = (new_shape, new_shape)
# 尺度比 (new / old)
r = min(new_shape[0] / shape[0], new_shape[1] / shape[1])
if not scaleup: # 只縮小,不擴(kuò)大(為了更好的val mAP)
r = min(r, 1.0)
# 計(jì)算填充
new_unpad = int(round(shape[1] * r)), int(round(shape[0] * r))
dw, dh = new_shape[1] - new_unpad[0], new_shape[0] - new_unpad[1] # wh padding
if auto: # 最小矩形區(qū)域
dw, dh = np.mod(dw, stride), np.mod(dh, stride) # wh padding
dw /= 2 # divide padding into 2 sides
dh /= 2
if shape[::-1] != new_unpad: # resize
im = cv2.resize(im, new_unpad, interpolation=cv2.INTER_LINEAR)
top, bottom = int(round(dh - 0.1)), int(round(dh + 0.1))
left, right = int(round(dw - 0.1)), int(round(dw + 0.1))
im = cv2.copyMakeBorder(im, top, bottom, left, right, cv2.BORDER_CONSTANT, value=color) # add border
im = im.transpose((2, 0, 1))
im = np.expand_dims(im, 0)
im = np.ascontiguousarray(im)
im = im.astype(np.float32)
im /= 255
return im, r, (dw, dh)
3.3 然后就可以放如onnxruntime中得到那7維的輸出結(jié)果了:
def preprocess(self, image_path):
self.img = cv2.imread(image_path)
self.img = cv2.cvtColor(self.img, cv2.COLOR_BGR2RGB)
image = self.img.copy()
im, ratio, dwdh = self.letterbox(image, auto=False)
t1 = time.time()
outputs = self.predict(im)
print("推理時(shí)間", (time.time() - t1) * 1000, ' ms')
ori_images = [self.img.copy()]
for i, (batch_id, x0, y0, x1, y1, cls_id, score) in enumerate(outputs):
image = ori_images[int(batch_id)]
box = np.array([x0, y0, x1, y1])
box -= np.array(dwdh * 2)
box /= ratio
box = box.round().astype(np.int32).tolist()
cls_id = int(cls_id)
score = round(float(score), 3)
name = self.names[cls_id]
color = self.colors[name]
name += ' ' + str(score)
cv2.rectangle(image, box[:2], box[2:], color, 2)
cv2.putText(image, name, (box[0], box[1] - 2), cv2.FONT_HERSHEY_SIMPLEX, 0.75, [225, 255, 255], thickness=2)
a = Image.fromarray(ori_images[0])
return a
3.4 pre的部分比較簡單基本onnx都處理過了,直接拿字典結(jié)果就可以
def predict(self, im):
outname = [i.name for i in self.session.get_outputs()]
inname = [i.name for i in self.session.get_inputs()]
inp = {inname[0]: im}
outputs = self.session.run(outname, inp)[0]
return outputs
3.5?可以看看推理時(shí)間還是非??斓?,單幀在10ms左右,真是100FPS?。。。?!
3.6 放一張馬的傳統(tǒng),哈哈:
2022年9月21日 最后補(bǔ)充下,其實(shí)上面就是源碼,把它拼接到一起就是一個(gè)整體腳本了呀??文章來源:http://www.zghlxwxcb.cn/news/detail-496481.html
import os
import cv2
import time
import requests
import argparse
import random
import numpy as np
import onnxruntime as ort
from PIL import Image
names = ["1", "2", "3", "4", "5", "6", "unknow" 'truck', 'boat', 'traffic light',
'fire hydrant', 'stop sign', 'parking meter', 'bench', 'bird', 'cat', 'dog', 'horse', 'sheep', 'cow',
'elephant', 'bear', 'zebra', 'giraffe', 'backpack', 'umbrella', 'handbag', 'tie', 'suitcase', 'frisbee',
'skis', 'snowboard', 'sports ball', 'kite', 'baseball bat', 'baseball glove', 'skateboard', 'surfboard',
'tennis racket', 'bottle', 'wine glass', 'cup', 'fork', 'knife', 'spoon', 'bowl', 'banana', 'apple',
'sandwich', 'orange', 'broccoli', 'carrot', 'hot dog', 'pizza', 'donut', 'cake', 'chair', 'couch',
'potted plant', 'bed', 'dining table', 'toilet', 'tv', 'laptop', 'mouse', 'remote', 'keyboard', 'cell phone',
'microwave', 'oven', 'toaster', 'sink', 'refrigerator', 'book', 'clock', 'vase', 'scissors', 'teddy bear',
'hair drier', 'toothbrush']
class ONNX_engine():
def __init__(self, weights, size, cuda) -> None:
self.img_new_shape = (size, size)
self.weights = weights
self.device = cuda
self.init_engine()
self.names = names
self.colors = {name: [random.randint(0, 255) for _ in range(3)] for i, name in enumerate(self.names)}
def init_engine(self):
providers = ['CUDAExecutionProvider', 'CPUExecutionProvider'] if self.device else ['CPUExecutionProvider']
self.session = ort.InferenceSession(self.weights, providers=providers)
def predict(self, im):
outname = [i.name for i in self.session.get_outputs()]
inname = [i.name for i in self.session.get_inputs()]
inp = {inname[0]: im}
outputs = self.session.run(outname, inp)[0]
# print(outputs.shape)
return outputs
def preprocess(self, image_path):
print('----------', image_path, '---------------')
self.img = cv2.imread(image_path)
self.img = cv2.cvtColor(self.img, cv2.COLOR_BGR2RGB)
image = self.img.copy()
im, ratio, dwdh = self.letterbox(image, auto=False)
t1 = time.time()
outputs = self.predict(im)
print("推理時(shí)間", (time.time() - t1) * 1000, ' ms')
ori_images = [self.img.copy()]
for i, (batch_id, x0, y0, x1, y1, cls_id, score) in enumerate(outputs):
image = ori_images[int(batch_id)]
box = np.array([x0, y0, x1, y1])
box -= np.array(dwdh * 2)
box /= ratio
box = box.round().astype(np.int32).tolist()
cls_id = int(cls_id)
score = round(float(score), 3)
name = self.names[cls_id]
color = self.colors[name]
name += ' ' + str(score)
print("pre result is :", box, name)
cv2.rectangle(image, box[:2], box[2:], color, 2)
cv2.putText(image, name, (box[0], box[1] - 2), cv2.FONT_HERSHEY_SIMPLEX, 0.75, [225, 255, 255], thickness=1)
a = Image.fromarray(ori_images[0])
return a
def letterbox(self, im, color=(114, 114, 114), auto=True, scaleup=True, stride=32):
# 調(diào)整大小和墊圖像,同時(shí)滿足跨步多約束
shape = im.shape[:2] # current shape [height, width]
new_shape = self.img_new_shape
# 如果是單個(gè)size的話,就在這里變成一雙
if isinstance(new_shape, int):
new_shape = (new_shape, new_shape)
# 尺度比 (new / old)
r = min(new_shape[0] / shape[0], new_shape[1] / shape[1])
if not scaleup: # 只縮小,不擴(kuò)大(為了更好的val mAP)
r = min(r, 1.0)
# 計(jì)算填充
new_unpad = int(round(shape[1] * r)), int(round(shape[0] * r))
dw, dh = new_shape[1] - new_unpad[0], new_shape[0] - new_unpad[1] # wh padding
if auto: # 最小矩形區(qū)域
dw, dh = np.mod(dw, stride), np.mod(dh, stride) # wh padding
dw /= 2 # divide padding into 2 sides
dh /= 2
if shape[::-1] != new_unpad: # resize
im = cv2.resize(im, new_unpad, interpolation=cv2.INTER_LINEAR)
top, bottom = int(round(dh - 0.1)), int(round(dh + 0.1))
left, right = int(round(dw - 0.1)), int(round(dw + 0.1))
im = cv2.copyMakeBorder(im, top, bottom, left, right, cv2.BORDER_CONSTANT, value=color) # add border
im = im.transpose((2, 0, 1))
im = np.expand_dims(im, 0)
im = np.ascontiguousarray(im)
im = im.astype(np.float32)
im /= 255
return im, r, (dw, dh)
if __name__ == '__main__':
parser = argparse.ArgumentParser()
parser.add_argument('--weights', type=str, default='./weights/coil_onnxruntime.onnx', help='weights path of onnx')
parser.add_argument('--cuda', type=bool, default=True, help='if your pc have cuda')
parser.add_argument('--imgs_path', type=str, default='inference/images', help='infer the img of path')
parser.add_argument('--size', type=int, default=640, help='infer the img size')
opt = parser.parse_args()
onnx_engine = ONNX_engine(opt.weights, opt.size, opt.cuda)
save_path = './inference'
for img_path in os.listdir(opt.imgs_path):
img_path_file = opt.imgs_path + '/' + img_path
# print('The img path is: ', img_path_file)
a = onnx_engine.preprocess(img_path_file)
# a.save(save_path + '/' + 'pre_img' + '/' + img_path)
print('*'*50)
參考代碼:https://colab.research.google.com/github/WongKinYiu/yolov7/blob/main/tools/YOLOv7onnx.ipynb文章來源地址http://www.zghlxwxcb.cn/news/detail-496481.html
到了這里,關(guān)于Yolov7如期而至,奉上ONNXRuntime的推理部署流程(CPU/GPU)的文章就介紹完了。如果您還想了解更多內(nèi)容,請?jiān)谟疑辖撬阉鱐OY模板網(wǎng)以前的文章或繼續(xù)瀏覽下面的相關(guān)文章,希望大家以后多多支持TOY模板網(wǎng)!