国产无码综合区,色欲AV无码国产永久播放,无码天堂亚洲国产AV,国产日韩欧美女同一区二区

Jetson Nano配置YOLOv5并實現(xiàn)FPS=25

2年前作者：carrymingteng分類：Toy博客閱讀(20)違法舉報

這篇具有很好參考價值的文章主要介紹了Jetson Nano配置YOLOv5并實現(xiàn)FPS=25。希望對大家有所幫助。如果存在錯誤或未考慮完全的地方，請大家不吝賜教，您也可以點擊"舉報違法"按鈕提交疑問。

Jetson Nano配置YOLOv5并實現(xiàn)FPS=25的實時檢測（超詳細(xì)保姆級）

一、版本說明

JetPack 4.6——2021.8
yolov5-v6.0版本
使用的為yolov5的yolov5n.pt，并利用tensorrtx進(jìn)行加速推理，在調(diào)用攝像頭實時檢測可以達(dá)到FPS=25。

二、配置CUDA

sudo gedit ~/.bashrc

在打開的文檔的末尾添加如下：

export CUDA_HOME=/usr/local/cuda-10.2
export LD_LIBRARY_PATH=/usr/local/cuda-10.2/lib64:$LD_LIBRARY_PATH
export PATH=/usr/local/cuda-10.2/bin:$PATH

保持并退出，終端執(zhí)行

source ~/.bashrc
nvcc -V	#如果配置成功可以看到CUDA的版本號

三、修改Nano板顯存

1.打開終端輸入：

sudo gedit /etc/systemd/nvzramconfig.sh

2.修改nvzramconfig.sh文件

修改mem = $((("${totalmem}"/2/"${NRDEVICES}")*1024))
為mem = $((("${totalmem}"*2/"${NRDEVICES}")*1024))

3.重啟Jetson Nano

4.終端中輸入：

free -h

可查看到swap已經(jīng)變?yōu)?.7G

四、配置Pytorch1.8

1.下載torch-1.8.0-cp36-cp36m-linux_aarch64.whl

下載地址：nvidia.box.com/shared/static/p57jwntv436lfrd78inwl7iml6p13fzh.whl
網(wǎng)盤分享：鏈接：https://pan.baidu.com/s/1tS51E3a-a-w9_OdCNraoAg
提取碼：30qr
說明：建議在電腦上下載后拷貝到Jetson Nano的文件夾下，因為該網(wǎng)站的服務(wù)器在國外，可能下載比較慢或網(wǎng)頁加載不出來，可以打開VPN進(jìn)行下載。

2.安裝所需的依賴包及pytorch

打開終端輸入：

sudo apt-get update
sudo apt-get upgrade
sudo apt-get dist-upgrade
sudo apt-get install python3-pip libopenblas-base libopenmpi-dev

因為下面用pip指令安裝時用默認(rèn)選用的國外源，所以下載比較費時間，建議更換一下國內(nèi)源，這里提供幾種源，當(dāng)使用某種國內(nèi)源pip無法下載某一包時可以嘗試切換再下載。具體步驟如下：

打開終端輸入：

mkdir ~/.pip
sudo gedit ~/.pip/pip.conf

在空白文件中輸入如下內(nèi)容保存并退出：
以下為豆瓣源

[global]
timeout=6000
index-url=https://pypi.doubanio.com/simple
trusted-host=pypi.doubanio.com

以下為阿里源

[global]
index-url=http://mirrors.aliyun.com/pypi/simple/
[install]
trusted-host=mirrors.aliyun.com

以下為清華源

[global]
index-url=https://pypi.tuna.tsinghua.edu.cn/simple/
[install]
trusted-host=https://pypi.tuna.tsinghua.edu.cn

終端輸入：

pip3 install --upgrade pip		#如果pip已是最新，可不執(zhí)行
pip3 install Cython
pip3 install numpy
pip3 install torch-1.8.0-cp36-cp36m-linux_aarch64.whl		#注意要在存放該文件下的位置打開終端并運行
sudo apt-get install libjpeg-dev zlib1g-dev libpython3-dev libavcodec-dev libavformat-dev libswscale-dev
git clone --branch v0.9.0 https://github.com/pytorch/vision torchvision  #下載torchvision，會下載一個文件夾
cd torchvision	#或者進(jìn)入到這個文件夾，右鍵打開終端
export BUILD_VERSION=0.9.0
python3 setup.py install --user	#時間較久
#驗證torch和torchvision這兩個模塊是否安裝成功
python3
import torch
print(torch.__version__)	#注意version前后都是有兩個橫杠
#如果安裝成功會打印出版本號
import torchvision
print(torchvision.__version__)
#如果安裝成功會打印出版本號

五、搭建yolov5環(huán)境

終端中輸入：

git clone https://github.com/ultralytics/yolov5.git		#因為不開VPN很容易下載出錯，建議在電腦中下載后拷貝到j(luò)etson nano中
python3 -m pip install --upgrade pip
cd yolov5	#如果是手動下載的，文件名稱為yolov5-master.zip壓縮包格式，所以要對用unzip yolov5-master.zip進(jìn)行解壓，然后再進(jìn)入到該文件夾
pip3 install -r requirements.txt		#我的問題是對matplotlib包裝不上，解決辦法，在下方。如果其他包安裝不上，可去重新執(zhí)行換源那一步，更換另一種國內(nèi)源。
python3 -m pip list		#可查看python中安裝的包
以下指令可以用來測試yolov5
python3 detect.py --source data/images/bus.jpg --weights yolov5n.pt --img 640	#圖片測試
python3 detect.py --source video.mp4 --weights yolov5n.pt --img 640	#視頻測試,需要自己準(zhǔn)備視頻
python3 detect.py --source 0 --weights yolov5n.pt --img 640	#攝像頭測試

問題1：解決matplotlib安裝不上問題
解決：下載matplotlib的whl包（下方有網(wǎng)盤分享）
問題2：在運行yolov5的detect.py文件時出現(xiàn) “Illegal instruction（core dumped）”
解決：

sudo gedit ~/.bashrc
末尾添加
export OPENBLAS_CORETYPE=ARMV8
保持關(guān)閉
source ~/.bashrc

網(wǎng)盤分享：
yolov5：鏈接：https://pan.baidu.com/s/1oGLTyUZ9TzEWO1VfxV70yw
提取碼：3ran
yolov5n.pt:鏈接：https://pan.baidu.com/s/1k-EDuIJgKc_9OYubWOhJcg
提取碼：oe0w
下載yolov5n.pt文件在https://github.com/ultralytics/yolov5中下載，如圖位置所示
matplotlib：鏈接：https://pan.baidu.com/s/19DanfBYMxKerxlDSuIF8MQ
提取碼：fp4i
Jetson Nano配置YOLOv5并實現(xiàn)FPS=25,yolov5,jetson nano,Ubunntu,人工智能,python

六、利用tensorrtx加速推理

1.下載tensorrtx

下載地址：https://github.com/wang-xinyu/tensorrtx.git

或者

git clone https://github.com/wang-xinyu/tensorrtx.git

網(wǎng)盤分享：鏈接：https://pan.baidu.com/s/14vCw3V74bWrT_3QQ-Yk–A
提取碼：3zom

2.編譯

將下載的tensorrtx項目中的yolov5/gen_wts.py復(fù)制到上述的yolov5（注意：不是tensorrtx下的yolov5?。。。┫拢缓笤诖颂幋蜷_終端

打開終端輸入：

python3 gen_wts.py -w yolov5n.pt -o yolov5n.wts		#生成wts文件，要先把yolov5n.pt文件放在此處再去執(zhí)行
cd ~/tensorrtx/yolov5/		#如果是手動下載的名稱可能是tensorrtx-master
mkdir build
cd build
將生成的wts文件復(fù)制到build下	#手動下載的，名稱為yolov5-master
cmake ..
make -j4
sudo ./yolov5 -s yolov5n.wts yolov5n.engine n #生成engine文件
sudo ./yolov5 -d yolov5n.engine ../samples/	#測試圖片查看效果,發(fā)現(xiàn)在檢測zidane.jpg時漏檢，這時可以返回上一層文件夾找到y(tǒng)olov5.cpp中的CONF_THRESH=0.25再進(jìn)入到build中重新make -j4，再重新運行該指令即可

3.調(diào)用USB攝像頭

參考了該文章https://blog.csdn.net/weixin_54603153/article/details/120079220

（1）在tensorrtx/yolov5下備份yolov5.cpp文件，因為如果更換模型時重新推理加速時需要用到該文件。

（2）然后對yolov5.cpp文件修改為如下內(nèi)容

修改了12行和342行

#include <iostream>
#include <chrono>
#include "cuda_utils.h"
#include "logging.h"
#include "common.hpp"
#include "utils.h"
#include "calibrator.h"
 
#define USE_FP32  // set USE_INT8 or USE_FP16 or USE_FP32
#define DEVICE 0  // GPU id
#define NMS_THRESH 0.4 //0.4
#define CONF_THRESH 0.25	//置信度，默認(rèn)值為0.5，由于效果不好修改為0.25取得了較好的效果
#define BATCH_SIZE 1
 
// stuff we know about the network and the input/output blobs
static const int INPUT_H = Yolo::INPUT_H;
static const int INPUT_W = Yolo::INPUT_W;
static const int CLASS_NUM = Yolo::CLASS_NUM;
static const int OUTPUT_SIZE = Yolo::MAX_OUTPUT_BBOX_COUNT * sizeof(Yolo::Detection) / sizeof(float) + 1;  // we assume the yololayer outputs no more than MAX_OUTPUT_BBOX_COUNT boxes that conf >= 0.1
const char* INPUT_BLOB_NAME = "data";
const char* OUTPUT_BLOB_NAME = "prob";
static Logger gLogger;
 
char* my_classes[] = { "person", "bicycle", "car", "motorcycle", "airplane", "bus", "train", "truck", "boat", "traffic light",
         "fire hydrant", "stop sign", "parking meter", "bench", "bird", "cat", "dog", "horse", "sheep", "cow",
         "elephant", "bear", "zebra", "giraffe", "backpack", "umbrella", "handbag", "tie", "suitcase", "frisbee",
         "skis", "snowboard", "sports ball", "kite", "baseball bat", "baseball glove", "skateboard","surfboard",
         "tennis racket", "bottle", "wine glass", "cup", "fork", "knife", "spoon", "bowl", "banana", "apple",
         "sandwich", "orange", "broccoli", "carrot", "hot dog", "pizza", "donut", "cake", "chair", "couch",
         "potted plant", "bed", "dining table", "toilet", "tv", "laptop", "mouse", "remote", "keyboard", "cell phone",
         "microwave", "oven", "toaster", "sink", "refrigerator", "book", "clock", "vase", "scissors", "teddy bear",
         "hair drier", "toothbrush" };
 
static int get_width(int x, float gw, int divisor = 8) {
    //return math.ceil(x / divisor) * divisor
    if (int(x * gw) % divisor == 0) {
        return int(x * gw);
    }
    return (int(x * gw / divisor) + 1) * divisor;
}
 
static int get_depth(int x, float gd) {
    if (x == 1) {
        return 1;
    }
    else {
        return round(x * gd) > 1 ? round(x * gd) : 1;
    }
}
 
ICudaEngine* build_engine(unsigned int maxBatchSize, IBuilder* builder, IBuilderConfig* config, DataType dt, float& gd, float& gw, std::string& wts_name) {
    INetworkDefinition* network = builder->createNetworkV2(0U);
 
    // Create input tensor of shape {3, INPUT_H, INPUT_W} with name INPUT_BLOB_NAME
    ITensor* data = network->addInput(INPUT_BLOB_NAME, dt, Dims3{ 3, INPUT_H, INPUT_W });
    assert(data);
 
    std::map<std::string, Weights> weightMap = loadWeights(wts_name);
 
    /* ------ yolov5 backbone------ */
    auto focus0 = focus(network, weightMap, *data, 3, get_width(64, gw), 3, "model.0");
    auto conv1 = convBlock(network, weightMap, *focus0->getOutput(0), get_width(128, gw), 3, 2, 1, "model.1");
    auto bottleneck_CSP2 = C3(network, weightMap, *conv1->getOutput(0), get_width(128, gw), get_width(128, gw), get_depth(3, gd), true, 1, 0.5, "model.2");
    auto conv3 = convBlock(network, weightMap, *bottleneck_CSP2->getOutput(0), get_width(256, gw), 3, 2, 1, "model.3");
    auto bottleneck_csp4 = C3(network, weightMap, *conv3->getOutput(0), get_width(256, gw), get_width(256, gw), get_depth(9, gd), true, 1, 0.5, "model.4");
    auto conv5 = convBlock(network, weightMap, *bottleneck_csp4->getOutput(0), get_width(512, gw), 3, 2, 1, "model.5");
    auto bottleneck_csp6 = C3(network, weightMap, *conv5->getOutput(0), get_width(512, gw), get_width(512, gw), get_depth(9, gd), true, 1, 0.5, "model.6");
    auto conv7 = convBlock(network, weightMap, *bottleneck_csp6->getOutput(0), get_width(1024, gw), 3, 2, 1, "model.7");
    auto spp8 = SPP(network, weightMap, *conv7->getOutput(0), get_width(1024, gw), get_width(1024, gw), 5, 9, 13, "model.8");
 
    /* ------ yolov5 head ------ */
    auto bottleneck_csp9 = C3(network, weightMap, *spp8->getOutput(0), get_width(1024, gw), get_width(1024, gw), get_depth(3, gd), false, 1, 0.5, "model.9");
    auto conv10 = convBlock(network, weightMap, *bottleneck_csp9->getOutput(0), get_width(512, gw), 1, 1, 1, "model.10");
 
    auto upsample11 = network->addResize(*conv10->getOutput(0));
    assert(upsample11);
    upsample11->setResizeMode(ResizeMode::kNEAREST);
    upsample11->setOutputDimensions(bottleneck_csp6->getOutput(0)->getDimensions());
 
    ITensor* inputTensors12[] = { upsample11->getOutput(0), bottleneck_csp6->getOutput(0) };
    auto cat12 = network->addConcatenation(inputTensors12, 2);
    auto bottleneck_csp13 = C3(network, weightMap, *cat12->getOutput(0), get_width(1024, gw), get_width(512, gw), get_depth(3, gd), false, 1, 0.5, "model.13");
    auto conv14 = convBlock(network, weightMap, *bottleneck_csp13->getOutput(0), get_width(256, gw), 1, 1, 1, "model.14");
 
    auto upsample15 = network->addResize(*conv14->getOutput(0));
    assert(upsample15);
    upsample15->setResizeMode(ResizeMode::kNEAREST);
    upsample15->setOutputDimensions(bottleneck_csp4->getOutput(0)->getDimensions());
 
    ITensor* inputTensors16[] = { upsample15->getOutput(0), bottleneck_csp4->getOutput(0) };
    auto cat16 = network->addConcatenation(inputTensors16, 2);
 
    auto bottleneck_csp17 = C3(network, weightMap, *cat16->getOutput(0), get_width(512, gw), get_width(256, gw), get_depth(3, gd), false, 1, 0.5, "model.17");
 
    // yolo layer 0
    IConvolutionLayer* det0 = network->addConvolutionNd(*bottleneck_csp17->getOutput(0), 3 * (Yolo::CLASS_NUM + 5), DimsHW{ 1, 1 }, weightMap["model.24.m.0.weight"], weightMap["model.24.m.0.bias"]);
    auto conv18 = convBlock(network, weightMap, *bottleneck_csp17->getOutput(0), get_width(256, gw), 3, 2, 1, "model.18");
    ITensor* inputTensors19[] = { conv18->getOutput(0), conv14->getOutput(0) };
    auto cat19 = network->addConcatenation(inputTensors19, 2);
    auto bottleneck_csp20 = C3(network, weightMap, *cat19->getOutput(0), get_width(512, gw), get_width(512, gw), get_depth(3, gd), false, 1, 0.5, "model.20");
    //yolo layer 1
    IConvolutionLayer* det1 = network->addConvolutionNd(*bottleneck_csp20->getOutput(0), 3 * (Yolo::CLASS_NUM + 5), DimsHW{ 1, 1 }, weightMap["model.24.m.1.weight"], weightMap["model.24.m.1.bias"]);
    auto conv21 = convBlock(network, weightMap, *bottleneck_csp20->getOutput(0), get_width(512, gw), 3, 2, 1, "model.21");
    ITensor* inputTensors22[] = { conv21->getOutput(0), conv10->getOutput(0) };
    auto cat22 = network->addConcatenation(inputTensors22, 2);
    auto bottleneck_csp23 = C3(network, weightMap, *cat22->getOutput(0), get_width(1024, gw), get_width(1024, gw), get_depth(3, gd), false, 1, 0.5, "model.23");
    IConvolutionLayer* det2 = network->addConvolutionNd(*bottleneck_csp23->getOutput(0), 3 * (Yolo::CLASS_NUM + 5), DimsHW{ 1, 1 }, weightMap["model.24.m.2.weight"], weightMap["model.24.m.2.bias"]);
 
    auto yolo = addYoLoLayer(network, weightMap, "model.24", std::vector<IConvolutionLayer*>{det0, det1, det2});
    yolo->getOutput(0)->setName(OUTPUT_BLOB_NAME);
    network->markOutput(*yolo->getOutput(0));
 
    // Build engine
    builder->setMaxBatchSize(maxBatchSize);
    config->setMaxWorkspaceSize(16 * (1 << 20));  // 16MB
#if defined(USE_FP16)
    config->setFlag(BuilderFlag::kFP16);
#elif defined(USE_INT8)
    std::cout << "Your platform support int8: " << (builder->platformHasFastInt8() ? "true" : "false") << std::endl;
    assert(builder->platformHasFastInt8());
    config->setFlag(BuilderFlag::kINT8);
    Int8EntropyCalibrator2* calibrator = new Int8EntropyCalibrator2(1, INPUT_W, INPUT_H, "./coco_calib/", "int8calib.table", INPUT_BLOB_NAME);
    config->setInt8Calibrator(calibrator);
#endif
 
    std::cout << "Building engine, please wait for a while..." << std::endl;
    ICudaEngine* engine = builder->buildEngineWithConfig(*network, *config);
    std::cout << "Build engine successfully!" << std::endl;
 
    // Don't need the network any more
    network->destroy();
 
    // Release host memory
    for (auto& mem : weightMap)
    {
        free((void*)(mem.second.values));
    }
 
    return engine;
}
 
ICudaEngine* build_engine_p6(unsigned int maxBatchSize, IBuilder* builder, IBuilderConfig* config, DataType dt, float& gd, float& gw, std::string& wts_name) {
    INetworkDefinition* network = builder->createNetworkV2(0U);
 
    // Create input tensor of shape {3, INPUT_H, INPUT_W} with name INPUT_BLOB_NAME
    ITensor* data = network->addInput(INPUT_BLOB_NAME, dt, Dims3{ 3, INPUT_H, INPUT_W });
    assert(data);
 
    std::map<std::string, Weights> weightMap = loadWeights(wts_name);
 
    /* ------ yolov5 backbone------ */
    auto focus0 = focus(network, weightMap, *data, 3, get_width(64, gw), 3, "model.0");
    auto conv1 = convBlock(network, weightMap, *focus0->getOutput(0), get_width(128, gw), 3, 2, 1, "model.1");
    auto c3_2 = C3(network, weightMap, *conv1->getOutput(0), get_width(128, gw), get_width(128, gw), get_depth(3, gd), true, 1, 0.5, "model.2");
    auto conv3 = convBlock(network, weightMap, *c3_2->getOutput(0), get_width(256, gw), 3, 2, 1, "model.3");
    auto c3_4 = C3(network, weightMap, *conv3->getOutput(0), get_width(256, gw), get_width(256, gw), get_depth(9, gd), true, 1, 0.5, "model.4");
    auto conv5 = convBlock(network, weightMap, *c3_4->getOutput(0), get_width(512, gw), 3, 2, 1, "model.5");
    auto c3_6 = C3(network, weightMap, *conv5->getOutput(0), get_width(512, gw), get_width(512, gw), get_depth(9, gd), true, 1, 0.5, "model.6");
    auto conv7 = convBlock(network, weightMap, *c3_6->getOutput(0), get_width(768, gw), 3, 2, 1, "model.7");
    auto c3_8 = C3(network, weightMap, *conv7->getOutput(0), get_width(768, gw), get_width(768, gw), get_depth(3, gd), true, 1, 0.5, "model.8");
    auto conv9 = convBlock(network, weightMap, *c3_8->getOutput(0), get_width(1024, gw), 3, 2, 1, "model.9");
    auto spp10 = SPP(network, weightMap, *conv9->getOutput(0), get_width(1024, gw), get_width(1024, gw), 3, 5, 7, "model.10");
    auto c3_11 = C3(network, weightMap, *spp10->getOutput(0), get_width(1024, gw), get_width(1024, gw), get_depth(3, gd), false, 1, 0.5, "model.11");
 
    /* ------ yolov5 head ------ */
    auto conv12 = convBlock(network, weightMap, *c3_11->getOutput(0), get_width(768, gw), 1, 1, 1, "model.12");
    auto upsample13 = network->addResize(*conv12->getOutput(0));
    assert(upsample13);
    upsample13->setResizeMode(ResizeMode::kNEAREST);
    upsample13->setOutputDimensions(c3_8->getOutput(0)->getDimensions());
    ITensor* inputTensors14[] = { upsample13->getOutput(0), c3_8->getOutput(0) };
    auto cat14 = network->addConcatenation(inputTensors14, 2);
    auto c3_15 = C3(network, weightMap, *cat14->getOutput(0), get_width(1536, gw), get_width(768, gw), get_depth(3, gd), false, 1, 0.5, "model.15");
 
    auto conv16 = convBlock(network, weightMap, *c3_15->getOutput(0), get_width(512, gw), 1, 1, 1, "model.16");
    auto upsample17 = network->addResize(*conv16->getOutput(0));
    assert(upsample17);
    upsample17->setResizeMode(ResizeMode::kNEAREST);
    upsample17->setOutputDimensions(c3_6->getOutput(0)->getDimensions());
    ITensor* inputTensors18[] = { upsample17->getOutput(0), c3_6->getOutput(0) };
    auto cat18 = network->addConcatenation(inputTensors18, 2);
    auto c3_19 = C3(network, weightMap, *cat18->getOutput(0), get_width(1024, gw), get_width(512, gw), get_depth(3, gd), false, 1, 0.5, "model.19");
 
    auto conv20 = convBlock(network, weightMap, *c3_19->getOutput(0), get_width(256, gw), 1, 1, 1, "model.20");
    auto upsample21 = network->addResize(*conv20->getOutput(0));
    assert(upsample21);
    upsample21->setResizeMode(ResizeMode::kNEAREST);
    upsample21->setOutputDimensions(c3_4->getOutput(0)->getDimensions());
    ITensor* inputTensors21[] = { upsample21->getOutput(0), c3_4->getOutput(0) };
    auto cat22 = network->addConcatenation(inputTensors21, 2);
    auto c3_23 = C3(network, weightMap, *cat22->getOutput(0), get_width(512, gw), get_width(256, gw), get_depth(3, gd), false, 1, 0.5, "model.23");
 
    auto conv24 = convBlock(network, weightMap, *c3_23->getOutput(0), get_width(256, gw), 3, 2, 1, "model.24");
    ITensor* inputTensors25[] = { conv24->getOutput(0), conv20->getOutput(0) };
    auto cat25 = network->addConcatenation(inputTensors25, 2);
    auto c3_26 = C3(network, weightMap, *cat25->getOutput(0), get_width(1024, gw), get_width(512, gw), get_depth(3, gd), false, 1, 0.5, "model.26");
 
    auto conv27 = convBlock(network, weightMap, *c3_26->getOutput(0), get_width(512, gw), 3, 2, 1, "model.27");
    ITensor* inputTensors28[] = { conv27->getOutput(0), conv16->getOutput(0) };
    auto cat28 = network->addConcatenation(inputTensors28, 2);
    auto c3_29 = C3(network, weightMap, *cat28->getOutput(0), get_width(1536, gw), get_width(768, gw), get_depth(3, gd), false, 1, 0.5, "model.29");
 
    auto conv30 = convBlock(network, weightMap, *c3_29->getOutput(0), get_width(768, gw), 3, 2, 1, "model.30");
    ITensor* inputTensors31[] = { conv30->getOutput(0), conv12->getOutput(0) };
    auto cat31 = network->addConcatenation(inputTensors31, 2);
    auto c3_32 = C3(network, weightMap, *cat31->getOutput(0), get_width(2048, gw), get_width(1024, gw), get_depth(3, gd), false, 1, 0.5, "model.32");
 
    /* ------ detect ------ */
    IConvolutionLayer* det0 = network->addConvolutionNd(*c3_23->getOutput(0), 3 * (Yolo::CLASS_NUM + 5), DimsHW{ 1, 1 }, weightMap["model.33.m.0.weight"], weightMap["model.33.m.0.bias"]);
    IConvolutionLayer* det1 = network->addConvolutionNd(*c3_26->getOutput(0), 3 * (Yolo::CLASS_NUM + 5), DimsHW{ 1, 1 }, weightMap["model.33.m.1.weight"], weightMap["model.33.m.1.bias"]);
    IConvolutionLayer* det2 = network->addConvolutionNd(*c3_29->getOutput(0), 3 * (Yolo::CLASS_NUM + 5), DimsHW{ 1, 1 }, weightMap["model.33.m.2.weight"], weightMap["model.33.m.2.bias"]);
    IConvolutionLayer* det3 = network->addConvolutionNd(*c3_32->getOutput(0), 3 * (Yolo::CLASS_NUM + 5), DimsHW{ 1, 1 }, weightMap["model.33.m.3.weight"], weightMap["model.33.m.3.bias"]);
 
    auto yolo = addYoLoLayer(network, weightMap, "model.33", std::vector<IConvolutionLayer*>{det0, det1, det2, det3});
    yolo->getOutput(0)->setName(OUTPUT_BLOB_NAME);
    network->markOutput(*yolo->getOutput(0));
 
    // Build engine
    builder->setMaxBatchSize(maxBatchSize);
    config->setMaxWorkspaceSize(16 * (1 << 20));  // 16MB
#if defined(USE_FP16)
    config->setFlag(BuilderFlag::kFP16);
#elif defined(USE_INT8)
    std::cout << "Your platform support int8: " << (builder->platformHasFastInt8() ? "true" : "false") << std::endl;
    assert(builder->platformHasFastInt8());
    config->setFlag(BuilderFlag::kINT8);
    Int8EntropyCalibrator2* calibrator = new Int8EntropyCalibrator2(1, INPUT_W, INPUT_H, "./coco_calib/", "int8calib.table", INPUT_BLOB_NAME);
    config->setInt8Calibrator(calibrator);
#endif
 
    std::cout << "Building engine, please wait for a while..." << std::endl;
    ICudaEngine* engine = builder->buildEngineWithConfig(*network, *config);
    std::cout << "Build engine successfully!" << std::endl;
 
    // Don't need the network any more
    network->destroy();
 
    // Release host memory
    for (auto& mem : weightMap)
    {
        free((void*)(mem.second.values));
    }
 
    return engine;
}
 
void APIToModel(unsigned int maxBatchSize, IHostMemory** modelStream, float& gd, float& gw, std::string& wts_name) {
    // Create builder
    IBuilder* builder = createInferBuilder(gLogger);
    IBuilderConfig* config = builder->createBuilderConfig();
 
    // Create model to populate the network, then set the outputs and create an engine
    ICudaEngine* engine = build_engine(maxBatchSize, builder, config, DataType::kFLOAT, gd, gw, wts_name);
    assert(engine != nullptr);
 
    // Serialize the engine
    (*modelStream) = engine->serialize();
 
    // Close everything down
    engine->destroy();
    builder->destroy();
    config->destroy();
}
 
void doInference(IExecutionContext& context, cudaStream_t& stream, void** buffers, float* input, float* output, int batchSize) {
    // DMA input batch data to device, infer on the batch asynchronously, and DMA output back to host
    CUDA_CHECK(cudaMemcpyAsync(buffers[0], input, batchSize * 3 * INPUT_H * INPUT_W * sizeof(float), cudaMemcpyHostToDevice, stream));
    context.enqueue(batchSize, buffers, stream, nullptr);
    CUDA_CHECK(cudaMemcpyAsync(output, buffers[1], batchSize * OUTPUT_SIZE * sizeof(float), cudaMemcpyDeviceToHost, stream));
    cudaStreamSynchronize(stream);
}
 
bool parse_args(int argc, char** argv, std::string& engine) {
    if (argc < 3) return false;
    if (std::string(argv[1]) == "-v" && argc == 3) {
        engine = std::string(argv[2]);
    }
    else {
        return false;
    }
    return true;
}
 
int main(int argc, char** argv) {
    cudaSetDevice(DEVICE);
 
    //std::string wts_name = "";
    std::string engine_name = "";
    //float gd = 0.0f, gw = 0.0f;
    //std::string img_dir;
 
    if (!parse_args(argc, argv, engine_name)) {
        std::cerr << "arguments not right!" << std::endl;
        std::cerr << "./yolov5 -v [.engine] // run inference with camera" << std::endl;
        return -1;
    }
 
    std::ifstream file(engine_name, std::ios::binary);
    if (!file.good()) {
        std::cerr << " read " << engine_name << " error! " << std::endl;
        return -1;
    }
    char* trtModelStream{ nullptr };
    size_t size = 0;
    file.seekg(0, file.end);
    size = file.tellg();
    file.seekg(0, file.beg);
    trtModelStream = new char[size];
    assert(trtModelStream);
    file.read(trtModelStream, size);
    file.close();
 
 
    // prepare input data ---------------------------
    static float data[BATCH_SIZE * 3 * INPUT_H * INPUT_W];
    //for (int i = 0; i < 3 * INPUT_H * INPUT_W; i++)
    //    data[i] = 1.0;
    static float prob[BATCH_SIZE * OUTPUT_SIZE];
    IRuntime* runtime = createInferRuntime(gLogger);
    assert(runtime != nullptr);
    ICudaEngine* engine = runtime->deserializeCudaEngine(trtModelStream, size);
    assert(engine != nullptr);
    IExecutionContext* context = engine->createExecutionContext();
    assert(context != nullptr);
    delete[] trtModelStream;
    assert(engine->getNbBindings() == 2);
    void* buffers[2];
    // In order to bind the buffers, we need to know the names of the input and output tensors.
    // Note that indices are guaranteed to be less than IEngine::getNbBindings()
    const int inputIndex = engine->getBindingIndex(INPUT_BLOB_NAME);
    const int outputIndex = engine->getBindingIndex(OUTPUT_BLOB_NAME);
    assert(inputIndex == 0);
    assert(outputIndex == 1);
    // Create GPU buffers on device
    CUDA_CHECK(cudaMalloc(&buffers[inputIndex], BATCH_SIZE * 3 * INPUT_H * INPUT_W * sizeof(float)));
    CUDA_CHECK(cudaMalloc(&buffers[outputIndex], BATCH_SIZE * OUTPUT_SIZE * sizeof(float)));
    // Create stream
    cudaStream_t stream;
    CUDA_CHECK(cudaStreamCreate(&stream));
 
 
    cv::VideoCapture capture("/home/cao-yolox/yolov5/tensorrtx-master/yolov5/samples/1.mp4");	#修改為自己要檢測的視頻或者圖片，注意要寫全路徑，如果調(diào)用攝像頭，則括號內(nèi)的參數(shù)設(shè)為0，注意引號要去掉。
    //cv::VideoCapture capture("../overpass.mp4");
    //int fourcc = cv::VideoWriter::fourcc('M','J','P','G');
    //capture.set(cv::CAP_PROP_FOURCC, fourcc);
    if (!capture.isOpened()) {
        std::cout << "Error opening video stream or file" << std::endl;
        return -1;
    }
 
    int key;
    int fcount = 0;
    while (1)
    {
        cv::Mat frame;
        capture >> frame;
        if (frame.empty())
        {
            std::cout << "Fail to read image from camera!" << std::endl;
            break;
        }
        fcount++;
        //if (fcount < BATCH_SIZE && f + 1 != (int)file_names.size()) continue;
        for (int b = 0; b < fcount; b++) {
            //cv::Mat img = cv::imread(img_dir + "/" + file_names[f - fcount + 1 + b]);
            cv::Mat img = frame;
            if (img.empty()) continue;
            cv::Mat pr_img = preprocess_img(img, INPUT_W, INPUT_H); // letterbox BGR to RGB
            int i = 0;
            for (int row = 0; row < INPUT_H; ++row) {
                uchar* uc_pixel = pr_img.data + row * pr_img.step;
                for (int col = 0; col < INPUT_W; ++col) {
                    data[b * 3 * INPUT_H * INPUT_W + i] = (float)uc_pixel[2] / 255.0;
                    data[b * 3 * INPUT_H * INPUT_W + i + INPUT_H * INPUT_W] = (float)uc_pixel[1] / 255.0;
                    data[b * 3 * INPUT_H * INPUT_W + i + 2 * INPUT_H * INPUT_W] = (float)uc_pixel[0] / 255.0;
                    uc_pixel += 3;
                    ++i;
                }
            }
        }
 
        // Run inference
        auto start = std::chrono::system_clock::now();
        doInference(*context, stream, buffers, data, prob, BATCH_SIZE);
        auto end = std::chrono::system_clock::now();
        //std::cout << std::chrono::duration_cast<std::chrono::milliseconds>(end - start).count() << "ms" << std::endl;
        int fps = 1000.0 / std::chrono::duration_cast<std::chrono::milliseconds>(end - start).count();
        std::vector<std::vector<Yolo::Detection>> batch_res(fcount);
        for (int b = 0; b < fcount; b++) {
            auto& res = batch_res[b];
            nms(res, &prob[b * OUTPUT_SIZE], CONF_THRESH, NMS_THRESH);
        }
        for (int b = 0; b < fcount; b++) {
            auto& res = batch_res[b];
            //std::cout << res.size() << std::endl;
            //cv::Mat img = cv::imread(img_dir + "/" + file_names[f - fcount + 1 + b]);
            for (size_t j = 0; j < res.size(); j++) {
                cv::Rect r = get_rect(frame, res[j].bbox);
                cv::rectangle(frame, r, cv::Scalar(0x27, 0xC1, 0x36), 2);
                std::string label = my_classes[(int)res[j].class_id];
                cv::putText(frame, label, cv::Point(r.x, r.y - 1), cv::FONT_HERSHEY_PLAIN, 1.2, cv::Scalar(0xFF, 0xFF, 0xFF), 2);
                std::string jetson_fps = "Jetson Nano FPS: " + std::to_string(fps);
                cv::putText(frame, jetson_fps, cv::Point(11, 80), cv::FONT_HERSHEY_PLAIN, 3, cv::Scalar(0, 0, 255), 2, cv::LINE_AA);
            }
            //cv::imwrite("_" + file_names[f - fcount + 1 + b], img);
        }
        cv::imshow("yolov5", frame);
        key = cv::waitKey(1);
        if (key == 'q') {
            break;
        }
        fcount = 0;
    }
 
    capture.release();
    // Release stream and buffers
    cudaStreamDestroy(stream);
    CUDA_CHECK(cudaFree(buffers[inputIndex]));
    CUDA_CHECK(cudaFree(buffers[outputIndex]));
    // Destroy the engine
    context->destroy();
    engine->destroy();
    runtime->destroy();
 
    return 0;
}

4.重新編譯

進(jìn)入到buid下重新make。注意只要修改了yolov5.cpp就要重新make。
執(zhí)行

sudo ./yolov5 -v yolov5n.engine	#注意要提前插好攝像頭

問題：出現(xiàn)Failed to load module “canberra-gtk-module”
解決：

sudo apt-get install libcanberra-gtk-module

5.效果

如下的測試，是在一個公用的行人檢測的視頻上進(jìn)行的，如果想用可在如下鏈接下載：
鏈接：https://pan.baidu.com/s/1HivF1OifVA8pHnGKtkXPfg
提取碼：jr7o
Jetson Nano配置YOLOv5并實現(xiàn)FPS=25,yolov5,jetson nano,Ubunntu,人工智能,python

七、參考

1.https://www.bilibili.com/read/cv11869887?spm_id_from=333.999.0.0

2.https://blog.csdn.net/weixin_54603153/article/details/120079220

3.https://github.com/wang-xinyu/tensorrtx/blob/master/yolov5/README.md文章來源地址http://www.zghlxwxcb.cn/news/detail-643034.html

到了這里，關(guān)于Jetson Nano配置YOLOv5并實現(xiàn)FPS=25的文章就介紹完了。如果您還想了解更多內(nèi)容，請在右上角搜索TOY模板網(wǎng)以前的文章或繼續(xù)瀏覽下面的相關(guān)文章，希望大家以后多多支持TOY模板網(wǎng)！

本文來自互聯(lián)網(wǎng)用戶投稿，該文觀點僅代表作者本人，不代表本站立場。本站僅提供信息存儲空間服務(wù)，不擁有所有權(quán)，不承擔(dān)相關(guān)法律責(zé)任。如若轉(zhuǎn)載，請注明出處：如若內(nèi)容造成侵權(quán)/違法違規(guī)/事實不符，請點擊違法舉報進(jìn)行投訴反饋，一經(jīng)查實，立即刪除！

分享到：

領(lǐng)支付寶紅包贊助服務(wù)器費用

如何使用Django 結(jié)合WebSocket 進(jìn)行實時目標(biāo)檢測呢？以yolov5 為例，實現(xiàn)：FPS 25+ (0: 系統(tǒng)簡介與架構(gòu))
訪問：http://127.0.0.1:8000/ObjectDetection/ObjectDetection1/ 先看下效果：兩個攝像頭實時展示之后更新了效果，打算加上檢測結(jié)果和 FPS ，結(jié)果加上FPS 實測了一下，好家伙一秒30-40 幀都行在我的3060 上，這是python 寫的前后端實時檢測你敢信，還兩個攝像頭機(jī)位。
2023年04月08日
瀏覽(22)
Jetson Nano部署YOLOv5與Tensorrtx加速——（自己走一遍全過程記錄）
搞了一下Jetson nano和YOLOv5，網(wǎng)上的資料大多重復(fù)也有許多的坑，在配置過程中摸爬滾打了好幾天，出坑后決定寫下這份教程供自己備忘。事先聲明，這篇文章的許多內(nèi)容本身并不是原創(chuàng)，而是將配置過程中的文獻(xiàn)進(jìn)行了搜集整理，但是所有步驟都1:1復(fù)刻我的配置過程，包括其
2024年02月03日
瀏覽(25)
Yolov5使用Ai實現(xiàn)FPS游戲自動瞄準(zhǔn)
自動瞄準(zhǔn)技術(shù)已經(jīng)成為了許多FPS游戲玩家們追求的終極目標(biāo)之一。近年來，隨著深度學(xué)習(xí)技術(shù)的發(fā)展，越來越多的自動瞄準(zhǔn)工具開始出現(xiàn)，其中最為流行且表現(xiàn)出色的莫過于 Yolo 系列目標(biāo)檢測算法，特別是 Yolov5。本文將介紹如何使用 Yolov5 算法實現(xiàn) FPS 游戲自動瞄準(zhǔn)。 xy坐標(biāo)
2024年02月09日
瀏覽(16)
Ai實現(xiàn)FPS游戲自動瞄準(zhǔn) yolov5fps自瞄
大家好我是畢加鎖 (鎖!) 今天來分享一個Yolov5 FPS跟槍的源碼解析和原理講解。代碼比較粗糙各位有什么優(yōu)化的方式可以留言指出，可以一起交流學(xué)習(xí)。? 需要了解的東西和可能會遇到的問題 1.xy坐標(biāo)點與當(dāng)前鼠標(biāo)的xy坐標(biāo)點距離計算 2.獲取窗口句柄，本文使用的是根據(jù)窗口名
2024年02月02日
瀏覽(15)
Jetson AGX Xavier實現(xiàn)TensorRT加速YOLOv5進(jìn)行實時檢測
link 上一篇：Jetson AGX Xavier安裝torch、torchvision且成功運行yolov5算法下一篇：Jetson AGX Xavier測試YOLOv4 ????????由于YOLOv5在Xavier上對實時畫面的檢測速度較慢，需要采用TensorRT對其進(jìn)行推理加速。接下來記錄一下我的實現(xiàn)過程。 ?如果還沒有搭建YOLOv5的python環(huán)境，按照下文步驟
2024年02月10日
瀏覽(16)
YOLOV5的FPS計算問題
data換為自己的數(shù)據(jù)集對應(yīng)的yaml文件 weights換為訓(xùn)練自己數(shù)據(jù)集得到的權(quán)重 batchsize這里要設(shè)置為1 pre-process：圖像預(yù)處理時間，包括圖像保持長寬比縮放和padding填充，通道變換（HWC-CHW）和升維處理等； inference：推理速度，指預(yù)處理之后的圖像輸入模型到模型輸出結(jié)果的時間；
2024年02月11日
瀏覽(31)
yolov5-計算fps（新加入：4. 記錄運行B導(dǎo)yolov7-tiny后計算fps的方法）
參考自：睿智的目標(biāo)檢測21——如何調(diào)用攝像頭進(jìn)行目標(biāo)檢測 FPS簡單來理解就是圖像的刷新頻率，也就是每秒多少幀假設(shè)目標(biāo)檢測網(wǎng)絡(luò)處理1幀要0.02s，此時FPS就是50 #---------------------------分割線-------------------------------- # 也就是說在計算FPS的時候，會強(qiáng)調(diào) 每秒、每張。因
2024年02月11日
瀏覽(26)
yolov5識別cf火線敵人（FPS類的AI瞄準(zhǔn)）詳細(xì)教程二
以下代碼只可用于私服，不可商用，代碼完全開源，主要用于學(xué)習(xí)，上篇文章已經(jīng)寫了yolov5的基礎(chǔ)用法，這篇文章主要是將我對yolov5模型的修改，用于實現(xiàn)對屏幕進(jìn)行實時監(jiān)測識別并將鼠標(biāo)移動到人體指定位置的功能，改動的代碼不是很多，我盡量說的詳細(xì)一些供大家學(xué)習(xí)。
2024年02月09日
瀏覽(15)
yolov5識別cf火線敵人（FPS類的AI瞄準(zhǔn)）詳細(xì)教程一
以下代碼只可用于私服，不可商用，代碼完全開源，因為自己的研究方向也是深度學(xué)習(xí)方向，而且平時閑的時候還喜歡玩會cf火線等槍戰(zhàn)，就想著找一個大模型做一個對游戲敵人的識別的功能，一切實現(xiàn)之后就想把自己的心得寫出來，我打算分倆個教程分別細(xì)述整個學(xué)習(xí)以及
2024年02月11日
瀏覽(21)
AI識別教程 yolov5 穿越火線，csgo等FPS游戲識別，角色識別
源碼： 2024年機(jī)器學(xué)習(xí)深度學(xué)習(xí)千例_人工智能_SYBH的博客-CSDN博客 yolov5 穿越火線角色識別實戰(zhàn)： fps視頻 csgo
2024年04月14日
瀏覽(18)