国产无码综合区,色欲AV无码国产永久播放,无码天堂亚洲国产AV,国产日韩欧美女同一区二区

<sup id="fcz1n"></sup><menuitem id="fcz1n"></menuitem>

<span id="fcz1n"></span>

PaddleSeg學習4——paddle模型使用TensorRT推理（c++）

2年前作者：waf13916分類：Toy博客閱讀(20)違法舉報

這篇具有很好參考價值的文章主要介紹了PaddleSeg學習4——paddle模型使用TensorRT推理（c++）。希望對大家有所幫助。如果存在錯誤或未考慮完全的地方，請大家不吝賜教，您也可以點擊"舉報違法"按鈕提交疑問。

1 模型末端添加softmax和argmax算子

前文 PaddleSeg c++部署OCRNet+HRNet模型中的語義分割模型輸出為float32類型，模型不含softmax和argmax處理，導致在項目應用過程中后處理耗時較高。
通過PaddleSeg/tools/export.py在網(wǎng)絡末端增加softmax和argmax算子，解決應用中的后處理耗時問題。

參考文檔PaddleSeg/docs/model_export_cn.md導出預測模型。將導出的預測模型文件保存在output/inference_model文件夾中，如下。模型輸出類型為int32。

./output/inference_model
  ├── deploy.yaml            # 部署相關的配置文件，主要說明數(shù)據(jù)預處理的方式
  ├── model.pdmodel          # 預測模型的拓撲結(jié)構文件
  ├── model.pdiparams        # 預測模型的權重文件
  └── model.pdiparams.info   # 參數(shù)額外信息，一般無需關注網(wǎng)絡輸出類型為int32。

python tools/export.py \
       --config  configs\ocrnet\ocrnet_hrnetw18_cityscapes_1024x512_160k_lovasz_softmax.yml\
       --model_path output\iter_12000\model.pdparams \
       --save_dir output\inference_model
       --output_op argmax

PaddleSeg v2.0以前export.py中不含argmax和softmax參數(shù)選項，可通過以下代碼在模型末端增加softmax和argmax算子。

import argparse
import os
import paddle
import yaml
from paddleseg.cvlibs import Config
from paddleseg.utils import logger

def parse_args():
    parser = argparse.ArgumentParser(description='Model export.')
    # params of training
    parser.add_argument(
        "--config",
        dest="cfg",
        help="The config file.",
        default=None,
        type=str,
        required=True)
    parser.add_argument(
        '--save_dir',
        dest='save_dir',
        help='The directory for saving the model snapshot',
        type=str,
        default='./output')
    parser.add_argument(
        '--model_path',
        dest='model_path',
        help='The path of model for evaluation',
        type=str,
        default=None)

    return parser.parse_args()
    
class SavedSegmentationNet(paddle.nn.Layer):
    def __init__(self, net, without_argmax=False, with_softmax=False):
        super().__init__()
        self.net = net
        self.post_processer = PostPorcesser(without_argmax, with_softmax)

    def forward(self, x):
        outs = self.net(x)
        outs = self.post_processer(outs)
        return outs

class PostPorcesser(paddle.nn.Layer):
    def __init__(self, without_argmax, with_softmax):
        super().__init__()
        self.without_argmax = without_argmax
        self.with_softmax = with_softmax

    def forward(self, outs):
        new_outs = []
        for out in outs:
            if self.with_softmax:
                out = paddle.nn.functional.softmax(out, axis=1)
            if not self.without_argmax:
                out = paddle.argmax(out, axis=1)
            new_outs.append(out)
        return new_outs

def main(args):
    os.environ['PADDLESEG_EXPORT_STAGE'] = 'True'
    cfg = Config(args.cfg)
    net = cfg.model

    if args.model_path:
        para_state_dict = paddle.load(args.model_path)
        net.set_dict(para_state_dict)
        logger.info('Loaded trained params of model successfully.')

    # 增加softmax、argmax處理
    new_net = SavedSegmentationNet(net, True,True)
    
    new_net.eval()
    new_net = paddle.jit.to_static(
        new_net,
        input_spec=[
            paddle.static.InputSpec(
                shape=[None, 3, None, None], dtype='float32')
        ])
    save_path = os.path.join(args.save_dir, 'model')
    paddle.jit.save(new_net, save_path)

    yml_file = os.path.join(args.save_dir, 'deploy.yaml')
    with open(yml_file, 'w') as file:
        transforms = cfg.export_config.get('transforms', [{
            'type': 'Normalize'
        }])
        data = {
            'Deploy': {
                'transforms': transforms,
                'model': 'model.pdmodel',
                'params': 'model.pdiparams'
            }
        }
        yaml.dump(data, file)

    logger.info(f'Model is saved in {args.save_dir}.')

if __name__ == '__main__':
    args = parse_args()
    main(args)

2 paddle模型轉(zhuǎn)onnx模型

參考文檔 PaddleSeg/docs/model_export_onnx_cn.md
參考文檔Paddle2ONNX

（1）安裝Paddle2ONNX

pip install paddle2onnx

（2）模型轉(zhuǎn)換
執(zhí)行如下命令，使用Paddle2ONNX將output/inference_model文件夾中的預測模型導出為ONNX格式模型。將導出的預測模型文件保存為model.onnx。

paddle2onnx --model_dir output/inference_model \
            --model_filename model.pdmodel \
            --params_filename model.pdiparams \
            --opset_version 12 \
            --save_file model.onnx \
            --enable_dev_version True

3 onnx模型轉(zhuǎn)TensorRT模型

3.1 安裝TensorRT-8.5.3.1

參考TensorRt安裝

3.2 使用 trtexec 將onnx模型編譯優(yōu)化導出為engine模型

由于是動態(tài)輸入，因此指定了輸入尺寸范圍和最優(yōu)尺寸。將導出的預測模型文件保存為model.trt。

trtexec.exe 
	--onnx=model.onnx 
	--explicitBatch --fp16 
	--minShapes=x:1x3x540x960 
	--optShapes=x:1x3x720x1280 
	--maxShapes=x:1x3x1080x1920 
	--saveEngine=model.trt

4 TensorRT模型推理測試

參考TensorRt動態(tài)尺寸輸入的分割模型測試

5 完整代碼

namespace TRTSegmentation {

	class Logger : public nvinfer1::ILogger
	{
	public:
		Logger(Severity severity = Severity::kWARNING) :
			severity_(severity) {}

		virtual void log(Severity severity, const char* msg) noexcept override
		{
			// suppress info-level messages
			if (severity <= severity_) {
				//std::cout << msg << std::endl;
			}
		}

		nvinfer1::ILogger& getTRTLogger() noexcept
		{
			return *this;
		}
	private:
		Severity severity_;
	};

	struct InferDeleter
	{
		template <typename T>
		void operator()(T* obj) const
		{
			delete obj;
		}
	};

	template <typename T>
	using SampleUniquePtr = std::unique_ptr<T, InferDeleter>;

	class LaneSegInferTRT
	{
	public:
		LaneSegInferTRT(const std::string seg_model_dir = "") {
			this->seg_model_dir_ = seg_model_dir;
			InitPredictor();
		}

		~LaneSegInferTRT()
		{
			cudaFree(bindings_[0]);
			cudaFree(bindings_[1]);
		}
		void PredictSeg(
			const cv::Mat &image_mat, 
			std::vector<PaddleSegmentation::DataLane> &solLanes /*實線*/,
			std::vector<PaddleSegmentation::DataLane> &dasLanes /*虛線*/,
			std::vector<double>* times = nullptr);
	private:
		void InitPredictor();
		// Preprocess image and copy data to input buffer
		cv::Mat Preprocess(const cv::Mat& image_mat);
		// Postprocess image
		void Postprocess(int rows, 
						int cols, 
						std::vector<int> &out_data,
						std::vector<PaddleSegmentation::DataLane> &solLanes,
						std::vector<PaddleSegmentation::DataLane> &dasLanes);

	private:
		//static const int num_classes_ = 15;
		std::shared_ptr<nvinfer1::ICudaEngine> mEngine_;
		SampleUniquePtr<nvinfer1::IExecutionContext> context_seg_lane_;
		std::vector<void*> bindings_;
		std::string seg_model_dir_;
		int gpuMaxBufSize = 1280 * 720; // output
	};

}//namespace PaddleSegmentation

#include "LaneSegInferTRT.hpp"
namespace {
	class Logger : public nvinfer1::ILogger
	{
	public:
		Logger(Severity severity = Severity::kWARNING) :
			severity_(severity) {}

		virtual void log(Severity severity, const char* msg) noexcept override
		{
			// suppress info-level messages
			if (severity <= severity_) {
				//std::cout << msg << std::endl;
			}
		}

		nvinfer1::ILogger& getTRTLogger() noexcept
		{
			return *this;
		}
	private:
		Severity severity_;
	};
}

namespace TRTSegmentation {

#define CHECK(status)                                                                                                  \
    do                                                                                                                 \
    {                                                                                                                  \
        auto ret = (status);                                                                                           \
        if (ret != 0)                                                                                                  \
        {                                                                                                              \
            std::cerr << "Cuda failure: " << ret << std::endl;                                                         \
		}                                                                                                              \
	} while (0)

	void LaneSegInferTRT::InitPredictor()
	{
		if (seg_model_dir_.empty()) {
			throw "Predictor must receive seg_model!";
		}

		std::ifstream ifs(seg_model_dir_, std::ifstream::binary);
		if (!ifs) {
			throw "seg_model_dir error!";
		}

		ifs.seekg(0, std::ios_base::end);
		int size = ifs.tellg();
		ifs.seekg(0, std::ios_base::beg);

		std::unique_ptr<char> pData(new char[size]);
		ifs.read(pData.get(), size);

		ifs.close();

		// engine模型
		Logger logger(nvinfer1::ILogger::Severity::kVERBOSE);

		SampleUniquePtr<nvinfer1::IRuntime> runtime{nvinfer1::createInferRuntime(logger.getTRTLogger()) };
		mEngine_ = std::shared_ptr<nvinfer1::ICudaEngine>(
			runtime->deserializeCudaEngine(pData.get(), size), InferDeleter());
			
		this->context_seg_lane_ = SampleUniquePtr<nvinfer1::IExecutionContext>(mEngine_->createExecutionContext());

		bindings_.resize(mEngine_->getNbBindings());

		CHECK(cudaMalloc(&bindings_[0], sizeof(float) * 3 * gpuMaxBufSize));    // n*3*h*w
		CHECK(cudaMalloc(&bindings_[1], sizeof(int) * 1 * gpuMaxBufSize));      // n*1*h*w
	}
	
	cv::Mat LaneSegInferTRT::Preprocess(const cv::Mat& image_mat)
	{
		cv::Mat img;
		cv::cvtColor(image_mat, img, cv::COLOR_BGR2RGB);

		if (true/*is_normalize*/) {
			img.convertTo(img, CV_32F, 1.0 / 255, 0);
			img = (img - 0.5) / 0.5;
		}
		return img;
	}

void LaneSegInferTRT::PredictSeg(
			const cv::Mat &image_mat,
			std::vector<PaddleSegmentation::DataLane> &solLanes ,
			std::vector<PaddleSegmentation::DataLane> &dasLanes,
			std::vector<double>* times)
	{
		// Preprocess image
		cv::Mat img = Preprocess(image_mat);		
		int rows = img.rows;
		int cols = img.cols;
		this->context_seg_lane_->setBindingDimensions(0, nvinfer1::Dims4{ 1, 3 , rows, cols });
		int chs = img.channels();
		std::vector<float> input_data(1 * chs * rows * cols, 0.0f);
		hwc_img_2_chw_data(img, input_data.data());		
		CHECK(cudaMemcpy(bindings_[0], static_cast<const void*>(input_data.data()), 3 * img.rows * img.cols * sizeof(float), cudaMemcpyHostToDevice));

		// Run predictor 推理
		context_seg_lane_->executeV2(bindings_.data());
		// Get output tensor		
		std::vector<int> out_data(1 * 1 * rows * cols);
		CHECK(cudaMemcpy(static_cast<void*>(out_data.data()), bindings_[1], out_data.size() * sizeof(int), cudaMemcpyDeviceToHost));
		// Postprocessing
		Postprocess(rows, cols, out_data, solLanes,dasLanes);
	}

	void LaneSegInferTRT::Postprocess(int rows, int cols, vector<int>& out_data,std::vector<PaddleSegmentation::DataLane> &solLanes,
		std::vector<PaddleSegmentation::DataLane> &dasLanes)
	{
		PaddleSegmentation::LanePostProcess laneNet(rows, cols);
		laneNet.lanePostprocessForTRT(out_data,solLanes,dasLanes);
	}	

}//namespace PaddleSegmentation

6 測試結(jié)果

PaddleSeg學習4——paddle模型使用TensorRT推理（c++）,PaddleSeg語義分割實戰(zhàn),學習,paddle,c++
文章來源地址http://www.zghlxwxcb.cn/news/detail-794343.html

到了這里，關于PaddleSeg學習4——paddle模型使用TensorRT推理（c++）的文章就介紹完了。如果您還想了解更多內(nèi)容，請在右上角搜索TOY模板網(wǎng)以前的文章或繼續(xù)瀏覽下面的相關文章，希望大家以后多多支持TOY模板網(wǎng)！

本文來自互聯(lián)網(wǎng)用戶投稿，該文觀點僅代表作者本人，不代表本站立場。本站僅提供信息存儲空間服務，不擁有所有權，不承擔相關法律責任。如若轉(zhuǎn)載，請注明出處：如若內(nèi)容造成侵權/違法違規(guī)/事實不符，請點擊違法舉報進行投訴反饋，一經(jīng)查實，立即刪除！

分享到：

領支付寶紅包贊助服務器費用

使用 TensorRT、卡爾曼濾波器和 SORT 算法進行實時對象檢測和跟蹤：第 2 部分將模型轉(zhuǎn)換為 TensorRT 并進行推理
在本博客文章系列的第 1 部分中，我們展示了如何使用 mmdetection 框架訓練對象檢測模型并在 BDD100K 數(shù)據(jù)集上對其進行微調(diào)。在第 2 部分中，我們將介紹將模型轉(zhuǎn)換為 TensorRT 并在 Nvidia GPU 上執(zhí)行推理的過程。在本博客文章系列的第 2 部分中，我們將討論以下主題：將模型轉(zhuǎn)換
2024年02月15日
瀏覽(22)
pointnet C++推理部署--tensorrt框架
python推理： C++推理：其中推理引擎的構建也可以直接使用tensorrt的bin目錄下的trtexec.exe。 LZ也實現(xiàn)了cuda版本的前處理代碼，但似乎效率比cpu前處理還低。可能是數(shù)據(jù)量不夠大吧（才10^3數(shù)量級），而且目前LZ的cuda水平也只是入門階段… python推理： C++推理： python推理： C++推理
2024年02月11日
瀏覽(19)
大語言模型推理提速：TensorRT-LLM 高性能推理實踐
作者：顧靜大型語言模型（Large language models,LLM）是基于大量數(shù)據(jù)進行預訓練的超大型深度學習模型。底層轉(zhuǎn)換器是一組神經(jīng)網(wǎng)絡，這些神經(jīng)網(wǎng)絡由具有 self-attention 的編碼器和解碼器組成。編碼器和解碼器從一系列文本中提取含義，并理解其中的單詞和短語之間的關系。當前
2024年01月25日
瀏覽(25)
【TensorRT】TensorRT C# API 項目更新 (1)：支持動態(tài)Bath輸入模型推理（下篇）
關于該項目的調(diào)用方式在上一篇文章中已經(jīng)進行了詳細介紹，具體使用可以參考《最新發(fā)布！TensorRT C# API ：基于C#與TensorRT部署深度學習模型》，下面結(jié)合Yolov8-cls模型詳細介紹一下更新的接口使用方法。 4.1 創(chuàng)建并配置C#項目 ? 首先創(chuàng)建一個簡單的C#項目，然后添加項目配置
2024年04月17日
瀏覽(20)
一個簡單的tensorRT mnist推理案例，模型采用代碼構建
TensorRT是NVIDIA的一個深度神經(jīng)網(wǎng)絡推理引擎，可以對深度學習模型進行優(yōu)化和部署。本程序中，使用了TensorRT來加載一個已經(jīng)訓練好的模型并進行推理。 TRTLogger是一個日志記錄類，用于記錄TensorRT的運行日志。 Matrix是一個矩陣結(jié)構體，用于存儲模型權重和輸入輸出數(shù)據(jù)。Mode
2023年04月10日
瀏覽(20)
paddle 52 在paddleseg中實現(xiàn)cutmix數(shù)據(jù)增強方式
CutMix是一種極其有效的數(shù)據(jù)增強方式，尤其是在遙感影像語義分割中。這主要是因為遙感影像標注成本較大，在實際業(yè)務中通常都是采用局部標注的方式進行標注，如下圖所示僅對標注成本較小的區(qū)域進行標注，而對標注成本較大的地方進行忽略。這使得標簽數(shù)據(jù)中各種類別
2024年02月01日
瀏覽(15)
深度學習：使用UNet做圖像語義分割，訓練自己制作的數(shù)據(jù)集并推理測試（詳細圖文教程）
語義分割(Semantic Segmentation)是圖像處理和機器視覺一個重要分支。與分類任務不同，語義分割需要判斷圖像每個像素點的類別，進行精確分割。語義分割目前在自動駕駛、自動摳圖、醫(yī)療影像等領域有著比較廣泛的應用。我總結(jié)了使用UNet網(wǎng)絡做圖像語義分割的方法，教程很詳
2024年01月18日
瀏覽(35)
4.4.tensorRT基礎(1)-模型推理時動態(tài)shape的具體實現(xiàn)要點
杜老師推出的 tensorRT從零起步高性能部署課程，之前有看過一遍，但是沒有做筆記，很多東西也忘了。這次重新擼一遍，順便記記筆記。本次課程學習 tensorRT 基礎-模型推理時動態(tài) shape 的具體實現(xiàn)要點課程大綱可看下面的思維導圖動態(tài) shape 指的是在模型編譯時指定可動態(tài)
2024年02月17日
瀏覽(20)
TensorRT C# API 項目更新 (1)：支持動態(tài)Bath輸入模型推理
?? NVIDIA? TensorRT? 是一款用于高性能深度學習推理的 SDK，包括深度學習推理優(yōu)化器和運行時，可為推理應用程序提供低延遲和高吞吐量?；?NVIDIA TensorRT 的應用程序在推理過程中的執(zhí)行速度比純 CPU 平臺快 36 倍，使您能夠優(yōu)化在所有主要框架上訓練的神經(jīng)網(wǎng)絡模型，以高
2024年04月11日
瀏覽(26)
【深度學習】SDXL tensorRT 推理，Stable Diffusion 轉(zhuǎn)onnx，轉(zhuǎn)TensorRT
juggernautXL_version6Rundiffusion.safetensors文件是pth pytroch文件，需要先轉(zhuǎn)為diffusers 的文件結(jié)構。 FP16在后面不好操作，所以最好先是FP32: 有了diffusers 的文件結(jié)構，就可以轉(zhuǎn)onnx文件。項目：https://huggingface.co/docs/diffusers/optimization/onnx stabilityai/stable-diffusion-xl-1.0-tensorrt 項目：https://hug
2024年01月19日
瀏覽(16)