torchvison:計算機視覺工具包
包含
- torchvison.transforms(常用的圖像預處理方法);
- torchvision.datasets(常用數(shù)據(jù)集的dataset實現(xiàn),MNIST,CIFAR-10,ImageNet等);
- torchvison.model(常用的模型預訓練,AlexNet,VGG,ResNet,GoogleNet等)。
torchvision.transforms
常用的數(shù)據(jù)預處理方法,提升泛化能力。包括:數(shù)據(jù)中心化、數(shù)據(jù)標準化、縮放、裁剪、旋轉、填充、噪聲添加、灰度變換、線性變換、放射變換、亮度、飽和度和對比度變換等
數(shù)據(jù)標準化——transforms.Normalize
功能:逐channel的對圖像進行標準化(均值變?yōu)?,標準差變?yōu)?),可以加快模型的收斂
- output=(input-mean)/std
- mean:各通道的均值
- std:各通道的標準差
- inplace:是否原地操作
問題:究竟是什么意思?(0.5,0.5,0.5),(0.5,0.5,0.5)又是怎么來的呢?
transform.ToTensor(),
transform.Normalize((0.5,0.5,0.5),(0.5,0.5,0.5))
1、transform.ToTensor()?
原始的PILImage格式或者numpy.array格式的數(shù)據(jù)格式化為可被pytorch快速處理的張量類型
-
是將輸入的數(shù)據(jù)shape W,H,C--->?C,W,H。? ?從opencv讀到的圖片是通道數(shù)在第三個維度,現(xiàn)在經(jīng)過ToTensor操作cv2圖片變成了torch image類型,也就是通道在第一個維度。
-
將所有數(shù)除以255,將數(shù)據(jù)歸一化到[0,1]
-
示例:
import torch
import numpy as np
from torchvision import transforms
import cv2
# 自定義圖片數(shù)組,數(shù)據(jù)類型一定要轉為‘uint8’,不然transforms.ToTensor()不會歸一化
data = np.array([
[[1,1,1],[1,1,1],[1,1,1],[1,1,1],[1,1,1]],
[[2,2,2],[2,2,2],[2,2,2],[2,2,2],[2,2,2]],
[[3,3,3],[3,3,3],[3,3,3],[3,3,3],[3,3,3]],
[[4,4,4],[4,4,4],[4,4,4],[4,4,4],[4,4,4]],
[[5,5,5],[5,5,5],[5,5,5],[5,5,5],[5,5,5]]
],dtype='uint8')
print(data)
print(data.shape) #(5,5,3)
data = transforms.ToTensor()(data)
print(data)
print(data.shape) #(3,5,5)
輸出:
tensor([[[0.0039, 0.0039, 0.0039, 0.0039, 0.0039],
[0.0078, 0.0078, 0.0078, 0.0078, 0.0078],
[0.0118, 0.0118, 0.0118, 0.0118, 0.0118],
[0.0157, 0.0157, 0.0157, 0.0157, 0.0157],
[0.0196, 0.0196, 0.0196, 0.0196, 0.0196]],
[[0.0039, 0.0039, 0.0039, 0.0039, 0.0039],
[0.0078, 0.0078, 0.0078, 0.0078, 0.0078],
[0.0118, 0.0118, 0.0118, 0.0118, 0.0118],
[0.0157, 0.0157, 0.0157, 0.0157, 0.0157],
[0.0196, 0.0196, 0.0196, 0.0196, 0.0196]],
[[0.0039, 0.0039, 0.0039, 0.0039, 0.0039],
[0.0078, 0.0078, 0.0078, 0.0078, 0.0078],
[0.0118, 0.0118, 0.0118, 0.0118, 0.0118],
[0.0157, 0.0157, 0.0157, 0.0157, 0.0157],
[0.0196, 0.0196, 0.0196, 0.0196, 0.0196]]])
2、?transforms.Normalize()
x = (x - mean) / std
即同一緯度的數(shù)據(jù)減去這一維度的平均值,再除以標準差,將歸一化后的數(shù)據(jù)變換到[-1,1]之間??烧媸沁@樣嗎??
- 求解mean和std
我們需要求得一批數(shù)據(jù)的mean和std,代碼如下:
import torch
import numpy as np
from torchvision import transforms
# 這里以上述創(chuàng)建的單數(shù)據(jù)為例子
data = np.array([
[[1,1,1],[1,1,1],[1,1,1],[1,1,1],[1,1,1]],
[[2,2,2],[2,2,2],[2,2,2],[2,2,2],[2,2,2]],
[[3,3,3],[3,3,3],[3,3,3],[3,3,3],[3,3,3]],
[[4,4,4],[4,4,4],[4,4,4],[4,4,4],[4,4,4]],
[[5,5,5],[5,5,5],[5,5,5],[5,5,5],[5,5,5]]
],dtype='uint8')
#將數(shù)據(jù)轉為C,W,H,并歸一化到[0,1]
data = transforms.ToTensor()(data)
# 需要對數(shù)據(jù)進行擴維,增加batch維度
data = torch.unsqueeze(data,0)
nb_samples = 0.
#創(chuàng)建3維的空列表
channel_mean = torch.zeros(3) #tensor([0., 0., 0.])
channel_std = torch.zeros(3) #tensor([0., 0., 0.])
print(data.shape) # torch.Size([1,3,5,5])
# tensor([[[[0.0039, 0.0039, 0.0039, 0.0039, 0.0039],
[0.0078, 0.0078, 0.0078, 0.0078, 0.0078],
[0.0118, 0.0118, 0.0118, 0.0118, 0.0118],
[0.0157, 0.0157, 0.0157, 0.0157, 0.0157],
[0.0196, 0.0196, 0.0196, 0.0196, 0.0196]],
[[0.0039, 0.0039, 0.0039, 0.0039, 0.0039],
[0.0078, 0.0078, 0.0078, 0.0078, 0.0078],
[0.0118, 0.0118, 0.0118, 0.0118, 0.0118],
[0.0157, 0.0157, 0.0157, 0.0157, 0.0157],
[0.0196, 0.0196, 0.0196, 0.0196, 0.0196]],
[[0.0039, 0.0039, 0.0039, 0.0039, 0.0039],
[0.0078, 0.0078, 0.0078, 0.0078, 0.0078],
[0.0118, 0.0118, 0.0118, 0.0118, 0.0118],
[0.0157, 0.0157, 0.0157, 0.0157, 0.0157],
[0.0196, 0.0196, 0.0196, 0.0196, 0.0196]]]])
N, C, H, W = data.shape[:4] #N=1 C=3 H=5 W=5
data = data.view(N, C, -1) #將w,h維度的數(shù)據(jù)展平,為batch,channel,data,然后對三個維度上的數(shù)分別求和和標準差
print(data.shape) #torch.Size([1, 3, 25])
#tensor([[[0.0039, 0.0039, 0.0039, 0.0039, 0.0039, 0.0078, 0.0078, 0.0078,
0.0078, 0.0078, 0.0118, 0.0118, 0.0118, 0.0118, 0.0118, 0.0157,
0.0157, 0.0157, 0.0157, 0.0157, 0.0196, 0.0196, 0.0196, 0.0196,
0.0196],
[0.0039, 0.0039, 0.0039, 0.0039, 0.0039, 0.0078, 0.0078, 0.0078,
0.0078, 0.0078, 0.0118, 0.0118, 0.0118, 0.0118, 0.0118, 0.0157,
0.0157, 0.0157, 0.0157, 0.0157, 0.0196, 0.0196, 0.0196, 0.0196,
0.0196],
[0.0039, 0.0039, 0.0039, 0.0039, 0.0039, 0.0078, 0.0078, 0.0078,
0.0078, 0.0078, 0.0118, 0.0118, 0.0118, 0.0118, 0.0118, 0.0157,
0.0157, 0.0157, 0.0157, 0.0157, 0.0196, 0.0196, 0.0196, 0.0196,
0.0196]]])
#展平后,w,h屬于第二維度,對他們求平均,sum(0)為將同一緯度的數(shù)據(jù)累加
channel_mean += data.mean(2).sum(0) #tensor([0.0118, 0.0118, 0.0118])
#展平后,w,h屬于第二維度,對他們求標準差,sum(0)為將同一緯度的數(shù)據(jù)累加
channel_std += data.std(2).sum(0) #tensor([0.0057, 0.0057, 0.0057])
#獲取所有batch的數(shù)據(jù),這里為1
nb_samples += N
#獲取同一batch的均值和標準差
channel_mean /= nb_samples
channel_std /= nb_samples
print(channel_mean, channel_std) #tensor([0.0118, 0.0118, 0.0118]) tensor([0.0057, 0.0057, 0.0057])
以此便可求得均值和標準差。我們再帶入公式:
x = (x - mean) / std
- 自己實現(xiàn):
data = np.array([
[[1,1,1],[1,1,1],[1,1,1],[1,1,1],[1,1,1]],
[[2,2,2],[2,2,2],[2,2,2],[2,2,2],[2,2,2]],
[[3,3,3],[3,3,3],[3,3,3],[3,3,3],[3,3,3]],
[[4,4,4],[4,4,4],[4,4,4],[4,4,4],[4,4,4]],
[[5,5,5],[5,5,5],[5,5,5],[5,5,5],[5,5,5]]
],dtype='uint8')
data = transforms.ToTensor()(data)
print(data)
#tensor([[[0.0039, 0.0039, 0.0039, 0.0039, 0.0039],
[0.0078, 0.0078, 0.0078, 0.0078, 0.0078],
[0.0118, 0.0118, 0.0118, 0.0118, 0.0118],
[0.0157, 0.0157, 0.0157, 0.0157, 0.0157],
[0.0196, 0.0196, 0.0196, 0.0196, 0.0196]],
[[0.0039, 0.0039, 0.0039, 0.0039, 0.0039],
[0.0078, 0.0078, 0.0078, 0.0078, 0.0078],
[0.0118, 0.0118, 0.0118, 0.0118, 0.0118],
[0.0157, 0.0157, 0.0157, 0.0157, 0.0157],
[0.0196, 0.0196, 0.0196, 0.0196, 0.0196]],
[[0.0039, 0.0039, 0.0039, 0.0039, 0.0039],
[0.0078, 0.0078, 0.0078, 0.0078, 0.0078],
[0.0118, 0.0118, 0.0118, 0.0118, 0.0118],
[0.0157, 0.0157, 0.0157, 0.0157, 0.0157],
[0.0196, 0.0196, 0.0196, 0.0196, 0.0196]]])
for i in range(3):
data[i,:,:] = (data[i,:,:] - channel_mean[i]) / channel_std[i]
print(data)
輸出:
tensor([[[-1.3856, -1.3856, -1.3856, -1.3856, -1.3856],
[-0.6928, -0.6928, -0.6928, -0.6928, -0.6928],
[ 0.0000, 0.0000, 0.0000, 0.0000, 0.0000],
[ 0.6928, 0.6928, 0.6928, 0.6928, 0.6928],
[ 1.3856, 1.3856, 1.3856, 1.3856, 1.3856]],
[[-1.3856, -1.3856, -1.3856, -1.3856, -1.3856],
[-0.6928, -0.6928, -0.6928, -0.6928, -0.6928],
[ 0.0000, 0.0000, 0.0000, 0.0000, 0.0000],
[ 0.6928, 0.6928, 0.6928, 0.6928, 0.6928],
[ 1.3856, 1.3856, 1.3856, 1.3856, 1.3856]],
[[-1.3856, -1.3856, -1.3856, -1.3856, -1.3856],
[-0.6928, -0.6928, -0.6928, -0.6928, -0.6928],
[ 0.0000, 0.0000, 0.0000, 0.0000, 0.0000],
[ 0.6928, 0.6928, 0.6928, 0.6928, 0.6928],
[ 1.3856, 1.3856, 1.3856, 1.3856, 1.3856]]])
- 官方實現(xiàn)
data = np.array([
[[1,1,1],[1,1,1],[1,1,1],[1,1,1],[1,1,1]],
[[2,2,2],[2,2,2],[2,2,2],[2,2,2],[2,2,2]],
[[3,3,3],[3,3,3],[3,3,3],[3,3,3],[3,3,3]],
[[4,4,4],[4,4,4],[4,4,4],[4,4,4],[4,4,4]],
[[5,5,5],[5,5,5],[5,5,5],[5,5,5],[5,5,5]]
],dtype='uint8')
data = transforms.ToTensor()(data)
data = transforms.Normalize(channel_mean, channel_std)(data)
print(data)
輸出:
tensor([[[-1.3856, -1.3856, -1.3856, -1.3856, -1.3856],
[-0.6928, -0.6928, -0.6928, -0.6928, -0.6928],
[ 0.0000, 0.0000, 0.0000, 0.0000, 0.0000],
[ 0.6928, 0.6928, 0.6928, 0.6928, 0.6928],
[ 1.3856, 1.3856, 1.3856, 1.3856, 1.3856]],
[[-1.3856, -1.3856, -1.3856, -1.3856, -1.3856],
[-0.6928, -0.6928, -0.6928, -0.6928, -0.6928],
[ 0.0000, 0.0000, 0.0000, 0.0000, 0.0000],
[ 0.6928, 0.6928, 0.6928, 0.6928, 0.6928],
[ 1.3856, 1.3856, 1.3856, 1.3856, 1.3856]],
[[-1.3856, -1.3856, -1.3856, -1.3856, -1.3856],
[-0.6928, -0.6928, -0.6928, -0.6928, -0.6928],
[ 0.0000, 0.0000, 0.0000, 0.0000, 0.0000],
[ 0.6928, 0.6928, 0.6928, 0.6928, 0.6928],
[ 1.3856, 1.3856, 1.3856, 1.3856, 1.3856]]])
我們觀察數(shù)據(jù)發(fā)現(xiàn),通過求解的均值和標準差,求得的標準化的值,并非在[-1,1]
結論:
經(jīng)過這樣處理后的數(shù)據(jù)符合標準正態(tài)分布,即均值為0,標準差為1。使模型更容易收斂。并非是歸于[-1,1]?。?/p>
Normalize之前,數(shù)據(jù)分布是在[0,1]
Normalize之后:
- ①如果mean = std=(0.5,0.5,0.5),那么Normalize之后,數(shù)據(jù)分布是【-1,1】;因為最小值=(0-mean)/std=(0-0.5)/0.5=-1。同理最大值的等于1。
- ②如果mean為數(shù)據(jù)的計算的mean,std也是數(shù)據(jù)計算的std。那么Normalize之后根據(jù)公式(x-mean)/std,結合均值和方差的性質,可以算出新數(shù)據(jù)的均值會變成0,方差會變成1。
總結:
如果是Normalize((0.5,0.5,0.5),(0.5,0.5,0.5)),確實是歸一化到【-1,1】
如果是Normalize(channel_mean, channel_std),才是均值為0,標準差為1。
transforms 數(shù)據(jù)增強
數(shù)據(jù)增強又稱為數(shù)據(jù)增廣,數(shù)據(jù)擴增,它是對訓練集進行變換,使訓練集更豐富,從而讓模型更具泛化能力。
transforms——裁剪
-
transforms.CenterCrop??
- 功能:從圖像中心裁剪圖片
- size:所需裁剪圖片尺寸
????2、transforms.RandomCrop
- 功能:從圖片中隨機裁剪出尺寸為size的圖片
- size:所需裁剪圖片尺寸
- padding:設置填充大小。當為a時,上下左右均填充a個像素,當為(a,b)時,上下填充b個像素,左右填充a個像素。當為(a,b,c,d)時,左上右下分別填充abcd
- pad_if_need:若圖像小于設定size,則填充
- padding_mode:填充模式:(1)constant像素值由fill設定(默認模式);(2)edge像素值由圖像邊緣像素決定(3)reflect鏡像填充,最后一個像素不鏡像,eg,[1,2,3,4]->[3,2,1,2,3,4,3,2];(4)symmetric鏡像填充,最后一個像素鏡像,eg,[1,2,3,4]->[2,1,1,2,3,4,4,3]
- fill:constant時,設置填充的像素值
transforms——翻轉和旋轉
- RandomHorizontalFlip 水平翻轉
- RandomVerticalFlip? 垂直翻轉
- RandomRotation? 隨機旋轉圖像
自定義tansfroms方法
class Compose(object):
def __call__(self,img):
for t in self.transforms:
img = t(img)
return img
transforms方法是在Compose類的__call__方法中被調用的,對一組transforms方法進行for循環(huán),每次循序的挑選并執(zhí)行transforms方法。文章來源:http://www.zghlxwxcb.cn/news/detail-444914.html
-
自定義transforms要素:
- 僅接收一個參數(shù),返回一個參數(shù)
- 注意上下游的輸出和輸入,數(shù)據(jù)類型必須匹配
通過類實現(xiàn)多參數(shù)傳入,下圖為自定義transforms的基本參數(shù),一個init,一個call文章來源地址http://www.zghlxwxcb.cn/news/detail-444914.html
class YourTransforms(object):
def __init__(self,...):
...
def __call__(self,img):
...
return img
椒鹽噪聲
- 椒鹽噪聲又稱為又稱為脈沖噪聲,是一種隨機出現(xiàn)的白點或者黑點,白點稱為鹽噪聲,黑點稱為椒噪聲。
- 信噪比(SNR):衡量噪聲的比例,圖像中為圖像像素的占比
class AddPepperNoise(object):
def __init__(self, snr, p):
self.snr = snr
self.p = p
def __call__(self, img):
'''
添加椒鹽噪聲具體實現(xiàn)過程
'''
return img
到了這里,關于學習pytorch中歸一化transforms.Normalize的文章就介紹完了。如果您還想了解更多內容,請在右上角搜索TOY模板網(wǎng)以前的文章或繼續(xù)瀏覽下面的相關文章,希望大家以后多多支持TOY模板網(wǎng)!