????????本文參考文獻:Progressive Image Deraining Networks: A Better and Simpler Baseline Dongwei Ren1, Wangmeng Zuo2, Qinghua Hu?1, Pengfei Zhu1, and Deyu Meng31College of Computing and Intelligence, Tianjin University, Tianjin, China 2School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China 3Xi’an Jiaotong University, Xi’an, China
? ? ? ? 論文下載網(wǎng)址:[1901.09221] Progressive Image Deraining Networks: A Better and Simpler Baseline (arxiv.org)https://arxiv.org/abs/1901.09221
????????論文作者提供的Github實驗源碼:https://github.com/csdwren/PReNet.
? ? ? ? 關于論文:本文參考論文的主要貢獻是提出了一種簡單易實現(xiàn)且有較好效果的去雨網(wǎng)絡架構PreNet,雖然其所用技術不是最先進的,但卻有著十分優(yōu)秀的去雨效果,因此作者認為這是一種可供眾多研究者學習和實驗對比使用的優(yōu)良的基準模型。而在我看來,正是由于模型的以上這些特點,PreNet也十分適合深度學習去雨入門研究者進行學習和實現(xiàn)。
? ? ? ? 接下來將仔細說明一種十分簡便的實現(xiàn)方法以及部分實驗原理。
實驗環(huán)境
? ? ? ? 首先介紹實現(xiàn)的實驗環(huán)境。新手建議首選在以下網(wǎng)址注冊賬號(注冊需科學上網(wǎng),后續(xù)使用不必)利用線上環(huán)境寫Pytorch代碼,并將模型放網(wǎng)站提供的免費云算力服務器上訓練??梢允∪バ率峙渲帽镜豤uda環(huán)境的煩惱。
Kaggle: Your Home for Data Sciencehttps://www.kaggle.com/
????????點擊左側 Code
?????????再點擊 New Notebook 即可開啟線上編程環(huán)境
?????????線上編程環(huán)境和Jupyter類似,進入編程界面后可以在上方菜單欄設置界面外觀選項(可以選擇添加行號以便于查看代碼)。右側可選擇加速器,我推薦使用GPU P100。
? ? ? ? ?代碼編寫及調試完成后點擊右上角 Save Version 來講模型放在云GPU上訓練(注意保存的版本無法手動刪除,因此一定要確認代碼調試無誤后再點擊 Save Version,以免版本太多造成的混亂)。
? ? ? ? 以上即是對編程環(huán)境的基本介紹,下面詳細介紹實現(xiàn)步驟。
具體實現(xiàn)
? ? ? ? 圖像去雨任務和圖像分類任務的處理流程相似,都是:數(shù)據(jù)處理 --> 模型構建 --> 訓練 --> 記錄訓練信息及模型保存。接下來將結合代碼詳細介紹。
? ? ? ? 數(shù)據(jù)處理:
'''
Dataset for Training.
'''
class MyTrainDataset(Dataset):
def __init__(self, input_path, label_path):
self.input_path = input_path
self.input_files = os.listdir(input_path)
self.label_path = label_path
self.label_files = os.listdir(label_path)
self.transforms = transforms.Compose([
transforms.CenterCrop([64, 64]),
transforms.ToTensor(),
])
def __len__(self):
return len(self.input_files)
def __getitem__(self, index):
label_image_path = os.path.join(self.label_path, self.label_files[index])
label_image = Image.open(label_image_path).convert('RGB')
'''
Ensure input and label are in couple.
'''
temp = self.label_files[index][:-4]
self.input_files[index] = temp + 'x2.png'
input_image_path = os.path.join(self.input_path, self.input_files[index])
input_image = Image.open(input_image_path).convert('RGB')
input = self.transforms(input_image)
label = self.transforms(label_image)
return input, label
'''
Dataset for testing.
'''
class MyValidDataset(Dataset):
def __init__(self, input_path, label_path):
self.input_path = input_path
self.input_files = os.listdir(input_path)
self.label_path = label_path
self.label_files = os.listdir(label_path)
self.transforms = transforms.Compose([
transforms.CenterCrop([64, 64]),
transforms.ToTensor(),
])
def __len__(self):
return len(self.input_files)
def __getitem__(self, index):
label_image_path = os.path.join(self.label_path, self.label_files[index])
label_image = Image.open(label_image_path).convert('RGB')
temp = self.label_files[index][:-4]
self.input_files[index] = temp + 'x2.png'
input_image_path = os.path.join(self.input_path, self.input_files[index])
input_image = Image.open(input_image_path).convert('RGB')
input = self.transforms(input_image)
label = self.transforms(label_image)
return input, label
? ? ? ? ?上面的代碼分為兩個部分:分別是訓練集和測試集的Dataset類的重寫。這是自定義Pytorch數(shù)據(jù)集處理方式的比較方便的處理方式。由于訓練集和測試集的處理方式一致,這里僅對訓練集的處理方式進行介紹。
? ? ? ? 首先我們需要明白,為什么我們特別地需要重寫__init__, __length__, __getitem__?這三個Dataset()類的方法,因為后續(xù)處理中封裝用的DataLoader類需要調用Dataset對象的這三個函數(shù)來獲取數(shù)據(jù)集的相關信息,這個關系可以理解為:DataLoader類負責將數(shù)據(jù)切分為很多個批次(batch)以分批次進行訓練,而Dataset負責記錄數(shù)據(jù)整體信息處理每一批次中的每一對標簽和輸入數(shù)據(jù)的內容。換句話說,Dataset類只負責記錄整體數(shù)據(jù)信息處理一對標簽和輸入數(shù)據(jù)對,而DataLoader將Dataset的處理方式循環(huán)地應用到整個數(shù)據(jù)集上。因此,對于不同的數(shù)據(jù)集我們要重寫Dataset類的這三個函數(shù)以改變DataLoader處理數(shù)據(jù)的方式。
? ? ? ? 明白了以上內容,我們就好理解這三個函數(shù)的運作方式了:__init__和?__length__負責記錄數(shù)據(jù)集的一些基本信息,__length__的內容一定是返回輸入數(shù)據(jù)項的長度(不可更改),而__init__用于初始化你需要用到的一些基本變量(可高度自定義),這些變量將在__getitem__中被調用。__getitem__就負責處理每一對數(shù)據(jù)對的匹配輸出,其關鍵是一定要保證最后 return 的 input 和 label 是成對的。
? ? ? ? 以下是創(chuàng)建Dataset和DataLoader對象的過程:
'''
Path of Dataset.
'''
input_path = "../input/jrdr-deraining-dataset/JRDR/rain_data_train_Heavy/rain/X2"
label_path = "../input/jrdr-deraining-dataset/JRDR/rain_data_train_Heavy/norain"
valid_input_path = '../input/jrdr-deraining-dataset/JRDR/rain_data_test_Heavy/rain/X2'
valid_label_path = '../input/jrdr-deraining-dataset/JRDR/rain_data_test_Heavy/norain'
'''
Prepare DataLoaders.
Attension:
'pin_numbers=True' can accelorate CUDA computing.
'''
dataset_train = MyTrainDataset(input_path, label_path)
dataset_valid = MyValidDataset(valid_input_path, valid_label_path)
train_loader = DataLoader(dataset_train, batch_size=batch_size, shuffle=True, pin_memory=True)
valid_loader = DataLoader(dataset_valid, batch_size=batch_size, shuffle=True, pin_memory=True)
? ? ? ?需要注意的是我使用的是Kaggle網(wǎng)站上提供的線上數(shù)據(jù)集,可自行搜索添加:
? ? ? ? ?你可以選擇使用Heavy數(shù)據(jù)集訓練,也可以使用Light,分別對應人工合成的大雨和小雨數(shù)據(jù)集。我推薦使用Heavy,使用Heavy訓練出來的模型對于真實下雨場景的去雨效果更明顯。
????????模型構建:
# 網(wǎng)絡架構
class PReNet_r(nn.Module):
def __init__(self, recurrent_iter=6, use_GPU=True):
super(PReNet_r, self).__init__()
self.iteration = recurrent_iter
self.use_GPU = use_GPU
self.conv0 = nn.Sequential(
nn.Conv2d(6, 32, 3, 1, 1),
nn.ReLU()
)
self.res_conv1 = nn.Sequential(
nn.Conv2d(32, 32, 3, 1, 1),
nn.ReLU(),
nn.Conv2d(32, 32, 3, 1, 1),
nn.ReLU()
)
self.conv_i = nn.Sequential(
nn.Conv2d(32 + 32, 32, 3, 1, 1),
nn.Sigmoid()
)
self.conv_f = nn.Sequential(
nn.Conv2d(32 + 32, 32, 3, 1, 1),
nn.Sigmoid()
)
self.conv_g = nn.Sequential(
nn.Conv2d(32 + 32, 32, 3, 1, 1),
nn.Tanh()
)
self.conv_o = nn.Sequential(
nn.Conv2d(32 + 32, 32, 3, 1, 1),
nn.Sigmoid()
)
self.conv = nn.Sequential(
nn.Conv2d(32, 3, 3, 1, 1),
)
def forward(self, input):
batch_size, row, col = input.size(0), input.size(2), input.size(3)
#mask = Variable(torch.ones(batch_size, 3, row, col)).cuda()
x = input
h = Variable(torch.zeros(batch_size, 32, row, col))
c = Variable(torch.zeros(batch_size, 32, row, col))
if self.use_GPU:
h = h.cuda()
c = c.cuda()
x_list = []
for i in range(self.iteration):
x = torch.cat((input, x), 1)
x = self.conv0(x)
x = torch.cat((x, h), 1)
i = self.conv_i(x)
f = self.conv_f(x)
g = self.conv_g(x)
o = self.conv_o(x)
c = f * c + i * g
h = o * torch.tanh(c)
x = h
for j in range(5):
resx = x
x = F.relu(self.res_conv1(x) + resx)
x = self.conv(x)
x = input + x
x_list.append(x)
return x, x_list
????????我直接照搬的論文源碼的網(wǎng)絡架構,簡單來說,該網(wǎng)絡就是結合了LSTM和遞歸殘差網(wǎng)絡的處理方式。你暫時可以不用理解,直接用就行。如果想要進行深入了解的話可以查看本文開頭處提供的原文鏈接或本站搜索論文翻譯。
? ? ? ? 訓練:
'''
Define optimizer and Loss Function.
'''
optimizer = optim.RAdam(net.parameters(), lr=learning_rate)
scheduler = CosineAnnealingLR(optimizer, T_max=epoch)
loss_f = SSIM()
? ? ? ? 首先初始化優(yōu)化器和損失函數(shù),采用RAdam優(yōu)化器(Adam優(yōu)化器的基礎上增加了warm-up的功能)并使用CosineAnnealingLR(余弦退火算法)讓學習率隨訓練輪數(shù)呈余弦變化,以優(yōu)化訓練結果。
? ? ? ? SSIM損失函數(shù)使用的是論文作者提供的源碼:
# SSIM損失函數(shù)實現(xiàn)
def gaussian(window_size, sigma):
gauss = torch.Tensor([exp(-(x - window_size//2)**2/float(2*sigma**2)) for x in range(window_size)])
return gauss/gauss.sum()
def create_window(window_size, channel):
_1D_window = gaussian(window_size, 1.5).unsqueeze(1)
_2D_window = _1D_window.mm(_1D_window.t()).float().unsqueeze(0).unsqueeze(0)
window = Variable(_2D_window.expand(channel, 1, window_size, window_size).contiguous())
return window
def _ssim(img1, img2, window, window_size, channel, size_average = True):
mu1 = F.conv2d(img1, window, padding = window_size//2, groups = channel)
mu2 = F.conv2d(img2, window, padding = window_size//2, groups = channel)
mu1_sq = mu1.pow(2)
mu2_sq = mu2.pow(2)
mu1_mu2 = mu1*mu2
sigma1_sq = F.conv2d(img1*img1, window, padding = window_size//2, groups = channel) - mu1_sq
sigma2_sq = F.conv2d(img2*img2, window, padding = window_size//2, groups = channel) - mu2_sq
sigma12 = F.conv2d(img1*img2, window, padding = window_size//2, groups = channel) - mu1_mu2
C1 = 0.01**2
C2 = 0.03**2
ssim_map = ((2*mu1_mu2 + C1)*(2*sigma12 + C2))/((mu1_sq + mu2_sq + C1)*(sigma1_sq + sigma2_sq + C2))
if size_average:
return ssim_map.mean()
else:
return ssim_map.mean(1).mean(1).mean(1)
class SSIM(torch.nn.Module):
def __init__(self, window_size = 11, size_average = True):
super(SSIM, self).__init__()
self.window_size = window_size
self.size_average = size_average
self.channel = 1
self.window = create_window(window_size, self.channel)
def forward(self, img1, img2):
(_, channel, _, _) = img1.size()
if channel == self.channel and self.window.data.type() == img1.data.type():
window = self.window
else:
window = create_window(self.window_size, channel)
if img1.is_cuda:
window = window.cuda(img1.get_device())
window = window.type_as(img1)
self.window = window
self.channel = channel
return _ssim(img1, img2, window, self.window_size, channel, self.size_average)
def ssim(img1, img2, window_size = 11, size_average = True):
(_, channel, _, _) = img1.size()
window = create_window(window_size, channel)
if img1.is_cuda:
window = window.cuda(img1.get_device())
window = window.type_as(img1)
return _ssim(img1, img2, window, window_size, channel, size_average)
? ? ? ? SSIM是一種評估兩幅圖像相似度的算法,具體原理此處不再詳述,你只需要記住其值越大兩張圖像相似度越高,值為一則兩張圖象完全一樣。因此我們在后續(xù)訓練時需要取SSIM的負值。
? ? ? ? 下面是循環(huán)訓練的代碼:
'''
START Training ...
'''
for i in range(epoch):
# ---------------Train----------------
net.train()
train_losses = []
'''
tqdm is a toolkit for progress bar.
'''
for batch in tqdm(train_loader):
inputs, labels = batch
outputs, _ = net(inputs.to(device))
loss = loss_f(labels.to(device), outputs)
loss = -loss
optimizer.zero_grad()
loss.backward()
'''
Avoid grad to be too BIG.
'''
grad_norm = nn.utils.clip_grad_norm_(net.parameters(), max_norm=10)
optimizer.step()
'''
Attension:
We need set 'loss.item()' to turn Tensor into Numpy, or plt will not work.
'''
train_losses.append(loss.item())
train_loss = sum(train_losses) / len(train_losses)
Loss_list.append(train_loss)
print(f"[ Train | {i + 1:03d}/{epoch:03d} ] SSIM_loss = {train_loss:.5f}")
scheduler.step()
for param_group in optimizer.param_groups:
learning_rate_list.append(param_group["lr"])
print('learning rate %f' % param_group["lr"])
# -------------Validation-------------
'''
Validation is a step to ensure training process is working.
You can also exploit Validation to see if your net work is overfitting.
Firstly, you should set model.eval(), to ensure parameters not training.
'''
net.eval()
valid_losses = []
for batch in tqdm(valid_loader):
inputs, labels = batch
'''
Cancel gradient decent.
'''
with torch.no_grad():
outputs, _ = net(inputs.to(device))
loss = loss_f(labels.to(device), outputs)
loss = -loss
valid_losses.append(loss.item())
valid_loss = sum(valid_losses) / len(valid_losses)
Valid_Loss_list.append(valid_loss)
print(f"[ Valid | {i + 1:03d}/{epoch:03d} ] SSIM_loss = {valid_loss:.5f}")
break_point = i + 1
'''
Update Logs and save the best model.
Patience is also checked.
'''
if valid_loss < best_valid_loss:
print(
f"[ Valid | {i + 1:03d}/{epoch:03d} ] SSIM_loss = {valid_loss:.5f} -> best")
else:
print(
f"[ Valid | {i + 1:03d}/{epoch:03d} ] SSIM_loss = {valid_loss:.5f}")
if valid_loss < best_valid_loss:
print(f'Best model found at epoch {i+1}, saving model')
torch.save(net.state_dict(), f'model_best.ckpt')
best_valid_loss = valid_loss
stale = 0
else:
stale += 1
if stale > patience:
print(f'No improvement {patience} consecutive epochs, early stopping.')
break
? ? ? ? 見注釋即可。其中break_point用于記錄訓練結束的epoch值,stale用于記錄模型未進步所持續(xù)的訓練輪數(shù),patience是預設的模型未進步所持續(xù)輪數(shù)的最大值。
記錄訓練信息及模型保存:
? ? ? ? 部分內容是從上面的代碼段截取的,由于實例較分散,我只在下面說明了其中比較典型的幾個:
Loss_list.append(train_loss) # 用于后續(xù)繪制Loss曲線
print(f"[ Train | {i + 1:03d}/{epoch:03d} ] SSIM_loss = {train_loss:.5f}")
? ? ? ? 打印日志信息
print(f'Best model found at epoch {i+1}, saving model')
torch.save(net.state_dict(), f'model_best.ckpt')
????????保存模型(關于所保存文件的后綴.ckpt 和 .pth的區(qū)別此處不詳述,可自行搜索)
'''
Use plt to draw Loss curves.
'''
plt.figure(dpi=500)
plt.subplot(211)
x = range(break_point)
y = Loss_list
plt.plot(x, y, 'ro-', label='Train Loss')
plt.plot(range(break_point), Valid_Loss_list, 'bs-', label='Valid Loss')
plt.ylabel('Loss')
plt.xlabel('epochs')
plt.subplot(212)
plt.plot(x, learning_rate_list, 'ro-', label='Learning rate')
plt.ylabel('Learning rate')
plt.xlabel('epochs')
plt.legend()
plt.show()
? ? ? ? 利用matplotlib庫繪制訓練過程中重要參數(shù)變化曲線。
實驗結果:
? ? ? ? 先附上我的完整代碼鏈接(注意前面提供的代碼不完整,一些細節(jié)部分被我省略了,直接復制粘貼無法運行):
? ? ? ? Kaggle平臺:PreNet | Kagglehttps://www.kaggle.com/code/leeding123/prenet? ? ? ? Gihub倉庫(歡迎點星):Derain_platform/prenet.ipynb at f3249f6ee4f14055bf30c53239141bccecdcb0f2 · DLee0102/Derain_platform · GitHubContribute to DLee0102/Derain_platform development by creating an account on GitHub.https://github.com/DLee0102/Derain_platform/blob/f3249f6ee4f14055bf30c53239141bccecdcb0f2/prenet.ipynb? ? ? ? 我在Heavy訓練集上的訓練結果:
? ? ? ? ?注:Loss曲線圖中紅色為訓練集Loss藍色為測試集Loss
? ? ? ? 合成數(shù)據(jù)集上的去雨效果:
?? ? ? ? ?真實數(shù)據(jù)集上的去雨效果:文章來源:http://www.zghlxwxcb.cn/news/detail-419802.html
文章來源地址http://www.zghlxwxcb.cn/news/detail-419802.html
到了這里,關于深度學習:圖像去雨網(wǎng)絡實現(xiàn)Pytorch (二)一個簡單實用的基準模型(PreNet)實現(xiàn)的文章就介紹完了。如果您還想了解更多內容,請在右上角搜索TOY模板網(wǎng)以前的文章或繼續(xù)瀏覽下面的相關文章,希望大家以后多多支持TOY模板網(wǎng)!