0 前言
?? 優(yōu)質(zhì)競賽項目系列,今天要分享的是
?? 深度學(xué)習(xí)驗證碼識別 - 機器視覺 python opencv
該項目較為新穎,適合作為競賽課題方向,學(xué)長非常推薦!
??學(xué)長這里給一個題目綜合評分(每項滿分5分)
- 難度系數(shù):3分
- 工作量:3分
- 創(chuàng)新點:4分
?? 更多資料, 項目分享:文章來源:http://www.zghlxwxcb.cn/news/detail-639875.html
https://gitee.com/dancheng-senior/postgraduate文章來源地址http://www.zghlxwxcb.cn/news/detail-639875.html
1 項目簡介
在python爬蟲爬取某些網(wǎng)站的驗證碼的時候可能會遇到驗證碼識別的問題,現(xiàn)在的驗證碼大多分為四類:
-
1、計算驗證碼
-
2、滑塊驗證碼
-
3、識圖驗證碼
-
4、語音驗證碼
學(xué)長這李主要寫的就是識圖驗證碼,識別的是簡單的驗證碼,要想讓識別率更高,識別的更加準確就需要花很多的精力去訓(xùn)練自己的字體庫。
2 驗證碼識別步驟
1、灰度處理
2、二值化
3、去除邊框(如果有的話)
4、降噪
5、切割字符或者傾斜度矯正
6、訓(xùn)練字體庫
7、識別
這6個步驟中前三個步驟是基本的,4或者5可根據(jù)實際情況選擇是否需要,并不一定切割驗證碼,識別率就會上升很多有時候還會下降
這篇博客不涉及訓(xùn)練字體庫的內(nèi)容,請自行搜索。同樣也不講解基礎(chǔ)的語法。
用到的幾個主要的python庫: Pillow(python圖像處理庫)、OpenCV(高級圖像處理庫)、pytesseract(識別庫)
2.1 灰度處理&二值化
灰度處理,就是把彩色的驗證碼圖片轉(zhuǎn)為灰色的圖片。
二值化,是將圖片處理為只有黑白兩色的圖片,利于后面的圖像處理和識別
在OpenCV中有現(xiàn)成的方法可以進行灰度處理和二值化,處理后的效果:
?
# 自適應(yīng)閥值二值化
def _get_dynamic_binary_image(filedir, img_name):
filename = './out_img/' + img_name.split('.')[0] + '-binary.jpg'
img_name = filedir + '/' + img_name
print('.....' + img_name)
im = cv2.imread(img_name)
im = cv2.cvtColor(im,cv2.COLOR_BGR2GRAY) #灰值化
# 二值化
th1 = cv2.adaptiveThreshold(im, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, 21, 1)
cv2.imwrite(filename,th1)
return th1
2.2 去除邊框
如果驗證碼有邊框,那我們就需要去除邊框,去除邊框就是遍歷像素點,找到四個邊框上的所有點,把他們都改為白色,我這里邊框是兩個像素寬
注意:在用OpenCV時,圖片的矩陣點是反的,就是長和寬是顛倒的
代碼:
?
# 去除邊框
def clear_border(img,img_name):
filename = './out_img/' + img_name.split('.')[0] + '-clearBorder.jpg'
h, w = img.shape[:2]
for y in range(0, w):
for x in range(0, h):
if y < 2 or y > w - 2:
img[x, y] = 255
if x < 2 or x > h -2:
img[x, y] = 255
cv2.imwrite(filename,img)
return img
效果
2.3 圖像降噪
降噪是驗證碼處理中比較重要的一個步驟,我這里使用了點降噪和線降噪
線降噪的思路就是檢測這個點相鄰的四個點(圖中標出的綠色點),判斷這四個點中是白點的個數(shù),如果有兩個以上的白色像素點,那么就認為這個點是白色的,從而去除整個干擾線,但是這種方法是有限度的,如果干擾線特別粗就沒有辦法去除,只能去除細的干擾線
?
# 干擾線降噪
def interference_line(img, img_name):
filename = './out_img/' + img_name.split('.')[0] + '-interferenceline.jpg'
h, w = img.shape[:2]
# ?。?!opencv矩陣點是反的
# img[1,2] 1:圖片的高度,2:圖片的寬度
for y in range(1, w - 1):
for x in range(1, h - 1):
count = 0
if img[x, y - 1] > 245:
count = count + 1
if img[x, y + 1] > 245:
count = count + 1
if img[x - 1, y] > 245:
count = count + 1
if img[x + 1, y] > 245:
count = count + 1
if count > 2:
img[x, y] = 255
cv2.imwrite(filename,img)
return img
點降噪的思路和線降噪的差不多,只是會針對不同的位置檢測的點不一樣,注釋寫的很清楚了
?
# 點降噪
def interference_point(img,img_name, x = 0, y = 0):
"""
9鄰域框,以當前點為中心的田字框,黑點個數(shù)
:param x:
:param y:
:return:
"""
filename = './out_img/' + img_name.split('.')[0] + '-interferencePoint.jpg'
# todo 判斷圖片的長寬度下限
cur_pixel = img[x,y]# 當前像素點的值
height,width = img.shape[:2]
for y in range(0, width - 1):
for x in range(0, height - 1):
if y == 0: # 第一行
if x == 0: # 左上頂點,4鄰域
# 中心點旁邊3個點
sum = int(cur_pixel) \
+ int(img[x, y + 1]) \
+ int(img[x + 1, y]) \
+ int(img[x + 1, y + 1])
if sum <= 2 * 245:
img[x, y] = 0
elif x == height - 1: # 右上頂點
sum = int(cur_pixel) \
+ int(img[x, y + 1]) \
+ int(img[x - 1, y]) \
+ int(img[x - 1, y + 1])
if sum <= 2 * 245:
img[x, y] = 0
else: # 最上非頂點,6鄰域
sum = int(img[x - 1, y]) \
+ int(img[x - 1, y + 1]) \
+ int(cur_pixel) \
+ int(img[x, y + 1]) \
+ int(img[x + 1, y]) \
+ int(img[x + 1, y + 1])
if sum <= 3 * 245:
img[x, y] = 0
elif y == width - 1: # 最下面一行
if x == 0: # 左下頂點
# 中心點旁邊3個點
sum = int(cur_pixel) \
+ int(img[x + 1, y]) \
+ int(img[x + 1, y - 1]) \
+ int(img[x, y - 1])
if sum <= 2 * 245:
img[x, y] = 0
elif x == height - 1: # 右下頂點
sum = int(cur_pixel) \
+ int(img[x, y - 1]) \
+ int(img[x - 1, y]) \
+ int(img[x - 1, y - 1])
if sum <= 2 * 245:
img[x, y] = 0
else: # 最下非頂點,6鄰域
sum = int(cur_pixel) \
+ int(img[x - 1, y]) \
+ int(img[x + 1, y]) \
+ int(img[x, y - 1]) \
+ int(img[x - 1, y - 1]) \
+ int(img[x + 1, y - 1])
if sum <= 3 * 245:
img[x, y] = 0
else: # y不在邊界
if x == 0: # 左邊非頂點
sum = int(img[x, y - 1]) \
+ int(cur_pixel) \
+ int(img[x, y + 1]) \
+ int(img[x + 1, y - 1]) \
+ int(img[x + 1, y]) \
+ int(img[x + 1, y + 1])
if sum <= 3 * 245:
img[x, y] = 0
elif x == height - 1: # 右邊非頂點
sum = int(img[x, y - 1]) \
+ int(cur_pixel) \
+ int(img[x, y + 1]) \
+ int(img[x - 1, y - 1]) \
+ int(img[x - 1, y]) \
+ int(img[x - 1, y + 1])
if sum <= 3 * 245:
img[x, y] = 0
else: # 具備9領(lǐng)域條件的
sum = int(img[x - 1, y - 1]) \
+ int(img[x - 1, y]) \
+ int(img[x - 1, y + 1]) \
+ int(img[x, y - 1]) \
+ int(cur_pixel) \
+ int(img[x, y + 1]) \
+ int(img[x + 1, y - 1]) \
+ int(img[x + 1, y]) \
+ int(img[x + 1, y + 1])
if sum <= 4 * 245:
img[x, y] = 0
cv2.imwrite(filename,img)
return img
效果:
其實到了這一步,這些字符就可以識別了,沒必要進行字符切割了,現(xiàn)在這三種類型的驗證碼識別率已經(jīng)達到50%以上了
2.4 字符切割
字符切割通常用于驗證碼中有粘連的字符,粘連的字符不好識別,所以我們需要將粘連的字符切割為單個的字符,在進行識別
字符切割的思路就是找到一個黑色的點,然后在遍歷與他相鄰的黑色的點,直到遍歷完所有的連接起來的黑色的點,找出這些點中的最高的點、最低的點、最右邊的點、最左邊的點,記錄下這四個點,認為這是一個字符,然后在向后遍歷點,直至找到黑色的點,繼續(xù)以上的步驟。最后通過每個字符的四個點進行切割
圖中紅色的點就是代碼執(zhí)行完后,標識出的每個字符的四個點,然后就會根據(jù)這四個點進行切割(圖中畫的有些誤差,懂就好)
但是也可以看到,m2是粘連的,代碼認為他是一個字符,所以我們需要對每個字符的寬度進行檢測,如果他的寬度過寬,我們就認為他是兩個粘連在一起的字符,并將它在從中間切割
確定每個字符的四個點代碼:
?
def cfs(im,x_fd,y_fd):
'''用隊列和集合記錄遍歷過的像素坐標代替單純遞歸以解決cfs訪問過深問題
'''
# print('**********')
xaxis=[]
yaxis=[]
visited =set()
q = Queue()
q.put((x_fd, y_fd))
visited.add((x_fd, y_fd))
offsets=[(1, 0), (0, 1), (-1, 0), (0, -1)]#四鄰域
while not q.empty():
x,y=q.get()
for xoffset,yoffset in offsets:
x_neighbor,y_neighbor = x+xoffset,y+yoffset
if (x_neighbor,y_neighbor) in (visited):
continue # 已經(jīng)訪問過了
visited.add((x_neighbor, y_neighbor))
try:
if im[x_neighbor, y_neighbor] == 0:
xaxis.append(x_neighbor)
yaxis.append(y_neighbor)
q.put((x_neighbor,y_neighbor))
except IndexError:
pass
# print(xaxis)
if (len(xaxis) == 0 | len(yaxis) == 0):
xmax = x_fd + 1
xmin = x_fd
ymax = y_fd + 1
ymin = y_fd
else:
xmax = max(xaxis)
xmin = min(xaxis)
ymax = max(yaxis)
ymin = min(yaxis)
#ymin,ymax=sort(yaxis)
return ymax,ymin,xmax,xmin
def detectFgPix(im,xmax):
'''搜索區(qū)塊起點
'''
h,w = im.shape[:2]
for y_fd in range(xmax+1,w):
for x_fd in range(h):
if im[x_fd,y_fd] == 0:
return x_fd,y_fd
def CFS(im):
'''切割字符位置
'''
zoneL=[]#各區(qū)塊長度L列表
zoneWB=[]#各區(qū)塊的X軸[起始,終點]列表
zoneHB=[]#各區(qū)塊的Y軸[起始,終點]列表
xmax=0#上一區(qū)塊結(jié)束黑點橫坐標,這里是初始化
for i in range(10):
try:
x_fd,y_fd = detectFgPix(im,xmax)
# print(y_fd,x_fd)
xmax,xmin,ymax,ymin=cfs(im,x_fd,y_fd)
L = xmax - xmin
H = ymax - ymin
zoneL.append(L)
zoneWB.append([xmin,xmax])
zoneHB.append([ymin,ymax])
except TypeError:
return zoneL,zoneWB,zoneHB
return zoneL,zoneWB,zoneHB
切割粘連字符代碼:
?
def cutting_img(im,im_position,img,xoffset = 1,yoffset = 1):
filename = './out_img/' + img.split('.')[0]
# 識別出的字符個數(shù)
im_number = len(im_position[1])
# 切割字符
for i in range(im_number):
im_start_X = im_position[1][i][0] - xoffset
im_end_X = im_position[1][i][1] + xoffset
im_start_Y = im_position[2][i][0] - yoffset
im_end_Y = im_position[2][i][1] + yoffset
cropped = im[im_start_Y:im_end_Y, im_start_X:im_end_X]
cv2.imwrite(filename + '-cutting-' + str(i) + '.jpg',cropped)
效果:
2.5 識別
識別用的是typesseract庫,主要識別一行字符和單個字符時的參數(shù)設(shè)置,識別中英文的參數(shù)設(shè)置,代碼很簡單就一行,我這里大多是filter文件的操作
?
# 識別驗證碼
cutting_img_num = 0
for file in os.listdir('./out_img'):
str_img = ''
if fnmatch(file, '%s-cutting-*.jpg' % img_name.split('.')[0]):
cutting_img_num += 1
for i in range(cutting_img_num):
try:
file = './out_img/%s-cutting-%s.jpg' % (img_name.split('.')[0], i)
# 識別字符
str_img = str_img + image_to_string(Image.open(file),lang = 'eng', config='-psm 10') #單個字符是10,一行文本是7
except Exception as err:
pass
print('切圖:%s' % cutting_img_num)
print('識別為:%s' % str_img)
最后這種粘連字符的識別率是在30%左右,而且這種只是處理兩個字符粘連,如果有兩個以上的字符粘連還不能識別,但是根據(jù)字符寬度判別的話也不難,有興趣的可以試一下
無需切割字符識別的效果:
需要切割字符的識別效果:
3 基于tensorflow的驗證碼識別
-
python庫: tensorflow, opencv, pandas, gpu機器。
-
訓(xùn)練集: 10w 圖片, 200step左右開始收斂。
-
策略: 切分圖片,訓(xùn)練單字母識別。預(yù)測時也是同樣切分。(ps:不切分訓(xùn)練及識別,跑了一夜,沒有收斂)
-
準確率: 在區(qū)分大小寫的情況下,單字母識別率98%, 整體識別率75%+。
3.1 數(shù)據(jù)集
數(shù)據(jù)集預(yù)處理
?
package com;
import java.awt.Color;
import java.io.File;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.OutputStream;
import java.util.Random;
import org.patchca.color.ColorFactory;
import org.patchca.filter.predefined.CurvesRippleFilterFactory;
import org.patchca.filter.predefined.DiffuseRippleFilterFactory;
import org.patchca.filter.predefined.DoubleRippleFilterFactory;
import org.patchca.filter.predefined.MarbleRippleFilterFactory;
import org.patchca.filter.predefined.WobbleRippleFilterFactory;
import org.patchca.service.ConfigurableCaptchaService;
import org.patchca.utils.encoder.EncoderHelper;
import org.patchca.word.RandomWordFactory;
public class CreatePatcha {
private static Random random = new Random();
private static ConfigurableCaptchaService cs = new ConfigurableCaptchaService();
static {
// cs.setColorFactory(new SingleColorFactory(new Color(25, 60, 170)));
cs.setColorFactory(new ColorFactory() {
@Override
public Color getColor(int x) {
int[] c = new int[3];
int i = random.nextInt(c.length);
for (int fi = 0; fi < c.length; fi++) {
if (fi == i) {
c[fi] = random.nextInt(71);
} else {
c[fi] = random.nextInt(256);
}
}
return new Color(c[0], c[1], c[2]);
}
});
RandomWordFactory wf = new RandomWordFactory();
// wf.setCharacters("23456789abcdefghigklmnpqrstuvwxyzABCDEFGHIGKLMNPQRSTUVWXYZ");
wf.setCharacters("0123456789abcdefghigklmnopqrstuvwxyzABCDEFGHIGKLMNOPQRSTUVWXYZ");
wf.setMaxLength(4);
wf.setMinLength(4);
cs.setWordFactory(wf);
}
public static void main(String[] args) throws IOException {
for (int i = 0; i < 100; i++) {
switch (random.nextInt(5)) {
case 0:
cs.setFilterFactory(new CurvesRippleFilterFactory(cs
.getColorFactory()));
break;
case 1:
cs.setFilterFactory(new MarbleRippleFilterFactory());
break;
case 2:
cs.setFilterFactory(new DoubleRippleFilterFactory());
break;
case 3:
cs.setFilterFactory(new WobbleRippleFilterFactory());
break;
case 4:
cs.setFilterFactory(new DiffuseRippleFilterFactory());
break;
}
OutputStream out = new FileOutputStream(new File(i + ".png"));
String token = EncoderHelper.getChallangeAndWriteImage(cs, "png",
out);
out.close();
File f = new File(i+".png");
f.renameTo(new File("checkdata/" + token +"_" + i+".png"));
System.out.println(i+"驗證碼=" + token);
}
}
}
3.2 基于tf的神經(jīng)網(wǎng)絡(luò)訓(xùn)練代碼
?
#coding:utf-8from gen_captcha import gen_captcha_text_and_imagefrom gen_captcha import numberfrom gen_captcha import alphabetfrom gen_captcha import ALPHABETimport numpy as npimport tensorflow as tfimport osos.environ["CUDA_VISIBLE_DEVICES"] = "0"text, image = gen_captcha_text_and_image()print("驗證碼圖像channel:", image.shape) # (70, 160, 3)# 圖像大小IMAGE_HEIGHT = 70IMAGE_WIDTH = 70MAX_CAPTCHA = len(text)print("驗證碼文本最長字符數(shù)", MAX_CAPTCHA) # 驗證碼最長4字符; 我全部固定為4,可以不固定. 如果驗證碼長度小于4,用'_'補齊# 把彩色圖像轉(zhuǎn)為灰度圖像(色彩對識別驗證碼沒有什么用)def convert2gray(img): if len(img.shape) > 2: gray = np.mean(img, -1) # 上面的轉(zhuǎn)法較快,正規(guī)轉(zhuǎn)法如下 # r, g, b = img[:,:,0], img[:,:,1], img[:,:,2] # gray = 0.2989 * r + 0.5870 * g + 0.1140 * b return gray else: return img"""cnn在圖像大小是2的倍數(shù)時性能最高, 如果你用的圖像大小不是2的倍數(shù),可以在圖像邊緣補無用像素。np.pad(image【,((2,3),(2,2)), 'constant', constant_values=(255,)) # 在圖像上補2行,下補3行,左補2行,右補2行"""# 文本轉(zhuǎn)向量# char_set = number + alphabet + ALPHABET + ['_'] # 如果驗證碼長度小于4, '_'用來補齊char_set = number + alphabet + ALPHABET # 如果驗證碼長度小于4, '_'用來補齊CHAR_SET_LEN = len(char_set) #26*2+10+1=63def text2vec(text): text_len = len(text) if text_len > MAX_CAPTCHA: raise ValueError('驗證碼最長4個字符') vector = np.zeros(MAX_CAPTCHA*CHAR_SET_LEN) def char2pos(c): if c =='_': k = 62 return k k = ord(c)-48 if k > 9: k = ord(c) - 55 if k > 35: k = ord(c) - 61 if k > 61: raise ValueError('No Map') return k for i, c in enumerate(text): idx = i * CHAR_SET_LEN + char2pos(c) vector[idx] = 1 return vector# 向量轉(zhuǎn)回文本def vec2text(vec): char_pos = vec.nonzero()[0] text=[] for i, c in enumerate(char_pos): char_at_pos = i #c/63 char_idx = c % CHAR_SET_LEN if char_idx < 10: char_code = char_idx + ord('0') elif char_idx <36: char_code = char_idx - 10 + ord('A') elif char_idx < 62: char_code = char_idx- 36 + ord('a') elif char_idx == 62: char_code = ord('_') else: raise ValueError('error') text.append(chr(char_code)) return "".join(text)"""#向量(大小MAX_CAPTCHA*CHAR_SET_LEN)用0,1編碼 每63個編碼一個字符,這樣順利有,字符也有vec = text2vec("F5Sd")text = vec2text(vec)print(text) # F5Sdvec = text2vec("SFd5")text = vec2text(vec)print(text) # SFd5"""# 生成一個訓(xùn)練batchdef get_next_batch(batch_size=128, train = True): batch_x = np.zeros([batch_size, IMAGE_HEIGHT*IMAGE_WIDTH]) batch_y = np.zeros([batch_size, MAX_CAPTCHA*CHAR_SET_LEN]) # 有時生成圖像大小不是(70, 160, 3) def wrap_gen_captcha_text_and_image(train): while True: text, image = gen_captcha_text_and_image(train) if image.shape == (70, 70, 3): return text, image for i in range(batch_size): text, image = wrap_gen_captcha_text_and_image(train) image = convert2gray(image) batch_x[i,:] = image.flatten() / 255 # (image.flatten()-128)/128 mean為0 batch_y[i,:] = text2vec(text) return batch_x, batch_y####################################################################X = tf.placeholder(tf.float32, [None, IMAGE_HEIGHT*IMAGE_WIDTH])Y = tf.placeholder(tf.float32, [None, MAX_CAPTCHA*CHAR_SET_LEN])keep_prob = tf.placeholder(tf.float32) # dropout# 定義CNNdef crack_captcha_cnn(w_alpha=0.01, b_alpha=0.1): x = tf.reshape(X, shape=[-1, IMAGE_HEIGHT, IMAGE_WIDTH, 1]) #w_c1_alpha = np.sqrt(2.0/(IMAGE_HEIGHT*IMAGE_WIDTH)) # #w_c2_alpha = np.sqrt(2.0/(3*3*32)) #w_c3_alpha = np.sqrt(2.0/(3*3*64)) #w_d1_alpha = np.sqrt(2.0/(8*32*64)) #out_alpha = np.sqrt(2.0/1024) # 3 conv layer w_c1 = tf.Variable(w_alpha*tf.random_normal([3, 3, 1, 32])) b_c1 = tf.Variable(b_alpha*tf.random_normal([32])) conv1 = tf.nn.relu(tf.nn.bias_add(tf.nn.conv2d(x, w_c1, strides=[1, 1, 1, 1], padding='SAME'), b_c1)) conv1 = tf.nn.max_pool(conv1, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME') conv1 = tf.nn.dropout(conv1, keep_prob) w_c2 = tf.Variable(w_alpha*tf.random_normal([3, 3, 32, 64])) b_c2 = tf.Variable(b_alpha*tf.random_normal([64])) conv2 = tf.nn.relu(tf.nn.bias_add(tf.nn.conv2d(conv1, w_c2, strides=[1, 1, 1, 1], padding='SAME'), b_c2)) conv2 = tf.nn.max_pool(conv2, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME') conv2 = tf.nn.dropout(conv2, keep_prob) w_c3 = tf.Variable(w_alpha*tf.random_normal([3, 3, 64, 64])) b_c3 = tf.Variable(b_alpha*tf.random_normal([64])) conv3 = tf.nn.relu(tf.nn.bias_add(tf.nn.conv2d(conv2, w_c3, strides=[1, 1, 1, 1], padding='SAME'), b_c3)) conv3 = tf.nn.max_pool(conv3, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME') conv3 = tf.nn.dropout(conv3, keep_prob) # Fully connected layer w_d = tf.Variable(w_alpha*tf.random_normal([9*9*64, 1024])) b_d = tf.Variable(b_alpha*tf.random_normal([1024])) dense = tf.reshape(conv3, [-1, w_d.get_shape().as_list()[0]]) dense = tf.nn.relu(tf.add(tf.matmul(dense, w_d), b_d)) dense = tf.nn.dropout(dense, keep_prob) w_out = tf.Variable(w_alpha*tf.random_normal([1024, MAX_CAPTCHA*CHAR_SET_LEN])) b_out = tf.Variable(b_alpha*tf.random_normal([MAX_CAPTCHA*CHAR_SET_LEN])) out = tf.add(tf.matmul(dense, w_out), b_out) #out = tf.nn.softmax(out) return out# 訓(xùn)練def train_crack_captcha_cnn(): output = crack_captcha_cnn() # loss #loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(output, Y)) with tf.device('/gpu:0'): loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=output, labels=Y)) # 最后一層用來分類的softmax和sigmoid有什么不同? # optimizer 為了加快訓(xùn)練 learning_rate應(yīng)該開始大,然后慢慢衰 optimizer = tf.train.AdamOptimizer(learning_rate=0.001).minimize(loss) predict = tf.reshape(output, [-1, MAX_CAPTCHA, CHAR_SET_LEN]) max_idx_p = tf.argmax(predict, 2) max_idx_l = tf.argmax(tf.reshape(Y, [-1, MAX_CAPTCHA, CHAR_SET_LEN]), 2) correct_pred = tf.equal(max_idx_p, max_idx_l) accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32)) saver = tf.train.Saver() config = tf.ConfigProto(allow_soft_placement=True) config.gpu_options.allow_growth = True with tf.Session(config=config) as sess: sess.run(tf.global_variables_initializer()) step = 0 while True: batch_x, batch_y = get_next_batch(256) _, loss_ = sess.run([optimizer, loss], feed_dict={X: batch_x, Y: batch_y, keep_prob: 0.75}) # 每100 step計算一次準確率 if step % 100 == 0: batch_x_test, batch_y_test = get_next_batch(100, False) acc = sess.run(accuracy, feed_dict={X: batch_x_test, Y: batch_y_test, keep_prob: 1.}) print('step:%d,loss:%g' % (step, loss_)) print('step:%d,acc:%g'%(step, acc)) # 如果準確率大于50%,保存模型,完成訓(xùn)練 if acc > 0.98: saver.save(sess, "crack_capcha.model", global_step=step) break step += 1def crack_captcha(captcha_image): output = crack_captcha_cnn() saver = tf.train.Saver() with tf.Session() as sess: saver.restore(sess, tf.train.latest_checkpoint('.')) predict = tf.argmax(tf.reshape(output, [-1, MAX_CAPTCHA, CHAR_SET_LEN]), 2) text_list = sess.run(predict, feed_dict={X: [captcha_image], keep_prob: 1}) text = text_list[0].tolist() vector = np.zeros(MAX_CAPTCHA*CHAR_SET_LEN) i = 0 for n in text: vector[i*CHAR_SET_LEN + n] = 1 i += 1 return vec2text(vector)if __name__ == '__main__': #text, image = gen_captcha_text_and_image() #image = convert2gray(image) #image = image.flatten() / 255 #predict_text = crack_captcha(image) #print("正確: {} 預(yù)測: {}".format(text, predict_text)) train_crack_captcha_cnn()
4 最后
?? 更多資料, 項目分享:
https://gitee.com/dancheng-senior/postgraduate
到了這里,關(guān)于競賽項目 深度學(xué)習(xí)驗證碼識別 - 機器視覺 python opencv的文章就介紹完了。如果您還想了解更多內(nèi)容,請在右上角搜索TOY模板網(wǎng)以前的文章或繼續(xù)瀏覽下面的相關(guān)文章,希望大家以后多多支持TOY模板網(wǎng)!