簡介
準備寫個系列博客介紹機器學習實戰(zhàn)中的部分公開項目。首先從初級項目開始。
本文為初級項目第二篇:利用MNIST數(shù)據(jù)集訓練手寫數(shù)字分類。
項目原網(wǎng)址為:Deep Learning Project – Handwritten Digit Recognition using Python。
第一篇為:機器學習實戰(zhàn) | emojify 使用Python創(chuàng)建自己的表情符號(深度學習初級)
技術(shù)流程
項目構(gòu)想:
MNIST數(shù)字分類項目,使機器能夠識別手寫數(shù)字。該Python項目對于計算機視覺可能非常有用。在這里,我們將使用MNIST數(shù)據(jù)集使用卷積神經(jīng)網(wǎng)絡(luò)訓練模型。
經(jīng)過訓練后,在GUI頁面(gui.py程序)顯示效果如下:左邊是手寫數(shù)字,通過鼠標手寫鍵入;右邊點擊recognise
會提示訓練結(jié)果以及識別置信度。
1. 載入依賴包和數(shù)據(jù)集
import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras import backend as K
import os
os.environ["KMP_DUPLICATE_LIB_OK"]="TRUE"
# the data, split between train and test sets
(x_train, y_train), (x_test, y_test) = mnist.load_data()
print(x_train.shape, y_train.shape)
除了常規(guī)包外,同樣需要提前配置Keras
和TensorFlow
,安裝命令為:
pip install keras==2.10.0
pip install TensorFlow==2.10.0
這里需要注意MNIST手寫數(shù)據(jù)集導入方法,直接從Keras中加載:keras.datasets.mnist
通過mnist.load
獲取訓練數(shù)據(jù)和測試數(shù)據(jù),訓練數(shù)據(jù)集維度為:
60000
×
28
×
28
60000 \times 28\times28
60000×28×28,測試數(shù)據(jù)集維度為:
10000
×
28
×
28
10000 \times 28\times28
10000×28×28.
2. 數(shù)據(jù)預處理
x_train = x_train.reshape(x_train.shape[0], 28, 28, 1)
x_test = x_test.reshape(x_test.shape[0], 28, 28, 1)
input_shape = (28, 28, 1)
# convert class vectors to binary class matrices
num_classes = 10
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)
x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255
print('x_train shape:', x_train.shape)
print(x_train.shape[0], 'train samples')
print(x_test.shape[0], 'test samples')
- x_train.reshape:將圖像數(shù)據(jù)轉(zhuǎn)換為神經(jīng)網(wǎng)絡(luò)輸入,圖像大小 60000 × 28 × 28 60000 \times 28\times28 60000×28×28,輸出大小為 60000 × 28 × 28 × 1 60000 \times 28\times28\times1 60000×28×28×1。
- keras.utils.to_categorical:將阿拉伯數(shù)字的0-9共10個數(shù)字(類別)轉(zhuǎn)換為
one-shot
特征,用二進制表示分類類別,比如數(shù)字0用0000表示,數(shù)字1用0001表示,數(shù)字2用0010表示。 - x_train /= 255:將圖像數(shù)據(jù)歸一化,首先將數(shù)據(jù)類型轉(zhuǎn)換為float32,接著將數(shù)據(jù)歸一化到
0~1
范圍內(nèi)。
3. 創(chuàng)建卷積神經(jīng)網(wǎng)絡(luò)模型
batch_size = 128
num_classes = 10
epochs = 50
model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),activation='relu',input_shape=input_shape))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(256, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))
model.compile(loss=keras.losses.categorical_crossentropy,optimizer=keras.optimizers.Adadelta(),metrics=['accuracy'])
該項目設(shè)計了卷積神經(jīng)網(wǎng)絡(luò)(CNN)模型,包括兩層卷積層、池化層、全連接層等。
函數(shù)分析:
- Sequential:序貫模型,與函數(shù)式模型對立。from keras.models import Sequential, 序貫模型通過一層層神經(jīng)網(wǎng)絡(luò)連接構(gòu)建深度神經(jīng)網(wǎng)絡(luò)。
- add(): 疊加網(wǎng)絡(luò)層,參數(shù)可為conv2D卷積神經(jīng)網(wǎng)絡(luò)層,MaxPooling2D二維最大池化層,Dropout隨機失活層(防止過擬合),Dense密集層(全連接FC層,在Keras層中FC層被寫作Dense層),下面會詳細介紹這幾個層的含義和參數(shù)設(shè)置。
- compile(): 編譯神經(jīng)網(wǎng)絡(luò)結(jié)構(gòu),參數(shù)包括:loss,字符串結(jié)構(gòu),指定損失函數(shù)(包括MSE等);optimizer,表示優(yōu)化方式(優(yōu)化器),用于控制梯度裁剪;metrics,列表,用來衡量模型指標,表示評價指標。
網(wǎng)絡(luò)結(jié)構(gòu)介紹:
- conv2D: 卷積神經(jīng)網(wǎng)絡(luò)層,參數(shù)包括:
- filters: 層深度(縱向),一般來說前期數(shù)據(jù)減少,后期數(shù)量逐漸增加,建議選擇 2 N 2^N 2N作為深度,比如說:[32,64,128] => [256,512,1024];
- kernel_size: 決定了2D卷積窗口的寬度和高度,一般設(shè)置為 ( 1 × 1 ) (1\times1) (1×1), ( 3 × 3 ) (3\times3) (3×3), ( 5 × 5 ) (5\times5) (5×5), ( 7 × 7 ) (7\times7) (7×7).
- activation:激活函數(shù),可選擇為:sigmoid,tanh,relu等
- MaxPooling2D: 池化層,本質(zhì)上是采樣,對輸入的數(shù)據(jù)進行壓縮,一般用在卷積層后,加快神經(jīng)網(wǎng)絡(luò)的訓練速度。沒有需要學習的參數(shù),數(shù)據(jù)降維,用來防止過擬合現(xiàn)象。
- Dropout:防過擬合層,在訓練時,忽略一定數(shù)量的特征檢測器,用來增加稀疏性,用伯努利分布(0-1分布)B(1,p)來隨機忽略特征數(shù)量,輸入?yún)?shù)為p的大小
- Flatten:將多維輸入數(shù)據(jù)一維化,用在卷積層到全連接層的過渡,減少參數(shù)的使用量,避免過擬合現(xiàn)象,無參。
- Dense:全連接層,將特征非線性變化映射到輸出空間上。
4. 訓練神經(jīng)網(wǎng)絡(luò)
hist = model.fit(x_train, y_train,batch_size=batch_size,epochs=epochs,verbose=1,validation_data=(x_test, y_test))
print("The model has successfully trained")
model.save('mnist.h5')
print("Saving the model as mnist.h5")
- model.fit:在搭建完成后,將數(shù)據(jù)送入模型進行訓練。參數(shù)包括:
- x:訓練數(shù)據(jù)輸入;
- y:訓練數(shù)據(jù)輸出;
- batch_size: batch樣本數(shù)量,即訓練一次網(wǎng)絡(luò)所用的樣本數(shù);
- epochs:迭代次數(shù),即全部樣本數(shù)據(jù)將被“輪”多少次,輪完訓練停止;
- verbose:可選訓練過程中信息是否輸出參數(shù),0表示不輸出信息,1表示顯示進度條(一般默認為1),2表示每個epoch輸出一行記錄;
- valdation_data:驗證數(shù)據(jù)集。
- model.save:保存訓練模型權(quán)重
訓練成功后,會在源目錄下保存mnist.h5
文件,即為權(quán)重文件。
5. 評價網(wǎng)絡(luò)
score = model.evaluate(x_test, y_test, verbose=0)
print('Test loss:', score[0])
print('Test accuracy:', score[1])
- model.evaluate:評價網(wǎng)絡(luò),返回值是一個浮點數(shù),表示損失值和評估指標值,輸入?yún)?shù)為測試數(shù)據(jù),verbose表示測試過程中信息是否輸出參數(shù),同樣verbose=0表示不輸出測試信息。
完整程序
train.py : 完整訓練代碼。
gui.py: GUI窗口,輸出可互動的界面。
train.py 程序
"""
Handwrittern digit recognition
"""
"""
1. Import the libraries and load the dataset
"""
import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras import backend as K
import os
os.environ["KMP_DUPLICATE_LIB_OK"]="TRUE"
# the data, split between train and test sets
(x_train, y_train), (x_test, y_test) = mnist.load_data()
print(x_train.shape, y_train.shape)
"""
2. Preprocess the data
"""
x_train = x_train.reshape(x_train.shape[0], 28, 28, 1)
x_test = x_test.reshape(x_test.shape[0], 28, 28, 1)
input_shape = (28, 28, 1)
# convert class vectors to binary class matrices
num_classes = 10
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)
x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255
print('x_train shape:', x_train.shape)
print(x_train.shape[0], 'train samples')
print(x_test.shape[0], 'test samples')
"""
3. Create the model
"""
batch_size = 128
num_classes = 10
epochs = 50
model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),activation='relu',input_shape=input_shape))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(256, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))
model.compile(loss=keras.losses.categorical_crossentropy,optimizer=keras.optimizers.Adadelta(),metrics=['accuracy'])
"""
4. Train the model
"""
hist = model.fit(x_train, y_train,batch_size=batch_size,epochs=epochs,verbose=1,validation_data=(x_test, y_test))
print("The model has successfully trained")
model.save('mnist.h5')
print("Saving the model as mnist.h5")
"""
5. Evaluate the model
"""
score = model.evaluate(x_test, y_test, verbose=0)
print('Test loss:', score[0])
print('Test accuracy:', score[1])
訓練結(jié)果會保存在源目錄下,生成文件名為:mnist.h5
。
gui.py程序
from keras.models import load_model
from tkinter import *
import tkinter as tk
import win32gui
from PIL import ImageGrab, Image
import numpy as np
model = load_model('mnist.h5')
def predict_digit(img):
#resize image to 28x28 pixels
img = img.resize((28, 28))
#convert rgb to grayscale
img = img.convert('L')
img = np.array(img)
#reshaping to support our model input and normalizing
img = img.reshape(1, 28, 28, 1)
img = img/255.0
#predicting the class
res = model.predict([img])[0]
return np.argmax(res), max(res)
class App(tk.Tk):
def __init__(self):
tk.Tk.__init__(self)
self.x = self.y = 0
# Creating elements
self.canvas = tk.Canvas(self, width=300, height=300, bg = "white", cursor="cross")
self.label = tk.Label(self, text="Thinking..", font=("Helvetica", 48))
self.classify_btn = tk.Button(self, text = "Recognise", command =self.classify_handwriting)
self.button_clear = tk.Button(self, text = "Clear", command = self.clear_all)
# Grid structure
self.canvas.grid(row=0, column=0, pady=2, sticky=W, )
self.label.grid(row=0, column=1,pady=2, padx=2)
self.classify_btn.grid(row=1, column=1, pady=2, padx=2)
self.button_clear.grid(row=1, column=0, pady=2)
#self.canvas.bind("<Motion>", self.start_pos)
self.canvas.bind("<B1-Motion>", self.draw_lines)
def clear_all(self):
self.canvas.delete("all")
def classify_handwriting(self):
HWND = self.canvas.winfo_id() # get the handle of the canvas
rect = win32gui.GetWindowRect(HWND) # get the coordinate of the canvas
im = ImageGrab.grab(rect)
digit, acc = predict_digit(im)
self.label.configure(text= str(digit)+', '+ str(int(acc*100))+'%')
def draw_lines(self, event):
self.x = event.x
self.y = event.y
r=8
self.canvas.create_oval(self.x-r, self.y-r, self.x + r, self.y + r, fill='black')
app = App()
mainloop()
gui.py
程序中用了tkinter
包來呈現(xiàn)GUI頁面,具體語句這里就不再分析解釋了,需要學習的話可以參考以下鏈接:Python GUI編程(Tkinter)
gui.py
運行后,輸出頁面為:
通過鍵盤在左側(cè)手寫字符,點擊recognise
輸出識別結(jié)果。文章來源:http://www.zghlxwxcb.cn/news/detail-558426.html
如有問題,歡迎指出和討論。文章來源地址http://www.zghlxwxcb.cn/news/detail-558426.html
到了這里,關(guān)于機器學習實戰(zhàn) | MNIST手寫數(shù)字分類項目(深度學習初級)的文章就介紹完了。如果您還想了解更多內(nèi)容,請在右上角搜索TOY模板網(wǎng)以前的文章或繼續(xù)瀏覽下面的相關(guān)文章,希望大家以后多多支持TOY模板網(wǎng)!