SVM建模進行人臉識別案例
1、導(dǎo)包
首先進行導(dǎo)包
from sklearn.decomposition import PCA
import numpy as np
from sklearn.svm import SVC
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.decomposition import PCA
from sklearn.metrics import accuracy_score
from sklearn.linear_model import LogisticRegression
from sklearn import datasets
from sklearn.model_selection import GridSearchCV
2、加載數(shù)據(jù)集
我們加載sklearn已經(jīng)幫我們收集好的人臉數(shù)據(jù)
# 加載人臉數(shù)據(jù) lfw->labled faces wild:野外標記的人臉
data = datasets.fetch_lfw_people(resize = 1, min_faces_per_person = 70)
data
查看結(jié)果:
我們?nèi)〕銎渲械臄?shù)據(jù)進行查看:
X = data['data']
y = data['target']
faces = data['images']
target_names = data['target_names']
display(X.shape,y.shape,faces.shape,target_names)
運行結(jié)果:
我們隨機選取一個人的圖片并通過索引獲取名字:
# 隨機取出一個人臉
index = np.random.randint(0,1288,size = 1)[0]
face = faces[index]
name = y[index] # 根據(jù)索引獲取名字
print(target_names[name])
display(face.shape)
plt.imshow(face, cmap = 'gray')
結(jié)果展示:
3、直接使用SVM模型建模
由于原來的數(shù)據(jù)很大,而且數(shù)據(jù)量多,我們首先對原始數(shù)據(jù)進行PCA降維
%%time
# 進行數(shù)據(jù)的降維
pca = PCA(n_components=0.95)
X_pca = pca.fit_transform(X)
display(X.shape,X_pca.shape)
結(jié)果展示:
然后對降維后的數(shù)據(jù)集進行訓(xùn)練和預(yù)測結(jié)果:
其中的C代表的是懲罰系數(shù),用來防止過擬合,我們先用默認的初始值測試下性能
%%time
# 降維之后的數(shù)據(jù)
X_train,X_test,y_train,y_test, faces_train,faces_test = train_test_split(X_pca,y,faces)
# C為懲罰項,越大,容忍錯誤越小
# C越大,趨勢:想方設(shè)發(fā),把數(shù)據(jù)分開,容易造成過擬合
svc = SVC(C = 1)
svc.fit(X_train,y_train)
# 訓(xùn)練數(shù)據(jù)效果很好,測試數(shù)據(jù)效果不好就是過擬合現(xiàn)象
print('訓(xùn)練數(shù)據(jù)的得分:',svc.score(X_train,y_train))
print('測試數(shù)據(jù)的得分:',svc.score(X_test,y_test))
# 算法的預(yù)測值
y_pred = svc.predict(X_test)
結(jié)果展示:
4、數(shù)據(jù)可視化
然后我們隨機加載50張圖片,并可視化查看預(yù)測結(jié)果:
plt.figure(figsize=(5 * 2, 10 * 3))
for i in range(50):
plt.subplot(10,5,i + 1) # 子視圖
plt.imshow(faces_test[i],cmap = 'gray')
plt.axis('off') # 刻度關(guān)閉
# 貼上標簽,并且對比實際數(shù)據(jù)和預(yù)測數(shù)據(jù)
true_name = target_names[y_test[i]].split(' ')[-1]
predict_name = target_names[y_pred[i]].split(' ')[-1]
plt.title(f'True:{true_name}\nPred:{predict_name}')
結(jié)果展示:
從結(jié)果來看,預(yù)測效果并不是很好,紅色框選出來的都是預(yù)測錯誤的名字,因此我們不得不對原來的性能優(yōu)化。
5、網(wǎng)絡(luò)搜索優(yōu)化確定最佳性能
sklearn為我們集成好了網(wǎng)絡(luò)搜索確定最佳性能的方法,只要吧要傳進的參數(shù)填進去,它會為我們自動搭配獲得最優(yōu)參數(shù)。
%%time
svc = SVC()
# C為懲罰系數(shù)(防止過擬合),kernel為核函數(shù)類型,tol為停止訓(xùn)練的誤差值、精度
params = {'C':np.logspace(-10,10,50),'kernel':['linear', 'poly', 'rbf', 'sigmoid'],'tol':[0.01,0.001,0.0001]}
gc = GridSearchCV(estimator = svc,param_grid = params,cv = 5)
gc.fit(X_pca,y)
gc.best_params_
結(jié)果展示:
6、使用最佳性能SVM建模
從上面的結(jié)果來看,獲得的最優(yōu)懲罰系數(shù)C為1.8420699693267165e-07,最優(yōu)核函數(shù)類型是linear線性模型,最優(yōu)精度為0.001.
基于上面的最優(yōu)參數(shù),對SVM進行優(yōu)化建模
svc = SVC(C = 1.8420699693267165e-07,kernel='linear',tol = 0.001)
# 隨機劃分的
X_pca_train,X_pca_test,y_train,y_test, faces_train,faces_test = train_test_split(X_pca,y,faces)
svc.fit(X_pca_train,y_train)
print('訓(xùn)練數(shù)據(jù)得分:',svc.score(X_pca_train,y_train))
print('測試數(shù)據(jù)的得分:',svc.score(X_pca_test,y_test))
結(jié)果展示:
從這個結(jié)果來看,相比于普通的SVM建模,優(yōu)化后的SVM在得分上明顯有提高。
7、優(yōu)化后的數(shù)據(jù)可視化
plt.figure(figsize=(5 * 2, 10 * 3))
for i in range(50):
plt.subplot(10,5,i + 1) # 子視圖
plt.imshow(faces_test[i],cmap = 'gray')
plt.axis('off') # 刻度關(guān)閉
# 貼上標簽,并且對比實際數(shù)據(jù)和預(yù)測數(shù)據(jù)
true_name = target_names[y_test[i]].split(' ')[-1]
predict_name = target_names[y_pred[i]].split(' ')[-1]
plt.title(f'True:{true_name}\nPred:{predict_name}')
結(jié)果展示:
文章來源:http://www.zghlxwxcb.cn/news/detail-763233.html
從優(yōu)化后的結(jié)果來看,雖然還是有分錯的結(jié)果,但是準確率較原來的準確率提高了很多。文章來源地址http://www.zghlxwxcb.cn/news/detail-763233.html
8、完整代碼
8.1未優(yōu)化的完整代碼
from sklearn.decomposition import PCA
import numpy as np
from sklearn.svm import SVC
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.decomposition import PCA
from sklearn.metrics import accuracy_score
from sklearn.linear_model import LogisticRegression
from sklearn import datasets
from sklearn.model_selection import GridSearchCV
# 加載人臉數(shù)據(jù) lfw->labled faces wild:野外標記的人臉
data = datasets.fetch_lfw_people(resize = 1, min_faces_per_person = 70)
data
# 進行數(shù)據(jù)的降維
pca = PCA(n_components=0.95)
X_pca = pca.fit_transform(X)
display(X.shape,X_pca.shape)
# 降維之后的數(shù)據(jù)
X_train,X_test,y_train,y_test, faces_train,faces_test = train_test_split(X_pca,y,faces)
# C為懲罰項,越大,容忍錯誤越小
# C越大,趨勢:想方設(shè)發(fā),把數(shù)據(jù)分開,容易造成過擬合
svc = SVC(C = 1)
svc.fit(X_train,y_train)
# 訓(xùn)練數(shù)據(jù)效果很好,測試數(shù)據(jù)效果不好就是過擬合現(xiàn)象
print('訓(xùn)練數(shù)據(jù)的得分:',svc.score(X_train,y_train))
print('測試數(shù)據(jù)的得分:',svc.score(X_test,y_test))
# 算法的預(yù)測值
y_pred = svc.predict(X_test)
plt.figure(figsize=(5 * 2, 10 * 3))
for i in range(50):
plt.subplot(10,5,i + 1) # 子視圖
plt.imshow(faces_test[i],cmap = 'gray')
plt.axis('off') # 刻度關(guān)閉
# 貼上標簽,并且對比實際數(shù)據(jù)和預(yù)測數(shù)據(jù)
true_name = target_names[y_test[i]].split(' ')[-1]
predict_name = target_names[y_pred[i]].split(' ')[-1]
plt.title(f'True:{true_name}\nPred:{predict_name}')
8.2優(yōu)化后的完整代碼
from sklearn.decomposition import PCA
import numpy as np
from sklearn.svm import SVC
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.decomposition import PCA
from sklearn.metrics import accuracy_score
from sklearn.linear_model import LogisticRegression
from sklearn import datasets
from sklearn.model_selection import GridSearchCV
# 加載人臉數(shù)據(jù) lfw->labled faces wild:野外標記的人臉
data = datasets.fetch_lfw_people(resize = 1, min_faces_per_person = 70)
data
# 進行數(shù)據(jù)的降維
pca = PCA(n_components=0.95)
X_pca = pca.fit_transform(X)
display(X.shape,X_pca.shape)
svc = SVC()
# C為懲罰系數(shù)(防止過擬合),kernel為核函數(shù)類型,tol為停止訓(xùn)練的誤差值、精度
params = {'C':np.logspace(-10,10,50),'kernel':['linear', 'poly', 'rbf', 'sigmoid'],'tol':[0.01,0.001,0.0001]}
gc = GridSearchCV(estimator = svc,param_grid = params,cv = 5)
gc.fit(X_pca,y)
gc.best_params_
svc = SVC(C = 1.8420699693267165e-07,kernel='linear',tol = 0.001)
# 隨機劃分的
X_pca_train,X_pca_test,y_train,y_test, faces_train,faces_test = train_test_split(X_pca,y,faces)
svc.fit(X_pca_train,y_train)
print('訓(xùn)練數(shù)據(jù)得分:',svc.score(X_pca_train,y_train))
print('測試數(shù)據(jù)的得分:',svc.score(X_pca_test,y_test))
plt.figure(figsize=(5 * 2, 10 * 3))
for i in range(50):
plt.subplot(10,5,i + 1) # 子視圖
plt.imshow(faces_test[i],cmap = 'gray')
plt.axis('off') # 刻度關(guān)閉
# 貼上標簽,并且對比實際數(shù)據(jù)和預(yù)測數(shù)據(jù)
true_name = target_names[y_test[i]].split(' ')[-1]
predict_name = target_names[y_pred[i]].split(' ')[-1]
plt.title(f'True:{true_name}\nPred:{predict_name}')
到了這里,關(guān)于機器學(xué)習(xí)實戰(zhàn)-SVM模型實現(xiàn)人臉識別的文章就介紹完了。如果您還想了解更多內(nèi)容,請在右上角搜索TOY模板網(wǎng)以前的文章或繼續(xù)瀏覽下面的相關(guān)文章,希望大家以后多多支持TOY模板網(wǎng)!