? ? ? 大家好,我是帶我去滑雪!
? ? ? 判斷肺部是否發(fā)生病變可以及早發(fā)現疾病、指導治療和監(jiān)測疾病進展,以及預防和促進肺部健康,定期進行肺部評估和檢查對于保護肺健康、預防疾病和提高生活質量至關重要。本期將利用相關醫(yī)學臨床數據結合邏輯回歸判斷病人肺部是否發(fā)生病變,其中響應變量為group(1表示肺部發(fā)生病變,0表示正常),特征變量為ESR(表示紅細胞沉降率)、CRP(表示C-反應蛋白)、ALB(表示白蛋白)、Anti-SSA(表示抗SSA抗體)、Glandular involvement(表示腺體受累)、gender(表示性別)、c-PSA(cancer-specific prostate-specific antigen)、CA 15-3(Cancer Antigen 15-3)、TH17(Th17細胞)、ANA(代表抗核抗體)、CA125(Cancer Antigen 125)、LDH(代表乳酸脫氫酶)。下面開始使用邏輯回歸進行肺部病變判斷。
(1)導入相關模塊與數據
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import confusion_matrix
from sklearn.metrics import classification_report
from sklearn.metrics import cohen_kappa_score#導入包
import numpy as np
from scipy.stats import logistic
import matplotlib.pyplot as plt
titanic = pd.read_csv('filename1.csv')
titanic#導入數據輸出結果:
? data.Age impute.data.ESR..mean. impute.data.CRP..mean. impute.data.ALB..mean. impute.data.Anti.SSA..median. impute.data.Glandular.involvement..median. impute.data.Gender..median. impute.data.c.PSA..mean. impute.data.CA153..mean. impute.data.TH17..mean. impute.data.ANA..median. impute.data.CA125..mean. impute.data.LDH..mean. data.group 0 67 21.000000 4.810000 38.692661 0 0 0 0.300000 3.50000 10.330000 1 3.000000 212.210493 0 1 78 33.000000 12.089916 41.100000 0 0 0 0.610931 22.40000 7.465353 1 17.500000 485.000000 0 2 69 24.000000 2.250000 42.700000 0 0 0 0.300000 5.40000 8.020000 0 4.360000 236.000000 0 3 71 43.000000 21.800000 39.200000 0 0 0 0.300000 11.11000 5.500000 1 6.700000 166.000000 0 4 69 20.000000 2.430000 47.600000 3 0 0 0.300000 6.93000 4.310000 0 3.520000 223.000000 0 ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... 954 63 40.274914 2.370000 40.300000 2 0 0 0.430000 6.10000 6.560000 0 7.720000 234.000000 0 955 68 27.000000 3.520000 41.000000 3 0 0 0.320000 7.52000 4.780000 1 7.150000 254.000000 0 956 61 40.274914 12.089916 40.700000 0 0 0 0.610931 12.46303 1.790000 1 9.392344 161.000000 0 957 60 27.000000 35.400000 38.300000 0 0 0 0.200000 7.68000 5.700000 0 9.290000 256.000000 0 958 68 30.000000 2.280000 44.400000 0 0 0 0.200000 5.32000 4.430000 0 4.710000 172.000000 0 959 rows × 14 columns
(2)數據處理
X = titanic.iloc[:,:-1]
y = titanic.iloc[:,-1]
X=pd.get_dummies(X,drop_first = True)
X
(3)劃分訓練集與測試集
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
X_train, X_test, y_train, y_test = ?train_test_split(X,y,test_size=0.2,stratify=None, random_state=0)#劃分訓練集和測試集
(4)擬合邏輯回歸
model = ?LogisticRegression(C=1e10)
model.fit(X_train, y_train)model.intercept_ ? ?#模型截距
model.coef_ ? ? ? #模型回歸系數輸出結果:
array([[ 0.03899236, 0.00458312, 0.000863 , -0.10140358, -0.09681747, 0.74167081, 0.56011254, 0.24636358, 0.0226635 , -0.02681392, 0.4987412 , -0.01932326, 0.00211805]])
(5)使用邏輯回歸測試集進行評價分類準確率
model.score(X_test, y_test)
輸出結果:
0.6822916666666666
(6)測試集預測所有種類的概率
prob = model.predict_proba(X_test)
prob[:5]輸出結果:
array([[0.71336774, 0.28663226], [0.34959506, 0.65040494], [0.91506198, 0.08493802], [0.24008149, 0.75991851], [0.55969043, 0.44030957]])
(7)模型預測
pred = model.predict(X_test)
pred[:5]#計算測試集的預測值,展示前五個值輸出結果:
array([0, 1, 0, 1, 0], dtype=int64)
(8)計算混淆矩陣
table = pd.crosstab(y_test, pred, rownames=['Actual'], colnames=['Predicted'])
table輸出結果:
Predicted 0 1 Actual ? ? 0 99 22 1 39 32
(9)計算基于混淆矩陣諸多評價指標?
print(classification_report(y_test, pred, target_names=['yes', 'no']))
輸出結果:
precision recall f1-score support yes 0.72 0.82 0.76 121 no 0.59 0.45 0.51 71 accuracy 0.68 192 macro avg 0.65 0.63 0.64 192 weighted avg 0.67 0.68 0.67 192
(10)繪制ROC曲線
from scikitplot.metrics import plot_roc
plot_roc(y_test, prob)
x = np.linspace(0, 1, 100)
plt.plot(x, x, 'k--', linewidth=1)
plt.title('ROC Curve (Test Set)')#畫ROC曲線
plt.savefig("E:\工作\碩士\博客\squares1.png",
? ? ? ? ? ? bbox_inches ="tight",
? ? ? ? ? ? pad_inches = 1,
? ? ? ? ? ? transparent = True,
? ? ? ? ? ? facecolor ="w",
? ? ? ? ? ? edgecolor ='w',
? ? ? ? ? ? dpi=300,
? ? ? ? ? ? orientation ='landscape')輸出結果:
?文章來源地址http://www.zghlxwxcb.cn/news/detail-680638.html
?需要數據集的家人們可以去百度網盤(永久有效)獲?。?/p>
鏈接:https://pan.baidu.com/s/1E59qYZuGhwlrx6gn4JJZTg?pwd=2138
提取碼:2138?
更多優(yōu)質內容持續(xù)發(fā)布中,請移步主頁查看。
? ?點贊+關注,下次不迷路!文章來源:http://www.zghlxwxcb.cn/news/detail-680638.html
?
到了這里,關于利用邏輯回歸判斷病人肺部是否發(fā)生病變的文章就介紹完了。如果您還想了解更多內容,請在右上角搜索TOY模板網以前的文章或繼續(xù)瀏覽下面的相關文章,希望大家以后多多支持TOY模板網!