【數(shù)學建?！繑?shù)據(jù)處理與可視化

這篇具有很好參考價值的文章主要介紹了【數(shù)學建模】數(shù)據(jù)處理與可視化。希望對大家有所幫助。如果存在錯誤或未考慮完全的地方，請大家不吝賜教，您也可以點擊"舉報違法"按鈕提交疑問。

數(shù)值計算工具NumPy

兩類基本對象
ndarray(N-dimensional Array Object)：存儲單一數(shù)據(jù)類型的多維數(shù)組
ufunc(Universal Funciton Object)：對數(shù)組進行處理的通用函數(shù)

數(shù)組的創(chuàng)建、屬性和操作

數(shù)組創(chuàng)建

向array函數(shù)傳入列表/元組
利用arange、linspace、empty等函數(shù)生成數(shù)組

數(shù)組屬性
【數(shù)學建模】數(shù)據(jù)處理與可視化,數(shù)學建模,數(shù)學建模,python,matplotlib,pandas,numpy,scipy,數(shù)據(jù)處理與可視化

數(shù)組元素索引
array數(shù)組和list列表的區(qū)別：list中的元素可以不同，array僅存儲相同類型數(shù)據(jù)
一維數(shù)組：數(shù)組名[start: end: step]
二維數(shù)組：數(shù)組名[i,j]

一般索引

import numpy as np
a = np.array([2,4,8,20,16,30])  
b = np.array(((1,2,3,4,5),(6,7,8,9,10),
              (10,9,1,2,3),(4,5,6,8,9.0)))
print(a[[2,3,5]])  #一維數(shù)組索引，輸出：[ 8 20 30]
print(a[[-1,-2,-3]])   #一維數(shù)組索引，輸出：[30 16 20]
print(b[1,2])  #輸出第2行第3列元素：8.0
print(b[2])    #輸出第3行元素：[10.  9.  1.  2.  3.]
print(b[2,:])  #輸出第3行元素：[10.  9.  1.  2.  3.]
print(b[:,1])  #輸出第2列所有元素：[2.  7.  9.  5.]
print(b[[2,3],1:4])  #輸出第3、4行，第2、3、4列的元素
print(b[1:3,1:3])    #輸出第2、3行，第2、3列的元素

說明：
一維數(shù)組索引可將索引位置組裝為列表
二維數(shù)組索引形式為[rows, cols]，即方括號前半部分確定行索引，后半部分確定列索引，用:表示索引該行/該列所有元素

布爾索引

from numpy import array, nan, isnan
a=array([[1, nan, 2], [4, nan, 3]])
b=a[~isnan(a)]  #提取a中非nan的數(shù)
print("b=",b)
print("b中大于2的元素有：", b[b>2])

說明：返回值為一維數(shù)組

花式索引
索引值為數(shù)組
若使用一維數(shù)組，索引一維數(shù)組，結(jié)果為對應位置的元素；索引數(shù)組為二維數(shù)組，結(jié)果為對應下標的行
若使用二維數(shù)組，索引為兩個維度相同的一維數(shù)組時，索引結(jié)果為兩個維度坐標組合索引得到單值組合的一維數(shù)組

數(shù)組修改

import numpy as np
x = np.array([[1,2],[3,4],[5,6]])
x[2,0] = -1  #修改第3行、第1列元素為-1
y=np.delete(x,2,axis=0)   #刪除數(shù)組的第3行
z=np.delete(y,0, axis=1)  #刪除數(shù)組的第1列
t1=np.append(x,[[7,8]],axis=0) #增加一行
t2=np.append(x,[[9],[10],[11]],axis=1) #增加一列

說明：索引修改

數(shù)組變形

import numpy as np
a=np.arange(4).reshape(2,2)  #生成數(shù)組[[0,1],[2,3]]
b=np.arange(4).reshape(2,2)  #生成數(shù)組[[0,1],[2,3]]
print(a.reshape(4,),'\n',a)  #輸出：[0 1 2 3]和[[0,1],[2,3]]
print(b.resize(4,),'\n',b)   #輸出：None和[0 1 2 3]

說明：reshape函數(shù)，參數(shù)為一個正整數(shù)元組指定數(shù)組在各個維度上的大小
【數(shù)學建?！繑?shù)據(jù)處理與可視化,數(shù)學建模,數(shù)學建模,python,matplotlib,pandas,numpy,scipy,數(shù)據(jù)處理與可視化

數(shù)組降維
使用ravel()、flatten()、reshape()等方法

import numpy as np
a=np.arange(4).reshape(2,2)  #生成數(shù)組[[0,1],[2,3]]
b=np.arange(4).reshape(2,2)  #生成數(shù)組[[0,1],[2,3]]
c=np.arange(4).reshape(2,2)  #生成數(shù)組[[0,1],[2,3]]
print(a.reshape(-1),'\n',a)  #輸出：[0 1 2 3]和[[0,1],[2,3]]
print(b.ravel(),'\n',b)      #輸出：[0 1 2 3]和[[0,1],[2,3]]
print(c.flatten(),'\n',c)    #輸出：[0 1 2 3]和[[0,1],[2,3]]

說明：三種方法的效果相同，一般使用flatten()方法，該方法分配了新的內(nèi)存

數(shù)組組合
vstack()和r_()實現(xiàn)垂直方向組合，hstack()和c_()實現(xiàn)水平方向組合

import numpy as np
a=np.arange(4).reshape(2,2)  #生成數(shù)組[[0,1],[2,3]]
b=np.arange(4,8).reshape(2,2)  #生成數(shù)組[[4,5],[6,7]]
c1=np.vstack([a,b])   #垂直方向組合
c2=np.r_[a,b]        #垂直方向組合
d1=np.hstack([a,b])   #水平方向組合
d2=np.c_[a,b]        #水平方向組合

數(shù)組分割
hsplit()實現(xiàn)垂直方向分割，vsplit()實現(xiàn)水平方向分割

import numpy as np
a=np.arange(4).reshape(2,2)  #生成數(shù)組[[0,1],[2,3]]
b=np.arange(4,8).reshape(2,2)  #生成數(shù)組[[4,5],[6,7]]
c1=np.vstack([a,b])   #垂直方向組合
c2=np.r_[a,b]        #垂直方向組合
d1=np.hstack([a,b])   #水平方向組合
d2=np.c_[a,b]        #水平方向組合

數(shù)組的運算、通用函數(shù)和廣播運算

四則運算
運算符號+、-、* 、/、%、//、**
運算函數(shù)add()、substract()、multiply()、divide()、fmod()、modf()、power()

import numpy as np
a=np.arange(10,15); b=np.arange(5,10)
c=a+b; d=a*b  #對應元素相加和相乘
e1=np.modf(a/b)[0]  #對應元素相除的小數(shù)部分
e2=np.modf(a/b)[1]  #對應元素相除的整數(shù)部分

比較運算
【數(shù)學建模】數(shù)據(jù)處理與可視化,數(shù)學建模,數(shù)學建模,python,matplotlib,pandas,numpy,scipy,數(shù)據(jù)處理與可視化

import numpy as np
a=np.array([[3,4,9],[12,15,1]])
b=np.array([[2,6,3],[7,8,12]])
print(a[a>b])  #取出a大于b的所有元素，輸出：[ 3  9  12  15]
print(a[a>10]) #取出a大于10的所有元素，輸出：[12  15]
print(np.where(a>10,-1,a)) #a中大于10的元素改為-1
print(np.where(a>10,-1,0)) #a中大于10的元素改為-1，否則為0

說明：多維數(shù)組使用bool索引得到結(jié)果為一維數(shù)組，使用np.where得到結(jié)果保持原來形狀

ufunc函數(shù)
對數(shù)組進行逐元素操作的函數(shù)

# 比較math函數(shù)和ufunc函數(shù)的效率
import numpy as np, time, math
x=[i*0.01 for i in range(1000000)]
start=time.time()  # 1970紀元后經(jīng)過的浮點秒數(shù)
for (i,t) in enumerate(x): x[i]=math.sin(t)
print("math.sin:", time.time()-start)
y=np.array([i*0.01 for i in range(1000000)])
start=time.time()
y=np.sin(y)
print("numpy.sin:", time.time()-start)

廣播機制
逐元素運算時對兩個數(shù)組的維度要求
各輸入數(shù)組從右到左的對應維度值相同，或其中之一為1

import numpy as np
a=np.arange(0, 20, 10).reshape(-1, 1)  #變形為1列的數(shù)組，行數(shù)自動計算
b=np.arange(0, 3)
print(a+b)

Numpy.random模塊的隨機數(shù)生成

相比Python內(nèi)置的random模塊，Numpy.random模塊的隨機數(shù)生成函數(shù)可以生成隨機向量且可使用的函數(shù)豐富
【數(shù)學建?！繑?shù)據(jù)處理與可視化,數(shù)學建模,數(shù)學建模,python,matplotlib,pandas,numpy,scipy,數(shù)據(jù)處理與可視化

文本文件和二進制文件存取

文件存取格式分類：二進制格式和文本格式
二進制格式文件分類：Numpy專用格式化二進制文件和無格式類型

文本文件存取
savetxt()和loadtxt()方法

import numpy as np
a=np.arange(0,3,0.5).reshape(2,3)  #生成2×3的數(shù)組
np.savetxt("Pdata2_18_1.txt", a)  #缺省按照'%.18e'格式保存數(shù)值，以空格分隔
b=np.loadtxt("Pdata2_18_1.txt")  #返回浮點型數(shù)組
print("b=",b)
np.savetxt("Pdata2_18_2.txt", a, fmt="%d", delimiter=",")  #保存為整型數(shù)據(jù)，以逗號分隔
c=np.loadtxt("Pdata2_18_2.txt",delimiter=",")  #讀入的時候也需要指定逗號分隔
print("c=",c)

二進制格式文件存取
tofile()和fromfile()方法

import numpy as np
a=np.arange(6).reshape(2,3)
a.tofile('Pdata2_22.bin')
b=np.fromfile('Pdata2_22.bin',dtype=int).reshape(2,3)
print(b)

NumPy專用二進制文件存取
load()和save()方法
savez()方法將多個數(shù)組保存到一個文件中

import numpy as np
a=np.arange(6).reshape(2,3)
np.save("Pdata2_23_1.npy",a)
b=np.load("Pdata2_23_1.npy")
c=np.arange(6,12).reshape(2,3)
d=np.sin(c)
np.savez("Pdata2_23_2.npz",c,d)
e=np.load("Pdata2_23_2.npz")
f1=e["arr_0"]  #提取第一個數(shù)組的數(shù)據(jù)
f2=e["arr_1"]  #提取第二個數(shù)組的數(shù)據(jù)

文件操作

文件分類
文本文件：每個字節(jié)存放一個ASCII碼，代表一個字符
二進制文件：將數(shù)據(jù)在內(nèi)存中的存儲形式直接輸出到磁盤上存放

文件基本操作

打開文件
使用open()函數(shù)，返回值為對應文件的文件對象
文件對象名=open(文件名[, 打開方式[, 緩沖區(qū)] )
【數(shù)學建?！繑?shù)據(jù)處理與可視化,數(shù)學建模,數(shù)學建模,python,matplotlib,pandas,numpy,scipy,數(shù)據(jù)處理與可視化
說明：使用with語句打開文件，則可以省略關(guān)閉文件的步驟，文件使用完后會自動關(guān)閉

文件對象屬性
【數(shù)學建?！繑?shù)據(jù)處理與可視化,數(shù)學建模,數(shù)學建模,python,matplotlib,pandas,numpy,scipy,數(shù)據(jù)處理與可視化

文件對象方法
【數(shù)學建?！繑?shù)據(jù)處理與可視化,數(shù)學建模,數(shù)學建模,python,matplotlib,pandas,numpy,scipy,數(shù)據(jù)處理與可視化

關(guān)閉文件
文件對象名.close()

文件管理方法

文件和目錄列表
os.listdir("目錄名")
文件重命名
os.rename("當前文件名","新文件名")
目錄操作
新建目錄：os.mkdir("新目錄名")
改變目錄：os.chdir("新目錄名")
顯示當前目錄名：os.getcwd()
刪除空目錄：os.rmdir("待刪除目錄名")（rmdir()方法只能刪除一個空目錄，即是刪除目錄中的所有內(nèi)容后，才能刪除當前目錄）

數(shù)據(jù)處理工具Pandas

Python最強大的數(shù)據(jù)分析和探索工具之一
支持類似于SQL語句的模型，支持時間序列分析
解決數(shù)據(jù)的預處理問題，如數(shù)據(jù)類型的轉(zhuǎn)換、缺失值的處理、描述性統(tǒng)計分析、數(shù)據(jù)的匯總等
Pandas中最重要的是Series和DataFrame子類

統(tǒng)計特征計算
算數(shù)平均值：mean()
標準差：std()
協(xié)方差矩陣：cov()
方差：var()
數(shù)據(jù)的基本情況：describe()

Series和DataFrame

Series：帶標簽的一維數(shù)組
DataFrame：帶標簽的二維數(shù)組
Panel：多維數(shù)據(jù)，通常包括不同時間的不同測量結(jié)果

Series構(gòu)造
構(gòu)建方法：使用同類型的列表或元組、通過字典、通過NumPy一維數(shù)組、通過DataFrame中某一列

import pandas as pd
import numpy as np
s1=pd.Series(np.array([10.5,20.5,30.5]))  #由數(shù)組構(gòu)造序列
s2=pd.Series({"北京":10.5,"上海":20.5,"廣東":30.5})  #由字典構(gòu)造序列
s3= pd.Series([10.5,20.5,30.5],index=['b','c','d'])   #給出行標簽命名
print(s1); print("--------------"); 
print(s2)
print("--------------"); 
print(s3)

Series的索引和計算

import pandas as pd
import numpy as np
s=pd.Series([10.5,20.5,98],index=['a','b','c'])
a=s['b']  #取出序列的第2個元素，輸出：20.5
b1=np.mean(s)  #輸出：43.0
b2=s.mean()  #通過數(shù)列方法求均值，輸出：43.0import pandas as pd
import numpy as np
s=pd.Series([10.5,20.5,98],index=['a','b','c'])
a=s['b']  #取出序列的第2個元素，輸出：20.5
b1=np.mean(s)  #輸出：43.0
b2=s.mean()  #通過數(shù)列方法求均值，輸出：43.0

DataFrame的創(chuàng)建方法
DataFrame(data=二維數(shù)據(jù) [, index=行索引[, columns=列索引[, dtype=數(shù)據(jù)類型]]])

import pandas as pd
import numpy as np
a=np.arange(1,7).reshape(3,2)
df1=pd.DataFrame(a)
df2=pd.DataFrame(a,index=['a','b','c'], columns=['x1','x2'])
df3=pd.DataFrame({'x1':a[:,0],'x2':a[:,1]})
print(df1); 
print("---------"); 
print(df2)
print("---------"); 
print(df3)

外部文件存取

從文本文件、Excel表格中讀取
文本文件的讀取
read_csv()函數(shù)，讀取txt或csv文件
【數(shù)學建模】數(shù)據(jù)處理與可視化,數(shù)學建模,數(shù)學建模,python,matplotlib,pandas,numpy,scipy,數(shù)據(jù)處理與可視化

import pandas as pd
a=pd.read_csv("Pdata2_32.txt",sep=',',parse_dates={'birthday':[0,1,2]},
#parse_dates參數(shù)通過字典實現(xiàn)前三列的日期解析，并合并為新字段birthday
skiprows=2,skipfooter=2,comment='#',thousands='&',engine='python')
print(a)

讀取Excel文件
【數(shù)學建模】數(shù)據(jù)處理與可視化,數(shù)學建模,數(shù)學建模,python,matplotlib,pandas,numpy,scipy,數(shù)據(jù)處理與可視化

import pandas as pd
a=pd.read_excel("Pdata2_33.xlsx",usecols=range(1,4))  #提取第2列到第4列的數(shù)據(jù)
b=a.values  #提取其中的數(shù)據(jù)
c=a.describe()  #對數(shù)據(jù)進行統(tǒng)計描述
print(c)

向Excel文件寫入數(shù)據(jù)

# 讀入Excel文件Pdata2_33.xlsx中的數(shù)據(jù)
# 然后寫入另一個文件Pdata2_34.xlsx中的兩個表單“sheet1”和“sheet2”中
import pandas as pd
import numpy as np
a=pd.read_excel("Pdata2_33.xlsx",usecols=range(1,4)) #提取第2列到第4列的數(shù)據(jù)
b=a.values  #提取其中的數(shù)據(jù)
#生成DataFrame類型數(shù)據(jù)
c=pd.DataFrame(b,index=np.arange(1,11),columns=["用戶A","用戶B","用戶C"])
f=pd.ExcelWriter('Pdata2_34.xlsx')  #創(chuàng)建文件對象
c.to_excel(f,"sheet1")  #把c寫入Excel文件
c.to_excel(f,"sheet2")  #c再寫入另一個表單中
f.save()

獲取數(shù)據(jù)子集
iloc()方法和loc()方法，篩選[rows_select, cols_select]
iloc()方法使用行號和列號篩選數(shù)據(jù)
loc()方法使用行標簽/列標簽篩選數(shù)據(jù)

# 讀取用戶A和用戶B的前6個數(shù)據(jù)
import pandas as pd
import numpy as np
a=pd.read_excel("Pdata2_33.xlsx",usecols=range(1,4)) #提取第2列到第4列的數(shù)據(jù)
b1=a.iloc[np.arange(6),[0,1]]  #通過標號篩選數(shù)據(jù)
b2=a.loc[np.arange(6),["用戶A","用戶B"]]  #通過標簽篩選數(shù)據(jù)

Matplotlib可視化

Python強大的數(shù)據(jù)可視化工具，類似MATLAB語言

基礎用法

四種對象容器
Figure：圖形大小、位置等操作
Axes：坐標軸位置、繪圖等操作
Axis：坐標軸的設置等操作
Tick：格式化刻度的樣式等操作

折線圖
plot(x, y, s)：x為數(shù)據(jù) $x$ 坐標，y為數(shù)據(jù) $y$ 坐標，s為指定線條顏色、樣式和數(shù)據(jù)點形狀的字符串
plot(x, y, linestyle, linewidth, color, marker, markersize, markeredgecolor, markerfacecolor, markeredgewidth, label, alpha)
【數(shù)學建?！繑?shù)據(jù)處理與可視化,數(shù)學建模,數(shù)學建模,python,matplotlib,pandas,numpy,scipy,數(shù)據(jù)處理與可視化
【數(shù)學建模】數(shù)據(jù)處理與可視化,數(shù)學建模,數(shù)學建模,python,matplotlib,pandas,numpy,scipy,數(shù)據(jù)處理與可視化

常用繪圖函數(shù)
pie()：餅狀圖
bar()：柱狀圖
hist()：二維直方圖
scatter()：散點圖

模塊加載
方式一：import matplotlib.pyplot as plt或from matplotlib import pyplot as plt，畫圖函數(shù)調(diào)用為plt.plot()
方式二：from matplotlib.pyplot import *，畫圖函數(shù)調(diào)用為plot()

畫圖步驟
（1）導入Matplotlib.pyplot模塊。
（2）設置繪圖的數(shù)據(jù)及參數(shù)。
（3）調(diào)用Matplotlib.pyplot模塊的plot()、pie()、bar()、hist()、scatter()等函數(shù)進行繪圖。
（4）設置繪圖的x軸、y軸、標題、網(wǎng)格線、圖例等內(nèi)容
（5）調(diào)用show()函數(shù)顯示已繪制的圖形。

中文和負號的顯示設置

rcParams['font.sans-serif']=['SimHei']   #用來正常顯示中文標簽
rcParams['axes.unicode_minus']=False  #用來正常顯示負號

可視化應用

散點圖
【數(shù)學建?！繑?shù)據(jù)處理與可視化,數(shù)學建模,數(shù)學建模,python,matplotlib,pandas,numpy,scipy,數(shù)據(jù)處理與可視化

import numpy as np
from matplotlib.pyplot import *
x=np.array(range(8))
y='27.0 26.8 26.5 26.3 26.1 25.7 25.3 24.8'  #數(shù)據(jù)是粘貼過來的
y=",".join(y.split())    #把空格替換成逗號
y=np.array(eval(y))      #數(shù)據(jù)之間加逗號太麻煩，我們用程序轉(zhuǎn)換
scatter(x,y)
savefig('figure2_23.png',dpi=500); show()

多個圖形顯示在一個圖形畫面
【數(shù)學建?！繑?shù)據(jù)處理與可視化,數(shù)學建模,數(shù)學建模,python,matplotlib,pandas,numpy,scipy,數(shù)據(jù)處理與可視化

import numpy as np
from matplotlib.pyplot import *
x=np.linspace(0,2*np.pi,200)
y1=np.sin(x); y2=np.cos(pow(x,2))
rc('font',size=16); rc('text', usetex=True)  #調(diào)用tex字庫
plot(x,y1,'r',label='$sin(x)$',linewidth=2)  #Latex格式顯示公式
plot(x,y2,'b--',label='$cos(x^2)$')
xlabel('$x$'); 
ylabel('$y$',rotation=0)
savefig('figure2_38.png',dpi=500); 
legend(); 
show()

多個圖形單獨顯示
【數(shù)學建模】數(shù)據(jù)處理與可視化,數(shù)學建模,數(shù)學建模,python,matplotlib,pandas,numpy,scipy,數(shù)據(jù)處理與可視化

import numpy as np
from matplotlib.pyplot import *
x=np.linspace(0,2*np.pi,200)
y1=np.sin(x); 
y2=np.cos(x); 
y3=np.sin(x*x)
rc('font',size=16); 
rc('text', usetex=True)  #調(diào)用tex字庫
ax1=subplot(2,2,1)  #新建左上1號子窗口
ax1.plot(x,y1,'r',label='$sin(x)$') #畫圖
legend()  #添加圖例
ax2=subplot(2,2,2)  #新建右上2號子窗口
ax2.plot(x,y2,'b--',label='$cos(x)$'); 
legend() 
ax3=subplot(2,1,2)  #新建兩行、1列的下面子窗口
ax3.plot(x,y3,'k--',label='$sin(x^2)$'); 
legend(); 
savefig('figure2_39.png',dpi=500); 
show()

三維空間的曲線
【數(shù)學建?！繑?shù)據(jù)處理與可視化,數(shù)學建模,數(shù)學建模,python,matplotlib,pandas,numpy,scipy,數(shù)據(jù)處理與可視化

from mpl_toolkits import mplot3d
import matplotlib.pyplot as plt
import numpy as np
ax=plt.axes(projection='3d')  #設置三維圖形模式
z=np.linspace(0, 100, 1000)
x=np.sin(z)*z; 
y=np.cos(z)*z
ax.plot3D(x, y, z, 'k')
plt.savefig('figure2_40.png',dpi=500); 
plt.show()

三維曲面圖形
【數(shù)學建模】數(shù)據(jù)處理與可視化,數(shù)學建模,數(shù)學建模,python,matplotlib,pandas,numpy,scipy,數(shù)據(jù)處理與可視化

from mpl_toolkits import mplot3d
import matplotlib.pyplot as plt
import numpy as np
x=np.linspace(-6,6,30)
y=np.linspace(-6,6,30)
X,Y=np.meshgrid(x, y)
Z= np.sin(np.sqrt(X ** 2 + Y ** 2))
ax1=plt.subplot(1,2,1,projection='3d')
ax1.plot_surface(X, Y, Z,cmap='viridis')
ax1.set_xlabel('x'); ax1.set_ylabel('y'); ax1.set_zlabel('z')
ax2=plt.subplot(1,2,2,projection='3d'); 
ax2.plot_wireframe(X, Y, Z,color='c')
ax2.set_xlabel('x'); ax2.set_ylabel('y'); ax2.set_zlabel('z')
plt.savefig('figure2_41.png',dpi=500); plt.show()

等高線圖
畫出區(qū)域的等高線和三維表面圖

【數(shù)學建?！繑?shù)據(jù)處理與可視化,數(shù)學建模,數(shù)學建模,python,matplotlib,pandas,numpy,scipy,數(shù)據(jù)處理與可視化

from mpl_toolkits import mplot3d
import matplotlib.pyplot as plt
import numpy as np
z=np.loadtxt("Pdata2_42.txt")  #加載高程數(shù)據(jù)
x=np.arange(0,1500,100)
y=np.arange(1200,-10,-100)
contr=plt.contour(x,y,z); plt.clabel(contr)  #畫等高線并標注
plt.xlabel('$x$'); plt.ylabel('$y$',rotation=0)
plt.savefig('figure2_42_1.png',dpi=500)
plt.figure()  #創(chuàng)建一個繪圖對象
ax=plt.axes(projection='3d') #用這個繪圖對象創(chuàng)建一個三維坐標軸對象
X,Y=np.meshgrid(x,y)
ax.plot_surface(X, Y, z,cmap='viridis')
ax.set_xlabel('x'); ax.set_ylabel('y'); ax.set_zlabel('z')
plt.savefig('figure2_42_2.png',dpi=500); plt.show()

可視化綜合應用

使用subplot函數(shù)將多種圖形組合在一起
【數(shù)學建?！繑?shù)據(jù)處理與可視化,數(shù)學建模,數(shù)學建模,python,matplotlib,pandas,numpy,scipy,數(shù)據(jù)處理與可視化

import numpy as np
from matplotlib.pyplot import *
x=np.linspace(0,2*np.pi,200)
y1=np.sin(x); y2=np.cos(x); y3=np.sin(x*x); y4=x*np.sin(x)
rc('font',size=16); rc('text', usetex=True)  #調(diào)用tex字庫
ax1=subplot(2,3,1)  #新建左上1號子窗口
ax1.plot(x,y1,'r',label='$sin(x)$') #畫圖
legend()  #添加圖例
ax2=subplot(2,3,2)  #新建2號子窗口
ax2.plot(x,y2,'b--',label='$cos(x)$'); legend() 
ax3=subplot(2,3,(3,6))  #3、6子窗口合并
ax3.plot(x,y3,'k--',label='$sin(x^2)$'); legend()
ax4=subplot(2,3,(4,5))  #4、5號子窗口合并
ax4.plot(x,y4,'k--',label='$xsin(x)$'); legend()
savefig('figure2_44.png',dpi=500); show()

【數(shù)學建?！繑?shù)據(jù)處理與可視化,數(shù)學建模,數(shù)學建模,python,matplotlib,pandas,numpy,scipy,數(shù)據(jù)處理與可視化

import numpy as np
import pandas as pd
from matplotlib.pyplot import *
a=pd.read_excel("Trade.xlsx")
a['year']=a.Date.dt.year  #添加交易年份字段
a['month']=a.Date.dt.month  #添加交易月份字段
rc('font',family='SimHei')  #用來正常顯示中文標簽
ax1=subplot(2,3,1)   #建立第一個子圖窗口
Class_Counts=a.Order_Class[a.year==2012].value_counts()
Class_Percent=Class_Counts/Class_Counts.sum()
ax1.set_aspect(aspect='equal')  #設置縱橫軸比例相等
ax1.pie(Class_Percent,labels=Class_Percent.index,
        autopct="%.1f%%")  #添加格式化的百分比顯示
ax1.set_title("2012年各等級訂單比例")
ax2=subplot(232)  #建立第2個子圖窗口
#統(tǒng)計2012年每月銷售額
Month_Sales=a[a.year==2012].groupby(by='month').aggregate({'Sales':np.sum})
#下面使用Pandas畫圖
Month_Sales.plot(title="2012年各月銷售趨勢",ax=ax2, legend=False)
ax2.set_xlabel('')
ax3=subplot(2,3,(3,6))
cost=a['Trans_Cost'].groupby(a['Transport'])
ts = list(cost.groups.keys())
dd = np.array(list(map(cost.get_group, ts)))
boxplot(dd); gca().set_xticklabels(ts)
ax4=subplot(2,3,(4,5))
hist(a.Sales[a.year==2012],bins=40, density=True)
ax4.set_title("2012年銷售額分布圖");
ax4.set_xlabel("銷售額");
savefig("figure2_45.png"); show()

scipy.stats模塊簡介

隨機變量及分布

scipy.stats模塊包含多種概率分布的隨機變量，隨機變量分為連續(xù)型和離散型兩種
連續(xù)型隨機變量為rv_continuous的派生類的對象
離散型隨機變量為rv_discrete的派生類的對象

連續(xù)型隨機變量

# 獲得scipy.stats模塊中所有的連續(xù)型隨機變量
from scipy import stats
[k for k, v in stats.__dict__.items() if isinstance(v, stats.rv_continuous]

常用方法：
【數(shù)學建模】數(shù)據(jù)處理與可視化,數(shù)學建模,數(shù)學建模,python,matplotlib,pandas,numpy,scipy,數(shù)據(jù)處理與可視化

常用概率密度函數(shù)：
【數(shù)學建?！繑?shù)據(jù)處理與可視化,數(shù)學建模,數(shù)學建模,python,matplotlib,pandas,numpy,scipy,數(shù)據(jù)處理與可視化

正態(tài)分布的主要函數(shù)：
【數(shù)學建模】數(shù)據(jù)處理與可視化,數(shù)學建模,數(shù)學建模,python,matplotlib,pandas,numpy,scipy,數(shù)據(jù)處理與可視化

離散型隨機變量

# 獲得scipy.stats模塊中所有的離散型隨機變量
from scipy import stats
[k for k, v in stats.__dict__.items() if isinstance(v, stats.rv_discrete)]

概率密度函數(shù)可視化

看一個Gamma分布的例子

# 四條Gamma分布的圖形
from pylab import plot, legend, xlabel, ylabel, savefig, show, rc
from scipy.stats import gamma
from numpy import linspace
x=linspace(0,15,100); rc('font',size=15); rc('text', usetex=True) 
plot(x,gamma.pdf(x,4,0,2),'r*-',label="$\\alpha=4, \\beta=2$")
plot(x,gamma.pdf(x,4,0,1),'bp-',label="$\\alpha=4, \\beta=1$")
plot(x,gamma.pdf(x,4,0,0.5),'.k-',label="$\\alpha=4, \\beta=0.5$")
plot(x,gamma.pdf(x,2,0,0.5),'>g-',label="$\\alpha=2, \\beta=0.5$")
legend(); xlabel('$x$'); ylabel('$f(x)$')
savefig("figure2_46.png",dpi=500); show()

正態(tài)分布

# 四個正態(tài)分布的圖形
import matplotlib.pyplot as plt
import numpy as np
from scipy.stats import norm
mu0 = [-1, 0]; s0 = [0.5, 1]
x = np.linspace(-7, 7, 100); plt.rc('font',size=15)
plt.rc('text', usetex=True); plt.rc('axes',unicode_minus=False)
f, ax = plt.subplots(len(mu0), len(s0), sharex=True, sharey=True)
for i in range(2):
    for j in range(2):
        mu = mu0[i]; s = s0[j]
        y = norm(mu, s).pdf(x)
        ax[i,j].plot(x, y)
ax[i,j].plot(1,0,label="$\\mu$ = {:3.2f}\n$\\sigma$ = {:3.2f}".format(mu,s))
ax[i,j].legend(fontsize=12)
ax[1,1].set_xlabel('$x$')
ax[0,0].set_ylabel('pdf($x$)')
plt.savefig('figure2_47.png'); plt.show()

二項分布文章來源地址http://www.zghlxwxcb.cn/news/detail-801744.html

# 二項分布
from scipy.stats import binom
import matplotlib.pyplot as plt
import numpy as np
n, p=5, 0.4
x=np.arange(6); y=binom.pmf(x,n,p)
plt.subplot(121); plt.plot(x, y, 'ro')
plt.vlines(x, 0, y, 'k', lw=3, alpha=0.5)  #vlines(x, ymin, ymax)畫豎線圖
#lw設置線寬度，alpha設置圖的透明度
plt.subplot(122); plt.stem(x, y, use_line_collection=True)
plt.savefig("figure2_48.png", dpi=500); plt.show()