前言
這篇文章記錄以下兩點:
1.獲取天氣信息的方法和步驟以及遇到的問題和改進方法
2.獲取到天氣信息后進行數(shù)據清洗和可視化展示
總的來說,就是將網站中的天氣信息通過爬蟲技術保存在文件中,再通過對文件中的文本數(shù)據進行處理后用圖表方式展現(xiàn)出來。
一、爬蟲部分
1.選定網址
(1)網址選擇
- 因為要對網站內的數(shù)據進行獲取,第一步就是要找到存在對應信息的網址,這里選擇一個天氣網站
- 通過robots協(xié)議可知,該網站可以進行爬蟲爬取
(2)分析
- 主頁網站并沒有需要的大量天氣信息;
- 思考:我們想要獲取天氣信息應該是針對某一城市的某些天;
- 在主頁中找到該網址(這里以成都的天氣為例):xxx/weather40d/101270101.shtml
- 通過該網站發(fā)現(xiàn),當點擊40天的天氣信息時,網址URL的第一個路徑為weather40d;
- 點擊其他選項卡,不難發(fā)現(xiàn),依次為:今天(1d),7天(無,這里需要避坑),8-15天(15d);
- 除此之外,尾部路徑的數(shù)字101270101代表的是成都市;101110101代表西安市;
2.獲取成都7天的天氣信息
(1)請求成都最近7天天氣信息的網站
- 首先請求,看是否正常
# coding:utf-8
import requests
def get_data(url):
headers = {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/94.0.4606.81 Safari/537.36"
}
r = requests.get(url=url,headers=headers)
if r.status_code == 200:
print('請求成功')
else:
print('請求失敗')
URL = 'http://www.weather.com.cn/weather7d/101270101.shtml'
get_data(URL)
- 下面圖片為輸出結果,說明該網站可以正常請求訪問
- 目前看來沒有問題,開始準備獲取成都最近7天的天氣信息
(2)獲取成都7天的天氣信息
我們發(fā)現(xiàn)天氣信息是保存在網頁源代碼里,只需要獲取網頁源代碼后進行解析即可獲取到數(shù)據
a.通過text方法獲取網頁源代碼
# coding:utf-8
import requests
def get_data(url):
headers = {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/94.0.4606.81 Safari/537.36"
}
r = requests.get(url=url, headers=headers)
if r.status_code == 200:
# 設置編碼格式
r.encoding = 'UTF-8'
# 通過text方法返回網頁源碼
return r.text
else:
return '請求失敗'
URL = 'http://www.weather.com.cn/weather7d/101270101.shtml'
print(get_data(URL))
b.解析網頁源碼
- 顯然,需要的信息在類名為c7d、id名為7d的div標簽下,它里面只有一個ul標簽(該標簽為列表標簽),ul標簽里有需要的7天的天氣信息,為7個li標簽;
- 總的來說,只需要找到id名為7d的div標簽,便可找到天氣信息,緊接著找到ul標簽,遍歷其中的li標簽,提取信息即可。
c.獲取數(shù)據
URL = 'http://www.weather.com.cn/weather/101270101.shtml'
# 調用函數(shù)獲取網頁源代碼
html_code = get_data(URL)
soup = BeautifulSoup(html_code, "html.parser")
div = soup.find("div", id="7d")
# 獲取div標簽,下面這種方式也可以
# div = soup.find('div', attrs={'id': '7d', 'class': 'c7d'}) # div
ul = div.find("ul") # ul
lis = ul.find_all("li") # li
# 此行為該網站更新信息時間
# print(soup.find("div", id='around').find("h1").find("i").text)
li_today = lis[0]
# 發(fā)現(xiàn)在晚上訪問該網站,今日的天氣是沒有最高氣溫,需要手動添加,無法遍歷添加
weather_list = []
weather = []
# 添加今天的數(shù)據
date_today = li_today.find('h1').text # 日期
wea_today = li_today.find('p', class_="wea").text # 天氣
tem_h_today = 'NONE'
tem_l_today = li_today.find('p', class_="tem").find("i").text # 溫度最低
spans_today = li_today.find('p', attrs={"class": "win"}).find_all("span")
win1_today = '' # 風向
for s in spans_today:
win1_today += s.get('title') + '且'
win2_today = li_today.find('p', attrs={"class": "win"}).find("i").text # 風力
weather_today = [date_today, wea_today, tem_h_today, tem_l_today, win1_today + win2_today]
weather_all = []
# 添加剩下6天的數(shù)據
for li in lis[1:]:
date = li.find('h1').text # 日期
wea = li.find('p', class_="wea").text # 天氣
tem_h = li.find('p', class_="tem").find("span").text # 溫度最高
tem_l = li.find('p', class_="tem").find("i").text # 溫度最低
spans = li.find('p', attrs={"class": "win"}).find("span") # 此處不需要find_all
win1 = spans.get('title') + '且' # 風向
win2 = li.find('p', attrs={"class": "win"}).find("i").text # 風力
weather = [date, wea, tem_h, tem_l, win1 + win2]
weather_all.append(weather)
# 插入首天數(shù)據
weather_all.insert(0, weather_today)
print(weather_all)
- 通過以上方式,便可得到7天的數(shù)據。
3.獲取成都40天的天氣信息
(1)分析網站
- 7天的天氣信息過少,需要獲取更多數(shù)據;
- 此刻,以獲取7天天氣信息的方式去獲取40天天氣信息時,發(fā)現(xiàn),并沒有需要的天氣信息;
- 經過對比,在網頁的檢查功能中,可以找到天氣信息,但是在網頁源代碼中并沒有該數(shù)據;
- 下圖為檢查功能中的天氣信息:
- 下圖為網頁源代碼中的天氣:
- 對比發(fā)現(xiàn),所需的天氣信息,在網頁源代碼中是空白,我們便知此處是由動態(tài)網頁生成,那就需要找到動態(tài)網頁中保存數(shù)據的文件。
(2)動態(tài)網頁的數(shù)據
- 一般來說,動態(tài)網頁的數(shù)據保存在網頁文件夾里的JSON文件中,只需要找到該文件即可;
- 依舊在審查元素(右鍵檢查)里,找到Network;
- 剛點進去是一片空白,因為該網頁已經渲染完畢,只需要刷新即可重新渲染
- 繼續(xù)通過XHR過濾器尋找JSON文件,是一片空白
- 繼續(xù)通過JS過濾器尋找,發(fā)現(xiàn)了所需的文件
(3)嘗試獲取動態(tài)網頁數(shù)據
- 需要在headers找到重新請求的網站;
# coding:utf-8
import requests
def get_data(web_url):
headers = {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/94.0.4606.81 Safari/537.36"
}
r = requests.get(url=web_url, headers=headers)
if r.status_code == 200:
# 返回響應對象中JSON解碼的數(shù)據內容
weather_data = r.json()
return weather_data
else:
return '請求失??!'
url = 'http://d1.weather.com.cn/calendar_new/2022/101270101_202207.html'
data = get_data(url)
print(data)
- 通過上述方法,并不能請求成功,發(fā)現(xiàn)錯誤代碼403
a.測試1:使用隨機用戶代理(此方法失?。?/h6>
# coding:utf-8
import requests
import random
def get_data(web_url):
my_headers = [
"Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36",
"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/35.0.1916.153 Safari/537.36",
"Mozilla/5.0 (Windows NT 6.1; WOW64; rv:30.0) Gecko/20100101 Firefox/30.0",
"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_2) AppleWebKit/537.75.14 (KHTML, like Gecko) Version/7.0.3 Safari/537.75.14",
"Mozilla/5.0 (compatible; MSIE 10.0; Windows NT 6.2; Win64; x64; Trident/6.0)",
'Mozilla/5.0 (Windows; U; Windows NT 5.1; it; rv:1.8.1.11) Gecko/20071127 Firefox/2.0.0.11',
'Opera/9.25 (Windows NT 5.1; U; en)',
'Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322; .NET CLR 2.0.50727)',
'Mozilla/5.0 (compatible; Konqueror/3.5; Linux) KHTML/3.5.5 (like Gecko) (Kubuntu)',
'Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.0.12) Gecko/20070731 Ubuntu/dapper-security Firefox/1.5.0.12',
'Lynx/2.8.5rel.1 libwww-FM/2.14 SSL-MM/1.4.1 GNUTLS/1.2.9',
"Mozilla/5.0 (X11; Linux i686) AppleWebKit/535.7 (KHTML, like Gecko) Ubuntu/11.04 Chromium/16.0.912.77 Chrome/16.0.912.77 Safari/535.7",
"Mozilla/5.0 (X11; Ubuntu; Linux i686; rv:10.0) Gecko/20100101 Firefox/10.0 "
]
random_header = random.choice(my_headers)
headers = {
'User-Agent': random_header
}
r = requests.get(url=web_url, headers=headers)
if r.status_code == 200:
# 返回響應對象中JSON解碼的數(shù)據內容
html_data = r.json()
return html_data
else:
return '爬取失??!'
url = 'http://d1.weather.com.cn/calendar_new/2022/101270101_202207.html'
data = get_data(url)
print(data)
- 依舊失敗
b.測試2:設置headers參數(shù)(此方法成功)
- 不難發(fā)現(xiàn),headers中有個Referer參數(shù),該參數(shù)說明:當前網址是由此參數(shù)對應值所包含的網站跳轉過來,為了防止惡意請求,添加上該參數(shù),便可正常請求了
# coding:utf-8
import requests
import random
import json
def get_data(web_url):
my_headers = [
"Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36",
"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/35.0.1916.153 Safari/537.36",
"Mozilla/5.0 (Windows NT 6.1; WOW64; rv:30.0) Gecko/20100101 Firefox/30.0",
"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_2) AppleWebKit/537.75.14 (KHTML, like Gecko) Version/7.0.3 Safari/537.75.14",
"Mozilla/5.0 (compatible; MSIE 10.0; Windows NT 6.2; Win64; x64; Trident/6.0)",
'Mozilla/5.0 (Windows; U; Windows NT 5.1; it; rv:1.8.1.11) Gecko/20071127 Firefox/2.0.0.11',
'Opera/9.25 (Windows NT 5.1; U; en)',
'Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322; .NET CLR 2.0.50727)',
'Mozilla/5.0 (compatible; Konqueror/3.5; Linux) KHTML/3.5.5 (like Gecko) (Kubuntu)',
'Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.0.12) Gecko/20070731 Ubuntu/dapper-security Firefox/1.5.0.12',
'Lynx/2.8.5rel.1 libwww-FM/2.14 SSL-MM/1.4.1 GNUTLS/1.2.9',
"Mozilla/5.0 (X11; Linux i686) AppleWebKit/535.7 (KHTML, like Gecko) Ubuntu/11.04 Chromium/16.0.912.77 Chrome/16.0.912.77 Safari/535.7",
"Mozilla/5.0 (X11; Ubuntu; Linux i686; rv:10.0) Gecko/20100101 Firefox/10.0 "
]
random_header = random.choice(my_headers)
# 獲取隨機headers
headers = {
"Referer": "http://www.weather.com.cn/",
'User-Agent': random_header
}
r = requests.get(url=web_url, headers=headers)
if r.status_code == 200:
content = r.content.decode(encoding='utf-8')
# 此json文件中前面有變量名,剔除變量名,只要后面的數(shù)組數(shù)據
weathers = json.loads(content[11:])
return weathers
else:
return '爬取失??!'
url = 'http://d1.weather.com.cn/calendar_new/2022/101270101_202207.html'
data = get_data(url)
print(data)
- 仔細觀察,此URL末尾便是對應年月,于是只需要修改末尾的年月數(shù)據,便可獲取大量天氣信息;
4.獲取成都180天的天氣數(shù)據
def get_y_m_url():
# 定義列表url_list
url_list = []
# 使用format功能構造每月數(shù)據的url
for month_2022 in range(1, 7):
url_2022 = 'http://d1.weather.com.cn/calendar_new/2022/101270101_20220{}.html'.format(month_2022)
# 保存多月數(shù)據的url到列表url_list中
url_list.append(url_2022)
return url_list
url_list_all = get_y_m_url()
# for循環(huán)遍歷列表url_list
for url in url_list_all:
# 調用函數(shù)get_data獲取每月數(shù)據
weather_data = get_data(url)
# 打印輸出每月數(shù)據
print(weather_data)
二、數(shù)據處理及可視化展示
1.分析數(shù)據
- 首先分析第一個月的數(shù)據,其實只需要的是日期、降雨概率、最高溫度、最低溫度;
2.獲取數(shù)據
(1)獲取一個月數(shù)據并處理
- 獲取一個月的數(shù)據進行整理
# 創(chuàng)建空列表保存天氣數(shù)據列表
weather_info = []
url = 'http://d1.weather.com.cn/calendar_new/2022/101110801_202206.html'
# 調用函數(shù)進行數(shù)據獲取
weather_data = get_data(url)
for every_day_weather in weather_data:
# 日期
date = every_day_weather['date']
# 降雨概率
rainfall_probability = every_day_weather['hgl']
# 最高溫
tem_max = every_day_weather['hmax']
# 最低溫
tem_min = every_day_weather['hmin']
# 將以上四個數(shù)據保存在字典里,為一天的數(shù)據
one_day_weahther = {'date': date, 'rainfall_probability': rainfall_probability, 'tem_max': tem_max,'tem_min': tem_min}
# 將每天的數(shù)據保存在列表里
weather_info.append(one_day_weahther)
print(weather_info)
- 下圖為獲取到的1個月的天氣信息
(2)處理180天數(shù)據
# 創(chuàng)建空列表保存天氣數(shù)據列表
weather_info = []
# for循環(huán)遍歷列表url_list
for url in url_list_all:
# 調用函數(shù)get_data獲取每月數(shù)據
weather_data = get_data(url)
for every_day_weather in weather_data:
# 日期
date = every_day_weather['date']
# 降雨概率
rainfall_probability = every_day_weather['hgl']
# 最高溫
tem_max = every_day_weather['hmax']
# 最低溫
tem_min = every_day_weather['hmin']
# 將以上四個數(shù)據保存在字典里,為一天的數(shù)據
one_day_weahther = {'date': date, 'rainfall_probability': rainfall_probability, 'tem_max': tem_max,'tem_min': tem_min}
# 將每天的數(shù)據保存在列表里,同時去重
if one_day_weahther not in weather_info:
weather_info.append(one_day_weahther)
3.保存數(shù)據
- 將180天的數(shù)據信息以CSV格式保存下來
# 保存天氣數(shù)據到CSV文件
def save_csv(weather_data):
# 打開文件
csv_file = open('weather_info.csv', 'w', encoding='UTF-8-SIG', newline='\n')
# 設置表頭信息fieldnames=['date', 'rainfall_probability', 'tem_max', 'tem_min']
fieldnames = ['date', 'rainfall_probability', 'tem_max', 'tem_min']
# 創(chuàng)建DictWriter對象,并返回給變量dict_writer
dict_writer = csv.DictWriter(csv_file, fieldnames=fieldnames)
# 使用writeheader功能寫入表頭信息
dict_writer.writeheader()
# 使用writerows功能寫入多行數(shù)據
dict_writer.writerows(weather_data)
# 關閉文件
csv_file.close()
save_csv(weather_info_final)
4.天氣信息可視化展示
(1)成都——深圳平均溫度對比圖
# coding:utf-8
import requests
import random
def get_data(web_url):
my_headers = [
"Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36",
"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/35.0.1916.153 Safari/537.36",
"Mozilla/5.0 (Windows NT 6.1; WOW64; rv:30.0) Gecko/20100101 Firefox/30.0",
"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_2) AppleWebKit/537.75.14 (KHTML, like Gecko) Version/7.0.3 Safari/537.75.14",
"Mozilla/5.0 (compatible; MSIE 10.0; Windows NT 6.2; Win64; x64; Trident/6.0)",
'Mozilla/5.0 (Windows; U; Windows NT 5.1; it; rv:1.8.1.11) Gecko/20071127 Firefox/2.0.0.11',
'Opera/9.25 (Windows NT 5.1; U; en)',
'Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322; .NET CLR 2.0.50727)',
'Mozilla/5.0 (compatible; Konqueror/3.5; Linux) KHTML/3.5.5 (like Gecko) (Kubuntu)',
'Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.0.12) Gecko/20070731 Ubuntu/dapper-security Firefox/1.5.0.12',
'Lynx/2.8.5rel.1 libwww-FM/2.14 SSL-MM/1.4.1 GNUTLS/1.2.9',
"Mozilla/5.0 (X11; Linux i686) AppleWebKit/535.7 (KHTML, like Gecko) Ubuntu/11.04 Chromium/16.0.912.77 Chrome/16.0.912.77 Safari/535.7",
"Mozilla/5.0 (X11; Ubuntu; Linux i686; rv:10.0) Gecko/20100101 Firefox/10.0 "
]
random_header = random.choice(my_headers)
headers = {
'User-Agent': random_header
}
r = requests.get(url=web_url, headers=headers)
if r.status_code == 200:
# 返回響應對象中JSON解碼的數(shù)據內容
html_data = r.json()
return html_data
else:
return '爬取失??!'
url = 'http://d1.weather.com.cn/calendar_new/2022/101270101_202207.html'
data = get_data(url)
print(data)

# coding:utf-8
import requests
import random
import json
def get_data(web_url):
my_headers = [
"Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36",
"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/35.0.1916.153 Safari/537.36",
"Mozilla/5.0 (Windows NT 6.1; WOW64; rv:30.0) Gecko/20100101 Firefox/30.0",
"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_2) AppleWebKit/537.75.14 (KHTML, like Gecko) Version/7.0.3 Safari/537.75.14",
"Mozilla/5.0 (compatible; MSIE 10.0; Windows NT 6.2; Win64; x64; Trident/6.0)",
'Mozilla/5.0 (Windows; U; Windows NT 5.1; it; rv:1.8.1.11) Gecko/20071127 Firefox/2.0.0.11',
'Opera/9.25 (Windows NT 5.1; U; en)',
'Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322; .NET CLR 2.0.50727)',
'Mozilla/5.0 (compatible; Konqueror/3.5; Linux) KHTML/3.5.5 (like Gecko) (Kubuntu)',
'Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.0.12) Gecko/20070731 Ubuntu/dapper-security Firefox/1.5.0.12',
'Lynx/2.8.5rel.1 libwww-FM/2.14 SSL-MM/1.4.1 GNUTLS/1.2.9',
"Mozilla/5.0 (X11; Linux i686) AppleWebKit/535.7 (KHTML, like Gecko) Ubuntu/11.04 Chromium/16.0.912.77 Chrome/16.0.912.77 Safari/535.7",
"Mozilla/5.0 (X11; Ubuntu; Linux i686; rv:10.0) Gecko/20100101 Firefox/10.0 "
]
random_header = random.choice(my_headers)
# 獲取隨機headers
headers = {
"Referer": "http://www.weather.com.cn/",
'User-Agent': random_header
}
r = requests.get(url=web_url, headers=headers)
if r.status_code == 200:
content = r.content.decode(encoding='utf-8')
# 此json文件中前面有變量名,剔除變量名,只要后面的數(shù)組數(shù)據
weathers = json.loads(content[11:])
return weathers
else:
return '爬取失??!'
url = 'http://d1.weather.com.cn/calendar_new/2022/101270101_202207.html'
data = get_data(url)
print(data)
def get_y_m_url():
# 定義列表url_list
url_list = []
# 使用format功能構造每月數(shù)據的url
for month_2022 in range(1, 7):
url_2022 = 'http://d1.weather.com.cn/calendar_new/2022/101270101_20220{}.html'.format(month_2022)
# 保存多月數(shù)據的url到列表url_list中
url_list.append(url_2022)
return url_list
url_list_all = get_y_m_url()
# for循環(huán)遍歷列表url_list
for url in url_list_all:
# 調用函數(shù)get_data獲取每月數(shù)據
weather_data = get_data(url)
# 打印輸出每月數(shù)據
print(weather_data)

# 創(chuàng)建空列表保存天氣數(shù)據列表
weather_info = []
url = 'http://d1.weather.com.cn/calendar_new/2022/101110801_202206.html'
# 調用函數(shù)進行數(shù)據獲取
weather_data = get_data(url)
for every_day_weather in weather_data:
# 日期
date = every_day_weather['date']
# 降雨概率
rainfall_probability = every_day_weather['hgl']
# 最高溫
tem_max = every_day_weather['hmax']
# 最低溫
tem_min = every_day_weather['hmin']
# 將以上四個數(shù)據保存在字典里,為一天的數(shù)據
one_day_weahther = {'date': date, 'rainfall_probability': rainfall_probability, 'tem_max': tem_max,'tem_min': tem_min}
# 將每天的數(shù)據保存在列表里
weather_info.append(one_day_weahther)
print(weather_info)

# 創(chuàng)建空列表保存天氣數(shù)據列表
weather_info = []
# for循環(huán)遍歷列表url_list
for url in url_list_all:
# 調用函數(shù)get_data獲取每月數(shù)據
weather_data = get_data(url)
for every_day_weather in weather_data:
# 日期
date = every_day_weather['date']
# 降雨概率
rainfall_probability = every_day_weather['hgl']
# 最高溫
tem_max = every_day_weather['hmax']
# 最低溫
tem_min = every_day_weather['hmin']
# 將以上四個數(shù)據保存在字典里,為一天的數(shù)據
one_day_weahther = {'date': date, 'rainfall_probability': rainfall_probability, 'tem_max': tem_max,'tem_min': tem_min}
# 將每天的數(shù)據保存在列表里,同時去重
if one_day_weahther not in weather_info:
weather_info.append(one_day_weahther)
# 保存天氣數(shù)據到CSV文件
def save_csv(weather_data):
# 打開文件
csv_file = open('weather_info.csv', 'w', encoding='UTF-8-SIG', newline='\n')
# 設置表頭信息fieldnames=['date', 'rainfall_probability', 'tem_max', 'tem_min']
fieldnames = ['date', 'rainfall_probability', 'tem_max', 'tem_min']
# 創(chuàng)建DictWriter對象,并返回給變量dict_writer
dict_writer = csv.DictWriter(csv_file, fieldnames=fieldnames)
# 使用writeheader功能寫入表頭信息
dict_writer.writeheader()
# 使用writerows功能寫入多行數(shù)據
dict_writer.writerows(weather_data)
# 關閉文件
csv_file.close()
save_csv(weather_info_final)
line = (
Line(
init_opts=opts.InitOpts(animation_opts=opts.AnimationOpts(animation_duration=5000),
bg_color='rgba(255,250,205,0.2)',
width='1000px',
height='600px',
page_title='成都——深圳平均天氣對比圖',
# 設置主題
theme=ThemeType.MACARONS
)
)
.add_xaxis(xaxis_data=x)
.add_yaxis(series_name="成都", y_axis=y_cd, is_smooth=True)
.add_yaxis(series_name="深圳", y_axis=y_sz, is_smooth=True)
.set_global_opts(title_opts=opts.TitleOpts(title="成都——深圳平均溫度對比圖"),
xaxis_opts=opts.AxisOpts(name='年月'),
yaxis_opts=opts.AxisOpts(name='溫度 單位:℃'), )
.render('compare_average_tem.html')
)
(2)成都——深圳最高溫度對比圖
import csv
import pyecharts.options as opts
from pyecharts.charts import Line
import numpy
line = (
Line(
init_opts=opts.InitOpts(animation_opts=opts.AnimationOpts(animation_duration=5000),
bg_color='rgba(255,250,205,0.2)',
width='1000px',
height='600px',
page_title='成都——深圳最高溫度對比圖',
theme=ThemeType.ROMANTIC
)
)
.add_xaxis(xaxis_data=x)
.add_yaxis(series_name="成都", y_axis=y_cd, is_smooth=True)
.add_yaxis(series_name="深圳", y_axis=y_sz, is_smooth=True)
.set_global_opts(title_opts=opts.TitleOpts(title="成都——深圳最高溫度對比圖"),
xaxis_opts=opts.AxisOpts(name='年月'),
yaxis_opts=opts.AxisOpts(name='溫度 單位:℃'), )
.render('compare_max_tem.html')
)
(3)成都——深圳最低溫度對比圖
import csv
import pyecharts.options as opts
from pyecharts.charts import Line
import numpy
line = (
Line(
init_opts=opts.InitOpts(animation_opts=opts.AnimationOpts(animation_duration=5000),
bg_color='rgba(255,250,205,0.2)',
width='1000px',
height='600px',
page_title='成都——深圳最低溫度對比圖',
theme=ThemeType.WESTEROS
)
)
.add_xaxis(xaxis_data=x)
.add_yaxis(series_name="成都", y_axis=y_cd, is_smooth=True)
.add_yaxis(series_name="深圳", y_axis=y_sz, is_smooth=True)
.set_global_opts(title_opts=opts.TitleOpts(title="成都——深圳最低溫度對比圖"),
xaxis_opts=opts.AxisOpts(name='年月'),
yaxis_opts=opts.AxisOpts(name='溫度 單位:℃'), )
.render('compare_min_tem.html')
)
(4)成都——深圳溫度區(qū)間天數(shù)圖
import csv
from pyecharts.globals import ThemeType
import pyecharts.options as opts
from pyecharts.charts import Bar
import pandas as pd
bar = (
Bar(
# 設置果凍特效動畫
init_opts=opts.InitOpts(animation_opts=opts.AnimationOpts(animation_delay=500, animation_easing="elasticOut"),
bg_color='rgba(255,250,205,0.2)',
width='1000px',
height='600px',
page_title='成都——深圳溫度區(qū)間天數(shù)圖',
theme=ThemeType.INFOGRAPHIC
)
)
.add_xaxis(xaxis_data=x)
.add_yaxis(series_name="成都", y_axis=cd_max_count)
.add_yaxis(series_name="深圳", y_axis=sz_max_count)
.set_global_opts(title_opts=opts.TitleOpts(title="成都——深圳溫度區(qū)間天數(shù)圖"),
xaxis_opts=opts.AxisOpts(name='溫度區(qū)間'),
yaxis_opts=opts.AxisOpts(name='天數(shù) 單位:天'), )
.render('compare_tem_count.html')
)
(5)成都半年每日最高溫度占比
pie = (
Pie(
# 設置果凍特效動畫
init_opts=opts.InitOpts(animation_opts=opts.AnimationOpts(animation_delay=500, animation_easing="elasticOut"),
bg_color='rgba(255,250,205,0.2)',
width='1000px',
height='600px',
page_title='成都半年每日最高溫度占比',
# theme=ThemeType.INFOGRAPHIC
)
)
.add('成都180天高溫溫度占比',
list(zip(attr_tem_interval, cd_max_count)),
)
.set_global_opts(title_opts=opts.TitleOpts(title="成都半年每日最高溫度占比"),
legend_opts=opts.LegendOpts(pos_left='center'
, pos_bottom='bottom'
, orient="horizontal"
)
)
# a:系列名稱(標題),b:數(shù)據項名稱,c:數(shù)值,d:百分比
.set_series_opts(label_opts=opts.LabelOpts(formatter=':{c}天(n5n3t3z%)'))
# 設置每塊區(qū)域的顏色
.set_colors(['#00FFFF', '#00BFFF', '#FFD700', '#FFA500', '#FF0000'])
.render('cd_tem_pie.html')
)
(6)成都半年每日最高溫度占比——南丁格爾圖
pie = (
Pie(
# 設置果凍特效動畫
init_opts=opts.InitOpts(animation_opts=opts.AnimationOpts(animation_delay=500, animation_easing="elasticOut"),
bg_color='rgba(255,250,205,0.2)',
width='1000px',
height='600px',
page_title='成都半年每日最高溫度占比——南丁格爾圖',
# theme=ThemeType.INFOGRAPHIC
)
)
.add('成都180天高溫溫度占比',
list(zip(attr_tem_interval, cd_max_count)),
# 是否展示成南丁格爾圖,通過半徑區(qū)分數(shù)據大小??蛇x擇兩種模式:
# 'radius' 扇區(qū)圓心角展現(xiàn)數(shù)據的百分比,半徑展現(xiàn)數(shù)據的大小。
# 'area' 所有扇區(qū)圓心角相同,僅通過半徑展現(xiàn)數(shù)據大小。
rosetype="radius",
# 餅圖的半徑,數(shù)組的第一項是內半徑,第二項是外半徑(如果兩項均設置則為環(huán)狀圖)
# 默認設置成百分比,相對于容器高寬中較小的一項的一半
radius="55%",
# 餅圖的中心(圓心)坐標,數(shù)組的第一項是橫坐標,第二項是縱坐標
# 默認設置成百分比,設置成百分比時第一項是相對于容器寬度,第二項是相對于容器高度
center=["50%", "50%"],
)
.set_global_opts(title_opts=opts.TitleOpts(title="成都半年每日最高溫度占比——南丁格爾圖"),
legend_opts=opts.LegendOpts(pos_left='center'
, pos_bottom='bottom'
, orient="horizontal"
)
)
# a:系列名稱(標題),b:數(shù)據項名稱,c:數(shù)值,d:百分比
.set_series_opts(label_opts=opts.LabelOpts(formatter=':{c}天(n5n3t3z%)'))
# 設置每塊區(qū)域的顏色
.set_colors(['#00FFFF', '#00BFFF', '#FFD700', '#FFA500', '#FF0000'])
.render('cd_tem_pie_coxcomb.html')
)
若本篇內容對您有所幫助,請三連點贊,關注,收藏支持下。
創(chuàng)作不易,白嫖不好,各位的支持和認可,就是我創(chuàng)作的最大動力,我們下篇文章見!
圈圈仔OvO | 文文章來源:http://www.zghlxwxcb.cn/news/detail-474094.html
如果本篇博客有任何錯誤,請批評指教,不勝感激 !文章來源地址http://www.zghlxwxcb.cn/news/detail-474094.html
到了這里,關于Python爬取180天的天氣信息及數(shù)據分析的文章就介紹完了。如果您還想了解更多內容,請在右上角搜索TOY模板網以前的文章或繼續(xù)瀏覽下面的相關文章,希望大家以后多多支持TOY模板網!