中華人民共和國農(nóng)業(yè)農(nóng)村部 http://www.moa.gov.cn/
- 點擊數(shù)據(jù) → 點擊周度數(shù)據(jù) → 跳轉(zhuǎn)網(wǎng)頁 http://zdscxx.moa.gov.cn:8080/nyb/pc/frequency.jsp
分析
-
抓包,發(fā)現(xiàn)getFrequencyData里面有我們想要的數(shù)據(jù)
-
查看請求的提交參數(shù)
-
使用postman接口測試工具測試驗證getFrequencyData里的url,發(fā)現(xiàn)測試返回的數(shù)據(jù)列表是空的
文章來源:http://www.zghlxwxcb.cn/news/detail-833748.html
- 繼續(xù)分析,發(fā)現(xiàn)需要先訪問updateFrequencyConditions,再訪問getFrequencyData
爬取
import requests
import uuid
import time
- 爬取第一頁的數(shù)據(jù)
url1 = 'http://zdscxx.moa.gov.cn:8080/nyb/updateFrequencyConditions'
url2 = 'http://zdscxx.moa.gov.cn:8080/nyb/getFrequencyData'
data = {
'page':'1',
'rows':'20',
'type':'周度數(shù)據(jù)',
'subType':'農(nóng)產(chǎn)品批發(fā)價格',
'level':'0',
'time':'["2019-37","2023-38"]',
'product':'蔬菜'
}
headers = {
'Cookie':'JSESSIONID=9EDB9C447A01905C7893BDE4C220CF65; yfx_c_g_u_id_10002896=_ck23091319002016340778405571397; yfx_f_l_v_t_10002896=f_t_1694602820630__r_t_1694602820630__v_t_1694602820630__r_c_0; _trs_uv=lmhmrkth_299_3qsk; wdcid=5dbb601a9ccf2804; wdses=369f04c5d15e94ad; _va_ref=%5B%22%22%2C%22%22%2C1694602920%2C%22http%3A%2F%2Fzdscxx.moa.gov.cn%3A8080%2F%22%5D; _va_ses=*; _va_id=34f0e583bc02483c.1694602920.1.1694602960.1694602920.; wdlast=1694603152',
'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/116.0.0.0 Safari/537.36 Edg/116.0.1938.76',
'Host':'zdscxx.moa.gov.cn:8080',
'Origin':'http://zdscxx.moa.gov.cn:8080',
'Referer':'http://zdscxx.moa.gov.cn:8080/nyb/pc/frequency.jsp',
'X-Requested-With':'XMLHttpRequest'
}
s = requests.session() # <requests.sessions.Session at 0x24b202c27f0>
r1 = s.post(url1,data=data,headers=headers) # <Response [200]>
r2 = s.post(url2,data=data,headers=headers) # <Response [200]>
content = r2.json() # 得到json數(shù)據(jù)
data_list = content['result']['pageInfo']['table']
for item in data_list:
v_data = {}
v_data['時間'] = item['time']
v_data['品類'] = item['product']
v_data['指標'] = item['item']
v_data['地區(qū)'] = item['area']
v_data['單位'] = item['unit']
v_data['數(shù)值'] = item['value']
print(v_data)
- 爬取所有頁面的數(shù)據(jù),只需要修改data里面的page
for page in range(1,11): # 一共10頁
url1 = 'http://zdscxx.moa.gov.cn:8080/nyb/updateFrequencyConditions'
url2 = 'http://zdscxx.moa.gov.cn:8080/nyb/getFrequencyData'
data = {
'page':page,
'rows':'20',
'type':'周度數(shù)據(jù)',
'subType':'農(nóng)產(chǎn)品批發(fā)價格',
'level':'0',
'time':'["2019-37","2023-38"]',
'product':'蔬菜'
}
headers = {
'Cookie':'JSESSIONID=9EDB9C447A01905C7893BDE4C220CF65; yfx_c_g_u_id_10002896=_ck23091319002016340778405571397; yfx_f_l_v_t_10002896=f_t_1694602820630__r_t_1694602820630__v_t_1694602820630__r_c_0; _trs_uv=lmhmrkth_299_3qsk; wdcid=5dbb601a9ccf2804; wdses=369f04c5d15e94ad; _va_ref=%5B%22%22%2C%22%22%2C1694602920%2C%22http%3A%2F%2Fzdscxx.moa.gov.cn%3A8080%2F%22%5D; _va_ses=*; _va_id=34f0e583bc02483c.1694602920.1.1694602960.1694602920.; wdlast=1694603152',
'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/116.0.0.0 Safari/537.36 Edg/116.0.1938.76',
'Host':'zdscxx.moa.gov.cn:8080',
'Origin':'http://zdscxx.moa.gov.cn:8080',
'Referer':'http://zdscxx.moa.gov.cn:8080/nyb/pc/frequency.jsp',
'X-Requested-With':'XMLHttpRequest'
}
s = requests.session() # <requests.sessions.Session at 0x24b202c27f0>
r1 = s.post(url1,data=data,headers=headers) # <Response [200]>
r2 = s.post(url2,data=data,headers=headers) # <Response [200]>
content = r2.json() # 得到json數(shù)據(jù)
data_list = content['result']['pageInfo']['table']
for item in data_list:
v_data = {}
v_data['時間'] = item['time']
v_data['品類'] = item['product']
v_data['指標'] = item['item']
v_data['地區(qū)'] = item['area']
v_data['單位'] = item['unit']
v_data['數(shù)值'] = item['value']
print(v_data)
time.sleep(5)
文章來源地址http://www.zghlxwxcb.cn/news/detail-833748.html
到了這里,關于python-爬蟲-爬取農(nóng)產(chǎn)品批發(fā)價格中的蔬菜價格周數(shù)據(jù)的文章就介紹完了。如果您還想了解更多內(nèi)容,請在右上角搜索TOY模板網(wǎng)以前的文章或繼續(xù)瀏覽下面的相關文章,希望大家以后多多支持TOY模板網(wǎng)!