国产无码综合区,色欲AV无码国产永久播放,无码天堂亚洲国产AV,国产日韩欧美女同一区二区

如何使用Python抓取網(wǎng)頁(yè)的結(jié)果并保存到 Excel 文件？

10月前分類(lèi)：編程知識(shí) / Python閱讀(1107)

在進(jìn)行網(wǎng)頁(yè)爬蟲(chóng)時(shí)，常常會(huì)遇到如何將數(shù)據(jù)返回并保存到文件的問(wèn)題。以下是一個(gè)基于Python的示例代碼，展示了如何從特定網(wǎng)站提取數(shù)據(jù)，并將結(jié)果保存為Excel文件。此代碼使用Pandas數(shù)據(jù)框架，以便于后續(xù)的數(shù)據(jù)操作。

from bs4 import BeautifulSoup as soup
from selenium import webdriver
import time
import pandas as pd

def checkproduct(url):
    driver = webdriver.Chrome()
    driver.get(url)

    driver.execute_script("window.scrollTo(0, 3000);")
    time.sleep(10)

    page_html = driver.page_source
    data = soup(page_html, 'html.parser')

    allproduct = data.findAll('div', {'class':'c16H9d'})
    list_title = []
    list_url = []
    list_price = []
    list_image = []

    for pd in allproduct:
        pd_title = pd.text
        pd_url = 'https:' + pd.a['href']
        list_title.append(pd_title)
        list_url.append(pd_url)

    allprice = data.findAll('span',{'class':'c13VH6'})
    for pc in allprice:
        pc_price = pc.text.replace('?','').replace(',','') 
        list_price.append(float(pc_price))

    allimages = data.findAll('img',{'class':'c1ZEkM'})
    for productimages in allimages:
        list_image.append(productimages['src'])

    driver.close()
    return [list_title, list_price, list_url, list_image]

base_url = "https://www.lazada.co.th/shop-smart-tv?pages="
n = 3
rows = []

for i in range(1, n+1):
    url = base_url + f"{i}"
    print(url)
    results = checkproduct(url)
    rows.append(pd.DataFrame(results).T)

df = pd.concat(rows).reset_index(drop=True)
df.columns = ['Product', 'Price', 'URL', 'Images']
df.to_excel("Lazada_Product.xlsx")

代碼解析

導(dǎo)入庫(kù)：使用BeautifulSoup進(jìn)行HTML解析，Selenium進(jìn)行網(wǎng)頁(yè)操作，pandas用于數(shù)據(jù)處理和保存。
定義函數(shù)：checkproduct函數(shù)負(fù)責(zé)訪問(wèn)網(wǎng)頁(yè)，提取產(chǎn)品信息并返回一個(gè)列表。
數(shù)據(jù)存儲(chǔ)：在主循環(huán)中，我們構(gòu)建了URL，并調(diào)用checkproduct函數(shù)來(lái)獲取數(shù)據(jù)。將每次爬取的結(jié)果轉(zhuǎn)換為DataFrame并存入列表。
合并數(shù)據(jù)并保存：最后，使用pandas將所有數(shù)據(jù)合并，并保存為Excel文件。

通過(guò)此方法，您可以有效地抓取網(wǎng)頁(yè)數(shù)據(jù)，并使用Pandas進(jìn)行簡(jiǎn)單的操作與保存，使數(shù)據(jù)的管理更加方便。文章來(lái)源地址http://www.zghlxwxcb.cn/article/782.html

到此這篇關(guān)于如何使用Python抓取網(wǎng)頁(yè)的結(jié)果并保存到 Excel 文件？的文章就介紹到這了,更多相關(guān)內(nèi)容可以在右上角搜索或繼續(xù)瀏覽下面的相關(guān)文章，希望大家以后多多支持TOY模板網(wǎng)！

原文地址:http://www.zghlxwxcb.cn/article/782.html

如若轉(zhuǎn)載，請(qǐng)注明出處：如若內(nèi)容造成侵權(quán)/違法違規(guī)/事實(shí)不符，請(qǐng)聯(lián)系站長(zhǎng)進(jìn)行投訴反饋，一經(jīng)查實(shí)，立即刪除！

Python網(wǎng)頁(yè)爬蟲(chóng)Python使用Pandas捉取內(nèi)容導(dǎo)入excel

分享到：

領(lǐng)支付寶紅包贊助服務(wù)器費(fèi)用