文章/答案/技术大牛

发布

社区首页 >问答首页 >python中的网络抓取知识

问python中的网络抓取知识
EN

Stack Overflow用户

提问于 2019-06-02 17:22:46

回答 1查看 40关注 0票数 0

我只是试着抓取一个网站，以获得标题和产品描述等只是为了练习，我已经抓取了产品名称，但我困惑如何抓取以下东西。

在这里，我只是想获取产品名称和它的描述。我已经成功地拿到了头衔。

from requests_html import HTML,HTMLSession
session = HTMLSession()
r = session.get('https://www.newegg.com/Video-Cards-Video-Devices/Category/ID-38?Tpk=graphics%20card')
containers =  r.html.find('.item-container',first=True)
#print(containers.html)
title = containers.find('.item-branding img',first=True).attrs['title']
#print(title)
description = containers.find('.item-title',first=True).html
print(description)

但是问题出在description中，我想要获取i中这个a标记中的数据，它显示了我不能做的产品的描述，所以如果有任何帮助，我将不胜感激

从这个开始：

<a class="item-title" href="https://www.newegg.com/evga-geforce-rtx-2080-ti-11g-p4-2281-kr/p/N82E16814487418?Item=N82E16814487418" title="View Details"><i class="icon-premier icon-premier-xsm"/>EVGA GeForce RTX 2080 Ti DirectX 12 11G-P4-2281-KR BLACK EDITION GAMING Video Card, Dual HDB Fans &amp; RGB LED</a>

我想要抓住这个：

EVGA GeForce RTX 2080 Ti DirectX 12 11G-P4-2281-KR BLACK EDITION GAMING Video Card, Dual HDB Fans &amp; RGB LED

web-scraping

python

回答 1

Stack Overflow用户

发布于 2019-06-02 23:34:06

我推荐使用BeautifulSoup来这个网站的内容，你的代码应该是这样的：

from requests_html import HTML, HTMLSession
from bs4 import BeautifulSoup

session = HTMLSession()
r = session.get('https://www.newegg.com/Video-Cards-Video-Devices/Category/ID-38?Tpk=graphics%20card')
soup = BeautifulSoup(r.content,"lxml")

containers = soup.find("div", {"class","item-container"})
title = containers.findAll("img", {"class":"lazy-img"})[1]["title"]
description = containers.find("a",{"class":"item-title"}).getText()
print(description)

希望这能帮到你1：https://www.crummy.com/software/BeautifulSoup/bs4/doc/

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/56413773

复制

相似问题

问python中的网络抓取知识
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问python中的网络抓取知识EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问python中的网络抓取知识
EN