首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >不能同时从网页中抓取两个字段

不能同时从网页中抓取两个字段
EN

Stack Overflow用户
提问于 2021-09-23 19:18:51
回答 1查看 37关注 0票数 1

我正试图把series nameproduct code一起从网页中刮走。我创建的脚本可以完美地获取product code,但我不知道如何获取系列名称和产品代码。

我已经试过了:

代码语言:javascript
复制
import requests
from bs4 import BeautifulSoup

link = 'https://www.theimagingsource.com/products/industrial-cameras/usb-3.1-monochrome/'

with requests.Session() as s:
    s.headers['User-Agent'] = 'Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.150 Safari/537.36'
    res = s.get(link)
    soup = BeautifulSoup(res.text,"lxml")
    for item in soup.select("td.product-code > a[title]"):
        print(item.get_text(strip=True))

我得到的输出:

代码语言:javascript
复制
DMK 38UX267
DMK 38UX255
DMK 38UX304
DMK 38UX253
DMK 37AUX287
DMK 37AUX273
DMK 37AUX290

我希望得到的产出:

代码语言:javascript
复制
38 series - USB 3.1 monochrome industrial cameras DMK 38UX267
38 series - USB 3.1 monochrome industrial cameras DMK 38UX255
38 series - USB 3.1 monochrome industrial cameras DMK 38UX304
38 series - USB 3.1 monochrome industrial cameras DMK 38UX253
37 A series - USB 3.1 monochrome industrial cameras DMK 37AUX287
37 A series - USB 3.1 monochrome industrial cameras DMK 37AUX273
37 A series - USB 3.1 monochrome industrial cameras DMK 37AUX290

等等

EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2021-09-23 19:23:42

使用.find_previous('h3')

代码语言:javascript
复制
import requests
from bs4 import BeautifulSoup

link = "https://www.theimagingsource.com/products/industrial-cameras/usb-3.1-monochrome/"

with requests.Session() as s:
    s.headers[
        "User-Agent"
    ] = "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.150 Safari/537.36"
    res = s.get(link)
    soup = BeautifulSoup(res.text, "lxml")
    for item in soup.select("td.product-code > a[title]"):
        print(
            item.find_previous("h3").get_text(strip=True),
            item.get_text(strip=True),
        )

指纹:

代码语言:javascript
复制
38 series - USB 3.1 monochrome industrial cameras DMK 38UX267
38 series - USB 3.1 monochrome industrial cameras DMK 38UX255
38 series - USB 3.1 monochrome industrial cameras DMK 38UX304
38 series - USB 3.1 monochrome industrial cameras DMK 38UX253
37 A series - USB 3.1 monochrome industrial cameras DMK 37AUX287
37 A series - USB 3.1 monochrome industrial cameras DMK 37AUX273
37 A series - USB 3.1 monochrome industrial cameras DMK 37AUX290
37 A series - USB 3.1 monochrome industrial cameras DMK 37AUX252
37 A series - USB 3.1 monochrome industrial cameras DMK 37AUX265
37 A series - USB 3.1 monochrome industrial cameras DMK 37AUX250
37 A series - USB 3.1 monochrome industrial cameras DMK 37AUX264
37 A series - USB 3.1 monochrome industrial cameras DMK 37AUX178
37 A series - USB 3.1 monochrome industrial cameras DMK 37AUX226
37 B series - USB 3.1 monochrome industrial cameras DMK 37BUX287
37 B series - USB 3.1 monochrome industrial cameras DMK 37BUX273
37 B series - USB 3.1 monochrome industrial cameras DMK 37BUX290
37 B series - USB 3.1 monochrome industrial cameras DMK 37BUX252
37 B series - USB 3.1 monochrome industrial cameras DMK 37BUX265
37 B series - USB 3.1 monochrome industrial cameras DMK 37BUX250
37 B series - USB 3.1 monochrome industrial cameras DMK 37BUX264
37 B series - USB 3.1 monochrome industrial cameras DMK 37BUX178
37 B series - USB 3.1 monochrome industrial cameras DMK 37BUX226
票数 2
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/69305836

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档