首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >我不能抓取src图片链接到Beautifulsoup

我不能抓取src图片链接到Beautifulsoup
EN

Stack Overflow用户
提问于 2021-04-23 06:19:46
回答 1查看 39关注 0票数 1

所以我试着从Postmates上的餐馆抓取食物图片链接。对于我在这家餐厅尝试的示例:https://postmates.com/merchant/fruitive-washington-96807

有图像链接的src似乎对我来说有点难导出它。我尝试了我所知道的一切,但都没有结果。我总是得到这样的返回值:[]list index out of range errorNone type error,以及一般的错误。

该页面的代码如下:

代码语言:javascript
复制
<div id="" class="e1tw3vxs2 css-aktk0j e1qfcze90">
  <div>
    <img alt="Spring Pesto from Fruitive. Order online." src="https://raster-static.postmates.com/? 
     url=https%3A%2F%2Fitems-static.postmates.com%2Fuploads%2Fmedia%2F7b289988-5d19-4cfc-80a6- ce88a7a05f41%2Foriginal.jpg%3Fv%3D63784935843&amp;quality=85&amp;w=320&amp;h=0&amp;mode=auto&amp;format=webp&amp;v=4" 
     class="css-1hyfx7x e1qfcze94">
    <div title="Spring Pesto from Fruitive. Order online." class="css-1ggm7mr e1qfcze91" 
    style="background-image: url(&quot;https://raster-static.postmates.com/?url=https%3A%2F%2Fitems- 
    static.postmates.com%2Fuploads%2Fmedia%2F7b289988-5d19-4cfc-80a6-ce88a7a05f41%2Foriginal.jpg%3Fv%3D63784935843&amp;quality=85&amp;w=320&amp;h=0&amp;mode=auto&amp;
    format=webp&amp;v=4&quot;); opacity: 1;"></div>
  </div>
  <div class="css-f85l49 e1qfcze92"></div>
</div>

我的抓取代码是:

代码语言:javascript
复制
header = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.114 Safari/537.36"}
page_code = requests.get('https://postmates.com/merchant/fruitive-washington-96807', headers = header)
soup = bs(page_code.content, 'html.parser')

page_code = soup.find_all('div',{'class':'css-135ydxp e1u06svg2'})
for i in page_code:
    all_element_products = i.find_all('div',{'class':'product-container css-1kry540 e1tw3vxs3'})
    for a_e_p in all_element_products:
        try:
            img_link = a_e_p.find_all('div', {'class':'e1tw3vxs2 css-aktk0j e1qfcze90'})
        except Exception as Err:
            print(Err)
            print()
            img_link = '-'
        print(img_link)

这里有人有解决方案吗?

EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2021-04-23 06:29:37

您在页面上看到的信息是动态呈现的,数据以JSON格式嵌入。您可以使用以下示例,了解如何使用re/json模块加载它:

代码语言:javascript
复制
import re
import json
import requests


url = "https://postmates.com/merchant/fruitive-washington-96807"
html_doc = requests.get(url).text

data = re.search(r"window\.__PRELOADED_STATE__ = ({.*?});", html_doc).group(1)
data = json.loads(data)

# uncomment this to print all data:
# print(json.dumps(data, indent=4))

for cat in data["cart"]["categories"]:
    for product in cat["products"]:
        # print only products with image:
        if "img" in product:
            print(
                "{:<30} {}".format(
                    product["name"], product["img"]["originalUrl"]
                )
            )

打印:

代码语言:javascript
复制
Loaded Avocado Toast           https://items-static.postmates.com/uploads/media/a9f25be8-fd4a-4615-8f50-5f767d76ade9/original.jpg?v=63784935508
Pink Punch                     https://items-static.postmates.com/uploads/media/1db56bd1-9128-4ee6-837f-63c2db004494/original.jpg?v=63784935614
Tropical Bowl                  https://items-static.postmates.com/uploads/media/150f8376-ab6d-45e0-b21f-35c648f31814/original.jpg?v=63784935705
Beach Breeze                   https://items-static.postmates.com/uploads/media/c8bb194b-d12f-4684-a542-76c17310538f/original.jpg?v=63784935751
Spring Pesto                   https://items-static.postmates.com/uploads/media/7b289988-5d19-4cfc-80a6-ce88a7a05f41/original.jpg?v=63784935843
票数 2
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/67221523

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档