首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >使用Beautiful Soup抓取Amazon数据时出错: Object返回None

使用Beautiful Soup抓取Amazon数据时出错: Object返回None
EN

Stack Overflow用户
提问于 2020-05-09 11:26:40
回答 1查看 246关注 0票数 0

无论我做什么,amazon id对象都会返回None。作为一个实验,我在ebay id对象上尝试了这个代码,它起作用了。亚马逊有什么不同之处?我也已经尝试将html.parser更改为lxlm,但它仍然返回:

AttributeError:“NoneType”对象没有特性“”get_text“”

这个问题可以在getPrice()定义中找到

代码语言:javascript
复制
from bs4 import BeautifulSoup 
import time
import smtplib

URL = 'https://www.lego.com/en-us/product/darth-vader-s-castle-75251'

headers = {'Users-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.113 Safari/537.36'}

wanted = 80

email = "help@gmail.com"
password = 'password'

Server_name = 'mail.gmail.com'


MAIL_USE_SSL=True

def sendMail():
    subject = 'Ebay Price has Dropped!!'
    mailtext = "Subject:"+subject+"\n\n"+URL
    server = smtplib.SMTP(host='smtp.gmail.com', port=587)
    server.ehlo()
    server.starttls()
    server.login(email,password)
    server.sendmail(email,email,mailtext)
    print("Sent Email")
    pass




def trackPrice():
    price = getPrice()
    if price > wanted:
        diff = (price - wanted)
        diff = round(diff,5)
        print(f"it's still ${diff} over-priced")
    else:
        print('cheaper')
        sendMail()


def getPrice():

    page = requests.get(URL, headers=headers)

    soup = BeautifulSoup(page.content,"html.parser")

    price = soup.find(id="priceblock_ourprice").get_text().strip()[4:]

    price = float(price)

    print(price)
    return price







if __name__ == "__main__":
    while True:
        trackPrice()
        time.sleep(100)
EN

回答 1

Stack Overflow用户

发布于 2020-05-09 11:57:42

假设您的实际URL类似于:

代码语言:javascript
复制
URL = "https://www.amazon.com/LEGO-Vaders-Castle-Building-Pieces/dp/B07J6F8H3M"

然后,如果您打印soup变量,您将看到亚马逊已经检测到您正在尝试抓取他们的页面,并向您显示了一个错误页面,因为内容以:

代码语言:javascript
复制
<!--
        To discuss automated access to Amazon data please contact api-services-support@amazon.com.
        For information about migrating to our APIs refer to our Marketplace APIs at https://developer.amazonservices.com/ref=rm_5_sv, or our Product Advertising API at https://affiliate-program.amazon.com/gp/advertising/api/detail/main.html/ref=rm_5_ac for advertising use cases.
-->
...

这就解释了为什么没有找到带有id="priceblock_ourprice"的HTML标记,find(...)返回Noneget_text()函数失败。

票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/61691398

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档