文章/答案/技术大牛

发布

社区首页 >问答首页 >使用Beautiful Soup抓取Amazon数据时出错: Object返回None

问使用Beautiful Soup抓取Amazon数据时出错: Object返回None
EN

Stack Overflow用户

提问于 2020-05-09 11:26:40

回答 1查看 246关注 0票数 0

无论我做什么，amazon id对象都会返回None。作为一个实验，我在ebay id对象上尝试了这个代码，它起作用了。亚马逊有什么不同之处？我也已经尝试将html.parser更改为lxlm，但它仍然返回：

AttributeError：“NoneType”对象没有特性“”get_text“”

这个问题可以在getPrice()定义中找到

from bs4 import BeautifulSoup 
import time
import smtplib

URL = 'https://www.lego.com/en-us/product/darth-vader-s-castle-75251'

headers = {'Users-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.113 Safari/537.36'}

wanted = 80

email = "help@gmail.com"
password = 'password'

Server_name = 'mail.gmail.com'


MAIL_USE_SSL=True

def sendMail():
    subject = 'Ebay Price has Dropped!!'
    mailtext = "Subject:"+subject+"\n\n"+URL
    server = smtplib.SMTP(host='smtp.gmail.com', port=587)
    server.ehlo()
    server.starttls()
    server.login(email,password)
    server.sendmail(email,email,mailtext)
    print("Sent Email")
    pass




def trackPrice():
    price = getPrice()
    if price > wanted:
        diff = (price - wanted)
        diff = round(diff,5)
        print(f"it's still ${diff} over-priced")
    else:
        print('cheaper')
        sendMail()


def getPrice():

    page = requests.get(URL, headers=headers)

    soup = BeautifulSoup(page.content,"html.parser")

    price = soup.find(id="priceblock_ourprice").get_text().strip()[4:]

    price = float(price)

    print(price)
    return price







if __name__ == "__main__":
    while True:
        trackPrice()
        time.sleep(100)

python

object

web-scraping

beautifulsoup

回答 1

Stack Overflow用户

发布于 2020-05-09 11:57:42

假设您的实际URL类似于：

URL = "https://www.amazon.com/LEGO-Vaders-Castle-Building-Pieces/dp/B07J6F8H3M"

然后，如果您打印soup变量，您将看到亚马逊已经检测到您正在尝试抓取他们的页面，并向您显示了一个错误页面，因为内容以：

<!--
        To discuss automated access to Amazon data please contact api-services-support@amazon.com.
        For information about migrating to our APIs refer to our Marketplace APIs at https://developer.amazonservices.com/ref=rm_5_sv, or our Product Advertising API at https://affiliate-program.amazon.com/gp/advertising/api/detail/main.html/ref=rm_5_ac for advertising use cases.
-->
...

这就解释了为什么没有找到带有id="priceblock_ourprice"的HTML标记，find(...)返回None，get_text()函数失败。

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/61691398

复制

相似问题

问使用Beautiful Soup抓取Amazon数据时出错: Object返回None
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问使用Beautiful Soup抓取Amazon数据时出错: Object返回NoneEN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问使用Beautiful Soup抓取Amazon数据时出错: Object返回None
EN