无论我做什么,amazon id对象都会返回None。作为一个实验,我在ebay id对象上尝试了这个代码,它起作用了。亚马逊有什么不同之处?我也已经尝试将html.parser更改为lxlm,但它仍然返回:
AttributeError:“NoneType”对象没有特性“”get_text“”
这个问题可以在getPrice()定义中找到
from bs4 import BeautifulSoup
import time
import smtplib
URL = 'https://www.lego.com/en-us/product/darth-vader-s-castle-75251'
headers = {'Users-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.113 Safari/537.36'}
wanted = 80
email = "help@gmail.com"
password = 'password'
Server_name = 'mail.gmail.com'
MAIL_USE_SSL=True
def sendMail():
subject = 'Ebay Price has Dropped!!'
mailtext = "Subject:"+subject+"\n\n"+URL
server = smtplib.SMTP(host='smtp.gmail.com', port=587)
server.ehlo()
server.starttls()
server.login(email,password)
server.sendmail(email,email,mailtext)
print("Sent Email")
pass
def trackPrice():
price = getPrice()
if price > wanted:
diff = (price - wanted)
diff = round(diff,5)
print(f"it's still ${diff} over-priced")
else:
print('cheaper')
sendMail()
def getPrice():
page = requests.get(URL, headers=headers)
soup = BeautifulSoup(page.content,"html.parser")
price = soup.find(id="priceblock_ourprice").get_text().strip()[4:]
price = float(price)
print(price)
return price
if __name__ == "__main__":
while True:
trackPrice()
time.sleep(100)发布于 2020-05-09 11:57:42
假设您的实际URL类似于:
URL = "https://www.amazon.com/LEGO-Vaders-Castle-Building-Pieces/dp/B07J6F8H3M"然后,如果您打印soup变量,您将看到亚马逊已经检测到您正在尝试抓取他们的页面,并向您显示了一个错误页面,因为内容以:
<!--
To discuss automated access to Amazon data please contact api-services-support@amazon.com.
For information about migrating to our APIs refer to our Marketplace APIs at https://developer.amazonservices.com/ref=rm_5_sv, or our Product Advertising API at https://affiliate-program.amazon.com/gp/advertising/api/detail/main.html/ref=rm_5_ac for advertising use cases.
-->
...这就解释了为什么没有找到带有id="priceblock_ourprice"的HTML标记,find(...)返回None,get_text()函数失败。
https://stackoverflow.com/questions/61691398
复制相似问题