我正在尝试写一个代码,将发现亚马逊上的产品是否可用。我正在尝试从Amazon抓取数据,然后检查字符串"In stock“是否是我抓取的数据的一部分。
#part of a function called check
page = requests.get(url,headers = headers)
#parsing the html content
doc = html.fromstring(page.content)
# checking availability
xpath_availability = '//*[@id="availability"]/span/text()'
raw_availability = doc.xpath(xpath_availability)
print(raw_availability)
if "Is Stock" in raw_availability:
print('Hello')
check('https://www.amazon.com/PlayStation-4-Slim-1TB-Console/dp/B071CV8CG2/ref=sr_1_2?keywords=ps4&qid=1559836554&s=videogames&sr=1-2&th=1')我的问题是hello从来没有打印出来,我得到的文本要么是空白的,要么是下面的['\n \n \n In Stock.\n \n \n '],我做错了什么?另外,如果有人有更好的方法做这件事的建议,我将不胜感激!
发布于 2019-06-07 03:15:29
尝试更改标题信息(根据您的操作系统和浏览器信息,您可以从https://developers.whatismybrowser.com/useragents/explore/operating_system_name/mac-os-x/查看),我可以使用以下命令抓取网址:
url = 'https://www.amazon.com/PlayStation-4-Slim-1TB-Console/dp/B071CV8CG2/ref=sr_1_2?keywords=ps4&qid=1559836554&s=videogames&sr=1-2&th=1'
headers = {
'User-Agent':'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.71 Safari/537.36'
}
page = requests.get(url,headers=headers)
# checking availability
xpath_availability = '//*[@id="availability"]/span/text()'
raw_availability = doc.xpath(xpath_availability)
print(raw_availability)
Output: ['\n \n \n In Stock.\n \n \n ']https://stackoverflow.com/questions/56482994
复制相似问题