下面是我从亚马逊抓取产品链接的代码,但得到了错误。我试图从多个页面抓取链接的代码是很好的,3页后,给出了下面提到的错误。
wbD = wb.Chrome('chromedriver.exe')
wbD.get('https://www.amazon.com/s?i=specialty-aps&bbn=16225007011&rh=n%3A16225007011%2Cn%3A193870011&ref=nav_em__nav_desktop_sa_intl_computer_components_0_2_6_3')
links = []
condition = True
while condition:
productlist = wbD.find_elements_by_class_name('a-size-mini')
for elem in productlist:
if(elem.text !='' and elem.text !='Sponsored'):
pp2 = elem.find_element_by_tag_name('a')
links.append(pp2.get_property('href'))
try:
wbD.find_element_by_class_name('a-last').find_element_by_tag_name('a').get_property('href')
wbD.find_element_by_class_name('a-last').click()
except:
condition = False
print(links)发布于 2020-10-14 13:11:57
奇怪的是,tag_name搞砸了,我添加了time.sleep()来处理任何最大重试错误。
while True:
productlist = WebDriverWait(wbD, 10).until(EC.presence_of_all_elements_located((By.CLASS_NAME, "a-size-mini")))
for elem in productlist:
if(elem.text !='' and elem.text !='Sponsored'):
pp2 = elem.find_element_by_xpath('//a')
links.append(pp2.get_property('href'))
try:
wbD.find_element_by_class_name('a-last').find_element_by_tag_name('a').get_property('href')
wbD.find_element_by_class_name('a-last').click()
except:
break
time.sleep(5)
print(links)发布于 2020-10-14 01:29:50
原因可能是页面仍未加载,或者只是因为这些产品中没有此类元素(a),您可能需要在此类元素按标签名称添加异常时:a
links = []
condition = True
while condition:
productlist = wbD.find_elements_by_class_name("a-size-mini")
for elem in productlist:
if(elem.text !="" and elem.text !="Sponsored"):
try:
pp2 = elem.find_element_by_tag_name('a')
links.append(pp2.get_property('href'))
except Exception as e:
print('Error', e)
try:
wbD.find_element_by_class_name('a-last').find_element_by_tag_name('a').get_property('href')
wbD.find_element_by_class_name('a-last').click()
except:
condition = False
print(links)https://stackoverflow.com/questions/64339919
复制相似问题