我正在尝试从Amazon.in网页中收集产品的ASIN。我有一个代码,它将打开一个web驱动程序并搜索产品名称,并导航到产品的第一页,page.It只能为第一页收集数据,但是如何移动到下一页来收集相同的数据。这是我的代码:
import time
import json
import re
import numpy as np
from bs4 import BeautifulSoup
from selenium import webdriver
import urllib.request
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import TimeoutException
from selenium.webdriver.common.keys import Keys
import pandas as pd
temp = []
def init_driver():
driver = webdriver.Chrome(executable_path = "C:\\Users\\Desktop\\chromedriver")
driver.wait = WebDriverWait(driver, 10)
return driver
def get_asin(driver):
driver.get("https://www.amazon.in")
print ('Getting the URL')
HTML = driver.page_source
search_button = driver.find_element_by_id("twotabsearchtextbox")
search_button.send_keys("Mobiles")
select_button = driver.find_element_by_class_name("nav-input")
select_button.click()
HTML1=driver.page_source
soup = BeautifulSoup(HTML1, "html.parser")
styles = soup.find_all('li')
#print(styles)
#print(type(styles))
ASIN=[]
for link in styles:
if link.has_attr('data-asin'):
ASIN.append(link['data-asin'])
return(ASIN)
#print(ASIN)
if __name__ == "__main__":
driver = init_driver()
ASIN_NO = get_asin(driver)
#time.sleep(3)
#print ('opening search page')
#for i in range(0,len(ASIN_NO)):
#scrape(driver,ASIN_NO[i])
print (ASIN_NO)
time.sleep(5)我尝试了以下两种语法,这两种语法都显示了错误:
select_button = driver.find_element_by_id('pagnNextString')
select_button.click()日志中的异常:
WebDriverException:消息:未知错误:元素.不能点击点(778,606)。其他元素将收到单击:
select_button = driver.find_element_by_class_name('srSprite pagnNextArrow')
select_button.click()InvalidSelectorException:消息:无效选择器:不允许使用复合类名
请帮助找出正确的方法。提前谢谢。
发布于 2017-09-20 08:48:04
我认为您必须最大化窗口,因为元素是不可查看的,这就是为什么问题元素不可点击出现的原因。
driver.maximize_window()USe这个xpath按钮( InvalidSelctor问题)
.//*[@id='nav-search']/form/div[2]/div/input我对蟒蛇不太了解。这是java编码在我的系统中工作得很好。将其转换为Python
WebDriver driver=new FirefoxDriver();
driver.get("https://www.amazon.in");
driver.manage().timeouts().implicitlyWait(10, TimeUnit.SECONDS);
WebElement search_txt=driver.findElement(By.xpath("//*[@id='twotabsearchtextbox']"));
search_txt.sendKeys("Mobiles");
driver.manage().window().maximize();
driver.findElement(By.xpath(".//*[@id='nav-search']/form/div[2]/div/input")).click();
WebElement select_btn=driver.findElement(By.xpath("//*[@id='pagnNextString']"));
select_btn.click();发布于 2017-09-20 09:06:26
要能够单击Next按钮,可以使用以下代码:
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
next_button = WebDriverWait(driver, 10).until(EC.visibility_of_element_located((By.ID, "pagnNextString")))
next_button.location_once_scrolled_into_view
next_button.click()这应该允许您等待按钮出现在页面上,向下滚动到它并成功地单击。
https://stackoverflow.com/questions/46316389
复制相似问题