文章/答案/技术大牛

发布

社区首页 >问答首页 >无法找到使用xpath的元素，而且我确信在驱动程序查找它之前它是存在的

问无法找到使用xpath的元素，而且我确信在驱动程序查找它之前它是存在的
EN

Stack Overflow用户

提问于 2020-08-23 15:52:23

回答 1查看 430关注 0票数 2

我正在尝试下载excel文件从一个网站使用selenium在无头模式。虽然它在大多数情况下运行得很好，但是有一些情况(一年中的几个月)，driver.find_element_by_xpath()不能像预期的那样工作。我已经浏览过许多帖子，尽管在驱动程序查找时，元素可能没有出现，但情况并非如此，因为我彻底检查了它，并尝试使用time.sleep()来减缓进程，另外还注意到，我还使用driver.implicitly_wait()来简化工作，因为网站实际上需要一段时间才能在页面上加载内容。我不能使用请求，因为它在get请求的响应中没有显示任何数据。我的脚本如下：

from selenium import webdriver
import datetime
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.support.ui import Select
import os
import shutil
import time
import calendar

currentdir = os.path.dirname(__file__)
Initial_path = 'whateveritis'
chrome_options = Options()
chrome_options.add_argument('--headless')
chrome_options.add_argument('--no-sandbox')
chrome_options.add_argument('--disable-dev-shm-usage')
chrome_options.add_experimental_option("prefs", {                                                                                       
"download.default_directory": f"{Initial_path}",
"download.prompt_for_download": False,
"download.directory_upgrade": True,
"safebrowsing.enabled": True
})


def save_hist_data(year, months):
    def waitUntilDownloadCompleted(maxTime=1200):
        driver.execute_script("window.open()")
        # switch to new tab
        driver.switch_to.window(driver.window_handles[-1])
        # navigate to chrome downloads
        driver.get('chrome://downloads')
        # define the endTime
        endTime = time.time() + maxTime
        while True:
            try:
                # get the download percentage
                downloadPercentage = driver.execute_script(
                    "return document.querySelector('downloads-manager').shadowRoot.querySelector('#downloadsList downloads-item').shadowRoot.querySelector('#progress').value")
                # check if downloadPercentage is 100 (otherwise the script will keep waiting)
                if downloadPercentage == 100:
                    # exit the method once it's completed
                    return downloadPercentage
            except:
                pass
            # wait for 1 second before checking the percentage next time
            time.sleep(1)
            # exit method if the download not completed with in MaxTime.
            if time.time() > endTime:
                break

    starts_on = 1
    for month in months:
        no_month = datetime.datetime.strptime(month, "%b").month
        no_of_days = calendar.monthrange(year, no_month)[1]
        print(f"{no_of_days} days in {month}-{year}")

        driver = webdriver.Chrome(executable_path="whereeveritexists", options=chrome_options)
        driver.maximize_window() #For maximizing window
        driver.implicitly_wait(20)
        driver.get("https://www.iexindia.com/marketdata/areaprice.aspx")

        select = Select(driver.find_element_by_name('ctl00$InnerContent$ddlPeriod'))
        select.select_by_visible_text('-Select Range-')

        driver.find_element_by_xpath("//input[@name='ctl00$InnerContent$calFromDate$txt_Date']").click()
        select = Select(driver.find_element_by_xpath("//td[@class='scwHead']/select[@id='scwYears']"))
        select.select_by_visible_text(str(year))
        select = Select(driver.find_element_by_xpath("//td[@class='scwHead']/select[@id='scwMonths']"))
        select.select_by_visible_text(month)

#问题在于这个块

    test=None
    while not test:
        try:
            driver.find_element_by_xpath(f"//td[@class='scwCells' and contains(text(),'{starts_on}')]").click()
            test=True
        except IndentationError:
            print('Entered except block -IE')
            driver.find_element_by_xpath(f"//td[@class='scwCellsWeekend'  and contains(text(), '{starts_on}')]").click()
            test=True
        except:
            print('Entered except block -IE-2')
            driver.find_element_by_xpath(f"//td[@class='scwInputDate'  and contains(text(), '{starts_on}')]").click()
            test=True

        driver.find_element_by_xpath("//input[@name='ctl00$InnerContent$calToDate$txt_Date']").click()
        select = Select(driver.find_element_by_xpath("//td[@class='scwHead']/select[@id='scwYears']"))
        select.select_by_visible_text(str(year))
        select = Select(driver.find_element_by_xpath("//td[@class='scwHead']/select[@id='scwMonths']"))
        select.select_by_visible_text(month)

#问题在于这个块

    test=None
    while not test:
        try:
            driver.find_element_by_xpath(f"//td[@class='scwCells'  and contains(text(), '{no_of_days}')]").click()
            # time.sleep(4)
            test=True
        except IndentationError:
            print('Entered except block -IE')
            driver.find_element_by_xpath(f"//td[@class='scwCellsWeekend'  and contains(text(), '{no_of_days}')]").click()
            # time.sleep(4)
            test=True
        except:
            # time.sleep(2)
            driver.find_element_by_xpath(f"//td[@class='scwInputDate'  and contains(text(), '{no_of_days}')]").click()
            
            test=True

        driver.find_element_by_xpath("//input[@name='ctl00$InnerContent$btnUpdateReport']").click()
        driver.find_element_by_xpath("//a[@title='Export drop down menu']").click()
        print("Right before excel button click")
        driver.find_element_by_xpath("//a[@title='Excel']").click()
        waitUntilDownloadCompleted(180)
        print("After the download potentially!")
        
        filename = max([Initial_path + f for f in os.listdir(Initial_path)],key=os.path.getctime)
        shutil.move(filename,os.path.join(Initial_path,f"{month}{year}.xlsx"))

        driver.quit()


def main():

    # years = list(range(2013,2015))
    # months = ['Jan', 'Feb','Mar','Apr','May','Jun','Jul','Aug','Sep','Oct','Nov','Dec']
    # for year in years:
    #         try:
    save_hist_data(2018, ['Mar'])
            # except:
            #     pass

if __name__== '__main__':
    main()

while循环基本上用于选择日历上的date元素(已经从下拉列表中选择了月份和年份)。因为网站有不同的标签，如果日期是在工作日或周末，我使用了try和but块来尝试所有可能的xpath，但奇怪的是，一年中的一些月根本不像预期的那样工作。这是btw "https://www.iexindia.com/marketdata/areaprice.aspx"“链接，特别是在2018年3月-2018年3月-2018年3月-2018年3月-2018年，在chrome浏览器上搜索xpath，它位于2018年3月-2018年3月31日，但是当执行python脚本时，它抛出并出错，上面写着selenium.common.exceptions.NoSuchElementException:消息:没有这样的元素:无法定位元素：{”方法“：”xpath“，”选择器“：”//td@class=‘scwInputDate’并包含(text()，'31')"} (会话信息:chrome=84.0.4147.105：)。

python

selenium-webdriver

web-scraping

回答 1

Stack Overflow用户

回答已采纳

发布于 2020-08-24 13:50:26

问题是除了:异常处理。按照您的代码块，如果"//td[@class='scwCells' and contains(text(), '{no_of_days}')]"没有找到元素。因为3月31日的类是scwCellsWeekend元素，所以找不到。

首先，

将处理IdentationException。由于not元素不是一个IdentationException，所以除了第二个异常IdentationException之外，它将进行下一步--除了没有提到任何条件之外，在其中处理NoSuchElementException。按照这里给出的代码，它试图使用xpath //td[@class='scwInputDate' and contains(text(), '31')]搜索和元素。这也是无法找到的结果，因此您得到了NoSuchElementException.

与其使用如此多的异常处理方案，您还可以使用逻辑运算符或注释：

driver.find_element_by_xpath(f"//td[@class='scwCellsWeekend' and contains(text(), '{no_of_days}')] | //td[@class='scwCells' and contains(text(), '{no_of_days}')] | //td[@class='scwInputDate' and contains(text(), '{no_of_days}')]").click()

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/63549005

复制

相似问题

问无法找到使用xpath的元素，而且我确信在驱动程序查找它之前它是存在的
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问无法找到使用xpath的元素，而且我确信在驱动程序查找它之前它是存在的EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问无法找到使用xpath的元素，而且我确信在驱动程序查找它之前它是存在的
EN