我用python和selenium编写了一个脚本,可以从网页下载几个文件。我利用点击指向.docx文件的链接来下载它们。一旦下载了这些文件,就会用一些愚蠢的前缀重命名它们。我的剧本可以完美地完成这一切。
为了将下载的文件存储在文件夹中,我使用了os.chdir()命令,我想用os.path.join()替换它。然而,这正是我被困的地方,我无法找到如何使用它的任何想法。一旦我能够正确地使用os.path.join(),我就可以重命名下载的文件。
在这种情况下,如何使用os.path.join() 而不是 os.chdir() 来下载和重命名文件?
到目前为止,我已经写到:
import time
import os
from selenium import webdriver
link = 'https://www.online-convert.com/file-format/doc'
desk_location = r'C:\Users\WCS\Desktop\file_container'
if not os.path.exists(desk_location):os.mkdir(desk_location)
os.chdir(desk_location) #I wish to kick out this line to replace with os.path.join() somewhere within the script
def download_files(url):
driver.get(url)
for item in driver.find_elements_by_css_selector("a[href$='.doc']")[:2]:
filename = item.get_attribute("href").split("/")[-1]
item.click()
time_to_wait = 10
time_counter = 0
try:
while not os.path.exists(filename):
time.sleep(1)
time_counter += 1
if time_counter > time_to_wait:break
os.rename(filename,"its_"+filename) #It's a silly renaming in order to check whether this line is working
except Exception:pass
if __name__ == '__main__':
chromeOptions = webdriver.ChromeOptions()
prefs = {'download.default_directory' : desk_location,
'profile.default_content_setting_values.automatic_downloads': 1
}
chromeOptions.add_experimental_option('prefs', prefs)
driver = webdriver.Chrome(chrome_options=chromeOptions)
download_files(link)
driver.quit()发布于 2019-02-12 07:10:31
为了使脚本使用os.path.join()而不是os.chdir()工作,我需要更改脚本中的几行。谢谢你在评论中的建议。
整流部分:
def download_files(url):
driver.get(url)
for item in driver.find_elements_by_css_selector("a[href$='.doc']")[:2]:
filename = item.get_attribute("href").split("/")[-1]
#Define the path in the following line in order to resuse later
file_location = os.path.join(desk_location, filename)
item.click()
time_to_wait = 10
time_counter = 0
try:
while not os.path.exists(file_location): #use the file_location here
time.sleep(1)
time_counter += 1
if time_counter > time_to_wait:break
#Now rename the file once it is downloaded
os.rename(file_location, os.path.join(desk_location, "its_"+filename))
except Exception:passhttps://stackoverflow.com/questions/54635093
复制相似问题