我有一个脚本,我几乎100%完成,但只有一个步骤,我不能搞清楚。我的脚本当前检查目标文件是否已经存在,如果存在,则不会移动源位置中的文件。我遇到的问题是,代码不会检查所有子目录,也不会只检查根目录。
我正在使用os.walk遍历源文件夹中的所有文件,但不确定如何os.walk目标文件夹和源文件夹。
import time
import sys
import logging
import logging.config
def main():
purge_files
def move_files(src_file):
try:
#Attempt to move files to dest
shutil.move(src_file, dest)
#making use of the OSError exception instead of FileExistsError due to older version of python not contaning that exception
except OSError as e:
#Log the files that have not been moved to the console
logging.info(f'Files File already exists: {src_file}')
print(f'File already exists: {src_file}')
#os.remove to delete files that are already in dest repo
os.remove(src_file)
logging.warning(f'Deleting: {src_file}')
def file_loop(files, root):
for file in files:
#src_file is used to get the full path of everyfile
src_file = os.path.join(root,file)
#The two variables below are used to get the files creation date
t = os.stat(src_file)
c = t.st_ctime
#If the file is older then cutoff code within the if statement executes
if c<cutoff:
move_files(src_file)
#Log the file names that are not older then the cutoff and continue loop
else:
logging.info(f'File is not older than 14 days: {src_file}')
continue
def purge_files():
logging.info('invoke purge_files method')
#Walk through root directory and all subdirectories
for root, subdirs, files in os.walk(source):
dst_dir = root.replace(source, dest)
#Loop through files to grab every file
file_loop(files, root)
return files, root, subdirs
files, root, subdirs = purge_files()我预期输出会将源文件中的所有文件移动到dest。在移动文件之前,我希望检查dest位置中的所有文件,包括dest的subdir,如果其中任何文件与源文件相同,则不会将它们移动到dest。我不想要源代码中的文件夹。我只想把所有的文件移到根目录下。
发布于 2019-05-30 22:20:10
我可以看到你已经写了很大一部分代码,但由于它是目前发布的,它包含了相当多的错误:
shutil).source).如果我将您的代码复制粘贴到我的集成开发环境中,我会从pep8和pylint中得到26个错误,在修复缩进错误之后,我会得到49个错误。这让我想知道这是你的实际代码,还是你犯了复制-粘贴错误。无论如何,使用IDE肯定会帮助您验证代码并更早地捕获错误。试试看!
因为我不能运行你的代码,所以我不能确切地说为什么它不能工作,但我可以给你一些指点。
有一件事引起了很多问题,那就是下面这行:
dst_dir = root.replace(source, dest)除了不好的缩进之外,变量dst_dir在任何地方都不能使用。那么这句话的意义是什么呢?还要注意的是,这将替换root中出现的所有source。对于微不足道的情况,这不是问题,但它并不是在所有情况下都很健壮。因此,请尽可能使用标准库中的路径操作,并尽量避免在路径上执行手动字符串操作。在Python3.4中引入了Pathlib模块。我推荐使用它。
在某些情况下,使用os.walk()非常方便,但对于您的用例来说,可能不是最好的解决方案。也许递归地使用os.listdir()会容易得多,特别是因为目标目录将是平面的(即没有子目录的固定目录)。
一个可能的实现(使用pathlib和os.listdir())可能如下所示:
import logging
import os
import pathlib
import shutil
import time
SOURCE_DIR_PATH = pathlib.Path('C:\\Temp')
DESTINATION_DIR_PATH = pathlib.Path('D:\\archive')
CUTOFF_DAYS = 14
CUTOFF_TIME = time.time() - CUTOFF_DAYS * 24 * 3600 # two weeks
def move_file(src_file_path, dst_dir_path):
logging.debug('Moving file %s to directory %s', src_file_path,
dst_dir_path)
return # REMOVE THIS LINE TO ACTUALLY PERFORM FILE OPERATIONS
try:
shutil.move(str(src_file_path), str(dst_dir_path))
except OSError:
logging.info('File already exists in destination directory: %s',
src_file_path)
logging.warning('Deleting file %s', src_file_path)
src_file_path.unlink()
def move_files(src_file_paths, dst_dir_path):
for src_file_path in src_file_paths:
if src_file_path.stat().st_ctime < CUTOFF_TIME:
logging.info('Moving file older than %d days: %s', CUTOFF_DAYS,
src_file_path)
move_file(src_file_path, dst_dir_path)
else:
logging.info('Not moving file less than %d days old: %s',
CUTOFF_DAYS, src_file_path)
def purge_files(src_dir_path, dst_dir_path):
logging.info('Scanning directory %s', src_dir_path)
names = os.listdir(src_dir_path)
paths = [src_dir_path.joinpath(name) for name in names]
file_paths = [path for path in paths if path.is_file()]
dir_paths = [path for path in paths if path.is_dir()]
# Cleanup files
move_files(file_paths, dst_dir_path)
# Cleanup directories, recursively.
for dir_path in dir_paths:
purge_files(dir_path, dst_dir_path)
def main():
logging.basicConfig(format='%(message)s', level=logging.DEBUG)
purge_files(SOURCE_DIR_PATH, DESTINATION_DIR_PATH)
if __name__ == '__main__':
main()我测试了这段代码,它起作用了。
请注意,我对move_file使用了与示例中相同的错误处理。然而,我认为它并不是很健壮。如果源目录中存在两个同名的文件(在不同的子目录中,或者在不同的时间),该怎么办?则第二个文件将被删除而不进行备份。此外,如果出现其他错误(如“磁盘已满”或“网络错误”),代码只会假定文件已经备份并删除了原始文件。我不知道你的用例,但我会认真考虑重写这个函数。
但是,我希望这些建议和示例代码能让您走上正轨。
发布于 2019-05-30 04:33:30
你可能想要清理你的代码,它充满了bug。例如,main中的“purge_files”而不是“purge_files()”,purge_files中的缩进错误等等。此外,代码之间看似随机的换行符也让它读起来有点笨拙(至少对我来说是这样) :)
编辑:我快速浏览了你的代码,并修改了一些东西。主要是变量名。我注意到有几个变量的名称不具描述性('i','t‘等)。以及描述该变量含义的注释。如果您只是将变量名更改为更具描述性的名称,则不需要注释,您的代码甚至更易于rad。请注意,我没有测试这段代码,tbh它甚至可能不会运行(因为这不是我的目标,而是为了显示我建议的一些样式更改) :)
import os
import shutil
import time
import errno
import time
import sys
import logging
import logging.config
# NOTE: It is a convention to write constants in all caps
SOURCE = r'C:\Users\Desktop\BetaSource'
DEST = r'C:\Users\Desktop\BetaDest'
#Gets the current time from the time module
now = time.time()
#Timer of when to purge files
cutoff = now - (14 * 86400)
all_sources = []
all_dest_dirty = []
logging.basicConfig(level = logging.INFO,
filename = time.strftime("main-%Y-%m-%d.log"))
def main():
# NOTE: Why is this function called / does it exist? It only sets a global
# 'dest_files' which is never used...
dest_files()
purge_files()
# I used the dess_files function to get all of the destination files
def dest_files():
for root, subdirs, files in os.walk(DEST):
for file in files:
# NOTE: Is it really necessary to use a global here?
global all_dirty
all_dirty.append(files)
def purge_files():
logging.info('invoke purge_files method')
# I removed all duplicates from dest because cleaning up duplicates in
# dest is out of the scope
# NOTE: This is the perfect usecase for a set
all_dest_clean = set(all_dest_dirty)
# os.walk used to get all files in the source location
for source_root, source_subdirs, source_files in os.walk(SOURCE):
# looped through every file in source_files
for file in source_files:
# appending all_sources to get the application name from the
# file path
all_sources.append(os.path.abspath(file).split('\\')[-1])
# looping through each element of all_source
for source in all_sources:
# logical check to see if file in the source folder exists
# in the destination folder
if source not in all_dest_clean:
# src is used to get the path of the source file this
# will be needed to move the file in shutil.move
src = os.path.abspath(os.path.join(source_root, source))
# the two variables used below are to get the creation
# time of the files
metadata = os.stat(src)
creation_time = metadata.st_ctime
# logical check to see if the file is older than the cutoff
if creation_time < cutoff:
logging.info(f'File has been succesfully moved: {source}')
print(f'File has been succesfully moved: {source}')
shutil.move(src,dest)
# removing the already checked source files for the
# list this is also used in other spots within the loop
all_sources.remove(source)
else:
logging.info(f'File is not older than 14 days: {source}')
print(f'File is not older than 14 days: {source}')
all_sources.remove(source)
else:
all_sources.remove(source)
logging.info(f'File: {source} already exists in the destination')
print(f'File: {source} already exists in the destination')
if __name__ == '__main__':
main()https://stackoverflow.com/questions/56360794
复制相似问题