我使用此脚本获取文件夹及其子文件夹中所有照片的列表。然而,这个计划似乎真的很慢。
此文件夹和子文件夹中有50,000张.jpg图片。我可以减少for循环,但程序仍然或多或少地以相同的速度运行。
我也愿意使用lambda,但希望以最好的速度使用Python的基本功能。有人能建议改进吗?
import os
from fnmatch import fnmatch
import sys
root = "C:\\Users\\Agartha\\Desktop\\photos"
pattern = "*.jpg"
with open("./files\\list.txt", "w") as a:
for path, subdirs, files in os.walk(root):
for filename in files:
if fnmatch(filename,pattern):
a.write(str(os.path.join(filename)) + '\n')发布于 2018-03-12 12:56:13
在这里,你同时做四件事。你是
要找出较慢的步骤是什么,您应该将其解耦。
def find_files(root):
for path, subdirs, files in os.walk(root):
for filename in files:
yield filename
def filter_filename(files, pattern):
for filename in files:
if fnmatch(filename,pattern):
yield filename
def format_filenames(files, root):
for filename in files:
yield str(os.path.join(filename)) + '\n'
def writelines(out_file, files):
for filename in files:
out_file.write(filename)
# or:
# out_file.writelines(files)
def main(root, pattern, filename_out):
files = find_files(root)
files_filtered = filter_filename(files, pattern)
files_formatted = format_filenames(files, root)
with open(filename_out, 'w') as out_file:
writelines(out_file, files_formatted)然后,你可以单独花4件时间去找出时间消耗中最大的罪魁祸首,然后再去做:
files = list(find_files(root))files_filtered = list(filter_filename(files, pattern))files_formatted = list(format_filenames(files, root))with open(filename_out, 'w') as out_file: writelines(out_file, files_formatted)https://codereview.stackexchange.com/questions/189390
复制相似问题