我正在设计一个工具,它只应该从遵循命名约定的输入文件夹中获取EXR图像文件: u#_v#.exr或u#v##.exr (其中#表示整数或正数非零整数)。所有其他文件都应该被忽略。下面给出了我的工作代码。然而,是否有更好或更有效的方法来做到这一点呢?
def main():
# Add and read command line arguments
parser = argparse.ArgumentParser()
parser.add_argument('--input_folder', type=str, help='Directory where input images are located')
parser.add_argument('--output_folder', type=str, help='Directory where output image should be written')
args = parser.parse_args()
# Change directory to input folder and check all filenames belonging to our convention
os.chdir(args.input_folder)
all_files = check_combinatons_of_numeric_characters('u_v.exr', 'u_v_.exr')
print(all_files)
def check_combinatons_of_numeric_characters(convention1, convention2):
# Combinations for first convention which was supplied
split_convention1 = convention1.split('_')
convention1_combinations_alpha = np.array([])
convention1_combination1 = glob.glob(split_convention1[0] + '[0-9]_' +
split_convention1[1].split('.')[0] + '[0-9].' +
split_convention1[1].split('.')[1]
)
convention1_combination2 = glob.glob(split_convention1[0] + '[0-9][0-9]_' +
split_convention1[1].split('.')[0] + '[0-9][0-9].' +
split_convention1[1].split('.')[1]
)
convention1_combination3 = glob.glob(split_convention1[0] + '[0-9][0-9]_' +
split_convention1[1].split('.')[0] + '[0-9].' +
split_convention1[1].split('.')[1]
)
convention1_combination4 = glob.glob(split_convention1[0] + '[0-9]_' +
split_convention1[1].split('.')[0] + '[0-9][0-9].' +
split_convention1[1].split('.')[1]
)
convention1_combinations_alpha = np.concatenate((convention1_combination1,
convention1_combination2,
convention1_combination3,
convention1_combination4),
)
# Combinations for second convention supplied
split_convention2 = convention2.split('_')
convention2_combinations_alpha = np.array([])
convention2_combination1 = glob.glob(split_convention2[0] + '[0-9]_'+
split_convention2[1] + '[0-9]_[0-9]' +
split_convention2[2]
)
convention2_combination2 = glob.glob(split_convention2[0] + '[0-9][0-9]_' +
split_convention2[1] + '[0-9]_[0-9]' +
split_convention2[2]
)
convention2_combination3 = glob.glob(split_convention2[0] + '[0-9]_' +
split_convention2[1] + '[0-9]_[0-9][0-9]' +
split_convention2[2]
)
convention2_combination4 = glob.glob(split_convention2[0] + '[0-9][0-9]_' +
split_convention2[1] + '[0-9][0-9]_[0-9]' +
split_convention2[2]
)
convention2_combination5 = glob.glob(split_convention2[0] + '[0-9]_' +
split_convention2[1] + '[0-9][0-9]_[0-9][0-9]' +
split_convention2[2]
)
convention2_combination6 = glob.glob(split_convention2[0] + '[0-9][0-9]_' +
split_convention2[1] + '[0-9]_[0-9][0-9]' +
split_convention2[2]
)
convention2_combinations_alpha = np.concatenate((convention2_combination1,
convention2_combination2,
convention2_combination3,
convention2_combination4,
convention2_combination5,
convention2_combination6),
)
list_of_files = np.concatenate((convention1_combinations_alpha, convention2_combinations_alpha))
return list_of_files
if __name__ == '__main__':
main()发布于 2022-09-28 09:24:14
我只需匹配所有的*.exr文件,然后跳过那些不遵循模式的文件。
import glob
import re
list_of_files = [file for file in glob.glob('*.exr')
if re.match(r'^u\d{1,2}_v\d{1,2}\.exr$', file)]如果严格地需要排除零,正则表达式将需要使用(?!0)\d{1,2}而不是\d{1,2} (在这两个地方);或者如果您希望允许前导零(而不是零后面跟着非数字),则需要使用(?!0\D)\d{1,2}。
更详细地说,\d匹配一个数字,{1,2}表示前一个表达式出现一到两个数字,而\D匹配一个不是数字的字符。(?!something)是一个负前瞻,如果此时的文本与正则表达式something匹配,则会阻止匹配。\.匹配文字点,^匹配文件名的开头,$匹配结尾;大多数其他字符只是与自己匹配。有关更详细的说明,请查看documentation for the Python re module和/或Stack Overflow regex tag info page.上的初学者资源。
在空闲时间将结果列表转换为数据框架。
https://stackoverflow.com/questions/73878633
复制相似问题