假设我有一个文件列表
files = ['s1.txt', 'ai1.txt', 's2.txt', 'ai3.txt']我需要根据它们的编号将它们分类到子列表中,以便
files = [['s1.txt', 'ai1.txt'], ['s2.txt'], ['ai3.txt']]我可以写一堆循环,但是我想知道是否有更好的方法来做到这一点?
发布于 2012-02-27 21:49:15
下面是一个完整的基于defaultdict的工作示例
import re
from collections import defaultdict
files = ['s1.txt', 'ai1.txt', 's2.txt', 'ai3.txt']
def get_key(fname):
return int(re.findall(r'\d+', fname)[0])
d = defaultdict(list)
for f in files:
d[get_key(f)].append(f)
out = [d[k] for k in sorted(d.keys())]
print(out)这会产生以下结果:
[['s1.txt', 'ai1.txt'], ['s2.txt'], ['ai3.txt']]发布于 2012-02-27 21:46:56
import itertools
import re
r_number = re.compile("^.*([0-9]+).*$")
def key_for_filename(filename):
# Edit: This doesn't check for missing numbers.
return r_number.match(filename).group(1)
grouped = [list(v) for k, v in
itertools.groupby(sorted(files, key=key_for_filename),
key_for_filename)]发布于 2012-02-27 21:48:26
首先,编写一个从文件名中提取数字的函数:
def file_number(name):
return re.search(r"\d+", "s1.txt").group(0)(请注意,如果名称中根本没有数字,则此函数将出错。)
使用此函数作为键对列表进行排序:
files.sort(key=file_number)使用itertools.groupby()按此密钥分组
for number, group in itertools.groupby(files, file_number):
# whateverhttps://stackoverflow.com/questions/9466017
复制相似问题