我有一个非常大的文件夹,包含19个文件夹,每个文件夹都包含单个类的图像,我想将它们拆分为培训/测试/验证集;同时,我还要添加一个注释文件,用于验证和测试,以训练模型?
发布于 2022-07-28 15:04:11
用于拆分:
import splitfolders
# Split with a ratio.
# To only split into training and validation set, set a tuple to `ratio`, i.e, `(.8, .2)`.
splitfolders.ratio("/home/marouane/dev/mdl-py-classification/model/data", output="/home/marouane/dev/mdl-py-classification/model/data_s",
seed=1337, ratio=(.4, .1, .5), group_prefix=None, move=False) # default values
注释:
import os
import numpy as np
import shutil
import pandas as pd
def train_test_split(name):
"""
parameter : name of the folder
return : text file
"""
classes_dir = ["advert","box_start_horse","end_carriage","end_horse","group_heat_carriage","group_heat_horse","heat_carriage","heat_horse","interview",
"orthogonal_start_carriage","orthogonal_start_horse","paddock","presenters","race_carriage","race_horse","slide","truck_start_carriage","walking_horse","winnerslide"
]
dir = '/home/marouane/dev/mdl-py-classification/model/data_s/'+name+"/"
destFile = '/home/marouane/dev/mdl-py-classification/model/data_s/'+name+".txt"
for cls in classes_dir:
path = dir + cls
files = os.listdir(path)
for file in files :
with open(destFile, 'a') as f:
f.write(cls+"/" +file +" " +str(classes_dir.index(cls))+"\r\n")
train_test_split("val")
train_test_split("test")
https://stackoverflow.com/questions/73152789
复制相似问题