我有一个大约是8G的tfrecord文件,我想把它分成4个文件,每个文件大约是2G,我该怎么做呢?
发布于 2018-04-11 21:09:52
我不知道有什么方法可以指定tfrecord文件的结果大小。但是,您当然可以限制tfrecord文件中的特性数量。知道这并不完全是你所要求的,它会以类似的方式完成工作。
下面是我过去如何处理这种情况的示例代码(请参阅完整代码here):
(fragment_size是一个tfrecord文件中的特征数)
for video_count in range((num_videos)):
if video_count % fragment_size == 0:
if writer is not None:
writer.close()
filename = os.path.join(destination_path, name + str(
current_batch_number) + '_of_' + str(
total_batch_number) + '.tfrecords')
print('Writing', filename)
writer = tf.python_io.TFRecordWriter(filename)
for image_count in range(num_images):
path = 'blob' + '/' + str(image_count)
image = data[video_count, image_count, :, :, :]
image = image.astype(color_depth)
image_raw = image.tostring()
feature[path] = _bytes_feature(image_raw)
feature['height'] = _int64_feature(height)
feature['width'] = _int64_feature(width)
feature['depth'] = _int64_feature(num_channels)
example = tf.train.Example(features=tf.train.Features(feature=feature))
writer.write(example.SerializeToString())
if writer is not None:
writer.close()https://stackoverflow.com/questions/48184831
复制相似问题