我试图添加超过70000个新的功能到一个genbank文件使用biopython。
我有这样的代码:
from Bio import SeqIO
from Bio.SeqFeature import SeqFeature, FeatureLocation
fi = "myoriginal.gbk"
fo = "mynewfile.gbk"
for result in results:
start = 0
end = 0
result = result.split("\t")
start = int(result[0])
end = int(result[1])
for record in SeqIO.parse(original, "gb"):
record.features.append(SeqFeature(FeatureLocation(start, end), type = "misc_feat"))
SeqIO.write(record, fo, "gb")结果只是一个列表,其中包含了我需要添加到原始gbk文件中的每个特性的开始和结束。
这个解决方案对我的计算机来说是非常昂贵的,我不知道如何提高性能。有什么好主意吗?
发布于 2015-07-22 11:59:57
您应该只解析一次genbank文件。省略results包含的内容(我不知道,因为在您的示例中缺少了一些代码),我想这样的东西会提高性能,修改代码:
fi = "myoriginal.gbk"
fo = "mynewfile.gbk"
original_records = list(SeqIO.parse(fi, "gb"))
for result in results:
result = result.split("\t")
start = int(result[0])
end = int(result[1])
for record in original_records:
record.features.append(SeqFeature(FeatureLocation(start, end), type = "misc_feat"))
SeqIO.write(record, fo, "gb")https://stackoverflow.com/questions/31290902
复制相似问题