文章/答案/技术大牛

发布

社区首页 >问答首页 >BioPython:如何在GenBank中使用"Locus“键进行解析

问BioPython:如何在GenBank中使用"Locus“键进行解析
EN

Stack Overflow用户

提问于 2019-10-31 10:02:20

回答 1查看 286关注 0票数 1

我有一个包含许多序列的Genbank文件。我还有另一个文本文件，其中包含这些序列的名称以及TSV中的一些其他信息，我将其作为pandas数据帧读入。我使用.sample函数从该数据中随机选择一个名称，并为其分配了变量n_name，如下面的代码块所示。

n = df_bp_pos_2.sample(n = 1)
n_value = n.iloc[:2]
n_name = n.iloc[:1]

n_name与genbank文件中的轨迹名称相同，并且大小写准确。我正在尝试解析genbank文件并提取包含locus = n_name的序列。genbank文件名为all.gb。我有：

from Bio import SeqIO
for seq_record in SeqIO.parse("all.gb", "genbank"):

但是我不太确定下一行或者2行应该是什么，用locus来解析？有什么想法吗？

python

pandas

bioinformatics

biopython

genbank

回答 1

Stack Overflow用户

发布于 2019-11-29 06:21:58

您还可以使用一组locus标记，而不只是一个locus标记。

from Bio import SeqIO

locus_tags = ["b0001", "b0002"] # Example list of locus tags
records = []

for record in SeqIO.parse('all.gb', 'genbank'):
    for feature in record.features:
        tag = feature.qualifiers.get('locus_tag')
        if tag:
            if tag[0] in locus_tags:
                # Here you need to either extract the feature sequence from the record (using the extract method) if you only want the feature dna sequence, or alternatively get the translation for the protein by accession the 'translation' field in qualifiers, or make a translation of the feature on the fly. Afterwards you canappend the resulting record to `records`.

您可以在Biopython Cookbook中找到有关extract方法和功能限定符的更多信息。

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/58635996

复制

相似问题

问BioPython:如何在GenBank中使用"Locus“键进行解析
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问BioPython:如何在GenBank中使用"Locus“键进行解析EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问BioPython:如何在GenBank中使用"Locus“键进行解析
EN