文章/答案/技术大牛

发布

社区首页 >问答首页 >通过Biopython从NCBI查询ncbi序列

问通过Biopython从NCBI查询ncbi序列
EN

Stack Overflow用户

提问于 2014-06-27 13:56:00

回答 3查看 1.5K关注 0票数 0

如何查询给定染色体Genbank标识符的NCBI序列，并使用Biopython启动和停止位置？

CP001665    NAPP    TILE    6373    6422    .   +   .   cluster=9; 
CP001665    NAPP    TILE    6398    6447    .   +   .   cluster=3; 
CP001665    NAPP    TILE    6423    6472    .   +   .   cluster=3; 
CP001665    NAPP    TILE    6448    6497    .   +   .   cluster=3;
CP001665    NAPP    TILE    7036    7085    .   +   .   cluster=10; 
CP001665    NAPP    TILE    7061    7110    .   +   .   cluster=3; 
CP001665    NAPP    TILE    7073    7122    .   +   .   cluster=3;

ncbi

sequence

bioinformatics

biopython

回答 3

Stack Overflow用户

回答已采纳

发布于 2014-07-01 12:35:48

from Bio import Entrez
from Bio import SeqIO

Entrez.email = "sample@example.org"

handle = Entrez.efetch(db="nuccore",
                       id="CP001665",
                       rettype="gb",
                       retmode="text")

whole_sequence = SeqIO.read(handle, "genbank")

print whole_sequence[6373:6422]

一旦您知道了要从中获取的id和数据库，请使用Entrez.efetch获取该文件的句柄。您应该指定返回类型(rettype="gb")和模式(retmode="text")，以获得类文件数据的处理程序。

然后将这个处理程序传递给SeqIO，它应该返回一个SeqRecord对象。SeqRecord的一个很好的特性是，它们可以被清晰地切片为列表。如果可以从某个地方检索起始点和结束点，那么上面的print语句将返回：

ID: CP001665.1
Name: CP001665
Description: Escherichia coli 'BL21-Gold(DE3)pLysS AG', complete genome.
Number of features: 0
Seq('GCGCTAACCATGCGAGCGTGCCTGATGCGCTACGCTTATCAGGCCTACG', IUPACAmbiguousDNA())

票数 1

Stack Overflow用户

发布于 2014-06-28 07:02:15

可能是类似的东西？

    from Bio import Entrez
    Entrez.email = "Your.Name.Here@example.org"
    handle = Entrez.efetch(db="genome", id="56", rettype="fasta")

您需要确定正确的数据库并对其进行查询。我建议使用构建查询，看看是否可以这样解决问题：

http://www.ncbi.nlm.nih.gov/assembly/advanced

票数 0

Stack Overflow用户

发布于 2016-08-26 04:37:08

要下载核苷酸/蛋白质序列，不需要使用Biopython.You，可以使用urllib2，也可以使用Biopython或Bioperl.Here List contaiins、NCBI GI ID。

import urllib2
List = ['440906003','440909279','440901052']
for gi in List:
    url = 'https://www.ncbi.nlm.nih.gov/sviewer/viewer.cgi?    tool=portal&sendto=on&log$=seqview&db=protein&dopt=fasta&sort=&val='+gi+'&from=begin&to=end&maxplex=1'
    resp = urllib2.urlopen(url)
    page = resp.read()
    print (page),

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/24453617

复制

相似问题

问通过Biopython从NCBI查询ncbi序列
EN

回答 3

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问通过Biopython从NCBI查询ncbi序列EN

回答 3

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问通过Biopython从NCBI查询ncbi序列
EN