文章/答案/技术大牛

发布

问Rosalind共识和简介
EN

Stack Overflow用户

提问于 2016-11-10 04:03:13

回答 1查看 598关注 0票数 0

我的脚本填充计数器列表并输出相同长度的一致字符串。我不认为有数学或索引错误正在发生，也不会遇到困难。我的代码：

 with open('C:/users/steph/downloads/rosalind_cons (3).txt') as f:
    seqs = f.read().splitlines()

#remove all objects that are not sequences of interest
for s in seqs:
    if s[0] == '>':
        seqs.remove(s)

n = range(len(seqs[0])+1)

#lists to store counts for each nucleotide
A, C, G, T = [0 for i in n], [0 for i in n], [0 for i in n], [0 for i in n]

#see what nucleotide is at each index and augment the 
#same index of the respective list
def counter(Q):
    for q in Q:
        for k in range(len(q)):
            if q[k] == 'A':
                A[k] += 1
            elif q[k] == 'C':
                C[k] += 1
            elif q[k] == 'G':
                G[k] += 1
            elif q[k] == 'T':
                T[k] += 1
counter(seqs)

#find the max of all the counter lists at every index 
#and add the respective nucleotide to the consensus sequence
def consensus(a,t,c,g):
        consensus = ''
        for k in range(len(a)):
            if (a[k] > t[k]) and (a[k]>c[k]) and (a[k]>g[k]):
                consensus = consensus+"A"
            elif (t[k] > a[k]) and (t[k]>c[k]) and (t[k]>g[k]):
                consensus = consensus+ 'T'
            elif (c[k] > t[k]) and (c[k]>a[k]) and (c[k]>g[k]):
                consensus = consensus+ 'C'
            elif (g[k] > t[k]) and (g[k]>c[k]) and (g[k]>a[k]):
                consensus = consensus+ 'G'
            #ensure a nucleotide is added to consensus sequence
            #when more than one index has the max value
            else:
                if max(a[k],c[k],t[k],g[k]) in a:
                    consensus = consensus + 'A'
                elif max(a[k],c[k],t[k],g[k]) in c:
                    consensus = consensus + 'C'
                elif max(a[k],c[k],t[k],g[k]) in t:
                    consensus = consensus + 'T'
                elif max(a[k],c[k],t[k],g[k]) in g:
                    consensus = consensus + 'G'
        print(consensus)
        #debugging, ignore this --> print('len(consensus)',len(consensus))
consensus(A,T,C,G)

#debugging, ignore this --> print('len(A)',len(A))

print('A: ',*A, sep=' ')
print('C: ',*C, sep=' ')
print('G: ',*G, sep=' ')
print('T: ',*T, sep=' ')

谢谢您抽时间见我

consensus

rosalind

python

bioinformatics

回答 1

Stack Overflow用户

发布于 2016-11-10 04:56:04

以下行中有一个错误：

N= range(len(seqs)+1)

这导致序列太长(填充了额外的A和4倍的0)。删除+1，它应该可以正常工作。

另外，你的输出中有两个空格，在你的print statements.

If中删除后的空格，你修复了这两行，它将对例子起作用，但对于超过一行的序列将失败(就像在
中的真实例子一样)。

尝试将这些行合并，如下所示：

new_seqs = list()
for s in seqs:
    if s.startswith('>'):
        new_seqs.append('')
    else:
        new_seqs[-1]+=s
seqs = new_seqs

再试一次。

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/40515061

复制

相似问题

问Rosalind共识和简介
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问Rosalind共识和简介EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问Rosalind共识和简介
EN