我正在使用一个简单的函数将DNA序列转换为氨基酸序列。在较高的级别上,代码看起来很好,但每当我运行程序时,我都会得到错误KeyError: 'mtD',这个错误的来源显然在第26行(if table[seq[i:i+3]] == "_" :)。唯一的另一次是在我的程序中提到'mtD‘的时候,我只是简单地将我的数据集打印到控制台上,这使得这个问题更加令人困惑。我的代码如下所示。
#Creating the protein sequence column for the data
Protein_Sequence = []
#dna to protein sequence function
def translate11(seq):
table = {"TTT" : "F", "CTT" : "L", "ATT" : "I", "GTT" : "V",
"TTC" : "F", "CTC" : "L", "ATC" : "I", "GTC" : "V",
"TTA" : "L", "CTA" : "L", "ATA" : "I", "GTA" : "V",
"TTG" : "L", "CTG" : "L", "ATG" : "M", "GTG" : "V",
"TCT" : "S", "CCT" : "P", "ACT" : "T", "GCT" : "A",
"TCC" : "S", "CCC" : "P", "ACC" : "T", "GCC" : "A",
"TCA" : "S", "CCA" : "P", "ACA" : "T", "GCA" : "A",
"TCG" : "S", "CCG" : "P", "ACG" : "T", "GCG" : "A",
"TAT" : "Y", "CAT" : "H", "AAT" : "N", "GAT" : "D",
"TAC" : "Y", "CAC" : "H", "AAC" : "N", "GAC" : "D",
"TAA" : "_", "CAA" : "Q", "AAA" : "K", "GAA" : "E",
"TAG" : "_", "CAG" : "Q", "AAG" : "K", "GAG" : "E",
"TGT" : "C", "CGT" : "R", "AGT" : "S", "GGT" : "G",
"TGC" : "C", "CGC" : "R", "AGC" : "S", "GGC" : "G",
"TGA" : "_", "CGA" : "R", "AGA" : "R", "GGA" : "G",
"TGG" : "W", "CGG" : "R", "AGG" : "R", "GGG" : "G"
}
pro_sequence =" "
for i in range(0, len(seq)-(3+len(seq)%3), 3):
if table[seq[i:i+3]] == "_" :
break
pro_sequence += table[seq[i:i+3]]
return pro_sequence
newthang = df.mtDNA_Sequence
for thang in newthang:
x = translate11(thang)
Protein_Sequence.append(x)发布于 2020-08-23 23:08:00
你的函数对我很有效,我试着用一个短的核苷酸序列,它给出了适当的翻译
for循环结束时少了一个氨基酸,因此您可以删除3+:
for i in range(0, len(seq)-(len(seq)%3), 3):在声明pro_sequence时,应以空字符串""开头,而不是以空格字符" "开头
因此,在进行了这些微小的更改之后,我尝试了以下操作:
sequence = "tactgtggctactcagctgtgcgcatggcccgcctgctgtcaccaggggcgaggctcatcaccatcgagatcaaccccgactgtgccgccatcacccagcggatggtggatttcgctggcatgaaggacaag"
print translate11(sequence.upper())
# YCGYSAVRMARLLSPGARLITIEINPDCAAITQRMVDFAGMKDK这就是正确的翻译
因此,您为函数提供的一个输入(来自df.mtDNA_Sequence)必须以字母"mtD“开头或包含字母,而不仅仅是一串核苷酸
尝试添加另一个条件,如果字符不是可识别的密码子,就会中断for循环
for i in range(0, len(seq)-(len(seq)%3), 3):
if seq[i:i+3] not in table.keys() :
break
if table[seq[i:i+3]] == "_" :
breakhttps://stackoverflow.com/questions/63387440
复制相似问题