我使用的是来自skbio (0.5.4)的smith-waterman的包装版本,但我有一个未经检查的错误:
_,score,_= local_pairwise_align_ssw(protein_listidx1,protein_listidx2,substitution_matrix = blosum62)
文件"/anaconda3/lib/python3.6/site-packages/skbio/alignment/_pairwise.py",第732行,在local_pairwise_align_ssw中
validate=False)File "/anaconda3/lib/python3.6/site-packages/skbio/alignment /_tabular_msa.py",第785行,在__init__中
reset\_index=minter is None and index is None)File "/anaconda3/lib/python3.6/site-packages/skbio/alignment /_tabular_msa.py",第1956行,在扩展中
self.\_assert\_valid\_sequences(sequences)File "/anaconda3/lib/python3.6/site-packages/skbio/alignment /_tabular_msa.py",2035行,在_assert_valid_sequences中
% (length, expected\_length))ValueError:每个序列的长度必须与MSA中的位置数匹配: 232 != 231
奇怪的是,有时错误出现在蛋白质对0-10,而其他错误出现在0-116。所以,我不认为这是来自蛋白质的错误。
发布于 2020-01-28 15:52:57
我也有类似的问题。但是,我能够将错误限制在优化的SSW版本中。因此,序列格式中没有错误。
import warnings
from skbio.sequence import Protein
with warnings.catch_warnings():
warnings.filterwarnings("ignore", message="...")
from Bio.Align import substitution_matrices
from skbio.alignment import local_pairwise_align_ssw
from skbio.alignment import local_pairwise_align
peptide1 = Protein("CGAGDNQAGTALIF")
peptide2 = Protein("CAGEEGGGADGLTF")
gap_open_penalty = 10
gap_extend_penalty = 10
substitution_matrix = substitution_matrices.load("BLOSUM45")
## works correct
rv = local_pairwise_align_ssw(
sequence1 = peptide1
, sequence2 = peptide2
, gap_open_penalty=1
, gap_extend_penalty=1
, substitution_matrix=substitution_matrix
)
print(rv)
## but if I swap peptide1 and peptide 2 the ValueError occur
rv = local_pairwise_align_ssw(
sequence1 = peptide2
, sequence2 = peptide1
, gap_open_penalty=1
, gap_extend_penalty=1
, substitution_matrix=substitution_matrix
)
print(rv)
## if I do the same with local_pairwise_align it works!
rv = local_pairwise_align(
seq1=peptide2
, seq2=peptide1
, gap_open_penalty=1
, gap_extend_penalty=1
, substitution_matrix=substitution_matrix
)
print(rv)https://stackoverflow.com/questions/54895330
复制相似问题