文章/答案/技术大牛

发布

社区首页 >问答首页 >确定蛋白质片段的组合是否涵盖完整的蛋白质序列。

问确定蛋白质片段的组合是否涵盖完整的蛋白质序列。
EN

Stack Overflow用户

提问于 2019-08-02 13:31:50

回答 3查看 270关注 0票数 1

FASTA文件包含一个单一的蛋白质序列。第二个FASTA文件包含第一个文件中序列的片段。计算每个序列的分子量，并使用这些序列来确定是否有可以覆盖整个蛋白质序列的片段的组合，而没有这些片段的重叠。

我试着做了下面的脚本，但是我还没能把它全部放入一个有效的代码中

所以在

seqs

我把蛋白质片段的重量

total_weight

我已经把完整片段的重量，测试身体是否我试图使用功能。

seqs = [50,70,30]
total_weight = 100
current_weight = 0
for weight in seqs:
    if current_weight + weight == total_weight:
        print(True)
    elif current_weight + weight < total_weight:
        current_weight += weight
    if current_weight > total_weight:
        current_weight -= weight

显然，在这种情况下，我希望这段代码返回True。为了解决这个问题，我想省略

seqs

列出，然后重做我所做的“for”循环。不知怎么的，我无法通过省略第一个元素并再次运行新的循环来完成代码。

seqs

列表。有人能引导我朝正确的方向前进吗？

python

bioinformatics

回答 3

Stack Overflow用户

发布于 2019-08-02 14:14:31

下面是另一个递归方法，它实际上为列表中的任何值加到100个，并将打印出新的列表，即True语句

seqs = [50,70,30]
total_weight = 100

def protein_summation_check(target, lst, newLst=[]):
    print(newLst)
    for index,protein in enumerate(lst):
        newLst.append(protein)
        protein_summation_check(target, lst[index+1:], newLst)
        if sum(newLst) == target:
            return ("True",newLst)
        newLst.pop()
    else:
        return False
print(protein_summation_check(total_weight, seqs))

对于循环迭代，并不是对所有的解决方案都有效，而是对于您提供的解决方案；

seqs = [50,70,30]
total_weight = 100
current_weight = 0

for index, item in enumerate(seqs):
    if  current_weight == total_weight or item == total_weight:
        print("True")
        break
    for otheritem in seqs[index+1:]:
        if otheritem == total_weight:
            current_weight = total_weight
            break
        if current_weight < total_weight:
            current_weight += otheritem + item
        if current_weight > total_weight:
            if otheritem >= total_weight:
                current_weight -= item
            else:
                current_weight -= otheritem

票数 2

Stack Overflow用户

发布于 2019-08-02 13:50:51

查看seq列表中迭代工具中的permutaitons：

from itertools import permutations 
perm_list = list(permutations(seqs))
perm_list

提供下列输出：

[(50, 70, 30),
 (50, 30, 70),
 (70, 50, 30),
 (70, 30, 50),
 (30, 50, 70),
 (30, 70, 50)]

然后，您可以遍历这些组合，以查看哪些值可能等于总权重。

希望这有用，干杯！

票数 1

Stack Overflow用户

发布于 2019-08-02 13:51:59

您的代码显然不会将True打印为

0 + 50 = 50
50 & 70 => Nothing happens
50 + 30 = 80

对于每个条目，您可以尝试添加下一个条目，也可以不添加，因此您的函数将有两个参数，即已经分组的参数，其余的参数：

def calculate(current: int, next: int[]):
  pass

您想要检查当前元素是否是您的总权重，如果不添加任何内容，则不会得到任何进一步的结果。

total_weight=100
current_weight=0
data=[50,70,30]

def calculate(current: int, next: int[]):
  if(current == total_weight):
    return True
  if(not next):
    return False

现在你检查一下你的计算结果是否会达到你的总数。

def calculate(current: int, next: int[]):
  if(current == total_weight):
    return True
  if(not next):
    return False
  #Edit: x does not require to be calculated in every cases
  x = False
  if current+ next[0] <= total_weight:
    x = calculate(current+ next[0], next[1:]) #with
  y = calculate(current, next[1:]) #without
  return x or y

print(calculate(current_weight, data))

您可能需要线程在大数据集中执行更快和中止下一个计算步骤。

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/57327540

复制

相似问题

问确定蛋白质片段的组合是否涵盖完整的蛋白质序列。
EN

回答 3

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问确定蛋白质片段的组合是否涵盖完整的蛋白质序列。EN

回答 3

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问确定蛋白质片段的组合是否涵盖完整的蛋白质序列。
EN