这是对我以前的问题的后续,它是关于查找可迭代的最小值和最大值的。针对这一问题,提出了算法.下面是我使用阿糖皮图书馆的解决方案。
问题的短期再讨论:
你会得到两个数组(
genes和health),其中一个有“基因”的名字,另一个是“基因”的重量(也就是健康)。然后,您给出了一串字符串,每个字符串包含值m和n,它们表示要应用于genes和health数组的切片的开始和结束,以及“gene”-字符串,我们需要对它们确定是否健康。然后,我们需要为最健康的字符串和最不健康的字符串返回健康值。
我想这段代码可能有问题,但不确定是什么。它对小型测试用例很好,给出了与以前版本的解决方案大致相同的时间,但对于大型测试用例,我的PC基本上是挂起的。
一个小型测试案例的例子:
genes = ['a', 'b', 'c', 'aa', 'd', 'b']
health = [1, 2, 3, 4, 5, 6]
gene1 = "1 5 caaab" (result = 19 = max)
gene2 = "0 4 xyz" (result = 0 = min)
gene3 = "2 4 bcdybc" (result = 11)大型testcase (2个列出每个100 K元素;testcase 41K+元素):txt在我的dropbox (2,80 MB) (对pastebin来说太大了)
因此,我有两个问题:( 1)我的代码有什么问题,我如何才能赋予它的性能( 2)如何应用Aho-Corasick而不转到任何非标准库(因为,很可能,它不能安装在HackerRank服务器上)
def geneshealth(genes, health, testcase):
from ahocorapy.keywordtree import KeywordTree
import math
min_weight = math.inf
max_weight = -math.inf
for case in testcase:
#construct the keyword tree from appropriately sliced "genes" list
kwtree = KeywordTree(case_insensitive=True)
fl, ceil, g = case.split()
for i in genes[int(fl):int(ceil)+1]:
kwtree.add(i)
kwtree.finalize()
#search the testcase list for matches
result = list(kwtree.search_all(g))
hea = 0
for gn, _ in result:
for idx, val in enumerate(genes):
if val == gn:
hea += health[idx]
if hea < min_weight:
min_weight = hea
if hea > max_weight:
max_weight = hea
return(min_weight, max_weight)发布于 2020-06-07 06:03:08
此代码很慢,因为:
KeywordTree.search_all()是一个生成器,直接循环它。和,像这样的东西(未经测试):
import math
from collections import defaultdict
from ahocorapy.keywordtree import KeywordTree
def geneshealth(genes, health, testcases):
# build the kwtree using all the genes
kwtree = KeywordTree(case_insensitive=True)
for gene in genes:
kwtree.add(gene)
kwtree.finalize()
# build a dict that maps a gene to a list of (index, health) tuples
index_and_health = defaultdict(list)
for gene, data in zip(genes, enumerate(health)):
index_and_health[gene].append(data)
min_dna_health = math.inf
max_dna_health = -math.inf
for case in testcases:
start, end, dna = case.split()
start = int(start)
end = int(end)
dna_health = 0
# search the dna for any genes in the kwtree
# note: we don't care where the gene is in the dna
for gene, _ in kwtree.search_all(dna):
for gene_index, gene_health in index_and_health[gene]:
# only genes that are within the testcase limits
# contribute dna_health
if start <= gene_index <= end:
dna_health += gene_health
# keep the min/max weight
if dna_health < min_dna_health:
min_dna_health = dna_health
if dna_health > max_dna_health:
max_dna_health = dna_health
return(min_dna_health, max_dna_health)https://codereview.stackexchange.com/questions/243500
复制相似问题