以下是两个完全相同的函数,但是有谁知道为什么一个函数使用count()方法比另一个函数快得多呢?(我的意思是它是如何工作的?它是如何建造的?)
如果可能的话,我想要一个比这里找到的更容易理解的答案:Algorithm used to implement the Python str.count function或源代码中的内容:https://hg.python.org/cpython/file/tip/Objects/stringlib/fastsearch.h
def scoring1(seq):
score = 0
for i in range(len(seq)):
if seq[i] == '0':
score += 1
return score
def scoring2(seq):
score = 0
score = seq.count('0')
return score
seq = 'AATTGGCCGGGGAG0CTTC0CTCC000TTTCCCCGGAAA'
# takes 1min15 when applied to 100 sequences larger than 100 000 characters
score1 = scoring1(seq)
# takes 10 sec when applied to 100 sequences larger than 100 000 characters
score2 = scoring2(seq)非常感谢你的答复
发布于 2016-12-06 12:58:32
@CodeMonkey已经给出了答案,但值得注意的是,您的第一个函数可以改进,使其运行速度快20%:
import time, random
def scoring1(seq):
score=0
for i in range(len(seq)):
if seq[i]=='0':
score+=1
return score
def scoring2(seq):
score=0
for x in seq:
score += (x =='0')
return score
def scoring3(seq):
score = 0
score = seq.count('0')
return score
def test(n):
seq = ''.join(random.choice(['0','1']) for i in range(n))
functions = [scoring1,scoring2,scoring3]
for i,f in enumerate(functions):
start = time.clock()
s = f(seq)
elapsed = time.clock() - start
print('scoring' + str(i+1) + ': ' + str(s) + ' computed in ' + str(elapsed) + ' seconds')
test(10**7) 典型产出:
scoring1: 5000742 computed in 0.9651326495293333 seconds
scoring2: 5000742 computed in 0.7998054195159483 seconds
scoring3: 5000742 computed in 0.03732172598339578 seconds前两种方法都被内置的count()吹走了。
故事的寓意:当你没有使用已经优化的内置方法时,你需要优化你自己的代码。
发布于 2016-12-06 12:39:25
因为计数是在基础本机实现中执行的。for-循环在解释较慢的代码中执行。
https://stackoverflow.com/questions/40995614
复制相似问题