我有一组代码,用来计算模板如何适合用户的详细信息。它有点长,所以我已经删除了所有的代码,它的工作很好,只包括瓶颈和周围的代码。我的目标是速度!
关于变量的几个注释:
在运行时计算的
最后4个维度都是相同的维度。
以上五个维度都是相同的维度。
for x in range(4,8,1):
docSecSizes = docSecSizesFull[x-4]
shortSecSizes = shortSecSizesFull[x-4]
cutPoints = cutPointsFull[x-4]
tmpltNum = tmplNumFull[x-4]
layoutNums = 0
numTemps = len(docSecSizes)
tmpsplits = []
tmpsplits = [AllDocSplitsFull[x-4][z] for z in range(numTemps)]
alltmplIds = [tmplNumFull[x-4][z] for z in range(numTemps)]
for y in list(itertools.permutations(TopSecHead[x-4][1:])):
tmpHeadSec = []
tmpHeadSec.append('BasicInfo')
headingIDs = []
headingIDs.append(str(0))
for z in y:
tmpHeadSec.append(z)
headingIDs.append(str(headingLookups.index(z)))
SectionIDs = ','.join(headingIDs)
tmpvals = []
tmpArray = []
for key in allSecScoreDicP1:
tmpArray.append(allSecScoreDicP1[key])
nparr = np.array(tmpArray)
print(nparr.transpose())
for z in range(numTemps):
docScore = 0
docScoreReducer = 1
for q in range(len(shortSecSizes[z])):
if q < cutPoints[z]:
indexVal = shortSecSizes[z][q]
docScore+= allSecScoreDicP1[tmpHeadSec[q]][indexVal]
docScoreReducer *= allSecReducerDicP1[tmpHeadSec[q]][indexVal]
else:
indexVal = shortSecSizes[z][q]
docScore+= allSecScoreDicP2[tmpHeadSec[q]][indexVal]
docScoreReducer *= allSecReducerDicP2[tmpHeadSec[q]][indexVal]
docScore = docScore * docScoreReducer
tmpvals.append(docScore)
numTemplate = len(tmpvals)
totaldocs += numTemplate
sectionNum = [x] * numTemplate
layoutNumIterable = [layoutNums] * numTemplate
SectionIDsIterable = [SectionIDs] * numTemplate
scoredTemplates.append(pd.DataFrame(list(zip(sectionNum,alltmplIds,layoutNumIterable,tmpvals,SectionIDsIterable,tmpsplits)),columns = ['#Sections','TemplateID','LayoutID','Score','SectionIDs','Splits']))
layoutNums +=1
allScoredTemplates = pd.concat(scoredTemplates,ignore_index=True)问题代码是这样的:
for z in range(numTemps):
docScore = 0
docScoreReducer = 1
for q in range(len(shortSecSizes[z])):
if q < cutPoints[z]:
indexVal = shortSecSizes[z][q]
docScore+= allSecScoreDicP1[tmpHeadSec[q]][indexVal]
docScoreReducer *= allSecReducerDicP1[tmpHeadSec[q]][indexVal]
else:
indexVal = shortSecSizes[z][q]
docScore+= allSecScoreDicP2[tmpHeadSec[q]][indexVal]
docScoreReducer *= allSecReducerDicP2[tmpHeadSec[q]][indexVal]
docScore = docScore * docScoreReducer
tmpvals.append(docScore)我试过把它修改为列出理解,但速度要慢一些:
docScore = [sum([allSecScoreDicP1[tmpHeadSec[q]][shortSecSizes[z][q]] if q < cutPoints[z] else allSecScoreDicP2[tmpHeadSec[q]][shortSecSizes[z][q]] for q in range(len(shortSecSizes[z]))]) for z in range(numTemps)]
docReducer = [np.prod([allSecReducerDicP1[tmpHeadSec[q]][shortSecSizes[z][q]] if q < cutPoints[z] else allSecReducerDicP2[tmpHeadSec[q]][shortSecSizes[z][q]] for q in range(len(shortSecSizes[z]))]) for z in range(numTemps)]
tmpvals = [docScore[x] * docReducer[x] for x in range(len(docScore))]任何关于优化方法的建议都将不胜感激。我也尝试过将代码转换成cython,我已经将其转换并工作,但它的速度大约慢了10倍!
发布于 2020-01-26 07:55:31
如果“我的目标是速度!”(假设您的意思是执行的速度)那么最好的建议是将它移到编译语言中。Python有很多很多好处,但是执行速度很少是其中之一。
正如Andre O所建议的,第一步是得到一个好的算法。Python在这方面可能很有帮助。标准的profile模块可以帮助您找到代码花费在哪里的时间,您可以将优化集中在代码的这一部分上。
作为完全编译语言的中间步骤,您可以使用Python程序并将其移动到Cython,Cython将Python编译为机器语言。
如果您的分析发现某些部分是最常用和最慢的,那么您可以用C编写该部分代码,然后从Python调用它。
https://codereview.stackexchange.com/questions/236133
复制相似问题