文章/答案/技术大牛

发布

社区首页 >问答首页 >改进Python中`compute_optimal_weights`函数的性能

问改进Python中`compute_optimal_weights`函数的性能
EN

Stack Overflow用户

提问于 2017-04-20 03:18:25

回答 2查看 134关注 0票数 1

有没有更快的方法用Python编写"compute_optimal_weights“函数。我运行了数亿次，所以任何速度的提高都会有所帮助。函数的参数在每次运行时都是不同的。

c1 = 0.25
c2 = 0.67

def compute_optimal_weights(input_prices):
    input_weights_optimal = {}
    for i in input_prices:
        price = input_prices[i]
        input_weights_optimal[i] = c2 / sum([(price/n) ** c1 for n in input_prices.values()])
    return input_weights_optimal

input_sellers_ID = range(10)
input_prices = {}
for i in input_sellers_ID:
    input_prices[i] = random.uniform(0,1)


t0 = time.time()
for i in xrange(1000000):
    compute_optimal_weights(input_prices)
t1 = time.time()
print "old time", (t1 - t0)

列表和字典中的元素数量各不相同，但平均有10个元素。input_prices中的键在所有调用中都是相同的，但是值会发生变化，因此在不同的运行过程中，相同的键将有不同的值。

python

performance

dictionary

list-comprehension

回答 2

Stack Overflow用户

回答已采纳

发布于 2017-04-20 08:49:53

使用一点数学，您可以将sum_price_ratio_scaled的一部分计算为循环中的常量，并通过~80% (平均输入大小为10)来加快程序的速度。

优化实现(Python 3)：

def compute_optimal_weights(ids, prices):
    scaled_sum = 0
    for i in ids:
        scaled_sum += prices[i] ** -0.25
    result = {}
    for i in ids:
        result[i] = 0.67 * (prices[i] ** -0.25) / scaled_sum
    return result

作为对this answer 的回应，在使用numpy时，考虑到input_sellers_ID列表中“平均有10个元素”，我怀疑这种方法对于您的特定应用程序是否值得考虑。

虽然利用生成器表达式和字典理解的简洁性可能很有诱惑力，但我在机器上运行时注意到，通过使用常规的for-in循环和避免像sum(...)这样的函数调用，可以获得最佳性能。不过，为了完整起见，下面是上面的实现会以更“pythonic”的方式出现的样子：

def compute_optimal_weights(ids, prices):
    scaled_sum = sum(prices[i] ** -0.25 for i in ids)
    return {i: 0.67 * (prices[i] ** -0.25) / scaled_sum for i in ids}

推理/数学：

根据您发布的算法，您正在尝试创建一个字典，其值由下面的函数f(i)表示，其中i是input_sellers_ID列表中的元素之一。

当您最初为f(i)编写公式时，似乎必须对求和过程的每一步重新计算prices[i]，这是非常昂贵的。但是，使用指数规则简化表达式，您可以看到，确定f(i)所需的最简单的求和实际上独立于i (只使用j的索引值)，这意味着该术语是一个常量，可以在设置字典值的循环之外计算。

请注意，上面我将input_prices称为prices，input_sellers_ID称为ids。

性能简介(在我的机器上提高了80%的速度，尺寸为10)：

import time
import random

def compute_optimal_weights(ids, prices):
    scaled_sum = 0
    for i in ids:
        scaled_sum += prices[i] ** -0.25
    result = {}
    for i in ids:
        result[i] = 0.67 * (prices[i] ** -0.25) / scaled_sum
    return result

def compute_optimal_weights_old(input_sellers_ID, input_prices):
    input_weights_optimal = {}
    for i in input_sellers_ID:
        sum_price_ratio_scaled = 0
        for j in input_sellers_ID:
            price_ratio = input_prices[i] / input_prices[j]
            scaled_price_ratio = price_ratio ** c1
            sum_price_ratio_scaled += scaled_price_ratio
        input_weights_optimal[i] = c2 / sum_price_ratio_scaled
    return input_weights_optimal


c1 = 0.25
c2 = 0.67
input_sellers_ID = range(10)
input_prices = {i: random.uniform(0,1) for i in input_sellers_ID}

start = time.clock()
for _ in range(1000000):
    compute_optimal_weights_old(input_sellers_ID, input_prices) and None

old_time = time.clock() - start

start = time.clock()
for _ in range(1000000):
    compute_optimal_weights(input_sellers_ID, input_prices) and None

new_time = time.clock() - start

print('Old:', compute_optimal_weights_old(input_sellers_ID, input_prices))
print('New:', compute_optimal_weights(input_sellers_ID, input_prices))
print('New algorithm is {:.2%} faster.'.format(1 - new_time / old_time))

票数 1

Stack Overflow用户

发布于 2017-04-20 08:32:22

我相信我们可以通过分解循环来加速这个函数。让a = price，b = n和c = c1，如果我的数学没有错(例如(5/6)**3 == 5**3 / 6**3 )

(5./6.)**2 + (5./4.)**2
== 
5**2 / 6.**2 + 5**2 / 4.**2
== 
5**2 * (1/6.**2 + 1/4.**2)

与变量：

sum( (a / b) ** c for each b)
==
sum( a**c * (1/b) ** c for each b)
==
a**c * sum((1./b)**c for each b)

第二个项是常数，可以取出来。这就剩下：

更快的实现-原始Python

使用生成器和分词-理解：

def compute_optimal_weights(input_prices):
    sconst = sum(1/w**c1 for w in input_prices.values())
    return {k: c2 / (v**c1 * sconst) for k, v in input_prices.items()}

注意:如果您使用的是Python2，请将.values()和.items()替换为.itervalues()和.iteritems()，以获得额外的加速比(很少有大列表的ms )。

更快- Numpy

此外，如果您不太关心字典，只想要这些值，那么可以使用numpy (用于大型输入>100)来加快它的速度：

def compute_optimal_weights_np(input_prices):
    data = np.asarray(input_prices.values()) ** c1
    return c2 / (data * np.sum(1./data))

对于不同的输入大小，很少有计时：

N = 10输入：我的: 100000圈，最好每循环3: 6.02秒NUMPY: 100000圈，3: 10.6s每圈最好: 10000圈，最好是3: 23.8s/圈
N = 100输入：我的: 10000圈，最佳每环3: 49.1秒NUMPY: 10000圈，3: 22.6s/循环最佳: 1000圈，最佳3: 1.86毫秒/圈。
N = 1000输入：我的:1000个循环，每个循环3: 458 s的最佳循环NUMPY: 10000个循环，每个循环3: 121个循环的最佳循环:10个循环，最好的每循环3: 173 ms。
N = 100000输入：我的:10个循环，每个循环3: 54.2ms的最佳循环NUMPY: 100个循环，每个循环3: 11.1 ms的最佳循环:几分钟内没有完成

这两种选择都比问题中提出的办法快得多。如果您可以提供一致的输入(以数组的形式而不是字典的形式)，那么使用numpy的好处就会随着大小的增加而变得很明显：

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/43509602

复制

相似问题

问改进Python中`compute_optimal_weights`函数的性能
EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问改进Python中`compute_optimal_weights`函数的性能EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问改进Python中`compute_optimal_weights`函数的性能
EN