文章/答案/技术大牛

发布

社区首页 >问答首页 >为什么将值赋值给C-连续数组在我的例子中使用Cython很慢

问为什么将值赋值给C-连续数组在我的例子中使用Cython很慢
EN

Stack Overflow用户

提问于 2016-04-06 10:03:16

回答 1查看 461关注 0票数 2

我遇到了使用Cython将临时结果分配给数组的问题。这里我声明了一个test_array、sample-size和weight_array，通过使用for循环，我将每个加权结果保存到一个res_array中。在Cython中，test_array和weight_array都被定义为C-连续数组。test.pyx和setup.py文件列示如下：

# test.pyx
import numpy as np
cimport numpy as np
import random
cimport cython
from cython cimport boundscheck, wraparound


@cython.boundscheck(False)
@cython.wraparound(False)
@cython.nonecheck(False)
@cython.cdivision(True)
@cython.profile(True)
def cython_sample(int res_size, int sample_size, double[::1] all_data, double[::1] weight_array):
    # using c-contiguous array can speed up a little bit
    cdef int ii, jj
    cdef double tmp_res, dot_result
    cdef double[::1] tmp_sample = np.ones(sample_size, dtype=np.double)
    cdef double[::1] res_array = np.ones(res_size, dtype=np.double)

    ran = random.normalvariate   # generate random value as a test
    for ii in range(res_size):
        tmp_sample = all_data[ii:(ii + sample_size)]

        # inner product operation
        dot_result = 0.0
        for jj in range(sample_size):
            dot_result += tmp_sample[jj]*weight_array[jj]

        # save inner product result into array 
        res_array[ii] = dot_result
        #res_array[ii] = ran(10000,20000)

     return res_array

# setup.py
from setuptools import setup,find_packages
from distutils.extension import Extension
from Cython.Build import cythonize
import numpy as np

ext = Extension("mycython.test", sources=["mycython/test.pyx"])
setup(ext_modules=cythonize(ext),
      include_dirs=[np.get_include()],
      name="mycython",     
      version="0.1",
      packages=find_packages(),
      author="me",
      author_email="me@example.com",
      url="http://example.com/")

python test.py是：

import time
import random
import numpy as np
from strategy1 import __cyn__

sample_size = 3000
test_array = [random.random() for _ in range(300000)]
res_size = len(test_array) - sample_size + 1
weight_array = [random.random() for _ in range(sample_size)]
c_contig_store_array = np.ascontiguousarray(test_array, dtype=np.double)
c_contig_weigh_array = np.ascontiguousarray(weight_array, dtype=np.double)


replay = 100
start_time = time.time()
for ii in range(int(replay)):
    __cyn__.cython_sample(res_size, sample_size, c_contig_store_array, c_contig_weigh_array)
per_elapsed_time = (time.time() - start_time) / replay
print('Elapse time :: %g sec' % (per_elapsed_time))

因此，我测试了两个场景：

# 1. when saving dot_result into 'res_array':
     res_array[ii] = dot_result

速度测试显示：Elapse time :: 0.821084 sec

# 2. when saving a random value ran(10000,20000) into 'res_array':
     res_array[ii] = ran(10000,20000)

速度测试显示：Elapse time :: 0.214591 sec。

我使用ran(*,*)测试代码的原因是，如果我在原始代码中注释掉res_array[ii] = dot_result和res_array[ii] = ran(10000,20000)，速度几乎会提高30-100倍(Elapse time :: 0.00633394 sec)。然后，我认为问题可能在于将dot_result值分配给res_array，这变成真的，因为将随机生成的双值ran(10000,20000)分配给res_array的速度相当快(几乎比上面所示的速度快4倍)。

有办法解决这个问题吗？谢谢

python

numpy

cython

回答 1

Stack Overflow用户

回答已采纳

发布于 2016-04-06 11:07:45

如果不使用dot_result的值，编译器将删除循环：

dot_result = 0.0
for jj in range(sample_size):
    dot_result += tmp_sample[jj]*weight_array[jj]

内环占计算时间的绝大部分。

您的cython代码看起来像correlate()，您可以通过使用fft来加速它：

from scipy import signal
res = signal.fftconvolve(c_contig_store_array, c_contig_weigh_array[::-1], mode="valid")

票数 3

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/36447822

复制

相似问题

问为什么将值赋值给C-连续数组在我的例子中使用Cython很慢
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问为什么将值赋值给C-连续数组在我的例子中使用Cython很慢EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问为什么将值赋值给C-连续数组在我的例子中使用Cython很慢
EN