文章/答案/技术大牛

发布

社区首页 >问答首页 >使用scipy.sparse.csc_matrix替代数字广播

问使用scipy.sparse.csc_matrix替代数字广播
EN

Stack Overflow用户

提问于 2013-04-17 01:26:16

回答 2查看 4.1K关注 0票数 10

我的代码中有以下表达式：

a = (b / x[:, np.newaxis]).sum(axis=1)

其中b是形状(M, N)的ndarray，x是形状(M,)的ndarray。现在，b实际上是稀疏的，所以为了提高内存效率，我想用scipy.sparse.csc_matrix或csr_matrix代替。然而，这样的广播没有实现(即使除法或乘法保证保持稀疏性)( x的条目是非零的)，并引发NotImplementedError。有没有我不知道的sparse函数可以做我想做的事情？(dot()会沿着错误的轴求和。)

python

numpy

scipy

sparse-matrix

回答 2

Stack Overflow用户

回答已采纳

发布于 2013-04-17 04:47:26

如果b采用CSC格式，则b.data具有b的非零条目，而b.indices具有每个非零条目的行索引，因此您可以按如下方式进行除法：

b.data /= np.take(x, b.indices)

它比Warren的优雅解决方案更复杂，但在大多数情况下可能也会更快：

b = sps.rand(1000, 1000, density=0.01, format='csc')
x = np.random.rand(1000)

def row_divide_col_reduce(b, x):
    data = b.data.copy() / np.take(x, b.indices)
    ret = sps.csc_matrix((data, b.indices.copy(), b.indptr.copy()),
                         shape=b.shape)
    return ret.sum(axis=1)

def row_divide_col_reduce_bis(b, x):
    d = sps.spdiags(1.0/x, 0, len(x), len(x))
    return (d * b).sum(axis=1)

In [2]: %timeit row_divide_col_reduce(b, x)
1000 loops, best of 3: 210 us per loop

In [3]: %timeit row_divide_col_reduce_bis(b, x)
1000 loops, best of 3: 697 us per loop

In [4]: np.allclose(row_divide_col_reduce(b, x),
   ...:             row_divide_col_reduce_bis(b, x))
Out[4]: True

在上面的例子中，如果你就地进行除法，你可以将时间减少近一半，即：

def row_divide_col_reduce(b, x):
    b.data /= np.take(x, b.indices)
    return b.sum(axis=1)

In [2]: %timeit row_divide_col_reduce(b, x)
10000 loops, best of 3: 131 us per loop

票数 10

Stack Overflow用户

发布于 2013-04-17 04:23:42

要实现a = (b / x[:, np.newaxis]).sum(axis=1)，可以使用a = b.sum(axis=1).A1 / x。A1属性返回1Dndarray，因此结果是1Dndarray，而不是matrix。这个简洁的表达式之所以有效，是因为您既按x缩放，又沿轴1求和。例如：

In [190]: b
Out[190]: 
<3x3 sparse matrix of type '<type 'numpy.float64'>'
        with 5 stored elements in Compressed Sparse Row format>

In [191]: b.A
Out[191]: 
array([[ 1.,  0.,  2.],
       [ 0.,  3.,  0.],
       [ 4.,  0.,  5.]])

In [192]: x
Out[192]: array([ 2.,  3.,  4.])

In [193]: b.sum(axis=1).A1 / x
Out[193]: array([ 1.5 ,  1.  ,  2.25])

更一般地，如果你想用一个向量x来缩放一个稀疏矩阵的行，你可以把左边的b乘以对角线上包含1.0/x的稀疏矩阵。函数scipy.sparse.spdiags可以用来创建这样的矩阵。例如：

In [71]: from scipy.sparse import csc_matrix, spdiags

In [72]: b = csc_matrix([[1,0,2],[0,3,0],[4,0,5]], dtype=np.float64)

In [73]: b.A
Out[73]: 
array([[ 1.,  0.,  2.],
       [ 0.,  3.,  0.],
       [ 4.,  0.,  5.]])

In [74]: x = array([2., 3., 4.])

In [75]: d = spdiags(1.0/x, 0, len(x), len(x))

In [76]: d.A
Out[76]: 
array([[ 0.5       ,  0.        ,  0.        ],
       [ 0.        ,  0.33333333,  0.        ],
       [ 0.        ,  0.        ,  0.25      ]])

In [77]: p = d * b

In [78]: p.A
Out[78]: 
array([[ 0.5 ,  0.  ,  1.  ],
       [ 0.  ,  1.  ,  0.  ],
       [ 1.  ,  0.  ,  1.25]])

In [79]: a = p.sum(axis=1)

In [80]: a
Out[80]: 
matrix([[ 1.5 ],
        [ 1.  ],
        [ 2.25]])

票数 7

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/16043299

复制

相似问题

问使用scipy.sparse.csc_matrix替代数字广播
EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问使用scipy.sparse.csc_matrix替代数字广播EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问使用scipy.sparse.csc_matrix替代数字广播
EN