我正在寻找一种简洁的方法来开始:
a = numpy.array([1,4,1,numpy.nan,2,numpy.nan])至:
b = numpy.array([1,5,6,numpy.nan,8,numpy.nan])目前我能做的最好的事情是:
b = numpy.insert(numpy.cumsum(a[numpy.isfinite(a)]), (numpy.argwhere(numpy.isnan(a)) - numpy.arange(len(numpy.argwhere(numpy.isnan(a))))), numpy.nan)有没有更短的方法来完成同样的事情呢?沿着二维数组的轴线做累加怎么样?
发布于 2012-10-24 22:26:33
如何(对于不太大的数组):
In [34]: import numpy as np
In [35]: a = np.array([1,4,1,np.nan,2,np.nan])
In [36]: a*0 + np.nan_to_num(a).cumsum()
Out[36]: array([ 1., 5., 6., nan, 8., nan])发布于 2012-10-25 03:51:08
Pandas是一个构建在numpy之上的库。它的Series类有一个cumsum方法,它保留了nan的方法,并且比DSM提出的解决方案快得多:
In [15]: a = arange(10000.0)
In [16]: a[1] = np.nan
In [17]: %timeit a*0 + np.nan_to_num(a).cumsum()
1000 loops, best of 3: 465 us per loop
In [18] s = pd.Series(a)
In [19]: s.cumsum()
Out[19]:
0 0
1 NaN
2 2
3 5
...
9996 49965005
9997 49975002
9998 49985000
9999 49994999
Length: 10000
In [20]: %timeit s.cumsum()
10000 loops, best of 3: 175 us per loop发布于 2012-10-24 22:42:59
Masked arrays就是针对这种情况的。
>>> import numpy as np
>>> from numpy import ma
>>> a = np.array([1,4,1,np.nan,2,np.nan])
>>> b = ma.masked_array(a,mask = (np.isnan(a) | np.isinf(a)))
>>> b
masked_array(data = [1.0 4.0 1.0 -- 2.0 --],
mask = [False False False True False True],
fill_value = 1e+20)
>>> c = b.cumsum()
>>> c
masked_array(data = [1.0 5.0 6.0 -- 8.0 --],
mask = [False False False True False True],
fill_value = 1e+20)
>>> c.filled(np.nan)
array([ 1., 5., 6., nan, 8., nan])https://stackoverflow.com/questions/13051103
复制相似问题