我正在尝试计算数据帧中第一个向下的百分比。
这是数据帧
down distance
1 1.0 10.0
2 2.0 13.0
3 3.0 15.0
4 3.0 20.0
5 4.0 1.0
6 1.0 10.0
7 2.0 9.0
8 3.0 3.0
9 1.0 10.0我想计算从第一次向下的百分比,也就是第二次向下的百分比,码数增加的百分比是多少。对于第三个向下,基于第一个的第三个的perc。
例如,我希望得到以下输出。
down distance percentage
1 1.0 10.0 NaN
2 2.0 13.0 (13-10)/13
3 3.0 15.0 (15-10)/15
4 3.0 20.0 (20-10)/20
5 4.0 1.0 (1-10)/20
6 1.0 10.0 NaN # New calculation
7 2.0 9.0 (9-10)/9
8 3.0 3.0 (3-10)/3
9 1.0 10.0 NaN谢谢
对于第一个问题,当前的解决方案都可以正常工作。
发布于 2018-08-10 08:40:34
这是一个矢量化的解决方案:
# define condition
cond = df['down'] == 1
# calculate value to subtract
first = df['distance'].where(cond).ffill().mask(cond)
# perform calculation
df['percentage'] = (df['distance'] - first) / df['distance']
print(df)
down distance percentage
1 1.0 10.0 NaN
2 2.0 13.0 0.230769
3 3.0 15.0 0.333333
4 3.0 20.0 0.500000
5 4.0 1.0 -9.000000
6 1.0 10.0 NaN
7 2.0 9.0 -0.111111
8 3.0 3.0 -2.333333
9 1.0 10.0 NaN发布于 2018-08-10 08:47:46
使用groupby和transform
s = df.groupby(df.down.eq(1).cumsum()).distance.transform('first')
s = df.distance.sub(s).div(df.distance)
df['percentage'] = s.mask(s.eq(0))
down distance percentage
1 1.0 10.0 NaN
2 2.0 13.0 0.230769
3 3.0 15.0 0.333333
4 3.0 20.0 0.500000
5 4.0 1.0 -9.000000
6 1.0 10.0 NaN
7 2.0 9.0 -0.111111
8 3.0 3.0 -2.333333
9 1.0 10.0 NaN发布于 2018-08-10 12:56:10
使用Numpy位
应该是相当快的!
m = df.down.values == 1 # mask where equal to 1
i = np.flatnonzero(m) # positions where equal to 1
d = df.distance.values # Numpy array of distances
j = np.diff(np.append(i, len(df))) # use diff to find distances between
# values equal to 1. Note that I append
# the length of the df as a terminal value
k = i.repeat(j) # I repeat the positions where equal to 1
# a number of times in order to fill in.
p = np.where(m, np.nan, 1 - d[k] / d) # reduction of % formula while masking
df.assign(percentage=p)
down distance percentage
1 1.0 10.0 NaN
2 2.0 13.0 0.230769
3 3.0 15.0 0.333333
4 3.0 20.0 0.500000
5 4.0 1.0 -9.000000
6 1.0 10.0 NaN
7 2.0 9.0 -0.111111
8 3.0 3.0 -2.333333
9 1.0 10.0 NaNhttps://stackoverflow.com/questions/51777429
复制相似问题