我有一个这样的数据帧:
Id F M R
7 1 286 907
12 1 286 907
17 1 186 1271
21 1 296 905
30 1 308 908
32 1 267 905
40 2 591 788
41 1 486 874
47 1 686 906
74 1 230 907对于每一行,如果f> f's mean()、M> M's mean()和R>R's mean(),则新列中的输出为"1“。
如下所示:
Id F M R score
7 1 286 907 1
12 1 286 907 0
17 1 186 1271 1
21 1 296 905
30 1 308 908
32 1 267 905
40 2 591 788
41 1 486 874
47 1 686 906
74 1 230 907发布于 2016-06-27 16:08:54
您可以对通过将3列与其mean进行比较而创建的mask使用numpy.where,然后使用all检查所有行都是True
# I modify last value in row with index 6 to 1000
print (df)
Id F M R
0 7 1 286 907
1 12 1 286 907
2 17 1 186 1271
3 21 1 296 905
4 30 1 308 908
5 32 1 267 905
6 40 2 591 1000
7 41 1 486 874
8 47 1 686 906
9 74 1 230 907print (df.F.mean())
1.1
print (df.M.mean())
362.2
print (df.R.mean())
949.0
print (df[['F','M','R']] > df[['F','M','R']].mean())
F M R
0 False False False
1 False False False
2 False False True
3 False False False
4 False False False
5 False False False
6 True True True
7 False True False
8 False True False
9 False False False
mask = (df[['F','M','R']] > df[['F','M','R']].mean()).all(1)
print (mask)
0 False
1 False
2 False
3 False
4 False
5 False
6 True
7 False
8 False
9 False
dtype: booldf['score'] = np.where(mask,1,0)
print (df)
Id F M R score
0 7 1 286 907 0
1 12 1 286 907 0
2 17 1 186 1271 0
3 21 1 296 905 0
4 30 1 308 908 0
5 32 1 267 905 0
6 40 2 591 1000 1
7 41 1 486 874 0
8 47 1 686 906 0
9 74 1 230 907 0如果条件发生变化:
mask = (df.F > df.F.mean()) & (df.M < df.M.mean()) & (df.R < df.R.mean())
print (mask)
0 False
1 False
2 False
3 False
4 False
5 False
6 False
7 False
8 False
9 False
dtype: bool
df['score'] = np.where(mask,2,0)
print (df)
Id F M R score
0 7 1 286 907 0
1 12 1 286 907 0
2 17 1 186 1271 0
3 21 1 296 905 0
4 30 1 308 908 0
5 32 1 267 905 0
6 40 2 591 1000 0
7 41 1 486 874 0
8 47 1 686 906 0
9 74 1 230 907 0编辑:
我认为你可以首先检查在某些条件下是否有多个值不在某一行中,方法是:
mask1 = (df.F > df.F.mean()) & (df.M > df.M.mean()) & (df.R > df.R.mean())
mask2 = (df.F > df.F.mean()) & (df.M < df.M.mean()) & (df.R < df.R.mean())
mask3 = (df.F < df.F.mean()) & (df.M < df.M.mean()) & (df.R < df.R.mean())
df['score1'] = np.where(mask1,1,0)
df['score2'] = np.where(mask2,2,0)
df['score3'] = np.where(mask3,3,0)如果不是,请使用:
df.loc[mask1, 'score'] = 1
df.loc[mask2, 'score'] = 2
df.loc[mask3, 'score'] = 3
df.score.fillna(0, inplace=True)发布于 2016-06-27 16:06:42
df.loc[df['f']>df['f'].mean(),['f']] += 1
df.loc[df['m']>df['m'].mean(),['m']] += 1
df.loc[df['r']>df['r'].mean(),['r']] += 1还没有测试过,如果有效,请尝试并评论。
或者试试这个
df['f'] = [x+1 for x in df['f'] if x>df['f'].mean()]
df['m'] = [x+1 for x in df['m'] if x>df['m'].mean()]
df['r'] = [x+1 for x in df['r'] if x>df['r'].mean()]https://stackoverflow.com/questions/38048665
复制相似问题