我有熊猫DF,在其中我需要迭代两列的值(位置和事件),并用NaN替换字符串"Gate-3“"NO Access”。
下面是DF示例。
Time Location Event Badge ID
18:28:59 Gate-2 Access Granted 81002
18:28:12 Gate-1 Access Granted 80557
18:27:55 Gate-3 Access Granted 80557
18:27:44 Gate-3 NO Access 80398
18:25:38 Gate-1 NO Access 80978
18:25:30 Gate-2 Access Granted 73680
18:23:56 Gate-1 Access Granted 73680
18:23:52 Gate-2 Access Granted 80557
18:23:19 Gate-2 NO Access 128
18:23:16 Gate-1 Access Granted 80557预期输出是
Time Location Event Badge ID
0 18:28:59 Gate-2 Access Granted 81002
1 18:28:12 Gate-1 Access Granted 80557
2 18:27:55 NaN Access Granted 80557
3 18:27:44 NaN NaN 80398
4 18:25:38 Gate-1 NaN 80978
5 18:25:30 Gate-2 Access Granted 73680
6 18:23:56 Gate-1 Access Granted 73680
7 18:23:52 Gate-2 Access Granted 80557
8 18:23:19 Gate-2 NaN 128
9 18:23:16 Gate-1 Access Granted 80557发布于 2018-12-14 17:12:20
您可以在加载XLS文件时通过指定一个na_values参数来设置它。
df = pd.read_excel('file.xls', na_values=['Gate-3', 'NO Access'])
print(df)
Time Location Event Badge ID
0 18:28:59 Gate-2 Access Granted 81002
1 18:28:12 Gate-1 Access Granted 80557
2 18:27:55 NaN Access Granted 80557
3 18:27:44 NaN NaN 80398
4 18:25:38 Gate-1 NaN 80978
5 18:25:30 Gate-2 Access Granted 73680
6 18:23:56 Gate-1 Access Granted 73680
7 18:23:52 Gate-2 Access Granted 80557
8 18:23:19 Gate-2 NaN 128
9 18:23:16 Gate-1 Access Granted 80557这是,海事组织,比必须清理你的数据后加载它。
发布于 2018-12-14 17:12:41
在条件为真的情况下,可以获得布尔掩码。
mask = df.Location.eq('Gate-3') & df.Event.eq('NO Access') # df is your dataframe您可以使用该掩码设置任何想要NaN的列,如下所示:
df.loc[mask, ['Location', 'Event']] = np.nan # imported numpy as np 编辑:
看来你改变了规格。如果要将位置或事件列与哨兵值匹配的位置或事件列设置为NaN,请使用两个掩码。
locmask = df.Location.eq('Gate-3')
df.loc[locmask, 'Location'] = np.nan
evmask = df.Event.eq('NO Access')
df.loc[evmask, 'Event'] = np.nan发布于 2018-12-14 16:58:49
,如果我没有误解你的问题,那这个怎么样?
import pandas as pd
import numpy as np
df.loc[df.Location == 'Gate-3', 'Location'] = np.nan
df.loc[df.Event == 'NO Access', 'Event'] = np.nanhttps://stackoverflow.com/questions/53783937
复制相似问题