我有这种格式的json
{
"userobject": ["Nike", "NY", "", "Actor", "Artist"]
}我有一个dataframe,其中所有列都有值,但Country列除外,它是空的,由上面的json中的""表示。
我希望从具有类似值的dataframe中从json获取所有行。我的代码:
org = userobject[0]
region = userobject[1]
country = userobject[2]
title = userobject[3]
dept = userobject[4]
newdf = df[(df['Organization'] == org) & (df['Region'] == region) & (df['Country']== country)) & (df['Title'] == title) & (df['Department'] == dept)]但是这破坏了我的代码,只有当我删除这个(df['Country']== country))部件时,它才能工作。但是我也需要使用这个子句,因为我的数据应该从json中动态地被过滤,对于某些数据,国家值将在那里。有人能帮我一下吗?
更新:建议的代码在此条件下失败。数据格式是:
User Organization Region Country Title Department Output
0 rp990.john1001@gmail.com ABC.com BR IN DEV Engineering 1
1 rp1007.john1007@gmail.com ABC.com BR IN DEV Engineering 1
2 rp1012.john1012@gmail.com ABC.com BR IN DEV Engineering 1
3 rp1001.john1001@gmail.com ABC.com BR IN DEV Engineering 1
4 rp1017.john1017@gmail.com ABC.com BR IN DEV Engineering 1
5 rp1006.john1006@gmail.com ABC.com BR IN DEV Engineering 1
6 rp1011.john1011@gmail.com ABC.com BR IN DEV Engineering 1
7 rp1016.john1016@gmail.com ABC.com BR IN DEV Engineering 1
8 rp1005.john1005@gmail.com ABC.com BR IN DEV Engineering 1
9 rp1010.john1010@gmail.com -1
10 rp1015.john1015@gmail.com ABC.com BR IN DEV Engineering 1
11 rp1004.john1004@gmail.com ABC.com BR IN DEV Engineering 1
12 rp1020.john1020@gmail.com ABC.com BR IN DEV Engineering 1
13 rp1009.john1009@gmail.com ABC.com BR IN DEV Engineering 1
14 rp1014.john1014@gmail.com ABC.com BR IN DEV Engineering 1
15 rp1003.john1003@gmail.com ABC.com BR IN DEV Engineering 1
16 rp1019.john1019@gmail.com ABC.com BR IN DEV Engineering 1
17 rp1008.john1008@gmail.com ABC.com BR IN DEV Engineering 1
18 rp1013.john1013@gmail.com ABC.com BR IN DEV Engineering 1
19 rp1002.john1002@gmail.com ABC.com BR IN DEV Engineering 1
20 rp1018.john1018@gmail.com ABC.com BR IN DEV Engineering 1
21 rp1027.john1027@gmail.com ABC.com BR IN DEV Engineering 1
22 rp1021.john1021@gmail.com ABC.com BR IN DEV Engineering 1
23 rp1026.john1026@gmail.com ABC.com BR IN DEV Engineering 1
24 rp1025.john1025@gmail.com ABC.com BR IN DEV Engineering 1
25 rp1024.john1024@gmail.com ABC.com BR IN DEV Engineering 1
26 rp1029.john1029@gmail.com ABC.com BR IN DEV Engineering 1
27 rp1023.john1023@gmail.com ABC.com BR IN DEV Engineering 1
28 rp1028.john1028@gmail.com ABC.com BR IN DEV Engineering 1
29 rp1022.john1022@gmail.com ABC.com BR IN DEV Engineering 1
30 rp1036.john1036@gmail.com ABC.com BR IN DEV Engineering 1
31 rp1052.john1052@gmail.com -1
32 rp1041.john1041@gmail.com ABC.com BR IN DEV Engineering 1
33 rp1057.john1057@gmail.com -1
34 rp1030.john1030@gmail.com ABC.com BR DEV Engineering -1
35 rp1046.john1046@gmail.com ABC.com BR IN DEV MCA -1
36 rp1035.john1035@gmail.com ABC.com BR IN DEV Engineering 1
37 rp1051.john1051@gmail.com ABC.com BR IN DEV Engineering 1
38 rp1040.john1040@gmail.com ABC.com BR IN DEV MCA -1
39 rp1056.john1056@gmail.com -1
40 rp1045.john1045@gmail.com ABC.com BR DEV Engineering -1
41 rp1034.john1034@gmail.com ABC.com BR IN DEV MAC -1
42 rp1050.john1050@gmail.com ABC.com BR IN DEV Engineering 1
43 rp1039.john1039@gmail.com ABC.com BR IN DEV Engineering 1
44 rp1055.john1055@gmail.com -1
45 rp1044.john1044@gmail.com ABC.com BR IN DEV Engineering 1
46 rp1060.john1060@gmail.com -1
47 rp1033.john1033@gmail.com ABC.com BR IN DEV Engineering 1
48 rp1049.john1049@gmail.com ABC.com BR IN DEV Engineering 1
49 rp1038.john1038@gmail.com ABC.com BR IN DEV Engineering 1
50 rp1054.john1054@gmail.com -1
51 rp1043.john1043@gmail.com ABC.com KRI IN DEV Engineering -1
52 rp1059.john1059@gmail.com -1
53 rp1032.john1032@gmail.com ABC.com KRI IN DEV Engineering -1
54 rp1048.john1048@gmail.com ABC.com BR IN DEV Engineering 1
55 rp1037.john1037@gmail.com ABC.com PH DEV Engineering -1
56 rp1053.john1053@gmail.com -1
57 rp1042.john1042@gmail.com ABC.com BR IN DEV Engineering 1
58 rp1058.john1058@gmail.com -1
59 rp1031.john1031@gmail.com ABC.com BR IN DEV Engineering 1
60 rp1047.john1047@gmail.com ABC.com BR IN DEV Engineering 1过滤器的条件是:
用户rp1045.john1045@gmail.com的"ABC.com“、"BR”、"“、"DEV”、“工程”
发布于 2022-03-12 14:29:15
您可以使用reduce动态构建条件:
from functools import reduce
cols = ['Organization', 'Region', 'Country', 'Title', 'Department']
json_dict = {"userobject": ["Nike", "NY", "", "Actor", "Artist"]}
cond = reduce(
lambda r, p: r & (df[p[0]] == p[1]),
((c, v) for c, v in zip(cols, json_dict["userobject"]) if v != ""),
pd.Series(True, index=df.index)
)
newdf = df[cond]reduce将从初始值r开始:最后一个参数,这里是一个用True (pd.Series(True, index=df.index))填充的系列。然后,通过作为第一个参数的函数,它将依次“减少”给定的可迭代性,第二个参数。可迭代中的对是(列名,来自userobject的对应值) ((c, v)),由值过滤:如果值是"",它就被过滤掉。lambda函数接受已经构建的条件r,并将& (df[c] == v)添加到其中。
如果您不喜欢这个解决方案,下面是@ShubhamSharma方法的一种调整:
cols = ['Organization', 'Region', 'Country', 'Title', 'Department']
json_dict = {"userobject": ["ABC.com", "BR", "", "DEV", "Engineering"]}
col_val = [(c, v) for c, v in zip(cols, json_dict["userobject"]) if v != ""]
if col_val:
cols_used, values = map(list, zip(*col_val))
else:
cols_used, values = [], []
newdf = df[df[cols_used].eq(values).all(1)]发布于 2022-03-12 14:40:15
首先必须用空字符串填充null值,然后才能创建mask..further,您可以通过使用eq将列与userobject列表进行比较来简化代码,然后使用all来减少沿列轴的布尔掩码:
cols = ['Organization', 'Region', 'Country', 'Title', 'Department']
df[df[cols].fillna('').eq(userobject).all(1)]https://stackoverflow.com/questions/71450242
复制相似问题