首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >基于可以在Pandas中包含一个或多个空值的多列值筛选数据数据行

基于可以在Pandas中包含一个或多个空值的多列值筛选数据数据行
EN

Stack Overflow用户
提问于 2022-03-12 13:55:47
回答 2查看 1.3K关注 0票数 0

我有这种格式的json

代码语言:javascript
复制
{
    "userobject": ["Nike", "NY", "", "Actor", "Artist"]
}

我有一个dataframe,其中所有列都有值,但Country列除外,它是空的,由上面的json中的""表示。

我希望从具有类似值的dataframe中从json获取所有行。我的代码:

代码语言:javascript
复制
org = userobject[0]
region = userobject[1]
country = userobject[2]
title = userobject[3]
dept = userobject[4]
newdf = df[(df['Organization'] == org) & (df['Region'] == region) & (df['Country']== country)) & (df['Title'] == title) & (df['Department'] == dept)]

但是这破坏了我的代码,只有当我删除这个(df['Country']== country))部件时,它才能工作。但是我也需要使用这个子句,因为我的数据应该从json中动态地被过滤,对于某些数据,国家值将在那里。有人能帮我一下吗?

更新:建议的代码在此条件下失败。数据格式是:

代码语言:javascript
复制
    User    Organization    Region  Country Title   Department  Output
0   rp990.john1001@gmail.com    ABC.com BR  IN  DEV Engineering 1
1   rp1007.john1007@gmail.com   ABC.com BR  IN  DEV Engineering 1
2   rp1012.john1012@gmail.com   ABC.com BR  IN  DEV Engineering 1
3   rp1001.john1001@gmail.com   ABC.com BR  IN  DEV Engineering 1
4   rp1017.john1017@gmail.com   ABC.com BR  IN  DEV Engineering 1
5   rp1006.john1006@gmail.com   ABC.com BR  IN  DEV Engineering 1
6   rp1011.john1011@gmail.com   ABC.com BR  IN  DEV Engineering 1
7   rp1016.john1016@gmail.com   ABC.com BR  IN  DEV Engineering 1
8   rp1005.john1005@gmail.com   ABC.com BR  IN  DEV Engineering 1
9   rp1010.john1010@gmail.com                                  -1
10  rp1015.john1015@gmail.com   ABC.com BR  IN  DEV Engineering 1
11  rp1004.john1004@gmail.com   ABC.com BR  IN  DEV Engineering 1
12  rp1020.john1020@gmail.com   ABC.com BR  IN  DEV Engineering 1
13  rp1009.john1009@gmail.com   ABC.com BR  IN  DEV Engineering 1
14  rp1014.john1014@gmail.com   ABC.com BR  IN  DEV Engineering 1
15  rp1003.john1003@gmail.com   ABC.com BR  IN  DEV Engineering 1
16  rp1019.john1019@gmail.com   ABC.com BR  IN  DEV Engineering 1
17  rp1008.john1008@gmail.com   ABC.com BR  IN  DEV Engineering 1
18  rp1013.john1013@gmail.com   ABC.com BR  IN  DEV Engineering 1
19  rp1002.john1002@gmail.com   ABC.com BR  IN  DEV Engineering 1
20  rp1018.john1018@gmail.com   ABC.com BR  IN  DEV Engineering 1
21  rp1027.john1027@gmail.com   ABC.com BR  IN  DEV Engineering 1
22  rp1021.john1021@gmail.com   ABC.com BR  IN  DEV Engineering 1
23  rp1026.john1026@gmail.com   ABC.com BR  IN  DEV Engineering 1
24  rp1025.john1025@gmail.com   ABC.com BR  IN  DEV Engineering 1
25  rp1024.john1024@gmail.com   ABC.com BR  IN  DEV Engineering 1
26  rp1029.john1029@gmail.com   ABC.com BR  IN  DEV Engineering 1
27  rp1023.john1023@gmail.com   ABC.com BR  IN  DEV Engineering 1
28  rp1028.john1028@gmail.com   ABC.com BR  IN  DEV Engineering 1
29  rp1022.john1022@gmail.com   ABC.com BR  IN  DEV Engineering 1
30  rp1036.john1036@gmail.com   ABC.com BR  IN  DEV Engineering 1
31  rp1052.john1052@gmail.com                                  -1
32  rp1041.john1041@gmail.com   ABC.com BR  IN  DEV Engineering 1
33  rp1057.john1057@gmail.com                                  -1
34  rp1030.john1030@gmail.com   ABC.com BR      DEV Engineering -1
35  rp1046.john1046@gmail.com   ABC.com BR  IN  DEV MCA        -1
36  rp1035.john1035@gmail.com   ABC.com BR  IN  DEV Engineering 1
37  rp1051.john1051@gmail.com   ABC.com BR  IN  DEV Engineering 1
38  rp1040.john1040@gmail.com   ABC.com BR  IN  DEV MCA        -1
39  rp1056.john1056@gmail.com                                  -1
40  rp1045.john1045@gmail.com   ABC.com BR      DEV Engineering -1
41  rp1034.john1034@gmail.com   ABC.com BR  IN  DEV MAC         -1
42  rp1050.john1050@gmail.com   ABC.com BR  IN  DEV Engineering 1
43  rp1039.john1039@gmail.com   ABC.com BR  IN  DEV Engineering 1
44  rp1055.john1055@gmail.com                                  -1
45  rp1044.john1044@gmail.com   ABC.com BR  IN  DEV Engineering 1
46  rp1060.john1060@gmail.com                                  -1
47  rp1033.john1033@gmail.com   ABC.com BR  IN  DEV Engineering 1
48  rp1049.john1049@gmail.com   ABC.com BR  IN  DEV Engineering 1
49  rp1038.john1038@gmail.com   ABC.com BR  IN  DEV Engineering  1
50  rp1054.john1054@gmail.com                                   -1
51  rp1043.john1043@gmail.com   ABC.com KRI IN  DEV Engineering -1
52  rp1059.john1059@gmail.com                                   -1
53  rp1032.john1032@gmail.com   ABC.com KRI IN  DEV Engineering -1
54  rp1048.john1048@gmail.com   ABC.com BR  IN  DEV Engineering 1
55  rp1037.john1037@gmail.com   ABC.com     PH  DEV Engineering -1
56  rp1053.john1053@gmail.com                                   -1
57  rp1042.john1042@gmail.com   ABC.com BR  IN  DEV Engineering 1
58  rp1058.john1058@gmail.com                                   -1
59  rp1031.john1031@gmail.com   ABC.com BR  IN  DEV Engineering 1
60  rp1047.john1047@gmail.com   ABC.com BR  IN  DEV Engineering 1

过滤器的条件是:

用户rp1045.john1045@gmail.com的"ABC.com“、"BR”、"“、"DEV”、“工程”

EN

回答 2

Stack Overflow用户

回答已采纳

发布于 2022-03-12 14:29:15

您可以使用reduce动态构建条件:

代码语言:javascript
复制
from functools import reduce

cols = ['Organization', 'Region', 'Country', 'Title', 'Department']
json_dict = {"userobject": ["Nike", "NY", "", "Actor", "Artist"]}

cond = reduce(
    lambda r, p: r & (df[p[0]] == p[1]),
    ((c, v) for c, v in zip(cols, json_dict["userobject"]) if v != ""),
    pd.Series(True, index=df.index)
)
newdf = df[cond]

reduce将从初始值r开始:最后一个参数,这里是一个用True (pd.Series(True, index=df.index))填充的系列。然后,通过作为第一个参数的函数,它将依次“减少”给定的可迭代性,第二个参数。可迭代中的对是(列名,来自userobject的对应值) ((c, v)),由值过滤:如果值是"",它就被过滤掉。lambda函数接受已经构建的条件r,并将& (df[c] == v)添加到其中。

如果您不喜欢这个解决方案,下面是@ShubhamSharma方法的一种调整:

代码语言:javascript
复制
cols = ['Organization', 'Region', 'Country', 'Title', 'Department']
json_dict = {"userobject": ["ABC.com", "BR", "", "DEV", "Engineering"]}

col_val = [(c, v) for c, v in zip(cols, json_dict["userobject"]) if v != ""]
if col_val:
    cols_used, values = map(list, zip(*col_val))
else:
    cols_used, values = [], []
newdf = df[df[cols_used].eq(values).all(1)]
票数 1
EN

Stack Overflow用户

发布于 2022-03-12 14:40:15

首先必须用空字符串填充null值,然后才能创建mask..further,您可以通过使用eq将列与userobject列表进行比较来简化代码,然后使用all来减少沿列轴的布尔掩码:

代码语言:javascript
复制
cols = ['Organization', 'Region', 'Country', 'Title', 'Department']
df[df[cols].fillna('').eq(userobject).all(1)]
票数 -1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/71450242

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档