首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >相对于groupby的移位值

相对于groupby的移位值
EN

Stack Overflow用户
提问于 2022-01-11 14:37:56
回答 1查看 57关注 0票数 1

我想删除熊猫数据中的NaN值,并将值相对于groupbyCategoryGender上移动。下面是我创建的一个示例,它类似于我正在处理的数据:

代码语言:javascript
复制
import pandas as pd
test = {'Price':
        [20, 10, 'NaN', 'NaN',  'NaN', 'NaN',21, 11,'NaN', 'NaN', 'NaN','NaN'], 
        'Gender':
        ['womens-clothing','womens-clothing','womens-clothing','womens-clothing','womens-clothing','womens-clothing','mens-clothing','mens-clothing','mens-clothing','mens-clothing','mens-clothing','mens-clothing'],
        'Category':['dresses','dresses','dresses', 'dresses',  'dresses', 'dresses', 'jackets','jackets', 'jackets', 'jackets', 'jackets', 'jackets'],
        'Title':['NaN', 'NaN', 'Cheap Dress', 'First Dress', 'NaN', 'NaN','NaN', 'NaN','Main Jacket', 'Black Jacket','NaN', 'NaN'],
        'Review':['NaN','NaN','NaN','NaN',203,12,'NaN','NaN','NaN','NaN',201, 15]}

df = pd.DataFrame(test)

看上去是这样的:

代码语言:javascript
复制
    Price   Gender     Category Title         Review
0   20  womens-clothing dresses NaN             NaN
1   10  womens-clothing dresses NaN             NaN
2   NaN womens-clothing dresses Cheap Dress     NaN
3   NaN womens-clothing dresses First Dress     NaN
4   NaN womens-clothing dresses NaN             203
5   NaN womens-clothing dresses NaN             12
6   21  mens-clothing   jackets NaN             NaN
7   11  mens-clothing   jackets NaN             NaN
8   NaN mens-clothing   jackets Main Jacket     NaN
9   NaN mens-clothing   jackets Black Jacket    NaN
10  NaN mens-clothing   jackets NaN             201
11  NaN mens-clothing   jackets NaN             15

我希望丢弃保留NaN值的行以及来自GenderCategory的值,然后向上移动单元格,使其匹配如下所示:

代码语言:javascript
复制
    Price   Gender     Category Title         Review
0   20  womens-clothing dresses Cheap Dress     203
2   10  womens-clothing dresses First Dress     12
3   21  mens-clothing   jackets Main Jacket     201
4   11  mens-clothing   jackets Black Jacket    15

我试过:

代码语言:javascript
复制
data = df.apply(lambda x: pd.Series(x.drop(index=x[x[0] == 'NaN'], inplace=True).values))

但是,我似乎不能以这种方式删除特定的行。因为这些NaN是字符串(对我来说它们是实际的NA,我只是不知道如何在我可以为可复制代码创建的dict中生成它们)。

如果NaNs是实际的Nas,我如何获得预期的输出。我已经尝试过在上面的函数中包含一个groupby,但是我可以在numpy数组上使用它。我可以包括在函数之外,但没有帮助。

EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2022-01-11 14:50:16

在理想的数据样本中使用:

代码语言:javascript
复制
f = lambda x: x.apply(lambda x: x[x!='NaN'])
df = df.set_index(['Gender','Category']).groupby(['Gender','Category'], group_keys=False).apply(f).reset_index()
print (df)
            Gender Category Price         Title Review
0    mens-clothing  jackets    21   Main Jacket    201
1    mens-clothing  jackets    11  Black Jacket     15
2  womens-clothing  dresses    20   Cheap Dress    203
3  womens-clothing  dresses    10   First Dress     12

如果是一般数据,这意味着可能使用的非NaN值的数目不一样:

代码语言:javascript
复制
test = {'Price':
        [20, 10, 'NaN', 'NaN',  'NaN', 'NaN',21, 11,45, 'NaN', 'NaN','NaN'], 
        'Gender':
        ['womens-clothing','womens-clothing','womens-clothing','womens-clothing','womens-clothing','womens-clothing','mens-clothing','mens-clothing','mens-clothing','mens-clothing','mens-clothing','mens-clothing'],
        'Category':['dresses','dresses','dresses', 'dresses',  'dresses', 'dresses', 'jackets','jackets', 'jackets', 'jackets', 'jackets', 'jackets'],
        'Title':['NaN', 'NaN', 'Cheap Dress', 'First Dress', 'NaN', 'NaN','NaN', 'NaN','Main Jacket', 'Black Jacket','NaN', 'NaN'],
        'Review':['NaN','NaN','NaN','NaN',203,12,'NaN','NaN','NaN','NaN',201, 15]}

df = pd.DataFrame(test)

代码语言:javascript
复制
f = lambda x: x.apply(lambda x: pd.Series(x[x!='NaN'].to_numpy()))
#if NaNs are missing values
#f = lambda x: x.apply(lambda x: pd.Series(x.dropna().to_numpy()))
df = (df.set_index(['Gender','Category'])
        .groupby(['Gender','Category'])
        .apply(f)
        .droplevel(-1)
        .reset_index())
print (df)
            Gender Category Price         Title Review
0    mens-clothing  jackets    21   Main Jacket    201
1    mens-clothing  jackets    11  Black Jacket     15
2    mens-clothing  jackets    45           NaN    NaN
3  womens-clothing  dresses    20   Cheap Dress    203
4  womens-clothing  dresses    10   First Dress     12
票数 2
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/70668600

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档