首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >迭代多个列,并在行(增量)后用空值替换这些列中的值。

迭代多个列,并在行(增量)后用空值替换这些列中的值。
EN

Stack Overflow用户
提问于 2022-08-25 10:53:33
回答 2查看 88关注 0票数 1

给定数据df,如下所示:

代码语言:javascript
复制
          date      value     20211003     20211010     20211017
0    2021-9-19  3613.9663          NaN          NaN          NaN
1    2021-9-26  3613.0673          NaN          NaN          NaN
2    2021-10-3  3568.1668          NaN          NaN          NaN
3   2021-10-10  3592.1666  3510.221000          NaN          NaN
4   2021-10-17  3572.3662  3465.737012  3534.220800          NaN
5   2021-10-24  3582.6036  3479.107035  3539.856801  3514.420400
6   2021-10-31  3547.3361  3421.161235  3481.911001  3456.474600
7    2021-11-7  3491.5677  3370.140147  3439.284539  3416.621024
8   2021-11-14  3539.1002  3319.289523  3391.930037  3370.079953
9   2021-11-21  3560.3734  3261.343723  3333.984237  3312.134153
10  2021-11-28  3564.0894  3255.328902  3338.967086  3305.054247
11   2021-12-5  3607.4320  3313.274702  3396.912886  3363.000047
12  2021-12-12  3666.3479  3371.220502  3450.172564  3412.234440
13  2021-12-19  3632.3638          NaN  3466.930383  3428.683490
14  2021-12-26  3618.0535          NaN          NaN  3370.737690

假设value列之后的列(202110032021101020211017)是value的滚动预测结果,而不是每个列的10个值,我只需要保留3个值并得到如下结果:

代码语言:javascript
复制
         date      value     20211003     20211010     20211017
0   2021-9-19  3613.9663          NaN          NaN          NaN
1   2021-9-26  3613.0673          NaN          NaN          NaN
2   2021-10-3  3568.1668          NaN          NaN          NaN
3  2021-10-10  3592.1666  3510.221000          NaN          NaN
4  2021-10-17  3572.3662  3465.737012  3534.220800          NaN
5  2021-10-24  3582.6036  3479.107035  3539.856801  3514.420400
6  2021-10-31  3547.3361          NaN  3481.911001  3456.474600
7   2021-11-7  3491.5677          NaN          NaN  3416.621024

我怎么能在潘达斯实现这一点?谢谢。

EN

回答 2

Stack Overflow用户

回答已采纳

发布于 2022-08-25 11:09:26

假设每个列都有一个扩展,则使用非NA值的cumcount来确定要保留的值,然后对每一列进行切片和重新索引:

代码语言:javascript
复制
cols = ['20211003', '20211010', '20211017']
N = 3

# compute cumcount of non-NAs, leave NAs as NA
cc = df[cols].apply(lambda s: s.groupby((m:=s.notna())).cumcount().where(m))

# get the mask for the date/values columns
m1 = ~cc.min(1).gt(N)
# compute the mask for the first N non-NA
masks = ~cc.ge(N)

# mask and reindex per column with default m1 for date/values
out = df.apply(lambda s: s[masks[s.name] if s.name in masks else m1]
                         .reset_index(drop=True))

产出:

代码语言:javascript
复制
         date      value     20211003     20211010     20211017
0   2021-9-19  3613.9663          NaN          NaN          NaN
1   2021-9-26  3613.0673          NaN          NaN          NaN
2   2021-10-3  3568.1668          NaN          NaN          NaN
3  2021-10-10  3592.1666  3510.221000          NaN          NaN
4  2021-10-17  3572.3662  3465.737012  3534.220800          NaN
5  2021-10-24  3582.6036  3479.107035  3539.856801  3514.420400
6  2021-10-31  3547.3361          NaN  3481.911001  3456.474600
7   2021-11-7  3491.5677          NaN          NaN  3416.621024
8  2021-11-14  3539.1002          NaN          NaN          NaN
票数 1
EN

Stack Overflow用户

发布于 2022-08-26 05:29:02

备选案文1:

代码语言:javascript
复制
a = df.iloc[:, 2:].apply(lambda x:x.dropna().head(3))
df.iloc[df.index <= a.index.max(),:2].join(a)

退出:

代码语言:javascript
复制
        date      value     20211003     20211010     20211017
0   2021-9-19  3613.9663          NaN          NaN          NaN
1   2021-9-26  3613.0673          NaN          NaN          NaN
2   2021-10-3  3568.1668          NaN          NaN          NaN
3  2021-10-10  3592.1666  3510.221000          NaN          NaN
4  2021-10-17  3572.3662  3465.737012  3534.220800          NaN
5  2021-10-24  3582.6036  3479.107035  3539.856801  3514.420400
6  2021-10-31  3547.3361          NaN  3481.911001  3456.474600
7   2021-11-7  3491.5677          NaN          NaN  3416.621024

备选案文2:

代码语言:javascript
复制
df.iloc[:, 2:].apply(lambda x:x.dropna().head(3)).pipe(lambda x: df.iloc[df.index<=x.index.max(), :2].join(x))

退出:

代码语言:javascript
复制
         date      value     20211003     20211010     20211017
0   2021-9-19  3613.9663          NaN          NaN          NaN
1   2021-9-26  3613.0673          NaN          NaN          NaN
2   2021-10-3  3568.1668          NaN          NaN          NaN
3  2021-10-10  3592.1666  3510.221000          NaN          NaN
4  2021-10-17  3572.3662  3465.737012  3534.220800          NaN
5  2021-10-24  3582.6036  3479.107035  3539.856801  3514.420400
6  2021-10-31  3547.3361          NaN  3481.911001  3456.474600
7   2021-11-7  3491.5677          NaN          NaN  3416.621024

备选方案3:

代码语言:javascript
复制
df.iloc[:, 2:].apply(lambda x:x.dropna().head(3)).join(df.iloc[:,:2])

退出:

代码语言:javascript
复制
      20211003     20211010     20211017        date      value
3  3510.221000          NaN          NaN  2021-10-10  3592.1666
4  3465.737012  3534.220800          NaN  2021-10-17  3572.3662
5  3479.107035  3539.856801  3514.420400  2021-10-24  3582.6036
6          NaN  3481.911001  3456.474600  2021-10-31  3547.3361
7          NaN          NaN  3416.621024   2021-11-7  3491.5677
票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/73486252

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档