首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >为每个客户机添加缺少的行- Python / Pandas

为每个客户机添加缺少的行- Python / Pandas
EN

Stack Overflow用户
提问于 2022-03-31 20:12:19
回答 1查看 44关注 0票数 0

我有几个星期,几个月,几年的df。

代码语言:javascript
复制
week = ['01/03/2022 - 01/09/2022', '01/10/2022 - 01/16/2022', '01/17/2022 - 01/23/2022', '01/24/2022 - 01/30/2022']
month = ["January", "January", "January", "January"]
year = [2022, 2022, 2022, 2022]

myDict = {}

myDict["Week"] = week
myDict["Month"] = month 
myDict["Year"] = year

dates_df = pd.DataFrame(data=myDict)
dates_df
代码语言:javascript
复制
         Week              Month    Year

01/03/2022 - 01/09/2022   January   2022
01/10/2022 - 01/16/2022   January   2022
01/17/2022 - 01/23/2022   January   2022
01/24/2022 - 01/30/2022   January   2022

数据看起来类似于(以前分组):

代码语言:javascript
复制
test_data = {'CLient Id': [1,1,1,1,2,2,2,3,3],
    'Client Name': ['Tom Holland', 'Tom Holland', 'Tom Holland', 'Tom Holland', 'Brad Pitt', 'Brad Pitt', 'Brad Pitt', 'Anna Delvey', 'Anna Delvey'],
    'City': ['New York', 'New York', 'New York', 'New York', 'Los Angeles', 'Los Angeles', 'Los Angeles', 'New York', 'New York'],    
    'Week': ['01/03/2022 - 01/09/2022', '01/10/2022 - 01/16/2022', '01/17/2022 - 01/23/2022', '01/24/2022 - 01/30/2022',
            '01/03/2022 - 01/09/2022', '01/10/2022 - 01/16/2022', '01/24/2022 - 01/30/2022', '01/03/2022 - 01/09/2022',
            '01/24/2022 - 01/30/2022'], 
    'Month': ['January', 'January', 'January', 'January', 'January', 'January', 'January', 'January', 'January'], 
    'Year': [2022, 2022, 2022, 2022, 2022, 2022, 2022, 2022, 2022],
    'Spent': [1000, 750, 1200, 850, 777, 1500, 1200, 1500, 1200]}

test_df = pd.DataFrame(data=test_data)
test_df
代码语言:javascript
复制
Client Id    Client Name          City                Week               Month    Year   Spent

1            Tom Holland       New York      01/03/2022 - 01/09/2022    January   2022   1000
1            Tom Holland       New York      01/10/2022 - 01/16/2022    January   2022   750
1            Tom Holland       New York      01/17/2022 - 01/23/2022    January   2022   1200
1            Tom Holland       New York      01/24/2022 - 01/30/2022    January   2022   850
2            Brad Pitt         Los Angeles   01/03/2022 - 01/09/2022    January   2022   777
2            Brad Pitt         Los Angeles   01/10/2022 - 01/16/2022    January   2022   1500
2            Brad Pitt         Los Angeles   01/24/2022 - 01/30/2022    January   2022   1200
3            Anna Delvey       New York      01/03/2022 - 01/09/2022    January   2022   1500
3            Anna Delvey       New York      01/24/2022 - 01/30/2022    January   2022   1200

我需要为每个缺少一周的客户端(Id)创建额外的客户端名称行,即城市、周、月和年,在本例中,df + be列应该是0。

我需要的产出:

代码语言:javascript
复制
Client Id    Client Name          City                  Week               Month    Year   Spent

1            Tom Holland        New York       01/03/2022 - 01/09/2022    January   2022   1000
1            Tom Holland        New York       01/10/2022 - 01/16/2022    January   2022   750
1            Tom Holland        New York       01/17/2022 - 01/23/2022    January   2022   1200
1            Tom Holland        New York       01/24/2022 - 01/30/2022    January   2022   850
2            Brad Pitt          Los Angeles    01/03/2022 - 01/09/2022    January   2022   777
2            Brad Pitt          Los Angeles    01/10/2022 - 01/16/2022    January   2022   1500
2            Brad Pitt          Los Angeles    01/17/2022 - 01/23/2022    January   2022   0
2            Brad Pitt          Los Angeles    01/24/2022 - 01/30/2022    January   2022   1200
3            Anna Delvey        New York       01/03/2022 - 01/09/2022    January   2022   1500
3            Anna Delvey        New York       01/10/2022 - 01/16/2022    January   2022   0
3            Anna Delvey        New York       01/17/2022 - 01/23/2022    January   2022   0
3            Anna Delvey        New York       01/24/2022 - 01/30/2022    January   2022   1200

我试着对它进行编码,但没有运气。

EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2022-03-31 20:19:51

您可以使用pivot + reindex + fillna (获取丢失的数据)+ stack (返回到以前的形状):

代码语言:javascript
复制
columns = ['Week','Month','Year']
out = (test_df.pivot(['CLient Id', 'Client Name', 'City'], columns, ['Spent'])
       .reindex(pd.MultiIndex.from_arrays(dates_df.assign(Spent='Spent').to_numpy()[:, [-1,0,1,2]].T, 
                                          names=[None]+columns), axis=1)
       .fillna(0).stack(level=columns).reset_index())

输出:

代码语言:javascript
复制
    CLient Id  Client Name         City                     Week    Month  Year   Spent  
0           1  Tom Holland     New York  01/03/2022 - 01/09/2022  January  2022  1000.0  
1           1  Tom Holland     New York  01/10/2022 - 01/16/2022  January  2022   750.0  
2           1  Tom Holland     New York  01/17/2022 - 01/23/2022  January  2022  1200.0  
3           1  Tom Holland     New York  01/24/2022 - 01/30/2022  January  2022   850.0  
4           2    Brad Pitt  Los Angeles  01/03/2022 - 01/09/2022  January  2022   777.0  
5           2    Brad Pitt  Los Angeles  01/10/2022 - 01/16/2022  January  2022  1500.0  
6           2    Brad Pitt  Los Angeles  01/17/2022 - 01/23/2022  January  2022     0.0  
7           2    Brad Pitt  Los Angeles  01/24/2022 - 01/30/2022  January  2022  1200.0  
8           3  Anna Delvey     New York  01/03/2022 - 01/09/2022  January  2022  1500.0  
9           3  Anna Delvey     New York  01/10/2022 - 01/16/2022  January  2022     0.0  
10          3  Anna Delvey     New York  01/17/2022 - 01/23/2022  January  2022     0.0  
11          3  Anna Delvey     New York  01/24/2022 - 01/30/2022  January  2022  1200.0  

如果有多个列要填充0,则可以使用:

代码语言:javascript
复制
columns = ['Week','Month','Year']
value_columns = ['Spent', ...]
new_df = (pd.DataFrame({'key':[1]*len(value_columns),'New':value_columns})
          .merge(dates_df.assign(key=1)).drop(columns='key'))
out = (test_df.pivot(['CLient Id', 'Client Name', 'City'], columns, value_columns)
       .reindex(pd.MultiIndex.from_arrays(zip(*new_df.to_numpy()), 
                                          names=[None] + columns), axis=1)
       .fillna(0).stack(level=columns).reset_index())
票数 2
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/71698633

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档