首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >每月为与客户端关联的所有行指定正确的限定条件- Python / Pandas

每月为与客户端关联的所有行指定正确的限定条件- Python / Pandas
EN

Stack Overflow用户
提问于 2022-04-01 20:43:35
回答 2查看 45关注 0票数 2

如果满足所有关联行的条件,我需要每月为与客户关联的所有行分配正确的值(限定的不限定的)。

代码语言:javascript
复制
test_data = {'Client Id': [1,1,1,1,1,1,1,1, 
                           2,2,2,2,2,2,2,2],
            'Client Name': ['Tom Holland', 'Tom Holland', 'Tom Holland', 'Tom Holland', 
                           'Tom Holland', 'Tom Holland', 'Tom Holland', 'Tom Holland',
                           'Brad Pitt', 'Brad Pitt', 'Brad Pitt', 'Brad Pitt',       
                           'Brad Pitt', 'Brad Pitt', 'Brad Pitt', 'Brad Pitt',],
             'Week': ['01/03/2022 - 01/09/2022', '01/10/2022 - 01/16/2022', 
                     '01/17/2022 - 01/23/2022', '01/24/2022 - 01/30/2022',
                     '01/31/2022 - 02/06/2022', '02/07/2022 - 02/13/2022',     
                     '02/14/2022 - 02/20/2022','02/21/2022 - 02/27/2022',
                     '01/03/2022 - 01/09/2022', '01/10/2022 - 01/16/2022', 
                     '01/17/2022 - 01/23/2022', '01/24/2022 - 01/30/2022',
                     '01/31/2022 - 02/06/2022', '02/07/2022 - 02/13/2022', 
                     '02/14/2022 - 02/20/2022','02/21/2022 - 02/27/2022'], 
            'Month': ['January', 'January', 'January', 'January',              
                      "February", "February", "February", "February",
                      'January', 'January', 'January', 'January', 
                      "February", "February", "February", "February"], 
            'Year': [2022, 2022, 2022, 2022, 2022, 2022, 2022, 2022,
                     2022, 2022, 2022, 2022, 2022, 2022, 2022, 2022],
            'Payment Status': ["Pending", "Paid in Full", "Didn't Paid", "Paid in Full", 
                              "Paid in Full", "Paid in Full", "Paid in Full", 
                              "Paid in Full", "Paid in Full", "Paid in Full", 
                              "Paid in Full", "Paid in Full", "Paid in Full", 
                              "Paid in Full", "Paid in Full", "Pending"]}

test_df = pd.DataFrame(data=test_data)

数据:

代码语言:javascript
复制
Client Id      Client Name           Week              Month    Year    Payment Status

    1         Tom Holland   01/03/2022 - 01/09/2022   January   2022       Pending
    1         Tom Holland   01/10/2022 - 01/16/2022   January   2022    Paid in Full
    1         Tom Holland   01/17/2022 - 01/23/2022   January   2022     Didn't Paid
    1         Tom Holland   01/24/2022 - 01/30/2022   January   2022    Paid in Full
    1         Tom Holland   01/31/2022 - 02/06/2022   February  2022    Paid in Full
    1         Tom Holland   02/07/2022 - 02/13/2022   February  2022    Paid in Full
    1         Tom Holland   02/14/2022 - 02/20/2022   February  2022    Paid in Full
    1         Tom Holland   02/21/2022 - 02/27/2022   February  2022    Paid in Full
    2          Brad Pitt    01/03/2022 - 01/09/2022   January   2022    Paid in Full
    2          Brad Pitt    01/10/2022 - 01/16/2022   January   2022    Paid in Full
    2          Brad Pitt    01/17/2022 - 01/23/2022   January   2022    Paid in Full
    2          Brad Pitt    01/24/2022 - 01/30/2022   January   2022    Paid in Full
    2          Brad Pitt    01/31/2022 - 02/06/2022   February  2022    Paid in Full
    2          Brad Pitt    02/07/2022 - 02/13/2022   February  2022    Paid in Full
    2          Brad Pitt    02/14/2022 - 02/20/2022   February  2022    Paid in Full
    2          Brad Pitt    02/21/2022 - 02/27/2022   February  2022      Pending

如果每个月与客户端关联的每一行(周)都是完全支付的,那么每个月分配给与客户关联的所有行(周)的限定。即使1周不是全额支付(3周可以全额支付,但1周没有支付或未决),那么分配给的所有行都不是合格的。

期望产出:

代码语言:javascript
复制
Client Id  Client Name           Week              Month    Year    Payment Status  Qualification

    1      Tom Holland  01/03/2022 - 01/09/2022   January   2022       Pending     Not Qualified
    1      Tom Holland  01/10/2022 - 01/16/2022   January   2022    Paid in Full   Not Qualified
    1      Tom Holland  01/17/2022 - 01/23/2022   January   2022     Didn't Paid   Not Qualified
    1      Tom Holland  01/24/2022 - 01/30/2022   January   2022    Paid in Full   Not Qualified
    1      Tom Holland  01/31/2022 - 02/06/2022   February  2022    Paid in Full    Qualified
    1      Tom Holland  02/07/2022 - 02/13/2022   February  2022    Paid in Full    Qualified
    1      Tom Holland  02/14/2022 - 02/20/2022   February  2022    Paid in Full    Qualified
    1      Tom Holland  02/21/2022 - 02/27/2022   February  2022    Paid in Full    Qualified
    2       Brad Pitt   01/03/2022 - 01/09/2022   January   2022    Paid in Full    Qualified
    2       Brad Pitt   01/10/2022 - 01/16/2022   January   2022    Paid in Full    Qualified
    2       Brad Pitt   01/17/2022 - 01/23/2022   January   2022    Paid in Full    Qualified
    2       Brad Pitt   01/24/2022 - 01/30/2022   January   2022    Paid in Full    Qualified
    2       Brad Pitt   01/31/2022 - 02/06/2022   February  2022    Paid in Full   Not Qualified
    2       Brad Pitt   02/07/2022 - 02/13/2022   February  2022    Paid in Full   Not Qualified
    2       Brad Pitt   02/14/2022 - 02/20/2022   February  2022    Paid in Full   Not Qualified
    2       Brad Pitt   02/21/2022 - 02/27/2022   February  2022      Pending      Not Qualified

我不知道如何实现这一点,我认为循环中的value_counts:

代码语言:javascript
复制
for name, month in zip(list(test_df["Client Name"].unique()), list(test_df["Month"])):
    print(test_df[(test_df["Client Name"] == name) & (test_df["Month"] == month)].value_counts(["Payment Status"]))
EN

回答 2

Stack Overflow用户

回答已采纳

发布于 2022-04-01 21:04:46

关键是创建一个布尔掩码:如果Payment Status是“全额支付”,那么True else False。现在您可以按Client IdMonthYear分组,以检查是否所有值都是True。使用transform将结果广播到组的每一行。最后,将True/False替换为其各自的值。

通过向dataframe添加一个新列is_paid来动态创建布尔掩码:

代码语言:javascript
复制
df['Qualification'] = (
    df.assign(is_paid=df['Payment Status'] == 'Paid in Full')
      .groupby(['Client Id', 'Month', 'Year'])['is_paid']
      .transform('all').replace({True: 'Qualified', False: 'Not Qualified'})
)
print(df)

# Output
    Client Id  Client Name                     Week     Month  Year Payment Status  Qualification
0           1  Tom Holland  01/03/2022 - 01/09/2022   January  2022        Pending  Not Qualified
1           1  Tom Holland  01/10/2022 - 01/16/2022   January  2022   Paid in Full  Not Qualified
2           1  Tom Holland  01/17/2022 - 01/23/2022   January  2022    Didn't Paid  Not Qualified
3           1  Tom Holland  01/24/2022 - 01/30/2022   January  2022   Paid in Full  Not Qualified
4           1  Tom Holland  01/31/2022 - 02/06/2022  February  2022   Paid in Full      Qualified
5           1  Tom Holland  02/07/2022 - 02/13/2022  February  2022   Paid in Full      Qualified
6           1  Tom Holland  02/14/2022 - 02/20/2022  February  2022   Paid in Full      Qualified
7           1  Tom Holland  02/21/2022 - 02/27/2022  February  2022   Paid in Full      Qualified
8           2    Brad Pitt  01/03/2022 - 01/09/2022   January  2022   Paid in Full      Qualified
9           2    Brad Pitt  01/10/2022 - 01/16/2022   January  2022   Paid in Full      Qualified
10          2    Brad Pitt  01/17/2022 - 01/23/2022   January  2022   Paid in Full      Qualified
11          2    Brad Pitt  01/24/2022 - 01/30/2022   January  2022   Paid in Full      Qualified
12          2    Brad Pitt  01/31/2022 - 02/06/2022  February  2022   Paid in Full  Not Qualified
13          2    Brad Pitt  02/07/2022 - 02/13/2022  February  2022   Paid in Full  Not Qualified
14          2    Brad Pitt  02/14/2022 - 02/20/2022  February  2022   Paid in Full  Not Qualified
15          2    Brad Pitt  02/21/2022 - 02/27/2022  February  2022        Pending  Not Qualified
票数 4
EN

Stack Overflow用户

发布于 2022-04-01 21:04:19

首先将Payment Status转换为bools:

代码语言:javascript
复制
test_df['Paid'] = test_df['Payment Status'] == 'Paid in Full'
代码语言:javascript
复制
>>> test_df
Client Id   Client Name Week    Month   Year    Payment         Status          Paid
1           Tom Holland 01/03/2022 - 01/09/2022 January 2022    Pending         FALSE
1           Tom Holland 01/10/2022 - 01/16/2022 January 2022    Paid in Full    TRUE
1           Tom Holland 01/17/2022 - 01/23/2022 January 2022    Didnt Paid      FALSE
1           Tom Holland 01/24/2022 - 01/30/2022 January 2022    Paid in Full    TRUE
1           Tom Holland 01/31/2022 - 02/06/2022 February2022    Paid in Full    TRUE
1           Tom Holland 02/07/2022 - 02/13/2022 February2022    Paid in Full    TRUE
1           Tom Holland 02/14/2022 - 02/20/2022 February2022    Paid in Full    TRUE
1           Tom Holland 02/21/2022 - 02/27/2022 February2022    Paid in Full    TRUE
2           Brad Pitt   01/03/2022 - 01/09/2022 January 2022    Paid in Full    TRUE
2           Brad Pitt   01/10/2022 - 01/16/2022 January 2022    Paid in Full    TRUE
2           Brad Pitt   01/17/2022 - 01/23/2022 January 2022    Paid in Full    TRUE
2           Brad Pitt   01/24/2022 - 01/30/2022 January 2022    Paid in Full    TRUE
2           Brad Pitt   01/31/2022 - 02/06/2022 February2022    Paid in Full    TRUE
2           Brad Pitt   02/07/2022 - 02/13/2022 February2022    Paid in Full    TRUE
2           Brad Pitt   02/14/2022 - 02/20/2022 February2022    Paid in Full    TRUE
2           Brad Pitt   02/21/2022 - 02/27/2022 February2022    Pending         FALSE

MonthClient Id对它们进行分组并检查组中的所有Paid值的是True

代码语言:javascript
复制
status = test_df[["Client Id", "Month", "Paid"]].groupby(["Month", "Client Id"]).all()
代码语言:javascript
复制
>>> status
                      Paid
Month     Client Id 
February  1           True
          2           False
January   1           False
          2           True

现在重置索引并将Paid转换回文本(QualifiedNot Qualified):

代码语言:javascript
复制
status = status.reset_index()
status['Paid'] = status['Paid'].map({True: 'Qualified', False:"Not Qualified"})
代码语言:javascript
复制
>>> status

Month   Client Id   Paid
February    1       Qualified
February    2       Not Qualified
January     1       Not Qualified
January     2       Qualified

现在与原始表合并以获得所需的结果(并删除通过合并生成的不必要的列。还重命名新列:

代码语言:javascript
复制
output = pd.merge(test_df, a, on=['Client Id', 'Month']).drop(columns='Paid_x')
output = output.rename(columns={'Paid_y': 'Qualification'})
代码语言:javascript
复制
>>> output
Client Id   Client Name Week                    Month   Year    Payment Status  Qualification
1           Tom Holland 01/03/2022 - 01/09/2022 January 2022    Pending         Not Qualified
1           Tom Holland 01/10/2022 - 01/16/2022 January 2022    Paid in Full    Not Qualified
1           Tom Holland 01/17/2022 - 01/23/2022 January 2022    Didnt Paid      Not Qualified
1           Tom Holland 01/24/2022 - 01/30/2022 January 2022    Paid in Full    Not Qualified
1           Tom Holland 01/31/2022 - 02/06/2022 February2022    Paid in Full    Qualified
1           Tom Holland 02/07/2022 - 02/13/2022 February2022    Paid in Full    Qualified
1           Tom Holland 02/14/2022 - 02/20/2022 February2022    Paid in Full    Qualified
1           Tom Holland 02/21/2022 - 02/27/2022 February2022    Paid in Full    Qualified
2           Brad Pitt   01/03/2022 - 01/09/2022 January 2022    Paid in Full    Qualified
2           Brad Pitt   01/10/2022 - 01/16/2022 January 2022    Paid in Full    Qualified
2           Brad Pitt   01/17/2022 - 01/23/2022 January 2022    Paid in Full    Qualified
2           Brad Pitt   01/24/2022 - 01/30/2022 January 2022    Paid in Full    Qualified
2           Brad Pitt   01/31/2022 - 02/06/2022 February2022    Paid in Full    Not Qualified
2           Brad Pitt   02/07/2022 - 02/13/2022 February2022    Paid in Full    Not Qualified
2           Brad Pitt   02/14/2022 - 02/20/2022 February2022    Paid in Full    Not Qualified
2           Brad Pitt   02/21/2022 - 02/27/2022 February2022    Pending Not     Qualified
票数 2
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/71712570

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档