如果满足所有关联行的条件,我需要每月为与客户关联的所有行分配正确的值(限定的或不限定的)。
test_data = {'Client Id': [1,1,1,1,1,1,1,1,
2,2,2,2,2,2,2,2],
'Client Name': ['Tom Holland', 'Tom Holland', 'Tom Holland', 'Tom Holland',
'Tom Holland', 'Tom Holland', 'Tom Holland', 'Tom Holland',
'Brad Pitt', 'Brad Pitt', 'Brad Pitt', 'Brad Pitt',
'Brad Pitt', 'Brad Pitt', 'Brad Pitt', 'Brad Pitt',],
'Week': ['01/03/2022 - 01/09/2022', '01/10/2022 - 01/16/2022',
'01/17/2022 - 01/23/2022', '01/24/2022 - 01/30/2022',
'01/31/2022 - 02/06/2022', '02/07/2022 - 02/13/2022',
'02/14/2022 - 02/20/2022','02/21/2022 - 02/27/2022',
'01/03/2022 - 01/09/2022', '01/10/2022 - 01/16/2022',
'01/17/2022 - 01/23/2022', '01/24/2022 - 01/30/2022',
'01/31/2022 - 02/06/2022', '02/07/2022 - 02/13/2022',
'02/14/2022 - 02/20/2022','02/21/2022 - 02/27/2022'],
'Month': ['January', 'January', 'January', 'January',
"February", "February", "February", "February",
'January', 'January', 'January', 'January',
"February", "February", "February", "February"],
'Year': [2022, 2022, 2022, 2022, 2022, 2022, 2022, 2022,
2022, 2022, 2022, 2022, 2022, 2022, 2022, 2022],
'Payment Status': ["Pending", "Paid in Full", "Didn't Paid", "Paid in Full",
"Paid in Full", "Paid in Full", "Paid in Full",
"Paid in Full", "Paid in Full", "Paid in Full",
"Paid in Full", "Paid in Full", "Paid in Full",
"Paid in Full", "Paid in Full", "Pending"]}
test_df = pd.DataFrame(data=test_data)数据:
Client Id Client Name Week Month Year Payment Status
1 Tom Holland 01/03/2022 - 01/09/2022 January 2022 Pending
1 Tom Holland 01/10/2022 - 01/16/2022 January 2022 Paid in Full
1 Tom Holland 01/17/2022 - 01/23/2022 January 2022 Didn't Paid
1 Tom Holland 01/24/2022 - 01/30/2022 January 2022 Paid in Full
1 Tom Holland 01/31/2022 - 02/06/2022 February 2022 Paid in Full
1 Tom Holland 02/07/2022 - 02/13/2022 February 2022 Paid in Full
1 Tom Holland 02/14/2022 - 02/20/2022 February 2022 Paid in Full
1 Tom Holland 02/21/2022 - 02/27/2022 February 2022 Paid in Full
2 Brad Pitt 01/03/2022 - 01/09/2022 January 2022 Paid in Full
2 Brad Pitt 01/10/2022 - 01/16/2022 January 2022 Paid in Full
2 Brad Pitt 01/17/2022 - 01/23/2022 January 2022 Paid in Full
2 Brad Pitt 01/24/2022 - 01/30/2022 January 2022 Paid in Full
2 Brad Pitt 01/31/2022 - 02/06/2022 February 2022 Paid in Full
2 Brad Pitt 02/07/2022 - 02/13/2022 February 2022 Paid in Full
2 Brad Pitt 02/14/2022 - 02/20/2022 February 2022 Paid in Full
2 Brad Pitt 02/21/2022 - 02/27/2022 February 2022 Pending如果每个月与客户端关联的每一行(周)都是完全支付的,那么每个月分配给与客户关联的所有行(周)的限定。即使1周不是全额支付(3周可以全额支付,但1周没有支付或未决),那么分配给的所有行都不是合格的。
期望产出:
Client Id Client Name Week Month Year Payment Status Qualification
1 Tom Holland 01/03/2022 - 01/09/2022 January 2022 Pending Not Qualified
1 Tom Holland 01/10/2022 - 01/16/2022 January 2022 Paid in Full Not Qualified
1 Tom Holland 01/17/2022 - 01/23/2022 January 2022 Didn't Paid Not Qualified
1 Tom Holland 01/24/2022 - 01/30/2022 January 2022 Paid in Full Not Qualified
1 Tom Holland 01/31/2022 - 02/06/2022 February 2022 Paid in Full Qualified
1 Tom Holland 02/07/2022 - 02/13/2022 February 2022 Paid in Full Qualified
1 Tom Holland 02/14/2022 - 02/20/2022 February 2022 Paid in Full Qualified
1 Tom Holland 02/21/2022 - 02/27/2022 February 2022 Paid in Full Qualified
2 Brad Pitt 01/03/2022 - 01/09/2022 January 2022 Paid in Full Qualified
2 Brad Pitt 01/10/2022 - 01/16/2022 January 2022 Paid in Full Qualified
2 Brad Pitt 01/17/2022 - 01/23/2022 January 2022 Paid in Full Qualified
2 Brad Pitt 01/24/2022 - 01/30/2022 January 2022 Paid in Full Qualified
2 Brad Pitt 01/31/2022 - 02/06/2022 February 2022 Paid in Full Not Qualified
2 Brad Pitt 02/07/2022 - 02/13/2022 February 2022 Paid in Full Not Qualified
2 Brad Pitt 02/14/2022 - 02/20/2022 February 2022 Paid in Full Not Qualified
2 Brad Pitt 02/21/2022 - 02/27/2022 February 2022 Pending Not Qualified我不知道如何实现这一点,我认为循环中的value_counts:
for name, month in zip(list(test_df["Client Name"].unique()), list(test_df["Month"])):
print(test_df[(test_df["Client Name"] == name) & (test_df["Month"] == month)].value_counts(["Payment Status"]))发布于 2022-04-01 21:04:46
关键是创建一个布尔掩码:如果Payment Status是“全额支付”,那么True else False。现在您可以按Client Id、Month和Year分组,以检查是否所有值都是True。使用transform将结果广播到组的每一行。最后,将True/False替换为其各自的值。
通过向dataframe添加一个新列is_paid来动态创建布尔掩码:
df['Qualification'] = (
df.assign(is_paid=df['Payment Status'] == 'Paid in Full')
.groupby(['Client Id', 'Month', 'Year'])['is_paid']
.transform('all').replace({True: 'Qualified', False: 'Not Qualified'})
)
print(df)
# Output
Client Id Client Name Week Month Year Payment Status Qualification
0 1 Tom Holland 01/03/2022 - 01/09/2022 January 2022 Pending Not Qualified
1 1 Tom Holland 01/10/2022 - 01/16/2022 January 2022 Paid in Full Not Qualified
2 1 Tom Holland 01/17/2022 - 01/23/2022 January 2022 Didn't Paid Not Qualified
3 1 Tom Holland 01/24/2022 - 01/30/2022 January 2022 Paid in Full Not Qualified
4 1 Tom Holland 01/31/2022 - 02/06/2022 February 2022 Paid in Full Qualified
5 1 Tom Holland 02/07/2022 - 02/13/2022 February 2022 Paid in Full Qualified
6 1 Tom Holland 02/14/2022 - 02/20/2022 February 2022 Paid in Full Qualified
7 1 Tom Holland 02/21/2022 - 02/27/2022 February 2022 Paid in Full Qualified
8 2 Brad Pitt 01/03/2022 - 01/09/2022 January 2022 Paid in Full Qualified
9 2 Brad Pitt 01/10/2022 - 01/16/2022 January 2022 Paid in Full Qualified
10 2 Brad Pitt 01/17/2022 - 01/23/2022 January 2022 Paid in Full Qualified
11 2 Brad Pitt 01/24/2022 - 01/30/2022 January 2022 Paid in Full Qualified
12 2 Brad Pitt 01/31/2022 - 02/06/2022 February 2022 Paid in Full Not Qualified
13 2 Brad Pitt 02/07/2022 - 02/13/2022 February 2022 Paid in Full Not Qualified
14 2 Brad Pitt 02/14/2022 - 02/20/2022 February 2022 Paid in Full Not Qualified
15 2 Brad Pitt 02/21/2022 - 02/27/2022 February 2022 Pending Not Qualified发布于 2022-04-01 21:04:19
首先将Payment Status转换为bools:
test_df['Paid'] = test_df['Payment Status'] == 'Paid in Full'>>> test_df
Client Id Client Name Week Month Year Payment Status Paid
1 Tom Holland 01/03/2022 - 01/09/2022 January 2022 Pending FALSE
1 Tom Holland 01/10/2022 - 01/16/2022 January 2022 Paid in Full TRUE
1 Tom Holland 01/17/2022 - 01/23/2022 January 2022 Didnt Paid FALSE
1 Tom Holland 01/24/2022 - 01/30/2022 January 2022 Paid in Full TRUE
1 Tom Holland 01/31/2022 - 02/06/2022 February2022 Paid in Full TRUE
1 Tom Holland 02/07/2022 - 02/13/2022 February2022 Paid in Full TRUE
1 Tom Holland 02/14/2022 - 02/20/2022 February2022 Paid in Full TRUE
1 Tom Holland 02/21/2022 - 02/27/2022 February2022 Paid in Full TRUE
2 Brad Pitt 01/03/2022 - 01/09/2022 January 2022 Paid in Full TRUE
2 Brad Pitt 01/10/2022 - 01/16/2022 January 2022 Paid in Full TRUE
2 Brad Pitt 01/17/2022 - 01/23/2022 January 2022 Paid in Full TRUE
2 Brad Pitt 01/24/2022 - 01/30/2022 January 2022 Paid in Full TRUE
2 Brad Pitt 01/31/2022 - 02/06/2022 February2022 Paid in Full TRUE
2 Brad Pitt 02/07/2022 - 02/13/2022 February2022 Paid in Full TRUE
2 Brad Pitt 02/14/2022 - 02/20/2022 February2022 Paid in Full TRUE
2 Brad Pitt 02/21/2022 - 02/27/2022 February2022 Pending FALSE由Month和Client Id对它们进行分组并检查组中的所有Paid值的是True。
status = test_df[["Client Id", "Month", "Paid"]].groupby(["Month", "Client Id"]).all()>>> status
Paid
Month Client Id
February 1 True
2 False
January 1 False
2 True现在重置索引并将Paid转换回文本(Qualified或Not Qualified):
status = status.reset_index()
status['Paid'] = status['Paid'].map({True: 'Qualified', False:"Not Qualified"})>>> status
Month Client Id Paid
February 1 Qualified
February 2 Not Qualified
January 1 Not Qualified
January 2 Qualified现在与原始表合并以获得所需的结果(并删除通过合并生成的不必要的列。还重命名新列:
output = pd.merge(test_df, a, on=['Client Id', 'Month']).drop(columns='Paid_x')
output = output.rename(columns={'Paid_y': 'Qualification'})>>> output
Client Id Client Name Week Month Year Payment Status Qualification
1 Tom Holland 01/03/2022 - 01/09/2022 January 2022 Pending Not Qualified
1 Tom Holland 01/10/2022 - 01/16/2022 January 2022 Paid in Full Not Qualified
1 Tom Holland 01/17/2022 - 01/23/2022 January 2022 Didnt Paid Not Qualified
1 Tom Holland 01/24/2022 - 01/30/2022 January 2022 Paid in Full Not Qualified
1 Tom Holland 01/31/2022 - 02/06/2022 February2022 Paid in Full Qualified
1 Tom Holland 02/07/2022 - 02/13/2022 February2022 Paid in Full Qualified
1 Tom Holland 02/14/2022 - 02/20/2022 February2022 Paid in Full Qualified
1 Tom Holland 02/21/2022 - 02/27/2022 February2022 Paid in Full Qualified
2 Brad Pitt 01/03/2022 - 01/09/2022 January 2022 Paid in Full Qualified
2 Brad Pitt 01/10/2022 - 01/16/2022 January 2022 Paid in Full Qualified
2 Brad Pitt 01/17/2022 - 01/23/2022 January 2022 Paid in Full Qualified
2 Brad Pitt 01/24/2022 - 01/30/2022 January 2022 Paid in Full Qualified
2 Brad Pitt 01/31/2022 - 02/06/2022 February2022 Paid in Full Not Qualified
2 Brad Pitt 02/07/2022 - 02/13/2022 February2022 Paid in Full Not Qualified
2 Brad Pitt 02/14/2022 - 02/20/2022 February2022 Paid in Full Not Qualified
2 Brad Pitt 02/21/2022 - 02/27/2022 February2022 Pending Not Qualifiedhttps://stackoverflow.com/questions/71712570
复制相似问题