我一直在努力将下面的x字典转换成下面的表/字典。
x = {'John': 0,
'Dan': 0,
'Tim': 0,
'Andrew': ['2022-04-10','2022-04-11','2022-06-16','2022-06-17','2022-06-18','2022-08-19','2022-08-20','2022-11-24','2022-12-12'],
'Jack': ['2021-12-31','2022-01-01','2022-01-02','2022-03-26','2022-03-27','2022-03-28','2022-05-29','2022-06-01','2022-06-10','2022-06-12','2022-08-13']}我想把名字(键)放到0列中,我想给出的日期仅是每年每个月的计数,列。
结果:
df = pd.DataFrame(data = {'Name':['John', 'Dan', 'Tim', 'Andrew', 'Jack'],
'2021-12':[0,0,0,0,1],
'2022-01' :[0,0,0,0,2],
'2022-02' :[0,0,0,0,0],
'2022-03' :[0,0,0,0,3],
'2022-04' :[0,0,0,1,0],
'2022-05' :[0,0,0,0,1],
'2022-06' :[0,0,0,3,3],
'2022-07' :[0,0,0,0,0],
'2022-08' :[0,0,0,2,1],
'2022-09' :[0,0,0,0,0],
'2022-10' :[0,0,0,0,0],
'2022-11' :[0,0,0,1,0],
'2022-12' :[0,0,0,1,0]})这是最终的df结果(值表示每个月的计数):

发布于 2021-12-31 13:53:14
你可以试试这样的方法:
x = {'John': 0,
'Dan': 0,
'Tim': 0,
'Andrew': ['2022-04-10','2022-04-11','2022-06-16','2022-06-17','2022-06-18','2022-08-19','2022-08-20','2022-11-24','2022-12-12'],
'Jack': ['2021-12-31','2022-01-01','2022-01-02','2022-03-26','2022-03-27','2022-03-28','2022-05-29','2022-06-01','2022-06-10','2022-06-12','2022-08-13']}
dfe = pd.DataFrame.from_dict(x, 'index').explode(0).set_axis(['dates'], axis=1).rename_axis('name').reset_index()
dfe['dates'] = pd.to_datetime(dfe['dates'], format='%Y-%m-%d')
df_out = dfe.groupby(['name',pd.Grouper(key='dates', freq='M')]).size().unstack(fill_value=0)
df_out = df_out.drop('1970-01-31', axis=1)
df_out = df_out.reindex(pd.date_range(df_out.columns.min(), df_out.columns.max(), freq='M'), axis=1, fill_value=0)
df_out.columns = df_out.columns.strftime('%Y-%m')
print(df_out)输出:
2021-12 2022-01 2022-02 2022-03 2022-04 2022-05 2022-06 2022-07 2022-08 2022-09 2022-10 2022-11 2022-12
name
Andrew 0 0 0 0 2 0 3 0 2 0 0 1 1
Dan 0 0 0 0 0 0 0 0 0 0 0 0 0
Jack 1 2 0 3 0 1 3 0 1 0 0 0 0
John 0 0 0 0 0 0 0 0 0 0 0 0 0
Tim 0 0 0 0 0 0 0 0 0 0 0 0 0处理句点而不是日期和转换为字符串..。
x = {'John': 0,
'Dan': 0,
'Tim': 0,
'Andrew': ['2022-04-10','2022-04-11','2022-06-16','2022-06-17','2022-06-18','2022-08-19','2022-08-20','2022-11-24','2022-12-12'],
'Jack': ['2021-12-31','2022-01-01','2022-01-02','2022-03-26','2022-03-27','2022-03-28','2022-05-29','2022-06-01','2022-06-10','2022-06-12','2022-08-13']}
dfe = pd.DataFrame.from_dict(x, 'index').explode(0).set_axis(['dates'], axis=1).rename_axis('name').reset_index()
dfe['dates'] = pd.to_datetime(dfe['dates'], format='%Y-%m-%d').dt.to_period('M')
df_out = dfe.groupby(['name', 'dates']).size().unstack(fill_value=0)
df_out = df_out.drop('1970-01', axis=1)
df_out = df_out.reindex(pd.period_range(df_out.columns.min(), df_out.columns.max(), freq='M'), axis=1, fill_value=0)
print(df_out)输出:
2021-12 2022-01 2022-02 2022-03 2022-04 2022-05 2022-06 2022-07 2022-08 2022-09 2022-10 2022-11 2022-12
name
Andrew 0 0 0 0 2 0 3 0 2 0 0 1 1
Dan 0 0 0 0 0 0 0 0 0 0 0 0 0
Jack 1 2 0 3 0 1 3 0 1 0 0 0 0
John 0 0 0 0 0 0 0 0 0 0 0 0 0
Tim 0 0 0 0 0 0 0 0 0 0 0 0 0https://stackoverflow.com/questions/70542676
复制相似问题