我在pandas中使用了split-apply-combine模式来创建一个新列,它度量两个时间戳之间的差异。
下面是我的问题的一个简化示例。
比方说,我有这个df
df = pd.DataFrame({'ssn_start_utc':pd.date_range('1/1/2011', periods=6, freq='D'), 'fld_id':[100,100,100,101,101,101], 'task_name': ['sowing','fungicide','insecticide','combine',''combine','sowing']})
df我想按fld_id分组,并应用一个函数,该函数创建一个列,测量每行两个时间戳之间的差值。就像这样
def pasttime(group):
val = group['ssn_start_utc'] - group['ssn_start_utc'][0]
# why group['ssn_start_utc'][0] ?
# Because it measures time difference for each row respective to first row of each group/ particular to *sowing* entry respective to each group. I have moved all *sowing* entries to first row of df for each group
return val
df["PastTime"] =df.groupby('fld_id',group_keys=False).apply(pasttime)生成的列df应该如下所示
df_new = pd.DataFrame({'ssn_start_utc':pd.date_range('1/1/2011', periods=6, freq='D'), 'fld_id':[100,100,100,101,101,101], 'task_name': ['sowing','fungicide','insecticide','combine',''combine','sowing'], 'pasttime' :[ 0 days, 1 days, 2 days, 3 days, -1 days, 0 days] })df_new我得到一个错误KeyError: 0
我也尝试过使用groupby:
df['pasttime'] = df.groupby(['fld_id'])['ssn_start_utc'].transform( df['ssn_start_utc'] - df.loc[df['name']=='sowing','ssn_start_utc'].values[0]) 如何应用自定义分组函数并拥有所需的df?
发布于 2021-07-29 15:16:28
在您的函数中,可以使用Series.iat按位置匹配第一个值
def pasttime(group):
val = group['ssn_start_utc'] - group['ssn_start_utc'].iat[0]
return val
df["PastTime"] =df.groupby('fld_id',group_keys=False).apply(pasttime)加油器替代方案是结合使用GroupBy.first和GroupBy.transform
s = df.groupby('fld_id')['ssn_start_utc'].transform('first')
df['pasttime'] = df['ssn_start_utc'].sub(s)如果需要在每个组中减去sowing行,使用与上述相同解决方案,只需先用Series.where将不匹配的日期时间替换为NaN
m = df['task_name']=='sowing'
s = df['ssn_start_utc'].where(m).groupby(df['fld_id']).transform('first')
df['pasttime1'] = df['ssn_start_utc'].sub(s)
print (df)
ssn_start_utc fld_id task_name PastTime pasttime pasttime1
0 2011-01-01 100 sowing 0 days 0 days 0 days
1 2011-01-02 100 fungicide 1 days 1 days 1 days
2 2011-01-03 100 insecticide 2 days 2 days 2 days
3 2011-01-04 101 combine 0 days 0 days -2 days
4 2011-01-05 101 combine 1 days 1 days -1 days
5 2011-01-06 101 sowing 2 days 2 days 0 dayshttps://stackoverflow.com/questions/68571617
复制相似问题