我有一个像这样的数据文件:
Path_Version commitdates Year-Month API Age api_spec_id
168 NaN 2018-10-19 2018-10 39 521
169 NaN 2018-10-19 2018-10 39 521
170 NaN 2018-10-12 2018-10 39 521
171 NaN 2018-10-12 2018-10 39 521
172 NaN 2018-10-12 2018-10 39 521
173 NaN 2018-10-11 2018-10 39 521
174 NaN 2018-10-11 2018-10 39 521
175 NaN 2018-10-11 2018-10 39 521
176 NaN 2018-10-11 2018-10 39 521
177 NaN 2018-10-11 2018-10 39 521
178 NaN 2018-09-26 2018-09 39 521
179 NaN 2018-09-25 2018-09 39 521 我想先计算从第一个提交日期到最后一个提交日期的时间,然后对提交日期进行排序,所以如下所示:
Path_Version commitdates Year-Month API Age api_spec_id Days_difference
168 NaN 2018-10-19 2018-10 39 521 25
169 NaN 2018-10-19 2018-10 39 521 25
170 NaN 2018-10-12 2018-10 39 521 18
171 NaN 2018-10-12 2018-10 39 521 18
172 NaN 2018-10-12 2018-10 39 521 18
173 NaN 2018-10-11 2018-10 39 521 16
174 NaN 2018-10-11 2018-10 39 521 16
175 NaN 2018-10-11 2018-10 39 521 16
176 NaN 2018-10-11 2018-10 39 521 16
177 NaN 2018-10-11 2018-10 39 521 16
178 NaN 2018-09-26 2018-09 39 521 1
179 NaN 2018-09-25 2018-09 39 521 0我首先尝试通过api_spec_id对提交进行排序,因为它对每个API都是唯一的,然后计算出差异。
final_api['commitdates'] = final_api.groupby('api_spec_id')['commitdate'].apply(lambda x: x.sort_values())
final_api['diff'] = final_api.groupby('api_spec_id')['commitdates'].diff() / np.timedelta64(1, 'D')
final_api['diff'] = final_api['diff'].fillna(0)它只会为整个列返回一个零。我不想对它们进行分组,我只想根据排序的提交日期计算差异:从第一次提交到整个数据集中的最后一次,以天为单位
你知道我怎么能做到这一点吗?
发布于 2022-11-24 22:44:06
使用pandas.to_datetime,sub,min和dt.days
t = pd.to_datetime(df['commitdates'])
df['Days_difference'] = t.sub(t.min()).dt.days如果您需要对每个API进行分组:
t = pd.to_datetime(df['commitdates'])
df['Days_difference'] = t.sub(t.groupby(df['api_spec_id']).transform('min')).dt.days输出:
Path_Version commitdates Year-Month API Age api_spec_id Days_difference
168 NaN 2018-10-19 2018-10 39 521 24
169 NaN 2018-10-19 2018-10 39 521 24
170 NaN 2018-10-12 2018-10 39 521 17
171 NaN 2018-10-12 2018-10 39 521 17
172 NaN 2018-10-12 2018-10 39 521 17
173 NaN 2018-10-11 2018-10 39 521 16
174 NaN 2018-10-11 2018-10 39 521 16
175 NaN 2018-10-11 2018-10 39 521 16
176 NaN 2018-10-11 2018-10 39 521 16
177 NaN 2018-10-11 2018-10 39 521 16
178 NaN 2018-09-26 2018-09 39 521 1
179 NaN 2018-09-25 2018-09 39 521 0https://stackoverflow.com/questions/74566819
复制相似问题