首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >尝试取两个日期列之间的差异

尝试取两个日期列之间的差异
EN

Stack Overflow用户
提问于 2019-06-27 07:04:41
回答 4查看 58关注 0票数 1

我有如下数据:

代码语言:javascript
复制
name   country    Join Date      End date 
Wrt     IND        1-2-2016      8-9-2017
Grt     China      3-2-2015     12-6-2018
frt     France     8-3-2017     continuing 
srt     Scottland   9-4-2018     continuing
crt     china       9-7-2016     7-8-2018

我试图找出连接日期和结束日期之间的区别。我尝试使用f9['Num of days'] = f9['End date '] - f9['Join Date'],但收到以下错误:

代码语言:javascript
复制
TypeError: unsupported operand type(s) for -: 'DatetimeIndex' and 'float'

我的预期产出应该是:

代码语言:javascript
复制
   name   country    Join Date      End date   diff 
   Wrt     IND        1-2-2016      8-9-2017   395
   Grt     China      3-2-2017      12-6-2018  160
   frt     France     8-3-2017     continuing  continuing
   srt     Scottland   9-4-2018     continuing  continuing
   crt     china       9-7-2017     7-8-2018     280
EN

回答 4

Stack Overflow用户

回答已采纳

发布于 2019-06-27 07:05:43

首先使用参数errors='coerce'将两列转换为datetimes,如果日期错误(如字符串continuing ),并在必要时添加参数dayfirst=True,则将缺少的值转换为datetimes,然后减去值,从timedeltas中获取Series.dt.days的天数,最后在必要时用Series.fillna替换误报值。

代码语言:javascript
复制
f9['Join Date'] = pd.to_datetime(f9['Join Date'], errors='coerce', dayfirst=True)
f9['End date'] = pd.to_datetime(f9['End date'], errors='coerce', dayfirst=True)

f9['Num of days'] = (f9['End date'] - f9['Join Date']).dt.days.fillna('continuing')
print (f9)
  name    country  Join Date   End date Num of days
0  Wrt        IND 2016-02-01 2017-09-08         585
1  Grt      China 2015-02-03 2018-06-12        1225
2  frt     France 2017-03-08        NaT  continuing
3  srt  Scottland 2018-04-09        NaT  continuing
4  crt      china 2016-07-09 2018-08-07         759

或者:

代码语言:javascript
复制
f9['Join Date'] = pd.to_datetime(f9['Join Date'], errors='coerce')
f9['End date'] = pd.to_datetime(f9['End date'], errors='coerce')

f9['Num of days'] = (f9['End date'] - f9['Join Date']).dt.days.fillna('continuing')
print (f9)
  name    country  Join Date   End date Num of days
0  Wrt        IND 2016-01-02 2017-08-09         585
1  Grt      China 2015-03-02 2018-12-06        1375
2  frt     France 2017-08-03        NaT  continuing
3  srt  Scottland 2018-09-04        NaT  continuing
4  crt      china 2016-09-07 2018-07-08         669

最后一步应该是替换丢失的值,但是丢失了datetime的列,获取与datetimes混合的字符串,因此以后类似于datetimelike的函数失败:

代码语言:javascript
复制
f9['End date'] = f9['End date'].fillna('continuing')
print (f9)
  name    country  Join Date             End date Num of days
0  Wrt        IND 2016-01-02  2017-08-09 00:00:00         585
1  Grt      China 2015-03-02  2018-12-06 00:00:00        1375
2  frt     France 2017-08-03           continuing  continuing
3  srt  Scottland 2018-09-04           continuing  continuing
4  crt      china 2016-09-07  2018-07-08 00:00:00         669

编辑:

您可以从顶部数字或底部数字添加多个条件,这里也可以使用Series.between函数:

代码语言:javascript
复制
f9['Join Date'] = pd.to_datetime(f9['Join Date'], errors='coerce')
f9['End date'] = pd.to_datetime(f9['End date'], errors='coerce')

f9['Num of days'] = (f9['End date'] - f9['Join Date']).dt.days

m1 = f9['Num of days'] > 730
m2 = f9['Num of days'].between(365, 730)
m3 = f9['Num of days'] < 365 
m4 = f9['Num of days'].isna()

f9['Status'] = np.select([m1, m2, m3,m4], ['U','L', 'N','EOL']) 

f9[['End date','Num of days']] = f9[['End date','Num of days']].fillna('continuing')
print (f9)

  name    country  Join Date             End date Num of days Status
0  Wrt        IND 2016-01-02  2017-08-09 00:00:00         585      L
1  Grt      China 2015-03-02  2018-12-06 00:00:00        1375      U
2  frt     France 2017-08-03           continuing  continuing    EOL
3  srt  Scottland 2018-09-04           continuing  continuing    EOL
4  crt      china 2016-09-07  2018-07-08 00:00:00         669      L

另一个想法是使用cut进行绑定:

代码语言:javascript
复制
f9['Join Date'] = pd.to_datetime(f9['Join Date'], errors='coerce')
f9['End date'] = pd.to_datetime(f9['End date'], errors='coerce')

f9['Num of days'] = (f9['End date'] - f9['Join Date']).dt.days

f9['Status']=pd.cut(f9['Num of days'],bins=[-np.inf, 365, 730, np.inf],labels=['U','L', 'N'])
f9['Status'] = f9['Status'].cat.add_categories(['EOL']).fillna('EOL')
f9[['End date','Num of days']] = f9[['End date','Num of days']].fillna('continuing')
print (f9)
  name    country  Join Date             End date Num of days Status
0  Wrt        IND 2016-01-02  2017-08-09 00:00:00         585      L
1  Grt      China 2015-03-02  2018-12-06 00:00:00        1375      N
2  frt     France 2017-08-03           continuing  continuing    EOL
3  srt  Scottland 2018-09-04           continuing  continuing    EOL
4  crt      china 2016-09-07  2018-07-08 00:00:00         669      L
票数 2
EN

Stack Overflow用户

发布于 2019-06-27 07:12:59

首先使用to_datetime按日期转换两列

然后使用.dt.date减去和获取天数

代码语言:javascript
复制
df = pd.DataFrame(data={'name':['wrt','grt','frt'],
                   'country':['ind','china','france'],
                   'join_date':['1-2-2016','3-2-2015','8-3-2017'],
                   'end_date':['8-9-2017','12-6-2018','continuing']})

df['join_date'] = pd.to_datetime(df['join_date'],errors='coerce').dt.date
df['end_date'] = pd.to_datetime(df['end_date'],errors='coerce').dt.date

df['diff'] = (df['end_date'] - df['join_date']).dt.days
df = df[['join_date','end_date','diff']].fillna('continuing')
print(df)
票数 1
EN

Stack Overflow用户

发布于 2019-06-27 07:11:20

在这里,您可以将"Join Date“和"End date”系列转换为numpy数组,并为此使用dtype = np.datetime64,然后取一个差异,然后将差异数组存储到数据格式中。还可以使用要填写的任何日期的当前数据填充“继续”单元格(取决于您的情况)。

票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/56785691

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档