我有一个数据框架,如下所示。
ID号是人员的ID,职位是他们的工作职位,Dt_Alter是他们转换角色的日期,学历是他们的背景。
我需要计算他们在岗位上的平均时间(直到他们进入经理岗位),以及他们在进入经理岗位之前变换了多少次角色。
任何技巧我都很欣赏,因为我是R的新手,在这一部分的分析中遇到了困难。数据帧是巨大的。
ID Number Position Dt_Alter Education
2 MANAGER 2019-02-01 BUSINESS, MANAGEMENT AND ADMINISTRATION
2 COORDINATOR 2019-01-01 BUSINESS, MANAGEMENT AND ADMINISTRATION
2000261 MANAGER 2018-12-01 BUSINESS, MANAGEMENT AND ADMINISTRATION
2000261 SUPERVISOR 2016-12-01 BUSINESS, MANAGEMENT AND ADMINISTRATION
2000553 MANAGER 2018-12-01 ENGINEERING
2000553 COORDINATOR 2016-04-01 ENGINEERING
structure(list(Matricula = c(2L, 2L, 2L, 2L, 2L),
Desc2 = c("GERENTE", "COORDENADOR SEGUROS", "COORDENADOR SEGUROS", "COORDENADOR SEGUROS", "COORDENADOR SEGUROS"),
Dt_Alteracao = c("01/02/2019", "01/01/2019", "01/01/2018", "01/09/2017", "01/09/2016"),
Education = c("BUSINESS, MANAGEMENT AND ADMINISTRATION", "BUSINESS, MANAGEMENT AND ADMINISTRATION", "BUSINESS, MANAGEMENT AND ADMINISTRATION", "BUSINESS, MANAGEMENT AND ADMINISTRATION", "BUSINESS, MANAGEMENT AND ADMINISTRATION")),
row.names = c("2.10823", "2.10824", "2.10825", "2.10826", "2.10827"), class = "data.frame")发布于 2019-09-10 03:21:08
下面是我结合使用ifelse和lag函数来解决这个问题的粗略方法。基本上,在确保您已经在ID.Number和date上对文件进行排序之后,记录的顺序应该允许跨记录进行比较。我为某人是否改变了他们的位置做了一个标记,如果是真的,就计算这些记录的difftime。
希望这能有所帮助。
df$Matricula<-as.character(df$Matricula)
df$Dt_Alteracao<-strptime(df$Dt_Alteracao,format="%d/%m/%Y")
df<-df[order(df$Matricula, df$Dt_Alteracao), ]
# indicator for whether a position change occurred
df$changePos<-ifelse( df$Matricula== lag(df$Matricula,1) & df$Desc2 != lag(df$Desc2,1),
"Changed Position", "Same") # review this logic for a variety of row groupings
# measure weeks between positions
library(lubridate)
df$Dt_Alteracao2<-as.POSIXct(df$Dt_Alteracao)
df$time_in_pos<-ifelse(df$changePos=="Changed Position",
difftime(lag(df$Dt_Alteracao2,1),df$Dt_Alteracao2,units ='weeks'),NA )https://stackoverflow.com/questions/57857840
复制相似问题