我有一个按时间顺序列出人们日常活动的数据集。一天中的第一项活动(从午夜开始)被称为“开始”,通常是在醒来之前,但并不总是如此。对于每个人(group_by(id)),我想把“开始”改为“睡眠”,如果“开始”先于“觉醒”。但是如果“开始”并不先于“觉醒”,至于id 22,那么我想保留“开始”。
id <- c(11,11,11,11,22,22,22,22,22,22,33,33,33)
activity <-c("Start","Wake","TV","Eat","Start","TV","Sleep","Wake","Eat","Dressed","Start","Wake","BrushTeeth")
DF<- data.frame(id,activity)
DF
id activity
1 11 Start
2 11 Wake
3 11 TV
4 11 Eat
5 22 Start
6 22 TV
7 22 Sleep
8 22 Wake
9 22 Eat
10 22 Dressed
11 33 Start
12 33 Wake
13 33 BrushTeeth这就是我希望最终数据看起来的样子(请注意,第1行和第7行中的“开始”已被“睡眠”所取代,但在第5行中仍然是“开始”,因为它不先于"Wake")
id activity
1 11 Sleep
2 11 Wake
3 11 TV
4 11 Eat
5 22 Start
6 22 TV
7 22 Sleep
8 22 Wake
9 22 Eat
10 22 Dressed
11 33 Sleep
12 33 Wake
13 33 BrushTeeth发布于 2019-07-24 13:03:25
试试看
library(dplyr)
DF %>%
mutate(new = replace(activity, activity == 'Start' & lead(activity) == 'Wake', 'Sleep'))这给了,
id activity new 1 11 Start Sleep 2 11 Wake Wake 3 11 TV TV 4 11 Eat Eat 5 22 Start Start 6 22 TV TV 7 22 Sleep Sleep 8 22 Wake Wake 9 22 Eat Eat 10 22 Dressed Dressed 11 33 Start Sleep 12 33 Wake Wake 13 33 BrushTeeth BrushTeeth
发布于 2019-07-24 13:21:42
data.table的一个选项是指定具有逻辑条件的i,并为那些从i返回为‘睡眠’的行指定(:=)活动
library(data.table)
setDT(DF)[activity == 'Start' & shift(activity, type = 'lead') == 'Wake',
activity := 'Sleep'][]
DF
# id activity
# 1: 11 Sleep
# 2: 11 Wake
# 3: 11 TV
# 4: 11 Eat
# 5: 22 Start
# 6: 22 TV
# 7: 22 Sleep
# 8: 22 Wake
# 9: 22 Eat
#10: 22 Dressed
#11: 33 Sleep
#12: 33 Wake
#13: 33 BrushTeethhttps://stackoverflow.com/questions/57183555
复制相似问题