我有一个data.frame,它有两个变量es和id。我想添加一个名为weeks的新变量。但是,我希望分别为每个weeks创建id,但要为每个id创建唯一的。
例如,如果es == "SHORT"对所有行的id == 1,我想要相同的数字(例如,3)。如果id == 2是一个不同的数字(例如,1)。
我能在below)基R中实现这一点吗?(参见所需的输出结构
注: SHORT < DEL1 < DEL2表示每个id下的数值。
下面是我尝试过但没有成功的数据和代码:
D <- data.frame(es = c("SHORT", "SHORT", "SHORT","DEL1", "DEL1","DEL1","SHORT",
"SHORT", "SHORT", "DEL1", "DEL1", "DEL1","DEL2","DEL2","DEL2"),
id = c(rep(1, 6), rep(2, 9)) )
weeks <- ifelse(D$es == "SHORT", sample(1:5, 6, T), ifelse(D$es == "DEL1",
sample(4:8, 7, T),
sample(7:12, 2, T)))所需的输出结构(数值是随机的):
es id weeks
SHORT 1 3
SHORT 1 3
SHORT 1 3
DEL1 1 5
DEL1 1 5
DEL1 1 5
SHORT 2 1
SHORT 2 1
SHORT 2 1
DEL1 2 6
DEL1 2 6
DEL1 2 6
DEL2 2 8
DEL2 2 8
DEL2 2 8发布于 2019-08-13 20:53:04
从本质上说,马库斯的意思是。可以用seq_along替换sample或其他函数,如果您需要的星期是随机的。
D <- data.frame(es = c("SHORT", "SHORT", "SHORT","DEL1", "DEL1","DEL1","SHORT",
"SHORT", "SHORT", "DEL1", "DEL1", "DEL1","DEL2","DEL2","DEL2"),
id = c(rep(1, 6), rep(2, 9)) )
weeksTbl <- unique(D)
weeksTbl$weeks <- seq_along(weeksTbl[[1]])
merge(D, weeksTbl, all = TRUE, sort = FALSE)
#> es id weeks
#> 1 SHORT 1 1
#> 2 SHORT 1 1
#> 3 SHORT 1 1
#> 4 DEL1 1 2
#> 5 DEL1 1 2
#> 6 DEL1 1 2
#> 7 SHORT 2 3
#> 8 SHORT 2 3
#> 9 SHORT 2 3
#> 10 DEL1 2 4
#> 11 DEL1 2 4
#> 12 DEL1 2 4
#> 13 DEL2 2 5
#> 14 DEL2 2 5
#> 15 DEL2 2 5发布于 2019-08-13 21:09:23
考虑diff和cumsum对不同分组的顺序排序:
set.seed(8132019)
rand <- sample(1:10, 10, replace=FALSE)
D <- within(D, {
diff <- c(0,diff(es)) + c(0, diff(id))
weeks <- cumsum(ifelse(diff == 0, 0, 1)) + 1
rm(diff)
})
D
# es id weeks
# 1 SHORT 1 1
# 2 SHORT 1 1
# 3 SHORT 1 1
# 4 DEL1 1 2
# 5 DEL1 1 2
# 6 DEL1 1 2
# 7 SHORT 2 3
# 8 SHORT 2 3
# 9 SHORT 2 3
# 10 DEL1 2 4
# 11 DEL1 2 4
# 12 DEL1 2 4
# 13 DEL2 2 5
# 14 DEL2 2 5
# 15 DEL2 2 5https://stackoverflow.com/questions/57484742
复制相似问题