我有一个数据框架,其中每一行都是一个家庭,我有一个面试者列表。目的是将面试者分配到家庭,每个面试者应该在每个城市获得相同数量的家庭。例如,
example <- data.frame(city = c("Los Angeles", "Los Angeles", "Los Angeles",
"San Diego", "San Diego", "San Diego", "San Diego", "San Diego",
"Santa Barbara", "Santa Barbara", "Santa Barbara", "Santa Barbara", "Santa Barbara"),
household_id = seq(1, 13))
interviewer <- c("A", "B", "C")所有3名面试者将前往所有城市,并将采访每个城市约三分之一的家庭,因此预期输出为
output <- data.frame(city = c("Los Angeles", "Los Angeles", "Los Angeles",
"San Diego", "San Diego", "San Diego", "San Diego", "San Diego",
"Santa Barbara", "Santa Barbara", "Santa Barbara", "Santa Barbara", "Santa Barbara"),
household_id = seq(1, 13),
interviewer = c("A", "B", "C",
"A", "A", "B", "B", "C",
"A", "A", "B", "B", "C"))发布于 2020-12-06 09:53:57
您可以尝试使用rep为每个city重复interviewer。
library(dplyr)
example %>%
group_by(city) %>%
mutate(interviewer = rep(interviewer, length.out = n()))
# city household_id interviewer
# <chr> <int> <chr>
# 1 Los Angeles 1 A
# 2 Los Angeles 2 B
# 3 Los Angeles 3 C
# 4 San Diego 4 A
# 5 San Diego 5 B
# 6 San Diego 6 C
# 7 San Diego 7 A
# 8 San Diego 8 B
# 9 Santa Barbara 9 A
#10 Santa Barbara 10 B
#11 Santa Barbara 11 C
#12 Santa Barbara 12 A
#13 Santa Barbara 13 B https://stackoverflow.com/questions/65163898
复制相似问题