假设我有
SAMPN PERNO loop car bus walk mode
1 1 1 3.4 2.5 1.5 1
1 1 1 3 2 1 2
1 1 1 4 2 5 3
1 1 2 14 1 3 1
1 1 2 5 8 2 1
2 1 1 1 5 5 3
2 1 1 9 4 3 3mode列与小汽车、公交车和步行交叉响应。
mode==1 walk
mode==2 car
mode==3 busSAMPN是家庭的索引,家庭中的PERNO成员和每个人的循环旅行。我想在每个循环中添加每个家庭中每个人的mode值。
例如,在first family SAMPN==1 first person PERNO==1中,我们有3行代表first trip loop==1。在该游览模式中,第一排是步行(mode==1),第二排是汽车(mode==2),第三排是公交车(mode==3)
所以我会加上第一排步行的第二辆车和第三辆3.4+2+5=10.4的公交车。其他人也是如此
输出:
SAMPN PERNO loop car bus walk mode utility
1 1 1 3.4 2.5 1.5 1 10.4
1 1 1 3 2 1 2 10.4
1 1 1 4 2 5 3 10.4
1 1 2 14 1 3 1 19
1 1 2 5 8 2 1 19
2 1 1 1 5 5 3 8
2 1 1 9 4 3 3 8发布于 2019-09-27 03:26:38
df %>%
mutate(utility = case_when(mode == 1 ~ car, # using the order in the example,
mode == 2 ~ bus, # not the order in the table
mode == 3 ~ walk,
TRUE ~ 0)) %>%
count(SAMPN, PERNO, loop, wt = utility, name = "utility")
## A tibble: 3 x 4
# SAMPN PERNO loop utility
# <int> <int> <int> <dbl>
#1 1 1 1 10.4
#2 1 1 2 19
#3 2 1 1 8 或者,要获得准确的输出:
df %>%
mutate(utility= case_when(mode == 1 ~ car,
mode == 2 ~ bus,
mode == 3 ~ walk,
TRUE ~ 0)) %>%
group_by(SAMPN, PERNO, loop) %>%
mutate(utility = sum(utility))
## A tibble: 7 x 8
## Groups: SAMPN, PERNO, loop [3]
# SAMPN PERNO loop car bus walk mode utility
# <int> <int> <int> <dbl> <dbl> <dbl> <int> <dbl>
#1 1 1 1 3.4 2.5 1.5 1 10.4
#2 1 1 1 3 2 1 2 10.4
#3 1 1 1 4 2 5 3 10.4
#4 1 1 2 14 1 3 1 19
#5 1 1 2 5 8 2 1 19
#6 2 1 1 1 5 5 3 8
#7 2 1 1 9 4 3 3 8 发布于 2019-09-27 03:26:44
这里有一个使用base R的选项。创建一个与'mode‘匹配的列索引,并命名列名('nm1'0,然后使用行索引进行循环,从数据集中提取相应的元素,使用ave获取按'SAMPN’分组的sum,并使用‘cbind’列将其分配给‘sum’
nm1 <- setNames(names(df1)[4:6], 1:3)[as.character(df1$mode)]
i1 <- cbind(seq_len(nrow(df1)), match(nm1, names(df1)))
df1$utility <- ave(df1[i1], df1$SAMPN, df1$PERNO, df1$loop, FUN = sum)
df1$utility
#[1] 10.4 10.4 10.4 19.0 19.0 8.0 8.0数据
df1 <- structure(list(SAMPN = c(1L, 1L, 1L, 1L, 1L, 2L, 2L), PERNO = c(1L,
1L, 1L, 1L, 1L, 1L, 1L), loop = c(1L, 1L, 1L, 2L, 2L, 1L, 1L),
car = c(3.4, 3, 4, 14, 5, 1, 9), bus = c(2.5, 2, 2, 1, 8,
5, 4), walk = c(1.5, 1, 5, 3, 2, 5, 3), mode = c(1L, 2L,
3L, 1L, 1L, 3L, 3L)), class = "data.frame", row.names = c(NA,
-7L))https://stackoverflow.com/questions/58123186
复制相似问题