我有一张两列的数据。第一列包含类项的单个条目(在本例中是蔬菜)。第二列是传入的new_item,它是不同类别的食品杂货(肉类、水果、蔬菜等)。
library(tidyverse)
current <- tibble::tribble(
~prev_veg, ~new_item,
"cabbage", "lettuce",
NA, "apple",
NA, "beef",
NA, "spinach",
NA, "broccoli",
NA, "mango"
)
current我想循环遍历新的item列,并且只向prev_veg添加蔬菜。任何新的蔬菜项目都需要附加到现有的列表中。重要的是,我有一个向量,所有可能的蔬菜,可能会出现在这个清单上。所需的数据文件如下。
target_veg <- c("cabbage","lettuce", "spinach", "broccoli"
desired <- tibble::tribble(
~prev_veg, ~new_item,
"cabbage", "lettuce",
"cabbage, lettuce", "apple",
"cabbage, lettuce", "strawbery",
"cabbage, lettuce", "spinach",
"cabbage, lettuce, spinach", "broccoli",
"cabbage, lettuce, spinach, broccoli", "mango"
)
desired最后,在这个dataframe中还有多个其他数据列,我在这里没有包含(只包含了相关的列)。理想情况下,请寻找dplyr解决方案。
发布于 2022-02-24 20:49:53
这也可以通过使用match查找索引,然后使用rowwise粘贴来创建。
library(dplyr)
library(tidyr)
current %>%
mutate(ind = lag(match(new_item, target_veg))) %>%
fill(ind, .direction = "downup") %>%
rowwise %>%
mutate(ind = toString(target_veg[seq(ind)])) %>%
ungroup %>%
mutate(prev_veg = coalesce(prev_veg, ind), .keep = "unused")-output
# A tibble: 6 × 2
prev_veg new_item
<chr> <chr>
1 cabbage lettuce
2 cabbage, lettuce apple
3 cabbage, lettuce beef
4 cabbage, lettuce spinach
5 cabbage, lettuce, spinach broccoli
6 cabbage, lettuce, spinach, broccoli mango 注:rowwise可能比@IceCreamToucan的accumulate慢。
https://stackoverflow.com/questions/71257290
复制相似问题