我正在尝试使用带状变量扩展行。下面是输入数据和所需的输出数据。基本上,我想使用tonnage扩展我的行,在输入数据集中,tonnage具有更宽的宽度。
我尝试使用expand.grid和splitstackshape,但无法继续。请帮帮忙。


发布于 2021-06-07 13:00:37
一种解决方案是创建你的乐队的母版。您可以创建一个自定义函数&执行lapply。我使用了一个for循环,以便这些步骤易于调试和遵循。
library(tidyverse)
bands <- c(0, 7.5, 12, 20, 40, Inf)
# Master df of all the bands
df_bands_master <- cbind(bands, c(bands[-1], bands[1])) %>%
as_tibble() %>%
slice(1:nrow(.) - 1) %>%
select(lb = bands, ub = V2)
# Your input df
df <- tibble(
Key = c("FDE", "GED"),
Tonnage = c("0-40", "7.5-40")
)
# Your output df
df_final <- c()
for (i in 1:nrow(df)) {
temp_row <- df %>% slice(i)
temp_full_band <- as.numeric(str_split(temp_row$Tonnage, pattern = "-") %>% pluck(1))
temp_min_band <- bands[which(temp_full_band[1] >= bands)] %>% pluck(length(.))
temp_max_band <- bands[which(temp_full_band[2] <= bands)] %>% pluck(length(.))
df_final <- df_final %>%
bind_rows(
df_bands_master %>%
filter(lb >= temp_min_band, ub <= temp_max_band) %>%
unite(Tonnage_Split, c("lb", "ub"), remove = T, sep = "-") %>%
mutate(Key = temp_row$Key))
}
df_final %>%
select(Key, Tonnage_Split)

你可以改变它使它成为40+,我将留给你。提示:在绑定行时,在for循环中结合使用mutate和ifelse。
发布于 2021-06-07 14:18:07
这里有一个建议。我编写了一个带有三个参数的函数: start、end和cutoff。然后,逐行运行此函数。关键是使用cut()为波段生成字符串。
library(tidyverse)
dat <- data.frame(key = c("FDE", "GED"), tonnage = c("0-40", "7.5-40"))
split_numbers <- function(start, end, cutoffs) {
cutoffs <- cutoffs[between(cutoffs, start, end)]
cut(1, cutoffs) %>%
levels() %>%
str_remove_all("\\(|\\]") %>%
str_replace(",", "-") %>%
c(paste0(end, "+"))
}
dat %>%
separate(tonnage, c("start", "end"), sep = "-") %>%
group_by(key) %>%
summarise(
tonnage_split = list(split_numbers(start, end, c(0, 7.5, 12, 20, 40)))
) %>%
unnest(tonnage_split)
# # A tibble: 9 x 2
# key tonnage_split
# <chr> <chr>
# 1 FDE 0-7.5
# 2 FDE 7.5-12
# 3 FDE 12-20
# 4 FDE 20-40
# 5 FDE 40+
# 6 GED 7.5-12
# 7 GED 12-20
# 8 GED 20-40
# 9 GED 40+ https://stackoverflow.com/questions/67865475
复制相似问题