我有一个名为“full_name”的专栏,它展示了两支球队,例如:“曼联赢利物浦”、“利物浦赢曼联”、“切尔西赢阿森纳赢”等等。
我希望能够把球队区分为北和南,如果“曼联赢利物浦赢”或者“利物浦赢曼联赢U赢”,那么这将被编码为“北方”,而如果“切尔西赢阿森纳赢了”,这是编码为“南方”等等。
levels(raw_data$full_name)[levels(raw_data$full_name)== "Man U to win Liverpool to win"] <- 'North'
levels(raw_data$full_name)[levels(raw_data$full_name)== "Liverpool to win Man U to win"] <- 'North'
levels(raw_data$full_name)[levels(raw_data$full_name)== "Chelsea to win Arsenal to win"] <- 'South'上面的代码不会产生任何错误,但是dataframe保持不变,并且没有产生所需的输出。是做这个的方法吗?
发布于 2022-11-27 18:33:26
这里有一个可以帮助您的tidyverse方法的示例。
library(dplyr)
north <- c("Man U to win Liverpool to win","Liverpool to win Man U to win")
south <- c("Chelsea to win Arsenal to win")
df <-
data.frame(full_name = sample(c(north,south),size = 5,replace = TRUE))
df %>%
mutate(region = case_when(
full_name %in% north ~ "North",
full_name %in% south ~ "South"
))
full_name region
1 Chelsea to win Arsenal to win South
2 Man U to win Liverpool to win North
3 Chelsea to win Arsenal to win South
4 Man U to win Liverpool to win North
5 Man U to win Liverpool to win North发布于 2022-11-27 18:34:31
下面是fct_recode的一个选项
library(forcats)
raw_data$full_name <- with(raw_data, fct_recode(full_name,
North = "Man U to win Liverpool to win",
North = "Liverpool to win Man U to win",
South = "Chelsea to win Arsenal to win"))或者使用base R
factor(raw_data$full_name, levels = c("Chelsea to win Arsenal to win",
"Liverpool to win Man U to win", "Man U to win Liverpool to win"
), labels = c("South", "North", "North"))或者如果我们想使用levels
lvls_to_change <- c("Man U to win Liverpool to win",
"Liverpool to win Man U to win", "Chelsea to win Arsenal to win")
lvsl_new <- c("North", "North", "South")
i1 <- levels(raw_data$full_name) %in% lvls_to_change
levels(raw_data$full_name)[i1] <- lvsl_new[match(levels(raw_data$full_name)[i1], lvls_to_change)]数据
raw_data <- structure(list(full_name = structure(c(2L, 2L, 3L, 2L,
1L), levels = c("Chelsea to win Arsenal to win",
"Liverpool to win Man U to win", "Man U to win Liverpool to win"
), class = "factor")), row.names = c(NA, -5L), class = "data.frame")发布于 2022-11-27 18:39:12
以下是另一种方法:
library(dplyr)
library(stringr)
north <- c("Liverpool|Man")
south <- c("Chelsea|Arsenal")
df %>%
mutate(region = case_when(str_detect(full_name, north) ~ "North",
str_detect(full_name, south) ~ "South",
TRUE ~ NA_character_)) full_name region
1 Liverpool to win Man U to win North
2 Chelsea to win Arsenal to win South
3 Man U to win Liverpool to win North
4 Chelsea to win Arsenal to win South
5 Liverpool to win Man U to win Northhttps://stackoverflow.com/questions/74592707
复制相似问题