我有一个不同类型的口罩清单,我想把它们分类为N95,外科,布料或其他。
df<-data.frame(mask_type=
c("Surgical Mask (3M 1800)",
"N95 FFR (Wilson 1105N) (2x 3mm leaks)",
"N95 FFR (San Huei United Company 1895N) (2x 3mm leaks)",
"Surgical Mask (Primed PG4-1073) (2x 3mm leaks)",
"Surgical Mask (3M 1800) (2x 3mm leaks)",
"N95 FFR (Wilson 1105N) (4x 3mm leaks)",
"Cloth FFR (San Huei United Company 1895N) (4x 3mm leaks)",
"Cloth Mask (Primed PG4-1073) (4x 3mm leaks)") 这样做可以过滤掩码,但不会创建“其他”列。你觉得我离得远吗?
需要(Dplyr)需要(Tidyr)
df %>%
mutate(TYPE=stringr::str_detect(mask_type,"N95 | surgical | cloth")) %>%
filter(TYPE=TRUE) %>%
select(mask_type)发布于 2021-02-19 18:24:50
如果字符串中存在任何模式'Surgical|N95|Cloth',则使用str_extract进行提取。如果都不存在,它将返回可以用'Other'替换的NA。
library(dplyr)
library(stringr)
df %>%
mutate(TYPE= str_extract(mask_type, regex('Surgical|N95|Cloth', ignore_case = TRUE)),
TYPE = replace(TYPE, is.na(TYPE), 'Other'))发布于 2021-02-20 02:42:19
我们可以使用base R
lst1 <- with(df, regmatches(mask_type, gregexpr('Surgical|N95|Cloth', mask_type)))
df$TYPE <- sapply(lst1, function(x) if(length(x) == 0) 'Other' else x)https://stackoverflow.com/questions/66275681
复制相似问题