我正试图将“艺术与人文”和“社会科学”这两个词从包含"/“不同学科知识的字符串中删除如下:
string = "Arts and Humanities Other Topics/Social Sciences Other Topics/Arts and Humanities/Social Sciences/Sociology"我使用stringr包尝试过这样的方法:
sapply(strsplit(string, "/"), function(x) paste(str_remove(x, "\\bArts and Humanities\\b|\\bSocial Sciences\\b"), collapse = "/"))但是生成的输出是" Other Topics/ Other Topics///Sociology",我需要这样的输出:
"Arts and Humanities Other Topics/Social Sciences Other Topics/Sociology"
提前谢谢。
发布于 2022-01-05 20:20:05
只需要稍微调整一下,现在可以将strings推广到这样一个字符串的向量:
解决方案
sapply(
# Split each string by "/" into its components.
X = strsplit(x = strings, split = "/"),
# Remove undesired components and then reassemble the strings.
FUN = function(v){paste0(
# Use subscripting to filter out matches.
v[!grepl(x = v, pattern = "^\\s*(Arts and Humanities|Social Sciences)\\s*$")],
# Reassemble components as separated by "/".
collapse = "/"
)},
# Make the result a vector like the original 'string' (rather than a list).
simplify = TRUE,
USE.NAMES = FALSE
)结果
给出这样的strings向量
strings <- c(
"Arts and Humanities Other Topics/Social Sciences Other Topics/Arts and Humanities/Social Sciences/Sociology",
"Sociology/Arts and Humanities"
)该解决方案应产生以下结果:
[1] "Arts and Humanities Other Topics/Social Sciences Other Topics/Sociology"
[2] "Sociology"备注
使用unlist()的解决方案将把所有的东西折叠成一个巨大的字符串,而不是在strings中重新组装每个字符串。
发布于 2022-01-05 20:08:26
一种方法是将整个字符串分开,然后排除您不感兴趣的部分:
paste0(unlist(strsplit(string, '/'))[!unlist(strsplit(string, '/')) %in% c("Arts and Humanities", "Social Sciences")],
collapse = '/')或
paste0(base::setdiff(unlist(strsplit(string, '/')),
c("Arts and Humanities", "Social Sciences")), collapse = '/')
#"Arts and Humanities Other Topics/Social Science Other Topics/Sociology"https://stackoverflow.com/questions/70598940
复制相似问题