我似乎无法理解一个看似简单的任务:如何根据一列()中的模式过滤数据,但是,只有当另一列中的一个模式匹配时,才能匹配:
数据:
df <- data.frame(
Speaker = c("A", NA, "B", "C", "A", "B", "A", "B", "C"),
Utterance = c("uh-huh",
"(0.666)",
"WOW!",
"#yeah#",
"=right=",
"oka::y¿",
"okay",
"some stuff",
"!more! £TAlk£"),
Orthographic = c("uh-huh", "NA", "wow", "yeah", "right", "okay", "okay", "some stuff", "more talk")
)我希望在df中删除模式^(yeah|okay|right|mhm|mm|uh(-| )?huh)$与列Orthographic 中的模式匹配的行,但不删除中的行,如果这些行包含来自Utterance列中的字符类[A-Z:↑↓£#¿?!]的任何字符。
预期结果
df
Speaker Utterance Orthographic
3 B WOW! wow
4 C #yeah# yeah
6 B oka::y¿ okay
8 B some stuff some stuff
9 C !more! £TAlk£ more talk尝试到目前为止,:(过滤太多了!)
library(dplyr)
df %>%
filter(!is.na(Speaker)) %>%
filter(!grepl("^(yeah|okay|right|mhm|mm|uh(-| )?huh)$", Orthographic)
& grepl("[A-Z:↑↓£#¿?!]", Utterance))
Speaker Utterance Orthographic
1 B WOW! wow
2 C !more! £TAlk£ more talk发布于 2021-04-30 09:45:59
我想你需要|:
library(dplyr)
df %>%
filter(!is.na(Speaker)) %>%
filter(!grepl("^(yeah|okay|right|mhm|mm|uh(-| )?huh)$", Orthographic)
| grepl("[A-Z:↑↓£#¿?!]", Utterance))
# Speaker Utterance Orthographic
#1 B WOW! wow
#2 C #yeah# yeah
#3 B oka::y¿ okay
#4 B some stuff some stuff
#5 C !more! £TAlk£ more talk保留没有^(yeah|okay|right|mhm|mm|uh(-| )?huh)$或[A-Z:↑↓£#¿?!]的行。
https://stackoverflow.com/questions/67331605
复制相似问题