基于this answer,我们如何在一个更紧凑的单列中列出结果,以防止我们匹配许多模式,但期望每个字符串只得到很少的点击量?(我不确定"hits“列的最正统格式,无论是下面的向量,还是分隔的字符串。)
streets = c("Berberichweg", "Otto-Klemperer-Weg", "Feldmeierbogen" , "Altostraße")
streets = tolower(streets) #Lowercase all
names = c("Berber", "Weg")
names = tolower(names)
#The original solution and output
sapply(names, function (y) sapply(streets, function (x) grepl(y, x)))
# berber weg
#berberichweg TRUE TRUE
#otto-klemperer-weg FALSE TRUE
#feldmeierbogen FALSE FALSE
#altostraße FALSE FALSE
#The desired output instead
#streets hits
#berberichweg c("berber", "weg")
#otto-klemperer-weg "weg"
#feldmeierbogen NA
#altostraße NA发布于 2021-04-06 12:16:26
res <- sapply(names, function (y) sapply(streets, function (x) grepl(y, x)))
res
# berber weg
# berberichweg TRUE TRUE
# otto-klemperer-weg FALSE TRUE
# feldmeierbogen FALSE FALSE
# altostraße FALSE FALSE
dat <- data.frame(streets = streets)
dat$hits1 <- names[apply(res, 1, function(z) if (any(z)) which.max(z) else NA)]
dat
# streets hits1
# 1 berberichweg berber
# 2 otto-klemperer-weg weg
# 3 feldmeierbogen <NA>
# 4 altostraße <NA>
dat$hits1
# [1] "berber" "weg" NA NA 如果您希望每个结果有一个字符串,那么也许
dat$hits2 <- apply(res, 1, function(z) toString(names(which(z))))
dat
# streets hits1 hits2
# 1 berberichweg berber berber, weg
# 2 otto-klemperer-weg weg weg
# 3 feldmeierbogen <NA>
# 4 altostraße <NA>
dat$hits2
# [1] "berber, weg" "weg" "" "" 注意到第一个字符串是一个逗号分隔的字符串,而不是字符串的向量。另一种选择是使用列表列,
dat$hits3 <- apply(res, 1, function(z) names(which(z)))
dat
# streets hits1 hits2 hits3
# 1 berberichweg berber berber, weg berber, weg
# 2 otto-klemperer-weg weg weg weg
# 3 feldmeierbogen <NA>
# 4 altostraße <NA>
dat$hits3
# $berberichweg
# [1] "berber" "weg"
# $`otto-klemperer-weg`
# [1] "weg"
# $feldmeierbogen
# character(0)
# $altostraße
# character(0)这是一个list,可以分配到一个框架中。关于这一点,有两件事要注意:
[[从这个hits3中获取单个“单元格”:dat$hits3 11#1 "berber“dat$hits3 21#1 "berber,weg”dat$hits3 31# $berberichweg #<--这是长度1#1“柏柏尔”"weg“dat$hits3 3[1]#1 "berber”"weg“的列表,而不是向量。
在本专栏中工作的任何东西都必须是
。
https://stackoverflow.com/questions/66967619
复制相似问题