首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >在R中的数据帧中搜索特定字符集

在R中的数据帧中搜索特定字符集
EN

Stack Overflow用户
提问于 2018-01-03 18:17:13
回答 1查看 288关注 0票数 0

我创建了一组字符,其中包含一些缺失的值,如下所示

代码语言:javascript
复制
bp <- rep(NA, 5)
bp[c(2,4)] <- c("sugar","milk")
bp

> bp
[1] NA  "sugar" NA  "milk" NA 

我正在寻找一种使用bp搜索更大数据框架的方法,以便找到类似的bp (以及位置),但是NAs已被填充。

例如,

代码语言:javascript
复制
[1] any1  "sugar" any2  "milk" any3 
[2] any2  "sugar" any5  "milk" any1 
[3] any6  "sugar" any1  "milk" any3 
[4] any8  "sugar" any7  "milk" any6
[5] any1  "sugar" any2  "milk" any3 

编辑:数据帧的一部分如下所示

代码语言:javascript
复制
c("milk", "sugar", "sugar", "creme", "carw", "milk", "creme", "carw", 
"sugar", "carw", "creme", "sugar", "sugar", "milk", "milk", "creme", 
"sugar", "sugar", "carw", "carw", "carw", "milk", "sugar", "sugar", 
"carw", "sugar", "milk", "sugar", "creme", "carw", "carw", "carw", 
"creme", "carw", "carw", "creme", "creme", "milk", "carw", "milk", 
"milk", "creme", "creme", "creme", "milk", "milk", "creme", "carw", 
"carw", "milk", "milk", "creme", "creme", "carw", "carw", "milk", 
"sugar", "carw", "milk", "carw", "creme", "sugar", "sugar", "creme", 
"sugar", "sugar", "creme", "sugar", "carw", "sugar", "carw", 
"carw", "creme", "sugar", "milk", "milk", "carw", "carw", "milk", 
"creme", "sugar", "carw", "milk", "sugar", "sugar", "milk", "sugar", 
"creme", "milk", "milk", "carw", "milk", "sugar", "carw", "sugar", 
"carw", "creme", "creme", "carw", "milk", "milk", "milk", "milk", 
"carw", "carw", "milk", "milk", "carw", "sugar", "milk", "milk", 
"milk", "creme", "carw", "creme", "milk", "milk", "milk", "creme", 
"carw", "milk", "carw", "carw", "carw", "carw", "carw", "carw"
)

我会正常地使用它来搜索整个数据框架,但是在这种情况下,这是很棘手的。

代码语言:javascript
复制
library(data.table)

n1 <- length(bp)
bp.pos <- setDT(data.frame)[,  which(Reduce(`&`, Map(`==`, shift(value1, seq(n1)-1, 
                                                                             type = "lead"), 
                                                                 bp)))]

任何帮助都将不胜感激。

EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2018-01-03 19:24:31

下面是我对你的问题的理解。我称向量为共享的x

代码语言:javascript
复制
test = sapply(seq_along(bp), function(i) bp[i] == x[(0 + i):(length(x) - length(bp) + i)])
test = test | is.na(test)
res = which(apply(test, 1, all))
res = lapply(res, function(x) x + seq_along(bp) - 1)
final = lapply(res, function(z) x[z])
names(final) = lapply(res, "[", 1)

# $`11`
# [1] "creme" "sugar" "sugar" "milk"  "milk" 
# 
# $`12`
# [1] "sugar" "sugar" "milk"  "milk"  "creme"
# 
# $`56`
# [1] "milk"  "sugar" "carw"  "milk"  "carw" 
# 
# $`73`
# [1] "creme" "sugar" "milk"  "milk"  "carw" 
# 
# $`80`
# [1] "creme" "sugar" "carw"  "milk"  "sugar"
# 
# $`83`
# [1] "milk"  "sugar" "sugar" "milk"  "sugar"
# 
# $`86`
# [1] "milk"  "sugar" "creme" "milk"  "milk" 
# 
# $`108`
# [1] "carw"  "sugar" "milk"  "milk"  "milk" 

结果是一个命名列表,其中名称是x的起始索引,值是匹配的向量。这为您提供了"where“以及一个对象中的匹配。

票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/48082983

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档