文章/答案/技术大牛

发布

社区首页 >问答首页 >R中文本向量中单词多次出现的计数

问R中文本向量中单词多次出现的计数
EN

Stack Overflow用户

提问于 2015-10-11 12:49:32

回答 1查看 740关注 0票数 0

我有以下几点：

text <- c('I am a human','It is an animal and not a human, I am a human','Cant think of something else to write','and and is am')
words <- c('and','am','is')

我想数一数课文中出现的这些词的总和。因此，输出应该如下：

[1] 1 3 0 4

我使用的代码显然不是最优雅的：

TotalCount <- vector(mode='integer',length = 4)
for (ii in 1:4){
    for(jj in 1:3){
          wordCount <- str_count(text[ii],words[jj])
          TotalCount[ii] <- wordCount + TotalCount[ii]
    }
}

有没有一种更有效率、更优雅和更好的方法来做到这一点？

string

回答 1

Stack Overflow用户

回答已采纳

发布于 2015-10-11 12:53:10

您可以从str_count库中使用stringr函数。

library(stringr)
text <- c('I am a human','It is an animal and not a human, I am a human','Cant think of something else to write','and and is am')
words <- c('and','am','is')
str_count(text, paste(words, collapse="|"))
# [1] 1 3 0 4

或

str_count(text, paste0(c("\\b("),paste(words,collapse="|"),c(")\\b")))

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/33065147

复制

相似问题

问R中文本向量中单词多次出现的计数
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问R中文本向量中单词多次出现的计数EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问R中文本向量中单词多次出现的计数
EN