首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >tm包removeWords函数连接R中的单词

tm包removeWords函数连接R中的单词
EN

Stack Overflow用户
提问于 2021-07-20 15:41:58
回答 1查看 27关注 0票数 1

我从tm包中使用removewords清理样本数据,但removeWords函数连接了删除后的单词。应该是“环保死青蛙”“环保死老鼠”。有谁能给我带路吗?

代码语言:javascript
复制
library(tm)
dc<-c("environmental dead frog still","environmental dead mouse come")

manualremovelist<-c("the","does","doesn't","please","new","ok","one","cant",
                "doesnt","can","still","done","will","without","seen",
                "also","danfoss","case","doesn´t","due","need","occurs","made",
                "using","now","make","makes","needs","put","okay","sno","since","therefore",
                "found","milwaukee","probably","got","finally","isnt","per","two",
                "obvious","unable","must","nos","3nos","1no",".","phone","tel","attached",
                "given","find","have","see","be","give","do","come","use","make","get",
                "try","call","request")

dc<-removeWords(dc,manualremovelist)

"environmentaldeadfrog"  "environmentaldeadmouse"
EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2021-07-20 16:03:02

removeWords仅适用于单词。您可以将字符串拆分成单词,并对单个短语/句子使用removeWords

代码语言:javascript
复制
library(tm)

dc  <- sapply(strsplit(dc, '\\s+'), function(x) 
        trimws(paste0(removeWords(x, manualremovelist), collapse = ' ')))

dc

#[1] "environmental dead frog"  "environmental dead mouse"
票数 2
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/68451246

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档