假设我有以下数据框架。我有一个包含最常用单词的列表(表"imp"),下面我有一张有不同型号的表。我要做的是创建第二列(名称“单词”),其中最常见的单词排序减少(如下面所示)。
imp<-data.frame(word=c("ls","lxl","mec","hatch","bi"),frec=c(100,90,80,85,70))
word frec
ls 100
lxl 90
mec 80
hatch 85
bi 70
table=data.frame(code=c(1,2,3,4,5),model=c("hatch ls 1.0 8v", " onix 2016 ls 1.0 ar condicionado + direcao hidraulica","onix hatch ls 1.0 8v flexpower 5p mec.",
"volvo xc bi turbo blindada","honda civic sedan lxl 1.8 flex 16v mec 4p aceita troca"),
words=c("ls hatch", "ls","ls hatch","bi","lxl"))
code model words
1 hatch ls 1.0 8v ls hatch
2 onix 2016 ls 1.0 ar condicionado + direcao hidraulica ls
3 onix hatch ls 1.0 8v flexpower 5p mec. ls hatch
4 volvo xc bi turbo blindada bi
5 honda civic sedan lxl 1.8 flex 16v mec 4p aceita troca lxl mec发布于 2017-05-21 15:08:35
我们可以试试
library(stringr)
sapply(str_extract_all(table$model, paste(imp$word, collapse="|")),
function(x) paste(head(x[order(-imp$frec[match(x, imp$word)])], 2), collapse= " "))https://stackoverflow.com/questions/44098346
复制相似问题