[1] NA NA
[3] NA NA
[5] "kilo130.9" "kilo5075.69"
[7] "kilo465" "kilo34.8"
[9] "kilo607.195" "kilo1362.7" 上面是我从R复制粘贴的一个数据帧的列,我运行了下面的代码来删除列中的单词kilo,但它不起作用。我没有得到一个错误,但它没有删除单词kilo。我使用了下面的代码
stopwords = readLines('stopwords.txt') #I put the word kilo in this txt file
x = df$Dist
x = removeWords(x,stopwords)
df$newdist<-x可能的原因是什么?
发布于 2018-05-31 07:37:45
removeWords()只会删除与"kilo"完全匹配的单词(没有其他字符):
x <- c("kilo", "kilo2", "pound")
tm::removeWords(x, "kilo")
#> [1] "" "kilo2" "pound"这里有另一个选择:
library("stringr")
x <- c(NA, NA, NA, NA, "kilo130.9", "kilo5075.69", "kilo465", "kilo34.8", "kilo607.195", "kilo1362.7")
str_replace(x, "kilo", "")
#> [1] NA NA NA NA "130.9" "5075.69" "465"
#> [8] "34.8" "607.195" "1362.7"发布于 2018-05-31 07:42:14
这是一个使用gsub的基础R解决方案
# Sample data
w <- c(
NA, NA,
NA, NA,
"kilo130.9", "kilo5075.69",
"kilo465", "kilo34.8",
"kilo607.195", "kilo1362.7")
# Strings that should be deleted
stopwords <- c("kilo", "something");
sapply(w, function(x)
x <- gsub(sprintf("(%s)", paste(stopwords, collapse = "|")), "", x));
# <NA> <NA> <NA> <NA> kilo130.9 kilo5075.69
# NA NA NA NA "130.9" "5075.69"
# kilo465 kilo34.8 kilo607.195 kilo1362.7
# "465" "34.8" "607.195" "1362.7"https://stackoverflow.com/questions/50614585
复制相似问题