我想从NA替换为随机值。这个数据框架有一个类似于“每周一天”的列,我不知道如何完成这个数据框架。我试着使用函数miss林,但是这个函数可以工作在带有整数的列上。你知道我怎样才能完成所有的专栏吗?
travel <- read.csv("https://openmv.net/file/travel-times.csv")
library(missForest)
summary(travel)
set.seed(82)
travel1 <- prodNA(travel, noNA = 0.2)
travel2 <- missForest(travel1)发布于 2020-11-28 16:10:24
首先,如果要将""字符串读入NAs,则需要在read.csv中添加一个参数na.strings = ""。那么,你的意思是把安娜对一个变量的观察用同一变量的另一个随机观察来代替吗?如果是这样,请考虑以下程序:
travel <- read.csv("https://openmv.net/file/travel-times.csv", na.strings = "")
set.seed(82)
res <- data.frame(lapply(travel, function(x) {
is_na <- is.na(x)
replace(x, is_na, sample(x[!is_na], sum(is_na), replace = TRUE))
}))res看起来像这样
Date StartTime DayOfWeek GoingTo Distance MaxSpeed AvgSpeed AvgMovingSpeed FuelEconomy TotalTime MovingTime Take407All Comments
1 1/6/2012 16:37 Friday Home 51.29 127.4 78.3 84.8 8.5 39.3 36.3 No Medium amount of rain
2 1/6/2012 08:20 Friday GSK 51.63 130.3 81.8 88.9 8.5 37.9 34.9 No Put snow tires on
3 1/4/2012 16:17 Wednesday Home 51.27 127.4 82.0 85.8 8.5 37.5 35.9 No Heavy rain
4 1/4/2012 07:53 Wednesday GSK 49.17 132.3 74.2 82.9 8.31 39.8 35.6 No Accident blocked 407 exit
5 1/3/2012 18:57 Tuesday Home 51.15 136.2 83.4 88.1 9.08 36.8 34.8 No Rain, rain, rain
6 1/3/2012 07:57 Tuesday GSK 51.80 135.8 84.5 88.8 8.37 36.8 35.0 No Backed up at Bronte
7 1/2/2012 17:31 Monday Home 51.37 123.2 82.9 87.3 - 37.2 35.3 No Pumped tires up: check fuel economy improved?
8 1/2/2012 07:34 Monday GSK 49.01 128.3 77.5 85.9 - 37.9 34.3 No Pumped tires up: check fuel economy improved?
9 12/23/2011 08:01 Friday GSK 52.91 130.3 80.9 88.3 8.89 39.3 36.0 No Police slowdown on 403
10 12/22/2011 17:19 Thursday Home 51.17 122.3 70.6 78.1 8.89 43.5 39.3 No Start early to run a batch发布于 2020-11-30 10:07:35
您可以使用imputeTS包向时间序列插入随机值。函数na_random可用于此操作。该函数可用于数值列(其他列将保持不变,这可能很有用,因为您可能不需要注释列的随机文本)
你可以打电话给
library("imputeTS")
na_random(yourData)该函数将查找每一列的最低值和最高值,并在此界限之间插入随机值。
但是,您也可以为随机值定义自己的边界,如下所示:
library("imputeTS")
na_random(yourData, lower_bound = 0, upper_bound = 25)对于您的数据,它可能如下所示:
library("imputeTS")
# To read the input correctly and have the right data types
travel <- read.csv("https://openmv.net/file/travel-times.csv", na.strings = "")
travel$FuelEconomy <- as.numeric(travel$FuelEconomy)
# To perform the missing data replacement
travel <- na_random(travel)https://stackoverflow.com/questions/65051396
复制相似问题