文章/答案/技术大牛

发布

社区首页 >问答首页 >如何用randomForest消除"NA/NaN/Inf in eliminate function call (Arg7)“运行预测

问如何用randomForest消除"NA/NaN/Inf in eliminate function call (Arg7)“运行预测
EN

Stack Overflow用户

提问于 2014-02-23 12:05:40

回答 2查看 86.2K关注 0票数 16

我对此进行了广泛的研究，但没有找到解决方案。我清理了我的数据集，如下所示：

library("raster")
impute.mean <- function(x) replace(x, is.na(x) | is.nan(x) | is.infinite(x) , 
mean(x, na.rm = TRUE))
losses <- apply(losses, 2, impute.mean)
colSums(is.na(losses))
isinf <- function(x) (NA <- is.infinite(x))
infout <- apply(losses, 2, is.infinite)
colSums(infout)
isnan <- function(x) (NA <- is.nan(x))
nanout <- apply(losses, 2, is.nan)
colSums(nanout)

运行预测算法会出现问题：

options(warn=2)
p  <-   predict(default.rf, losses, type="prob", inf.rm = TRUE, na.rm=TRUE, nan.rm=TRUE)

所有的研究都说它应该是数据中的NA或Inf或NaN，但我没有发现任何数据。我在deleted Traceback上提供了数据和randomForest摘要供侦探使用，但并没有透露太多信息(至少对我来说)：

4: .C("classForest", mdim = as.integer(mdim), ntest = as.integer(ntest), 
       nclass = as.integer(object$forest$nclass), maxcat = as.integer(maxcat), 
       nrnodes = as.integer(nrnodes), jbt = as.integer(ntree), xts = as.double(x), 
       xbestsplit = as.double(object$forest$xbestsplit), pid = object$forest$pid, 
       cutoff = as.double(cutoff), countts = as.double(countts), 
       treemap = as.integer(aperm(object$forest$treemap, c(2, 1, 
           3))), nodestatus = as.integer(object$forest$nodestatus), 
       cat = as.integer(object$forest$ncat), nodepred = as.integer(object$forest$nodepred), 
       treepred = as.integer(treepred), jet = as.integer(numeric(ntest)), 
       bestvar = as.integer(object$forest$bestvar), nodexts = as.integer(nodexts), 
       ndbigtree = as.integer(object$forest$ndbigtree), predict.all = as.integer(predict.all), 
       prox = as.integer(proximity), proxmatrix = as.double(proxmatrix), 
       nodes = as.integer(nodes), DUP = FALSE, PACKAGE = "randomForest")
3: predict.randomForest(default.rf, losses, type = "prob", inf.rm = TRUE, 
       na.rm = TRUE, nan.rm = TRUE)
2: predict(default.rf, losses, type = "prob", inf.rm = TRUE, na.rm = TRUE, 
       nan.rm = TRUE)
1: predict(default.rf, losses, type = "prob", inf.rm = TRUE, na.rm = TRUE, 
       nan.rm = TRUE)

runtime-error

random-forest

predict

回答 2

Stack Overflow用户

回答已采纳

发布于 2014-02-23 12:50:54

您的代码不是完全可重现的(没有运行实际的randomForest算法)，但是您没有用列向量的方式替换Inf值。这是因为在impute.mean函数中对mean()的调用中的na.rm = TRUE参数完全按照它所说的做--删除NA值(而不是Inf值)。

例如，您可以通过以下方式查看：

impute.mean <- function(x) replace(x, is.na(x) | is.nan(x) | is.infinite(x), mean(x, na.rm = TRUE))
losses <- apply(losses, 2, impute.mean)
sum( apply( losses, 2, function(.) sum(is.infinite(.))) )
# [1] 696

要删除无限值，请使用：

impute.mean <- function(x) replace(x, is.na(x) | is.nan(x) | is.infinite(x), mean(x[!is.na(x) & !is.nan(x) & !is.infinite(x)]))
losses <- apply(losses, 2, impute.mean)
sum(apply( losses, 2, function(.) sum(is.infinite(.)) ))
# [1] 0

票数 16

Stack Overflow用户

发布于 2016-01-08 10:30:34

错误消息的一个原因：

外部函数调用中的

NA/NaN/Inf (参数X)

在训练randomForest时，您的data.frame中包含character-class变量。如果它附带警告：

通过强制引入的

NAs

检查以确保所有字符变量都已转换为因子。

示例

set.seed(1)
dat <- data.frame(
  a = runif(100),
  b = rpois(100, 10),
  c = rep(c("a","b"), 100),
  stringsAsFactors = FALSE
)

library(randomForest)
randomForest(a ~ ., data = dat)

收益率：

randomForest.default(m，y，...)中的

错误:外部函数调用中的NA/NaN/Inf (arg 1)此外:警告消息:在data.matrix(x)中:强制引入的NAs

但是将它切换到stringsAsFactors = TRUE，它就会运行。

票数 13

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/21964078

复制

相似问题

问如何用randomForest消除"NA/NaN/Inf in eliminate function call (Arg7)“运行预测
EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问如何用randomForest消除"NA/NaN/Inf in eliminate function call (Arg7)“运行预测EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问如何用randomForest消除"NA/NaN/Inf in eliminate function call (Arg7)“运行预测
EN