提前感谢您的帮助。我正在尝试实现一个深度学习神经网络来预测一些变量(一种多变量非线性回归)。作为第一步,我查看了R语言中的Darch包,并研究了
http://cran.r-project.org/web/packages/darch/darch.pdf
当我运行来自p10的以下代码时,它似乎是在训练“异或”,那么产生的神经网络似乎无法学习该函数。它要么学习(1,0)模式,要么学习(0,1)模式为真,但不能同时学习两者,有时还学习应该为假的(1,1)模式。我的理解是,这些类型的网络应该能够学习几乎任何功能,包括入门‘独占或’:这不是通过原始的反向传播工作解决的,这个网络在微调中使用。我想我可能遗漏了什么,所以有什么建议或帮助是非常感谢的吗?(我甚至将纪元数增加到10,000,但无济于事。)
# Generating the datasets
inputs <- matrix(c(0,0,0,1,1,0,1,1),ncol=2,byrow=TRUE)
outputs <- matrix(c(0,1,1,0),nrow=4)
# Generating the darch
darch <- newDArch(c(2,4,1),batchSize=2)
# Pre-Train the darch
darch <- preTrainDArch(darch,inputs,maxEpoch=100)
# Prepare the layers for backpropagation training for
# backpropagation training the layer functions must be
# set to the unit functions which calculates the also
# derivatives of the function result.
layers <- getLayers(darch)
for(i in length(layers):1){
layers[[i]][[2]] <- sigmoidUnitDerivative
}
setLayers(darch) <- layers
rm(layers)
# Setting and running the Fine-Tune function
setFineTuneFunction(darch) <- backpropagation
darch <- fineTuneDArch(darch,inputs,outputs,maxEpoch=100)
# Running the darch
darch <- darch <- getExecuteFunction(darch)(darch,inputs)
outputs <- getExecOutputs(darch)
cat(outputs[[length(outputs)]])
## End(Not run)
#### Example results
> cat(outputs[[length(outputs)]])
0.02520016 0.8923063 0.1264799 0.9803244
## Different run
> cat(outputs[[length(outputs)]])
0.02702418 0.1061477 0.9833059 0.9813462发布于 2014-12-05 04:42:04
值得一提的是,下面的方法对我很有效:
# Generating the datasets
inputs <- matrix(c(0,0,0,1,1,0,1,1),ncol=2,byrow=TRUE)
print(inputs)
outputs <- matrix(c(0,1,1,0),nrow=4)
print(outputs)
# Generating the darch
darch <- newDArch(c(2,4,1),batchSize=4,ff=F)
# Pre-Train the darch
darch <- preTrainDArch(darch,inputs,maxEpoch=200,numCD=4)
# Prepare the layers for backpropagation training for
# backpropagation training the layer functions must be
# set to the unit functions which calculates the also
# derivatives of the function result.
layers <- getLayers(darch)
for(i in length(layers):1){
layers[[i]][[2]] <- sigmoidUnitDerivative
}
setLayers(darch) <- layers
rm(layers)
# Setting and running the Fine-Tune function
setFineTuneFunction(darch) <- rpropagation
darch <- fineTuneDArch(darch,trainData=inputs,targetData=outputs,
maxEpoch=200,
isBin=T)
# Running the darch
darch <- darch <- getExecuteFunction(darch)(darch,inputs)
outputs2 <- getExecOutputs(darch)
cat(outputs2[[length(outputs2)]])
## End(Not run)给出了以下结果
> # Running the darch
> darch <- darch <- getExecuteFunction(darch)(darch,inputs)
> outputs2 <- getExecOutputs(darch)
> cat(outputs2[[length(outputs2)]])
1.213234e-21 1 1 1.213234e-21
> ## End(Not run)
1.213234e-21 1 1 1.213234e-21因此,进行了以下更改:
因为我基本上是在执行巫毒(直到我练习了一些),所以我似乎不能将错误率控制在17%以下。
编辑:
因此,我一直在阅读并倾向于认为,系统的每个独特状态都与单个内部神经元有关。如果你有两位逻辑,那么就有四种独特的输入组合,所以有四种独特的输入状态。如果你想要一个可以处理这个问题的系统,那么你需要4个内部节点。这意味着对于8位操作,您可能需要256个内部节点。
atari-game folks具有模型自适应控制,因此,对于一个网络,他们预测系统的下一个状态,而对于另一个网络,他们在给定当前状态和预期下一个状态的情况下确定最佳控制策略。
当我重新运行几千次时,经过长时间训练后的输出大约有18%是错误的。我真的不喜欢那样。
想法:
发布于 2016-06-06 07:49:07
我能够在darch中调优基线example.xor,以可靠地正确地学习简单的xor。以下是基线版本:
> tmp<-mclapply(1:50, function(x) example.xor())
> table(sapply(tmp,function(x) tail(x@stats$dataErrors$class,1)))
0 25
30 20 下面是一个经过调整的变体:
trainingData <- matrix(
c(0,0,
0,1,
1,0,
1,1), ncol=2, byrow=T)
trainingTargets <- matrix(c(0,1,1,0),nrow=4)
tuned.xor <- function() {
darch(trainingData, trainingTargets,
# These settings are different
layers=c(2,6,1),
darch.batchSize=4,
darch.fineTuneFunction=function(...) rpropagation(..., weightDecay=0.0001),
# These settings are all as in example.xor
darch.bootstrap=F,
darch.learnRateWeights = 1.0,
darch.learnRateBiases = 1.0,
darch.isBin=T,
darch.stopClassErr=0,
darch.numEpochs=1000
)
}
> tmp<-mclapply(1:50, function(x) tuned.xor())
> table(sapply(tmp,function(x) tail(x@stats$dataErrors$class,1)))
0
50 https://stackoverflow.com/questions/24782006
复制相似问题