我使用了Daniel Nouri在他的同名网站上提供的框架。这是我看起来不错的代码,我所做的唯一改变是将output_nonlinearity=lasagne.nonlinearities.softmax和used.It回归到False.Otherwise,它看起来非常简单
from lasagne import layers
import theano
from lasagne.updates import sgd,nesterov_momentum
from nolearn.lasagne import NeuralNet
from sklearn.metrics import classification_report
import lasagne
import cv2
import numpy as np
from sklearn.cross_validation import train_test_split
from sklearn.datasets import fetch_mldata
import sys
mnist = fetch_mldata('MNIST original')
X = np.asarray(mnist.data, dtype='float32')
y = np.asarray(mnist.target, dtype='int32')
(trainX, testX, trainY, testY) = train_test_split(X,y,test_size =0.3,random_state=42)
trainX = trainX.reshape(-1, 1, 28, 28)
testX = testX.reshape(-1, 1, 28, 28)
clf = NeuralNet(
layers=[
('input', layers.InputLayer),
('conv1', layers.Conv2DLayer),
('pool1', layers.MaxPool2DLayer),
('dropout1', layers.DropoutLayer), # !
('conv2', layers.Conv2DLayer),
('pool2', layers.MaxPool2DLayer),
('dropout2', layers.DropoutLayer), # !
('hidden4', layers.DenseLayer),
('dropout4', layers.DropoutLayer), # !
('hidden5', layers.DenseLayer),
('output', layers.DenseLayer),
],
input_shape=(None,1, 28, 28),
conv1_num_filters=20, conv1_filter_size=(3, 3), pool1_pool_size=(2, 2),
dropout1_p=0.1, # !
conv2_num_filters=50, conv2_filter_size=(3, 3), pool2_pool_size=(2, 2),
dropout2_p=0.2, # !
hidden4_num_units=500,
dropout4_p=0.5, # !
hidden5_num_units=500,
output_num_units=10,
output_nonlinearity=lasagne.nonlinearities.softmax,
update=nesterov_momentum,
update_learning_rate=theano.shared(float32(0.03)),
update_momentum=theano.shared(float32(0.9)),
regression=False,
max_epochs=3000,
verbose=1,
)
clf.fit(trainX,trainY)然而,在运行它时,我得到了这个NaN
input (None, 1, 28, 28) produces 784 outputs
conv1 (None, 20, 26, 26) produces 13520 outputs
pool1 (None, 20, 13, 13) produces 3380 outputs
dropout1 (None, 20, 13, 13) produces 3380 outputs
conv2 (None, 50, 11, 11) produces 6050 outputs
pool2 (None, 50, 6, 6) produces 1800 outputs
dropout2 (None, 50, 6, 6) produces 1800 outputs
hidden4 (None, 500) produces 500 outputs
dropout4 (None, 500) produces 500 outputs
hidden5 (None, 500) produces 500 outputs
output (None, 10) produces 10 outputs
epoch train loss valid loss train/val valid acc dur
------- ------------ ------------ ----------- ----------- ------
1 nan nan nan 0.09923 16.18s
2 nan nan nan 0.09923 16.45s提前谢谢。
发布于 2016-04-26 21:42:41
我玩得很晚,但希望有人能找到有用的答案!
根据我的经验,这里可能有很多地方出了问题。我将在nolearn/lasagne中写出调试此类问题的步骤:
使用Theano的fast_compile优化器的
nan输出(在我的例子中这是最终问题)nan值开始时,或者如果nan值在训练开始后不久开始出现,学习率可能太高。如果它是0.01,请尝试使其为0.001.regression=True。nan输出,则可能不是特定于您正在训练的数据集。希望这能有所帮助!
https://stackoverflow.com/questions/30821123
复制相似问题