当我训练我的模型时,损失在每一个时代都在增加。我觉得这是一个简单的解决方案,我遗漏了一些显而易见的东西,但我不知道它是什么。任何帮助都将不胜感激。
神经网络:
def neural_network(data):
hidden_L1 = {'weights': tf.Variable(tf.random_normal([784, neurons_L1])),
'biases': tf.Variable(tf.random_normal([neurons_L1]))}
hidden_L2 = {'weights': tf.Variable(tf.random_normal([neurons_L1, neurons_L2])),
'biases': tf.Variable(tf.random_normal([neurons_L2]))}
output_L = {'weights': tf.Variable(tf.random_normal([neurons_L2, num_of_classes])),
'biases': tf.Variable(tf.random_normal([num_of_classes]))}
L1 = tf.add(tf.matmul(data, hidden_L1['weights']), hidden_L1['biases']) #matrix multiplication
L1 = tf.nn.relu(L1)
L2 = tf.add(tf.matmul(L1, hidden_L2['weights']), hidden_L2['biases']) #matrix multiplication
L2 = tf.nn.relu(L2)
output = tf.add(tf.matmul(L2, output_L['weights']), output_L['biases']) #matrix multiplication
output = tf.nn.softmax(output)
return output我的损失,优化和循环每一个时代:
output = neural_network(x)
loss = tf.reduce_mean( tf.nn.softmax_cross_entropy_with_logits(logits=output, labels=y) )
optimiser = tf.train.AdamOptimizer().minimize(loss)
init = tf.global_variables_initializer()
epochs = 5
total_batch_count = 60000//batch_size
with tf.Session() as sess:
sess.run(init)
for epoch in range(epochs):
avg_loss = 0
for i in range(total_batch_count):
batch_x, batch_y = next_batch(batch_size, x_train, y_train)
_, c = sess.run([optimiser, loss], feed_dict = {x:batch_x, y:batch_y})
avg_loss +=c/total_batch_count
print("epoch = ", epoch + 1, "loss =", avg_loss)
sess.close()我有一种感觉,我的问题在于我所写的每一个时代的损失函数或循环,然而我对TensorFlow并不熟悉,无法解决这个问题。
发布于 2019-10-03 08:20:58
您使用的是函数软极_交叉_熵_使用_逻辑,根据Tensorflow的文档,该函数具有以下logits规范,
因此,您应该在非线性应用程序之前传递激活(在您的例子中,softmax)。您可以通过执行以下操作来修复它,
def neural_network(data):
hidden_L1 = {'weights': tf.Variable(tf.random_normal([784, neurons_L1])),
'biases': tf.Variable(tf.random_normal([neurons_L1]))}
hidden_L2 = {'weights': tf.Variable(tf.random_normal([neurons_L1, neurons_L2])),
'biases': tf.Variable(tf.random_normal([neurons_L2]))}
output_L = {'weights': tf.Variable(tf.random_normal([neurons_L2, num_of_classes])),
'biases': tf.Variable(tf.random_normal([num_of_classes]))}
L1 = tf.add(tf.matmul(data, hidden_L1['weights']), hidden_L1['biases']) #matrix multiplication
L1 = tf.nn.relu(L1)
L2 = tf.add(tf.matmul(L1, hidden_L2['weights']), hidden_L2['biases']) #matrix multiplication
L2 = tf.nn.relu(L2)
logits = tf.add(tf.matmul(L2, output_L['weights']), output_L['biases']) #matrix multiplication
output = tf.nn.softmax(logits)
return output, logits然后,在函数之外,可以检索逻辑,并将其传递给丢失函数,如下面的示例所示,
output, logits = neural_network(x)
loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=logits,
labels=y))我注意到,您可能仍然感兴趣的输出张量,以计算您的网络的准确性。如果此替换不起作用,您还应该在AdamOptimizer (见这里的文档)上试验学习速率参数。
https://datascience.stackexchange.com/questions/61163
复制相似问题