我用TensorFlow训练神经网络。我了解到批量规范化对DNN非常有用,所以我在DNN中使用了它。
我使用"tf.layers.batch_normalization“并按照API文档的说明构建网络:当培训,设置其参数tf.layers.batch_normalization,以及当验证时,设置并添加tf.get_collection(tf.GraphKeys.UPDATE_OPS).
这是我的代码:
# -*- coding: utf-8 -*-
import tensorflow as tf
import numpy as np
input_node_num=257*7
output_node_num=257
tf_X = tf.placeholder(tf.float32,[None,input_node_num])
tf_Y = tf.placeholder(tf.float32,[None,output_node_num])
dropout_rate=tf.placeholder(tf.float32)
flag_training=tf.placeholder(tf.bool)
hid_node_num=2048
h1=tf.contrib.layers.fully_connected(tf_X, hid_node_num, activation_fn=None)
h1_2=tf.nn.relu(tf.layers.batch_normalization(h1,training=flag_training))
h1_3=tf.nn.dropout(h1_2,dropout_rate)
h2=tf.contrib.layers.fully_connected(h1_3, hid_node_num, activation_fn=None)
h2_2=tf.nn.relu(tf.layers.batch_normalization(h2,training=flag_training))
h2_3=tf.nn.dropout(h2_2,dropout_rate)
h3=tf.contrib.layers.fully_connected(h2_3, hid_node_num, activation_fn=None)
h3_2=tf.nn.relu(tf.layers.batch_normalization(h3,training=flag_training))
h3_3=tf.nn.dropout(h3_2,dropout_rate)
tf_Y_pre=tf.contrib.layers.fully_connected(h3_3, output_node_num, activation_fn=None)
loss=tf.reduce_mean(tf.square(tf_Y-tf_Y_pre))
update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS)
with tf.control_dependencies(update_ops):
train_step = tf.train.AdamOptimizer(1e-4).minimize(loss)
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
for i1 in range(3000*num_batch):
train_feature=... # Some processing
train_label=... # Some processing
sess.run(train_step,feed_dict={tf_X:train_feature,tf_Y:train_label,flag_training:True,dropout_rate:1}) # when train , set "training=True" , when validate ,set "training=False" , get a bad result . However when train , set "training=False" ,when validate ,set "training=False" , get a better result .
if((i1+1)%277200==0):# print validate loss every 0.1 epoch
validate_feature=... # Some processing
validate_label=... # Some processing
validate_loss = sess.run(loss,feed_dict={tf_X:validate_feature,tf_Y:validate_label,flag_training:False,dropout_rate:1})
print(validate_loss)我的代码中有错误吗?如果我的代码是正确的,我想我得到了一个奇怪的结果:
当设置为“= True”时,当验证时,设置“= False",结果并不好。我每0.1期打印一次验证损失,第1至第3期的验证损失为
0.929624
0.992692
0.814033
0.858562
1.042705
0.665418
0.753507
0.700503
0.508338
0.761886
0.787044
0.817034
0.726586
0.901634
0.633383
0.783920
0.528140
0.847496
0.804937
0.828761
0.802314
0.855557
0.702335
0.764318
0.776465
0.719034
0.678497
0.596230
0.739280
0.970555但是,当我更改代码“"training=False”sess.run(train_step,feed_dict={tf_X:train_feature,tf_Y:train_label,flag_training:True,dropout_rate:1})“时,即:当训练时设置sess.run,当验证时设置dropout_rate:1。结果是好的。第一阶段的验证损失为
0.474313
0.391002
0.369357
0.366732
0.383477
0.346027
0.336518
0.368153
0.330749
0.322070
0.335551为什么会出现这样的结果?是否有必要在培训时设置"training=True“,在验证时设置"training=False”?
发布于 2018-09-30 15:31:05
设置培训= False提高性能的原因是批处理规范化有四个变量(beta、gamma、mean、方差)。的确,当训练= False时,均值和方差没有得到更新。然而,伽马和贝塔仍然得到更新。因此,您的模型有两个额外的变量,因此具有更好的性能。
此外,我猜您的模型在没有批处理规范化的情况下具有相对较好的性能。
https://stackoverflow.com/questions/50047653
复制相似问题