我在Tensorflow for MNIST dataset中试验了一个VAE实现。首先,我训练了一个基于MLP编码器和解码器的VAE。它训练得很好,损失减少,并产生看似合理的数字。以下是这个基于MLP的VAE的解码器代码:
x = sampled_z
x = tf.layers.dense(x, 200, tf.nn.relu)
x = tf.layers.dense(x, 200, tf.nn.relu)
x = tf.layers.dense(x, np.prod(data_shape))
img = tf.reshape(x, [-1] + data_shape)作为下一步,我决定添加卷积层。只改变编码器就行了,但是当我在解码器中使用解卷积(而不是fc层)时,我根本没有得到任何训练。损失函数从不减小,并且输出始终是黑色的。这是解卷积解码器的代码:
x = tf.layers.dense(sampled_z, 24, tf.nn.relu)
x = tf.layers.dense(x, 7 * 7 * 64, tf.nn.relu)
x = tf.reshape(x, [-1, 7, 7, 64])
x = tf.layers.conv2d_transpose(x, 64, 3, 2, 'SAME', activation=tf.nn.relu)
x = tf.layers.conv2d_transpose(x, 32, 3, 2, 'SAME', activation=tf.nn.relu)
x = tf.layers.conv2d_transpose(x, 1, 3, 1, 'SAME', activation=tf.nn.sigmoid)
img = tf.reshape(x, [-1, 28, 28])这看起来很奇怪,代码在我看来很好。我把范围缩小到解码器的反卷积层,里面有什么东西破坏了它。例如,如果我添加一个完全连接的层(即使没有非线性!)在最后一次去卷积之后,它又能工作了!代码如下:
x = tf.layers.dense(sampled_z, 24, tf.nn.relu)
x = tf.layers.dense(x, 7 * 7 * 64, tf.nn.relu)
x = tf.reshape(x, [-1, 7, 7, 64])
x = tf.layers.conv2d_transpose(x, 64, 3, 2, 'SAME', activation=tf.nn.relu)
x = tf.layers.conv2d_transpose(x, 32, 3, 2, 'SAME', activation=tf.nn.relu)
x = tf.layers.conv2d_transpose(x, 1, 3, 1, 'SAME', activation=tf.nn.sigmoid)
x = tf.contrib.layers.flatten(x)
x = tf.layers.dense(x, 28 * 28)
img = tf.reshape(x, [-1, 28, 28])我真的有点卡在这一点上了,有人知道这里会发生什么吗?我使用tf 1.8.0,Adam优化器,1e-4学习率。
编辑:
正如@Agost指出的那样,我也许应该澄清一些关于我的损失函数和训练过程的事情。我将后验分布建模为伯努利分布,并将ELBO最大化为我的损失。灵感来自this的帖子。这是编码器,解码器和损失的完整代码:
def make_prior():
mu = tf.zeros(N_LATENT)
sigma = tf.ones(N_LATENT)
return tf.contrib.distributions.MultivariateNormalDiag(mu, sigma)
def make_encoder(x_input):
x_input = tf.reshape(x_input, shape=[-1, 28, 28, 1])
x = conv(x_input, 32, 3, 2)
x = conv(x, 64, 3, 2)
x = conv(x, 128, 3, 2)
x = tf.contrib.layers.flatten(x)
mu = dense(x, N_LATENT)
sigma = dense(x, N_LATENT, activation=tf.nn.softplus) # softplus is log(exp(x) + 1)
return tf.contrib.distributions.MultivariateNormalDiag(mu, sigma)
def make_decoder(sampled_z):
x = tf.layers.dense(sampled_z, 24, tf.nn.relu)
x = tf.layers.dense(x, 7 * 7 * 64, tf.nn.relu)
x = tf.reshape(x, [-1, 7, 7, 64])
x = tf.layers.conv2d_transpose(x, 64, 3, 2, 'SAME', activation=tf.nn.relu)
x = tf.layers.conv2d_transpose(x, 32, 3, 2, 'SAME', activation=tf.nn.relu)
x = tf.layers.conv2d_transpose(x, 1, 3, 1, 'SAME')
img = tf.reshape(x, [-1, 28, 28])
img_distribution = tf.contrib.distributions.Bernoulli(img)
img = img_distribution.probs
img_distribution = tf.contrib.distributions.Independent(img_distribution, 2)
return img, img_distribution
def main():
mnist = input_data.read_data_sets(os.path.join(experiment_dir(EXPERIMENT), 'MNIST_data'))
tf.reset_default_graph()
batch_size = 128
x_input = tf.placeholder(dtype=tf.float32, shape=[None, 28, 28], name='X')
prior = make_prior()
posterior = make_encoder(x_input)
mu, sigma = posterior.mean(), posterior.stddev()
z = posterior.sample()
generated_img, output_distribution = make_decoder(z)
likelihood = output_distribution.log_prob(x_input)
divergence = tf.distributions.kl_divergence(posterior, prior)
elbo = tf.reduce_mean(likelihood - divergence)
loss = -elbo
global_step = tf.train.get_or_create_global_step()
optimizer = tf.train.AdamOptimizer(1e-3).minimize(loss, global_step=global_step)发布于 2018-09-03 23:09:24
会不会是你在最终的deconv层中使用sigmoid将输出限制为0-1,你不会在基于MLP的自动编码器中这样做,或者在deconvs之后添加一个完全连接的数据范围时,所以可能的数据范围问题?
https://stackoverflow.com/questions/52146591
复制相似问题