我正在使用tensorflow用LSTM构建一个seq2seq模型。我使用的损失函数是softmax交叉熵损失。问题是我的输入序列有不同的长度,所以我填充它。模型的输出具有形状[max_length, batch_size, vocab_size]。如何计算0填充值不影响损失的损失?tf.nn.softmax_cross_entropy_with_logits提供轴向参数,可以用三维方法计算损耗,但不提供权值。tf.losses.softmax_cross_entropy提供权重参数,但是它接收到形状[batch_size, nclass(vocab_size)]的输入。请帮帮我!
发布于 2017-12-29 18:17:16
我想你得写你自己的损失函数。看看https://danijar.com/variable-sequence-lengths-in-tensorflow/。
发布于 2019-02-06 23:52:46
在这种情况下,您需要填充两个逻辑和标签,使它们具有相同的长度。所以,如果有张量,logits的大小是(batch_size, length, vocab_size),labels的尺寸是(batch_size, length),而length是序列的大小。首先,你必须把它们按相同的长度:
def _pad_tensors_to_same_length(logits, labels):
"""Pad x and y so that the results have the same length (second dimension)."""
with tf.name_scope("pad_to_same_length"):
logits_length = tf.shape(logits)[1]
labels_length = tf.shape(labels)[1]
max_length = tf.maximum(logits_length, labels_length)
logits = tf.pad(logits, [[0, 0], [0, max_length - logits_length], [0, 0]])
labels = tf.pad(labels, [[0, 0], [0, max_length - labels_length]])
return logits, labels然后你可以做填充交叉熵:
def padded_cross_entropy_loss(logits, labels, vocab_size):
"""Calculate cross entropy loss while ignoring padding.
Args:
logits: Tensor of size [batch_size, length_logits, vocab_size]
labels: Tensor of size [batch_size, length_labels]
vocab_size: int size of the vocabulary
Returns:
Returns the cross entropy loss
"""
with tf.name_scope("loss", values=[logits, labels]):
logits, labels = _pad_tensors_to_same_length(logits, labels)
# Calculate cross entropy
with tf.name_scope("cross_entropy", values=[logits, labels]):
xentropy = tf.nn.softmax_cross_entropy_with_logits_v2(
logits=logits, labels=targets)
weights = tf.to_float(tf.not_equal(labels, 0))
return xentropy * weights发布于 2020-03-07 09:19:23
下面的函数有两个形状的张量(batch_size,time_steps,vocab_len)。计算用于对与填充相关的时间步骤进行归零的掩码。该掩码将消除分类交叉熵中的填充损失。
# the labels that has 1 as the first element
def mask_loss(y_true, y_pred):
mask_value = np.zeros((vocab_len))
mask_value[0] = 1
# find out which timesteps in `y_true` are not the padding character
mask = K.equal(y_true, mask_value)
mask = 1 - K.cast(mask, K.floatx())
mask = K.sum(mask,axis=2)/2
# multplying the loss by the mask. the loss for padding will be zero
loss = tf.keras.layers.multiply([K.categorical_crossentropy(y_true, y_pred), mask])
return K.sum(loss) / K.sum(mask)https://stackoverflow.com/questions/48025004
复制相似问题