文章/答案/技术大牛

发布

社区首页 >问答首页 >不同长度序列的流量交叉熵损失

问不同长度序列的流量交叉熵损失
EN

Stack Overflow用户

提问于 2017-12-29 15:46:58

回答 3查看 3.7K关注 0票数 3

我正在使用tensorflow用LSTM构建一个seq2seq模型。我使用的损失函数是softmax交叉熵损失。问题是我的输入序列有不同的长度，所以我填充它。模型的输出具有形状[max_length, batch_size, vocab_size]。如何计算0填充值不影响损失的损失？tf.nn.softmax_cross_entropy_with_logits提供轴向参数，可以用三维方法计算损耗，但不提供权值。tf.losses.softmax_cross_entropy提供权重参数，但是它接收到形状[batch_size, nclass(vocab_size)]的输入。请帮帮我！

tensorflow

回答 3

Stack Overflow用户

回答已采纳

发布于 2017-12-29 18:17:16

我想你得写你自己的损失函数。看看https://danijar.com/variable-sequence-lengths-in-tensorflow/。

票数 2

Stack Overflow用户

发布于 2019-02-06 23:52:46

在这种情况下，您需要填充两个逻辑和标签，使它们具有相同的长度。所以，如果有张量，logits的大小是(batch_size, length, vocab_size)，labels的尺寸是(batch_size, length)，而length是序列的大小。首先，你必须把它们按相同的长度：

def _pad_tensors_to_same_length(logits, labels):
    """Pad x and y so that the results have the same length (second dimension)."""
    with tf.name_scope("pad_to_same_length"):
        logits_length = tf.shape(logits)[1]
        labels_length = tf.shape(labels)[1]

        max_length = tf.maximum(logits_length, labels_length)

        logits = tf.pad(logits, [[0, 0], [0, max_length - logits_length], [0, 0]])
        labels = tf.pad(labels, [[0, 0], [0, max_length - labels_length]])
        return logits, labels

然后你可以做填充交叉熵：

def padded_cross_entropy_loss(logits, labels, vocab_size):
  """Calculate cross entropy loss while ignoring padding.

  Args:
    logits: Tensor of size [batch_size, length_logits, vocab_size]
    labels: Tensor of size [batch_size, length_labels]
    vocab_size: int size of the vocabulary
  Returns:
    Returns the cross entropy loss 
  """
  with tf.name_scope("loss", values=[logits, labels]):
    logits, labels = _pad_tensors_to_same_length(logits, labels)

    # Calculate cross entropy
    with tf.name_scope("cross_entropy", values=[logits, labels]):
      xentropy = tf.nn.softmax_cross_entropy_with_logits_v2(
          logits=logits, labels=targets)

    weights = tf.to_float(tf.not_equal(labels, 0))
    return xentropy * weights

票数 1

Stack Overflow用户

发布于 2020-03-07 09:19:23

下面的函数有两个形状的张量(batch_size，time_steps，vocab_len)。计算用于对与填充相关的时间步骤进行归零的掩码。该掩码将消除分类交叉熵中的填充损失。

# the labels that has 1 as the first element
def mask_loss(y_true, y_pred):
    mask_value = np.zeros((vocab_len))
    mask_value[0] = 1
    # find out which timesteps in `y_true` are not the padding character 
    mask = K.equal(y_true, mask_value)
    mask = 1 - K.cast(mask, K.floatx())
    mask = K.sum(mask,axis=2)/2
    # multplying the loss by the mask. the loss for padding will be zero
    loss = tf.keras.layers.multiply([K.categorical_crossentropy(y_true, y_pred), mask])
    return K.sum(loss) / K.sum(mask)

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/48025004

复制

相似问题

问不同长度序列的流量交叉熵损失
EN

回答 3

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问不同长度序列的流量交叉熵损失EN

回答 3

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问不同长度序列的流量交叉熵损失
EN