文章/答案/技术大牛

发布

社区首页 >问答首页 >sparse_categorical_crossentropy和categorical_crossentropy有什么区别？

问sparse_categorical_crossentropy和categorical_crossentropy有什么区别？
EN

Stack Overflow用户

提问于 2019-10-26 04:33:20

回答 3查看 36.7K关注 0票数 53

sparse_categorical_crossentropy和categorical_crossentropy有什么区别？什么时候应该使用一种损失而不是另一种损失？例如，这些损失是否适用于线性回归？

python

tensorflow

machine-learning

keras

deep-learning

回答 3

Stack Overflow用户

回答已采纳

发布于 2019-10-26 05:39:45

简单地说：

categorical_crossentropy (cce)为每个类别生成一个包含可能匹配的数组，
sparse_categorical_crossentropy (scce)为最可能匹配的类别生成一个类别索引。

考虑一个具有5个类别(或类别)的分类问题。

在cce的情况下，一个热门目标可能是[0, 1, 0, 0, 0]，并且该模型可以预测[.2, .5, .1, .1, .1] (可能是正确的)

在scce的情况下，目标索引可能是1，并且模型可以预测：.5。

现在考虑一个具有3个类别的分类问题。

在cce的情况下，一个热门目标可能是[0, 0, 1]，并且该模型可以预测[.5, .1, .4] (可能不准确，假设它给予第一类更多的概率)
在scce的情况下，目标索引可以是[0]，并且该模型可以预测

许多分类模型生成scce输出，因为您节省了空间，但丢失了大量信息(例如，在第二个示例中，索引2也非常接近)。为了模型的可靠性，我通常更喜欢cce输出。

使用scce的情况有很多，包括：

当你的类是互斥的，即你根本不关心其他足够接近的预测时，
类别的数量很大，预测输出变成了overwhelming.

票数 67

Stack Overflow用户

发布于 2021-08-02 07:11:12

我也把这个搞糊涂了。幸运的是，优秀的keras文档提供了帮助。两者都具有相同的损失函数，并且最终都在做相同的事情，唯一的区别是真实标签的表示。

分类交叉熵[Doc]:

当有两个或更多个标签分类时，

使用此交叉点损失函数。我们希望在one_hot表示中提供标签。

>>> y_true = [[0, 1, 0], [0, 0, 1]]
>>> y_pred = [[0.05, 0.95, 0], [0.1, 0.8, 0.1]]
>>> # Using 'auto'/'sum_over_batch_size' reduction type.  
>>> cce = tf.keras.losses.CategoricalCrossentropy()
>>> cce(y_true, y_pred).numpy()
1.177

稀疏分类交叉熵[Doc]:

当有两个或更多个标签分类时，

使用此交叉点损失函数。我们希望标签以整数的形式提供。

>>> y_true = [1, 2]
>>> y_pred = [[0.05, 0.95, 0], [0.1, 0.8, 0.1]]
>>> # Using 'auto'/'sum_over_batch_size' reduction type.  
>>> scce = tf.keras.losses.SparseCategoricalCrossentropy()
>>> scce(y_true, y_pred).numpy()
1.177

稀疏分类交叉熵的一个很好的例子是fasion mnist数据集。

import tensorflow as tf
from tensorflow import keras

fashion_mnist = keras.datasets.fashion_mnist
(X_train_full, y_train_full), (X_test, y_test) = fashion_mnist.load_data()

print(y_train_full.shape) # (60000,)
print(y_train_full.dtype) # uint8

y_train_full[:10]
# array([9, 0, 0, 3, 0, 2, 7, 2, 5, 5], dtype=uint8)

票数 18

Stack Overflow用户

发布于 2019-10-27 04:00:05

From the TensorFlow source code，则sparse_categorical_crossentropy被定义为具有整数目标的categorical crossentropy：

def sparse_categorical_crossentropy(target, output, from_logits=False, axis=-1):
  """Categorical crossentropy with integer targets.
  Arguments:
      target: An integer tensor.
      output: A tensor resulting from a softmax
          (unless `from_logits` is True, in which
          case `output` is expected to be the logits).
      from_logits: Boolean, whether `output` is the
          result of a softmax, or is a tensor of logits.
      axis: Int specifying the channels axis. `axis=-1` corresponds to data
          format `channels_last', and `axis=1` corresponds to data format
          `channels_first`.
  Returns:
      Output tensor.
  Raises:
      ValueError: if `axis` is neither -1 nor one of the axes of `output`.
  """

From the TensorFlow source code，categorical_crossentropy被定义为输出张量和目标张量之间的分类交叉熵。

def categorical_crossentropy(target, output, from_logits=False, axis=-1):
  """Categorical crossentropy between an output tensor and a target tensor.
  Arguments:
      target: A tensor of the same shape as `output`.
      output: A tensor resulting from a softmax
          (unless `from_logits` is True, in which
          case `output` is expected to be the logits).
      from_logits: Boolean, whether `output` is the
          result of a softmax, or is a tensor of logits.
      axis: Int specifying the channels axis. `axis=-1` corresponds to data
          format `channels_last', and `axis=1` corresponds to data format
          `channels_first`.
  Returns:
      Output tensor.
  Raises:
      ValueError: if `axis` is neither -1 nor one of the axes of `output`.
  """

整型目标的含义是目标标签应该以整型列表的形式显示类的索引，例如：

sparse_categorical_crossentropy，对于1类和2类目标，在5类分类问题中，列表应为1,2。基本上，目标应为整数形式才能调用sparse_categorical_crossentropy。这被称为稀疏，因为目标表示比一次热编码需要更少的空间。例如，具有b目标和k类的批处理需要用one-hot表示b * k空间，而具有b目标和k类的批处理需要b空间以整数形式表示。对于categorical_crossentropy的
- ，对于1类和2类目标，在5类分类问题中，列表应为[[0,1,0,0,0], [0,0,1,0,0]]。基本上，目标应该是one-hot形式，以便调用categorical_crossentropy.

目标的表示是唯一的区别，结果应该是相同的，因为它们都在计算分类交叉性。

票数 13

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/58565394

复制

相似问题

问sparse_categorical_crossentropy和categorical_crossentropy有什么区别？
EN

回答 3

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问sparse_categorical_crossentropy和categorical_crossentropy有什么区别？EN

回答 3

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问sparse_categorical_crossentropy和categorical_crossentropy有什么区别？
EN