首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >从logits中获取概率- logits和标签大小不同

从logits中获取概率- logits和标签大小不同
EN

Stack Overflow用户
提问于 2017-06-20 13:07:05
回答 1查看 274关注 0票数 1

我正在尝试使用Tensorflow对一些对象表示进行分类。我使用了与Tensorflow Cifar-10示例中相同的体系结构,最后一层定义为:

代码语言:javascript
复制
    with tf.variable_scope('sigmoid_linear') as scope:
    weights = _variable_with_weight_decay('weights', [192, num_classes],
                                               stddev=1 / 192.0, wd=0.0)
    biases = _variable_on_cpu('biases', [num_classes],
                                   initializer)
    sigmoid_linear = tf.add(tf.matmul(local4, weights), biases, name=scope.name)
    _activation_summary(sigmoid_linear)

return sigmoid_linear

在我的例子中,num_classes2,反馈给神经网络的表示中的通道数量是8。此外,我目前正在调试只有5个示例。最后一层的输出具有[40,2]形状。我预计第一个维度是由于5 examples * 8 channels,第二个维度是由于类的数量。

为了使用tensorflow.nn.SparseSoftmaxCrossEntropyWithLogits来比较logits和labels,我需要它们有一个共同的形状。如何在当前形状中解释logits的当前内容,以及如何将logits的第一维减少到与num_classes相同

编辑:推理函数的输入形状为[5,101,1008,8]。推理函数定义为:

代码语言:javascript
复制
    def inference(representations):
    """Build the model.
  Args:
    STFT spectra: spectra returned from distorted_inputs() or inputs().
  Returns:
    Logits.
  """    
    # conv1
    with tf.variable_scope('conv1') as scope:
        kernel = _variable_with_weight_decay('weights',
                                                  shape=[5, 5, nChannels, 64],
                                                  stddev=5e-2,
                                                  wd=0.0)
        conv = tf.nn.conv2d(representations, kernel, [1, 1, 1, 1], padding='SAME')
        biases = _variable_on_cpu('biases', [64], initializer,
                                  )
        pre_activation = tf.nn.bias_add(conv, biases)
        conv1 = tf.nn.relu(pre_activation, name=scope.name)
        _activation_summary(conv1)

    # pool1
    pool1 = tf.nn.max_pool(conv1, ksize=[1, 3, 3, 1], strides=[1, 2, 2, 1],
                           padding='SAME', name='pool1')
    # norm1
    norm1 = tf.nn.lrn(pool1, 4, bias=1.0, alpha=0.001 / 9.0, beta=0.75,
                      name='norm1')

    # conv2
    with tf.variable_scope('conv2') as scope:
        kernel = _variable_with_weight_decay('weights',
                                                  shape=[5, 5, 64, 64],
                                                  stddev=5e-2,
                                                  wd=0.0)
        conv = tf.nn.conv2d(norm1, kernel, [1, 1, 1, 1], padding='SAME')
        biases = _variable_on_cpu('biases', [64], initializer)
        pre_activation = tf.nn.bias_add(conv, biases)
        conv2 = tf.nn.relu(pre_activation, name=scope.name)
        _activation_summary(conv2)

    # norm2
    norm2 = tf.nn.lrn(conv2, 4, bias=1.0, alpha=0.001 / 9.0, beta=0.75,
                      name='norm2')
    # pool2
    pool2 = tf.nn.max_pool(norm2, ksize=[1, 3, 3, 1],
                           strides=[1, 2, 2, 1], padding='SAME', name='pool2')

    # local3
    with tf.variable_scope('local3') as scope:
        # Move everything into depth so we can perform a single matrix multiply.
        reshape = tf.reshape(pool2, [batch_size, -1])
        dim = reshape.get_shape()[1].value
        weights = _variable_with_weight_decay('weights', shape=[dim, 384],
                                                   stddev=0.04, wd=0.004)
        biases = _variable_on_cpu('biases', [384], initializer)
        local3 = tf.nn.relu(tf.matmul(reshape, weights) + biases, name=scope.name)
        _activation_summary(local3)

    # local4
    with tf.variable_scope('local4') as scope:
        weights = _variable_with_weight_decay('weights', shape=[384, 192],
                                                   stddev=0.04, wd=0.004)
        biases = _variable_on_cpu('biases', [192], initializer)
        local4 = tf.nn.relu(tf.matmul(local3, weights) + biases, name=scope.name)
        _activation_summary(local4)


    with tf.variable_scope('sigmoid_linear') as scope:
        weights = _variable_with_weight_decay('weights', [192, num_classes],
                                                   stddev=1 / 192.0, wd=0.0)
        biases = _variable_on_cpu('biases', [num_classes],
                                       initializer)
        sigmoid_linear = tf.add(tf.matmul(local4, weights), biases, name=scope.name)
        _activation_summary(sigmoid_linear)

    return sigmoid_linear
EN

回答 1

Stack Overflow用户

发布于 2017-06-21 13:49:44

经过更多的调试,我发现了问题所在。发布的层代码最初来自Tensorflow教程,运行良好(当然是这样)。我在每一层之后打印了所有的形状,并发现40的数量不是由于5 examples * 8 channels,而是我之前设置的batch_size = 40,因此也高于训练样例的数量。不匹配是在local layer 3重塑之后开始的。问题现在可以结束了。

票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/44644244

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档