文章/答案/技术大牛

发布

问为什么学习率不变？
EN

Stack Overflow用户

提问于 2019-04-16 16:08:35

回答 2查看 1.7K关注 0票数 1

我使用Tensorflow Object Detection API tutorial https://tensorflow-object-detection-api-tutorial.readthedocs.io/en/latest/index.html来训练我的自定义模型。按照以下说明，我已经使用了官方GitHub存储库和脚本train.py中的配置文件进行训练。我在配置文件中看到，学习率应该是自适应的。可以在下面这几行中看到：

train_config: {
  batch_size: 24
  optimizer {
    rms_prop_optimizer: {
      learning_rate: {
        exponential_decay_learning_rate {
          initial_learning_rate: 0.004
          decay_steps: 800720
          decay_factor: 0.95
        }
      }
      momentum_optimizer_value: 0.9
      decay: 0.9
      epsilon: 1.0
    }
  }

然后，我在训练过程中使用了TensorBoard，它告诉我，每个训练步骤的学习率都是恒定的。为什么会这样呢？有没有可能，TensorBoard只能看到学习率的初始值，优化器会实时计算学习率的实际值？

python

tensorflow

object-detection-api

回答 2

Stack Overflow用户

回答已采纳

发布于 2019-04-16 16:37:31

在API中，优化器是在这个file中构建的。而this是用于rms_prop_optimizer的行。为了构建优化器学习率，该函数调用了一个函数_create_learning_rate，该函数最终调用了object_detection/utils下的learning_schedules。下面是如何在您的示例中安排学习率。

def exponential_decay_with_burnin(global_step,
                                  learning_rate_base,
                                  learning_rate_decay_steps,
                                  learning_rate_decay_factor,
                                  burnin_learning_rate=0.0,
                                  burnin_steps=0,
                                  min_learning_rate=0.0,
                                  staircase=True):
  """Exponential decay schedule with burn-in period.
  In this schedule, learning rate is fixed at burnin_learning_rate
  for a fixed period, before transitioning to a regular exponential
  decay schedule.
  Args:
    global_step: int tensor representing global step.
    learning_rate_base: base learning rate.
    learning_rate_decay_steps: steps to take between decaying the learning rate.
      Note that this includes the number of burn-in steps.
    learning_rate_decay_factor: multiplicative factor by which to decay
      learning rate.
    burnin_learning_rate: initial learning rate during burn-in period.  If
      0.0 (which is the default), then the burn-in learning rate is simply
      set to learning_rate_base.
    burnin_steps: number of steps to use burnin learning rate.
    min_learning_rate: the minimum learning rate.
    staircase: whether use staircase decay.
  Returns:
    a (scalar) float tensor representing learning rate
  """
  if burnin_learning_rate == 0:
    burnin_learning_rate = learning_rate_base
  post_burnin_learning_rate = tf.train.exponential_decay(
      learning_rate_base,
      global_step - burnin_steps,
      learning_rate_decay_steps,
      learning_rate_decay_factor,
      staircase=staircase)
  return tf.maximum(tf.where(
      tf.less(tf.cast(global_step, tf.int32), tf.constant(burnin_steps)),
      tf.constant(burnin_learning_rate),
      post_burnin_learning_rate), min_learning_rate, name='learning_rate')

这是学习率衰减图。即使在10万步之后，衰减实际上也是非常小的。

票数 4

Stack Overflow用户

发布于 2019-04-16 16:17:00

从文档中我看到计算衰减率的公式是：

decayed_learning_rate = learning_rate *
                    decay_rate ^ (global_step / decay_steps)

在此global_step中，需要按以下方式给出：

[...] requires a global_step value to compute the decayed learning rate.
You can just pass a TensorFlow variable that you increment at each training step.

因此，您可能只需要传递global_step参数就可以使速率有效衰减？

票数 3

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/55703416

复制

相似问题

问为什么学习率不变？
EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问为什么学习率不变？EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问为什么学习率不变？
EN