文章/答案/技术大牛

发布

社区首页 >问答首页 >TensorFlow2中不同“网络”梯度的组合

问TensorFlow2中不同“网络”梯度的组合
EN

Stack Overflow用户

提问于 2021-11-09 14:07:48

回答 1查看 79关注 0票数 0

我正试图将几个“网络”合并成一个最终的损失函数。我想知道我所做的是否是“合法的”，到目前为止，我似乎无法做到这一点。我用的是tensorflow概率：

主要问题是：

# Get gradients of the loss wrt the weights.
gradients = tape.gradient(loss, [m_phis.trainable_weights, m_mus.trainable_weights, m_sigmas.trainable_weights])

# Update the weights of our linear layer.
optimizer.apply_gradients(zip(gradients, [m_phis.trainable_weights, m_mus.trainable_weights, m_sigmas.trainable_weights])

这给了我没有渐变和抛出的应用梯度：

AttributeError：“列表”对象没有属性“设备”

完整代码：

univariate_gmm = tfp.distributions.MixtureSameFamily(
        mixture_distribution=tfp.distributions.Categorical(probs=phis_true),
        components_distribution=tfp.distributions.Normal(loc=mus_true,scale=sigmas_true)
    )
x = univariate_gmm.sample(n_samples, seed=random_seed).numpy()
dataset = tf.data.Dataset.from_tensor_slices(x) 
dataset = dataset.shuffle(buffer_size=1024).batch(64)  

m_phis = keras.layers.Dense(2, activation=tf.nn.softmax)
m_mus = keras.layers.Dense(2)
m_sigmas = keras.layers.Dense(2, activation=tf.nn.softplus)

def neg_log_likelihood(y, phis, mus, sigmas):
    a = tfp.distributions.Normal(loc=mus[0],scale=sigmas[0]).prob(y)
    b = tfp.distributions.Normal(loc=mus[1],scale=sigmas[1]).prob(y)
    c = np.log(phis[0]*a + phis[1]*b)
    return tf.reduce_sum(-c, axis=-1)

# Instantiate a logistic loss function that expects integer targets.
loss_fn = neg_log_likelihood

# Instantiate an optimizer.
optimizer = tf.keras.optimizers.SGD(learning_rate=1e-3)

# Iterate over the batches of the dataset.
for step, y in enumerate(dataset):
    
    yy = np.expand_dims(y, axis=1)

    # Open a GradientTape.
    with tf.GradientTape() as tape:
        
        # Forward pass.
        phis = m_phis(yy)
        mus = m_mus(yy)
        sigmas = m_sigmas(yy)

        # Loss value for this batch.
        loss = loss_fn(yy, phis, mus, sigmas)

    # Get gradients of the loss wrt the weights.
    gradients = tape.gradient(loss, [m_phis.trainable_weights, m_mus.trainable_weights, m_sigmas.trainable_weights])

    # Update the weights of our linear layer.
    optimizer.apply_gradients(zip(gradients, [m_phis.trainable_weights, m_mus.trainable_weights, m_sigmas.trainable_weights]))

    # Logging.
    if step % 100 == 0:
        print("Step:", step, "Loss:", float(loss))

python

keras

tensorflow2.0

回答 1

Stack Overflow用户

回答已采纳

发布于 2021-11-09 15:46:01

有两个单独的问题需要考虑。

1.梯度是None

通常情况下，如果在GradientTape监视的代码中执行非tensorflow操作，就会发生这种情况。具体来说，这涉及到np.log函数中的neg_log_likelihood计算。如果将np.log替换为tf.math.log，则应该计算渐变。尝试在“内部”tensorflow组件中不使用numpy可能是一个好习惯，因为这样可以避免类似的错误。对于大多数numpy操作，有一个很好的tensorflow替代品。

2.适用于多种培训的apply_gradients：

这主要与apply_gradients所期望的输入有关。在这里，你有两个选择：

第一个选项:调用apply_gradients三次，每次使用不同的可培训设备

optimizer.apply_gradients(zip(m_phis_gradients, m_phis.trainable_weights))
optimizer.apply_gradients(zip(m_mus_gradients, m_mus.trainable_weights))
optimizer.apply_gradients(zip(m_sigmas_gradients, m_sigmas.trainable_weights))

另一种方法是创建一个元组列表，如tensorflow文档中所示(引用："grads_and_vars: list of (梯度，变量)对“)。这意味着打电话给

optimizer.apply_gradients(
   [
      zip(m_phis_gradients, m_phis.trainable_weights),
      zip(m_mus_gradients, m_mus.trainable_weights),
      zip(m_sigmas_gradients, m_sigmas.trainable_weights),
   ]
)

这两个选项都要求您拆分渐变。您可以通过计算梯度并分别对它们进行索引(gradients[0],...)来实现这一点，也可以简单地单独计算梯度。请注意，这可能需要persistent=True在您的GradientTape中。

    # [...]
    # Open a GradientTape.
    with tf.GradientTape(persistent=True) as tape:
        # Forward pass.
        phis = m_phis(yy)
        mus = m_mus(yy)
        sigmas = m_sigmas(yy)

        # Loss value for this batch.
        loss = loss_fn(yy, phis, mus, sigmas)

    # Get gradients of the loss wrt the weights.
    m_phis_gradients = tape.gradient(loss, m_phis.trainable_weights)
    m_mus_gradients = tape.gradient(loss, m_mus.trainable_weights)
    m_sigmas_gradients = tape.gradient(loss, m_sigmas .trainable_weights)

    # Update the weights of our linear layer.
    optimizer.apply_gradients(
        [
            zip(m_phis_gradients, m_phis.trainable_weights),
            zip(m_mus_gradients, m_mus.trainable_weights),
            zip(m_sigmas_gradients, m_sigmas.trainable_weights),
       ]
   )
   # [...]

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/69899618

复制

相似问题

问TensorFlow2中不同“网络”梯度的组合
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问TensorFlow2中不同“网络”梯度的组合EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问TensorFlow2中不同“网络”梯度的组合
EN