首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >对于某些损失,tf.tape.gradient()返回None

对于某些损失,tf.tape.gradient()返回None
EN

Stack Overflow用户
提问于 2019-07-03 02:37:16
回答 1查看 6.9K关注 0票数 6

我试图弄清楚为什么有时tf.GradientTape().gradient会返回None,所以我使用了下面三个损失函数(mmd0()mmd1()mmd2()),虽然mmd0和mmd1的格式略有不同,但仍然会返回梯度,但对于mmd2,梯度是None。我打印出了这三个函数的损失,有人知道为什么它会这样吗?

代码语言:javascript
复制
def mmd0(x, y): # a and b are lists of aribiturary lengths
  return x  

def mmd1(x1, x2): # a and b are lists of aribiturary lengths
  dis = sum([x**2 for x in x1])/len(x1) - sum([x**2 for x in x2])/len(x2)
  return dis**2

def mmd2(x, y):
  dis = x-y
  return [tf.convert_to_tensor(elem) for elem in dis]

def get_MMD_norm(errors, sigma=0.1): 
  x2 = np.random.normal(0, sigma, len(errors))
  loss0 = mmd0(errors, x2)
  loss1 = mmd1(errors, x2)
  loss2 = mmd2(errors, x2)
  print("loss0:", loss0)
  print("loss1:", loss1)
  print("loss2:", loss2)
  return tf.cast(loss2, tf.float32)

def loss(model, x, y, sigma=0.1):
  y_ = model(x) # y_.shape is (batch_size, 3) for Iris dataset
  losses = []
  loss_object = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)
  for i in range(y.shape[0]):
    loss = loss_object(y_true=y[i], y_pred=y_[i])
    losses.append(loss) 
  batch_loss = get_MMD_norm(losses)
  single_losses_list = [loss.numpy() for loss in losses]
  return tf.convert_to_tensor(batch_loss, dtype=np.float32), single_losses_list

def grad(model, inputs, targets, sigma=0.1):
  with tf.GradientTape() as tape:
    tape.watch(model.trainable_variables)
    batch_loss, single_losses = loss(model, inputs, targets, sigma=0.1)
  return tape.gradient(batch_loss, model.trainable_variables), batch_loss, single_losses 

grads, batch_loss, single_losses = grad(model, features, labels)
print("grads:", grads)
print("batch_loss:", batch_loss)
##########################################################
loss0: [<tf.Tensor: id=39621, shape=(), dtype=float32, numpy=2.1656876>, <tf.Tensor: id=39659, shape=(), dtype=float32, numpy=2.057112>, <tf.Tensor: id=39697, shape=(), dtype=float32, numpy=2.2769136>, <tf.Tensor: id=39735, shape=(), dtype=float32, numpy=2.0263004>, <tf.Tensor: id=39773, shape=(), dtype=float32, numpy=2.1568372>, <tf.Tensor: id=39811, shape=(), dtype=float32, numpy=0.7392154>, <tf.Tensor: id=39849, shape=(), dtype=float32, numpy=0.7742219>, <tf.Tensor: id=39887, shape=(), dtype=float32, numpy=2.2176154>, <tf.Tensor: id=39925, shape=(), dtype=float32, numpy=1.0187237>, <tf.Tensor: id=39963, shape=(), dtype=float32, numpy=2.160415>, <tf.Tensor: id=40001, shape=(), dtype=float32, numpy=0.80997854>, <tf.Tensor: id=40039, shape=(), dtype=float32, numpy=0.70803094>, <tf.Tensor: id=40077, shape=(), dtype=float32, numpy=0.8207226>, <tf.Tensor: id=40115, shape=(), dtype=float32, numpy=0.82957774>, <tf.Tensor: id=40153, shape=(), dtype=float32, numpy=0.88732547>, <tf.Tensor: id=40191, shape=(), dtype=float32, numpy=0.90633464>, <tf.Tensor: id=40229, shape=(), dtype=float32, numpy=0.7932346>, <tf.Tensor: id=40267, shape=(), dtype=float32, numpy=2.1767666>, <tf.Tensor: id=40305, shape=(), dtype=float32, numpy=0.80166155>, <tf.Tensor: id=40343, shape=(), dtype=float32, numpy=0.7831647>, <tf.Tensor: id=40381, shape=(), dtype=float32, numpy=0.77431095>, <tf.Tensor: id=40419, shape=(), dtype=float32, numpy=0.82067406>, <tf.Tensor: id=40457, shape=(), dtype=float32, numpy=0.74510425>, <tf.Tensor: id=40495, shape=(), dtype=float32, numpy=2.1666338>, <tf.Tensor: id=40533, shape=(), dtype=float32, numpy=0.7922478>, <tf.Tensor: id=40571, shape=(), dtype=float32, numpy=0.73235756>, <tf.Tensor: id=40609, shape=(), dtype=float32, numpy=2.1792874>, <tf.Tensor: id=40647, shape=(), dtype=float32, numpy=0.919183>, <tf.Tensor: id=40685, shape=(), dtype=float32, numpy=0.761979>, <tf.Tensor: id=40723, shape=(), dtype=float32, numpy=2.1664479>, <tf.Tensor: id=40761, shape=(), dtype=float32, numpy=0.77892226>, <tf.Tensor: id=40799, shape=(), dtype=float32, numpy=0.99058735>]
loss1: tf.Tensor(4.158007, shape=(), dtype=float32)
loss2: [<tf.Tensor: id=40935, shape=(), dtype=float64, numpy=2.325676997771268>, <tf.Tensor: id=40936, shape=(), dtype=float64, numpy=1.9988182000798667>, <tf.Tensor: id=40937, shape=(), dtype=float64, numpy=2.303379813455908>, <tf.Tensor: id=40938, shape=(), dtype=float64, numpy=2.0615775258879356>, <tf.Tensor: id=40939, shape=(), dtype=float64, numpy=2.2949723624257774>, <tf.Tensor: id=40940, shape=(), dtype=float64, numpy=0.7019287657319235>, <tf.Tensor: id=40941, shape=(), dtype=float64, numpy=0.8522054859739794>, <tf.Tensor: id=40942, shape=(), dtype=float64, numpy=2.0819949907118125>, <tf.Tensor: id=40943, shape=(), dtype=float64, numpy=1.065878291073558>, <tf.Tensor: id=40944, shape=(), dtype=float64, numpy=2.1225998300026805>, <tf.Tensor: id=40945, shape=(), dtype=float64, numpy=0.9485520218242218>, <tf.Tensor: id=40946, shape=(), dtype=float64, numpy=0.7221746903906889>, <tf.Tensor: id=40947, shape=(), dtype=float64, numpy=0.9985009994522388>, <tf.Tensor: id=40948, shape=(), dtype=float64, numpy=0.9143119687525019>, <tf.Tensor: id=40949, shape=(), dtype=float64, numpy=0.9230117922853999>, <tf.Tensor: id=40950, shape=(), dtype=float64, numpy=1.0220225043292934>, <tf.Tensor: id=40951, shape=(), dtype=float64, numpy=0.8735972169951878>, <tf.Tensor: id=40952, shape=(), dtype=float64, numpy=2.1279260795512753>, <tf.Tensor: id=40953, shape=(), dtype=float64, numpy=0.9597649765787801>, <tf.Tensor: id=40954, shape=(), dtype=float64, numpy=0.8338326272407959>, <tf.Tensor: id=40955, shape=(), dtype=float64, numpy=0.6674084331022461>, <tf.Tensor: id=40956, shape=(), dtype=float64, numpy=0.8679296826013285>, <tf.Tensor: id=40957, shape=(), dtype=float64, numpy=0.8174893483228802>, <tf.Tensor: id=40958, shape=(), dtype=float64, numpy=2.212290299049252>, <tf.Tensor: id=40959, shape=(), dtype=float64, numpy=0.7304098620074719>, <tf.Tensor: id=40960, shape=(), dtype=float64, numpy=0.8463413221121661>, <tf.Tensor: id=40961, shape=(), dtype=float64, numpy=2.3081013094190443>, <tf.Tensor: id=40962, shape=(), dtype=float64, numpy=1.0314178020997722>, <tf.Tensor: id=40963, shape=(), dtype=float64, numpy=0.774951045805575>, <tf.Tensor: id=40964, shape=(), dtype=float64, numpy=2.127838465488091>, <tf.Tensor: id=40965, shape=(), dtype=float64, numpy=0.909498425717612>, <tf.Tensor: id=40966, shape=(), dtype=float64, numpy=1.0217239989370837>]
grads: [None, None, None, None, None, None]
batch_loss: tf.Tensor(
[2.325677   1.9988182  2.3033798  2.0615776  2.2949724  0.7019288
 0.8522055  2.081995   1.0658783  2.1225998  0.948552   0.7221747
 0.998501   0.91431195 0.9230118  1.0220225  0.8735972  2.127926
 0.95976496 0.8338326  0.6674084  0.8679297  0.8174893  2.2122903
 0.73040986 0.8463413  2.3081014  1.0314178  0.77495104 2.1278384
 0.90949845 1.021724  ], shape=(32,), dtype=float32)
EN

回答 1

Stack Overflow用户

发布于 2019-08-17 12:24:20

你看过this answer吗?我想我也有类似的问题,我相信你的问题可能和我的问题有关。它必须与损失有关,该损失是通过过程中的某个步骤计算的,其中感兴趣的张量从磁带的开始到结束是“丢失的”。引用的答案指出,原始海报有一个区域,其中返回了numpy数组而不是tensorflow张量,因此导致梯度Tape无法计算梯度。

我可能错了,因为我根本不是tensorflow专家,但这是我在寻找类似问题的解决方案时不断出现的东西。

票数 3
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/56858378

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档