问对于某些损失，tf.tape.gradient()返回None
EN

Stack Overflow用户

提问于 2019-07-03 02:37:16

回答 1查看 6.9K关注 0票数 6

我试图弄清楚为什么有时tf.GradientTape().gradient会返回None，所以我使用了下面三个损失函数(mmd0()，mmd1()，mmd2())，虽然mmd0和mmd1的格式略有不同，但仍然会返回梯度，但对于mmd2，梯度是None。我打印出了这三个函数的损失，有人知道为什么它会这样吗？

def mmd0(x, y): # a and b are lists of aribiturary lengths
  return x  

def mmd1(x1, x2): # a and b are lists of aribiturary lengths
  dis = sum([x**2 for x in x1])/len(x1) - sum([x**2 for x in x2])/len(x2)
  return dis**2

def mmd2(x, y):
  dis = x-y
  return [tf.convert_to_tensor(elem) for elem in dis]

def get_MMD_norm(errors, sigma=0.1): 
  x2 = np.random.normal(0, sigma, len(errors))
  loss0 = mmd0(errors, x2)
  loss1 = mmd1(errors, x2)
  loss2 = mmd2(errors, x2)
  print("loss0:", loss0)
  print("loss1:", loss1)
  print("loss2:", loss2)
  return tf.cast(loss2, tf.float32)

def loss(model, x, y, sigma=0.1):
  y_ = model(x) # y_.shape is (batch_size, 3) for Iris dataset
  losses = []
  loss_object = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)
  for i in range(y.shape[0]):
    loss = loss_object(y_true=y[i], y_pred=y_[i])
    losses.append(loss) 
  batch_loss = get_MMD_norm(losses)
  single_losses_list = [loss.numpy() for loss in losses]
  return tf.convert_to_tensor(batch_loss, dtype=np.float32), single_losses_list

def grad(model, inputs, targets, sigma=0.1):
  with tf.GradientTape() as tape:
    tape.watch(model.trainable_variables)
    batch_loss, single_losses = loss(model, inputs, targets, sigma=0.1)
  return tape.gradient(batch_loss, model.trainable_variables), batch_loss, single_losses 

grads, batch_loss, single_losses = grad(model, features, labels)
print("grads:", grads)
print("batch_loss:", batch_loss)
##########################################################
loss0: [<tf.Tensor: id=39621, shape=(), dtype=float32, numpy=2.1656876>, <tf.Tensor: id=39659, shape=(), dtype=float32, numpy=2.057112>, <tf.Tensor: id=39697, shape=(), dtype=float32, numpy=2.2769136>, <tf.Tensor: id=39735, shape=(), dtype=float32, numpy=2.0263004>, <tf.Tensor: id=39773, shape=(), dtype=float32, numpy=2.1568372>, <tf.Tensor: id=39811, shape=(), dtype=float32, numpy=0.7392154>, <tf.Tensor: id=39849, shape=(), dtype=float32, numpy=0.7742219>, <tf.Tensor: id=39887, shape=(), dtype=float32, numpy=2.2176154>, <tf.Tensor: id=39925, shape=(), dtype=float32, numpy=1.0187237>, <tf.Tensor: id=39963, shape=(), dtype=float32, numpy=2.160415>, <tf.Tensor: id=40001, shape=(), dtype=float32, numpy=0.80997854>, <tf.Tensor: id=40039, shape=(), dtype=float32, numpy=0.70803094>, <tf.Tensor: id=40077, shape=(), dtype=float32, numpy=0.8207226>, <tf.Tensor: id=40115, shape=(), dtype=float32, numpy=0.82957774>, <tf.Tensor: id=40153, shape=(), dtype=float32, numpy=0.88732547>, <tf.Tensor: id=40191, shape=(), dtype=float32, numpy=0.90633464>, <tf.Tensor: id=40229, shape=(), dtype=float32, numpy=0.7932346>, <tf.Tensor: id=40267, shape=(), dtype=float32, numpy=2.1767666>, <tf.Tensor: id=40305, shape=(), dtype=float32, numpy=0.80166155>, <tf.Tensor: id=40343, shape=(), dtype=float32, numpy=0.7831647>, <tf.Tensor: id=40381, shape=(), dtype=float32, numpy=0.77431095>, <tf.Tensor: id=40419, shape=(), dtype=float32, numpy=0.82067406>, <tf.Tensor: id=40457, shape=(), dtype=float32, numpy=0.74510425>, <tf.Tensor: id=40495, shape=(), dtype=float32, numpy=2.1666338>, <tf.Tensor: id=40533, shape=(), dtype=float32, numpy=0.7922478>, <tf.Tensor: id=40571, shape=(), dtype=float32, numpy=0.73235756>, <tf.Tensor: id=40609, shape=(), dtype=float32, numpy=2.1792874>, <tf.Tensor: id=40647, shape=(), dtype=float32, numpy=0.919183>, <tf.Tensor: id=40685, shape=(), dtype=float32, numpy=0.761979>, <tf.Tensor: id=40723, shape=(), dtype=float32, numpy=2.1664479>, <tf.Tensor: id=40761, shape=(), dtype=float32, numpy=0.77892226>, <tf.Tensor: id=40799, shape=(), dtype=float32, numpy=0.99058735>]
loss1: tf.Tensor(4.158007, shape=(), dtype=float32)
loss2: [<tf.Tensor: id=40935, shape=(), dtype=float64, numpy=2.325676997771268>, <tf.Tensor: id=40936, shape=(), dtype=float64, numpy=1.9988182000798667>, <tf.Tensor: id=40937, shape=(), dtype=float64, numpy=2.303379813455908>, <tf.Tensor: id=40938, shape=(), dtype=float64, numpy=2.0615775258879356>, <tf.Tensor: id=40939, shape=(), dtype=float64, numpy=2.2949723624257774>, <tf.Tensor: id=40940, shape=(), dtype=float64, numpy=0.7019287657319235>, <tf.Tensor: id=40941, shape=(), dtype=float64, numpy=0.8522054859739794>, <tf.Tensor: id=40942, shape=(), dtype=float64, numpy=2.0819949907118125>, <tf.Tensor: id=40943, shape=(), dtype=float64, numpy=1.065878291073558>, <tf.Tensor: id=40944, shape=(), dtype=float64, numpy=2.1225998300026805>, <tf.Tensor: id=40945, shape=(), dtype=float64, numpy=0.9485520218242218>, <tf.Tensor: id=40946, shape=(), dtype=float64, numpy=0.7221746903906889>, <tf.Tensor: id=40947, shape=(), dtype=float64, numpy=0.9985009994522388>, <tf.Tensor: id=40948, shape=(), dtype=float64, numpy=0.9143119687525019>, <tf.Tensor: id=40949, shape=(), dtype=float64, numpy=0.9230117922853999>, <tf.Tensor: id=40950, shape=(), dtype=float64, numpy=1.0220225043292934>, <tf.Tensor: id=40951, shape=(), dtype=float64, numpy=0.8735972169951878>, <tf.Tensor: id=40952, shape=(), dtype=float64, numpy=2.1279260795512753>, <tf.Tensor: id=40953, shape=(), dtype=float64, numpy=0.9597649765787801>, <tf.Tensor: id=40954, shape=(), dtype=float64, numpy=0.8338326272407959>, <tf.Tensor: id=40955, shape=(), dtype=float64, numpy=0.6674084331022461>, <tf.Tensor: id=40956, shape=(), dtype=float64, numpy=0.8679296826013285>, <tf.Tensor: id=40957, shape=(), dtype=float64, numpy=0.8174893483228802>, <tf.Tensor: id=40958, shape=(), dtype=float64, numpy=2.212290299049252>, <tf.Tensor: id=40959, shape=(), dtype=float64, numpy=0.7304098620074719>, <tf.Tensor: id=40960, shape=(), dtype=float64, numpy=0.8463413221121661>, <tf.Tensor: id=40961, shape=(), dtype=float64, numpy=2.3081013094190443>, <tf.Tensor: id=40962, shape=(), dtype=float64, numpy=1.0314178020997722>, <tf.Tensor: id=40963, shape=(), dtype=float64, numpy=0.774951045805575>, <tf.Tensor: id=40964, shape=(), dtype=float64, numpy=2.127838465488091>, <tf.Tensor: id=40965, shape=(), dtype=float64, numpy=0.909498425717612>, <tf.Tensor: id=40966, shape=(), dtype=float64, numpy=1.0217239989370837>]
grads: [None, None, None, None, None, None]
batch_loss: tf.Tensor(
[2.325677   1.9988182  2.3033798  2.0615776  2.2949724  0.7019288
 0.8522055  2.081995   1.0658783  2.1225998  0.948552   0.7221747
 0.998501   0.91431195 0.9230118  1.0220225  0.8735972  2.127926
 0.95976496 0.8338326  0.6674084  0.8679297  0.8174893  2.2122903
 0.73040986 0.8463413  2.3081014  1.0314178  0.77495104 2.1278384
 0.90949845 1.021724  ], shape=(32,), dtype=float32)

tensorflow

tensor

回答 1

Stack Overflow用户

发布于 2019-08-17 12:24:20

你看过this answer吗？我想我也有类似的问题，我相信你的问题可能和我的问题有关。它必须与损失有关，该损失是通过过程中的某个步骤计算的，其中感兴趣的张量从磁带的开始到结束是“丢失的”。引用的答案指出，原始海报有一个区域，其中返回了numpy数组而不是tensorflow张量，因此导致梯度Tape无法计算梯度。

我可能错了，因为我根本不是tensorflow专家，但这是我在寻找类似问题的解决方案时不断出现的东西。

票数 3

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/56858378

复制

相似问题

问对于某些损失，tf.tape.gradient()返回None
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问对于某些损失，tf.tape.gradient()返回NoneEN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问对于某些损失，tf.tape.gradient()返回None
EN