我遵循本教程实现反向传播算法。然而,我仍然坚持执行这个算法的势头。
没有动量,这是权重更新方法的代码:
def update_weights(network, row, l_rate):
for i in range(len(network)):
inputs = row[:-1]
if i != 0:
inputs = [neuron['output'] for neuron in network[i - 1]]
for neuron in network[i]:
for j in range(len(inputs)):
neuron['weights'][j] += l_rate * neuron['delta'] * inputs[j]
neuron['weights'][-1] += l_rate * neuron['delta']以下是我的执行情况:
def updateWeights(network, row, l_rate, momentum=0.5):
for i in range(len(network)):
inputs = row[:-1]
if i != 0:
inputs = [neuron['output'] for neuron in network[i-1]]
for neuron in network[i]:
for j in range(len(inputs)):
previous_weight = neuron['weights'][j]
neuron['weights'][j] += l_rate * neuron['delta'] * inputs[j] + momentum * previous_weight
previous_weight = neuron['weights'][-1]
neuron['weights'][-1] += l_rate * neuron['delta'] + momentum * previous_weight这给了我一个Mathoverflow误差,因为在多个时期,权重呈指数级变大。我相信我的previous_weight逻辑对更新是错误的。
发布于 2017-11-10 15:53:04
我会给你个提示。在实现中将momentum与previous_weight相乘,这是同一步中网络的另一个参数。这显然很快就爆炸了。
相反,您应该记住前一个反向传播步骤上的整个更新向量l_rate * neuron['delta'] * inputs[j],并将其加起来。它看起来可能是这样的:
velocity[j] = l_rate * neuron['delta'] * inputs[j] + momentum * velocity[j]
neuron['weights'][j] += velocity[j]..。其中velocity是一个与network相同长度的数组,定义的范围比updateWeights大,并使用零初始化。详情请参见这个职位。
https://stackoverflow.com/questions/47211478
复制相似问题