文章/答案/技术大牛

发布

社区首页 >问答首页 >如何计算和计算三层神经网络的权值和偏差的导数(仅含numpy)？

问如何计算和计算三层神经网络的权值和偏差的导数(仅含numpy)？
EN

Stack Overflow用户

提问于 2022-07-05 21:19:59

回答 1查看 86关注 0票数 1

我试图创建一个3层神经网络，一个输入层，一个隐藏层和一个输出层。输入层由一个(1, 785) Numpy数组表示，它认为我使用MNIST将数字从0分类到9。我的前向传播算法具有数组的所有维数，尽管当我计算网络的权值和偏差的导数时，数组的形状与原始数组不同，当我进行梯度下降来更新权重和偏差时，由于根据Numpy文档，当形状不相等或其中之一等于1时，广播是不可能的。

这里是权值的导数和反向传播上的偏差的计算：

    def backpropagation(self, x, y):
        predicted_value = self.forward_propagation(x)
        cost_value_derivative = self.loss_function(
                predicted_value.T, self.expected_value(y), derivative=True
            )
        print(f"{'-*-'*15} PREDICTION {'-*-'*15}")
        print(f"Predicted Value: {np.argmax(predicted_value)}")
        print(f"Actual Value: {y}")
        print(f"{'-*-'*15}{'-*-'*19}")

        derivative_W2 = (cost_value_derivative*self.sigmoid(
            self.output_layer_without_activity, derivative=True)
        ).dot(self.hidden_layer.T).T

        print(f"Derivative_W2: {derivative_W2.shape}, weights_hidden_layer_to_output_layer: {self.weights_hidden_layer_to_output_layer.shape}")
        assert derivative_W2.shape == self.weights_hidden_layer_to_output_layer.shape

        derivative_b2 = (cost_value_derivative*(self.sigmoid(
                self.output_layer_without_activity, derivative=True).T
        )).T

        print(f"Derivative_b2: {derivative_b2.shape}, bias_on_output_layer: {self.bias_on_output_layer.shape}")
        assert derivative_b2.shape == self.bias_on_output_layer.shape

        derivative_b1 = cost_value_derivative*self.sigmoid(
            self.output_layer_without_activity.T, derivative=True
        ).dot(self.weights_hidden_layer_to_output_layer.T).dot(
            self.sigmoid(self.hidden_layer_without_activity, derivative=True)
        )
        print(f"Derivative_b1: {derivative_b1.shape}, bias_on_hidden_layer: {self.bias_on_hidden_layer.shape}")

        assert derivative_b1.shape == self.bias_on_hidden_layer.shape

        derivative_W1 = cost_value_derivative*self.sigmoid(
            self.output_layer_without_activity.T, derivative=True
        ).dot(self.weights_hidden_layer_to_output_layer.T).dot(self.sigmoid(
                self.hidden_layer_without_activity, derivative=True)
        ).dot(x)

        print(f"Derivative_W1: {derivative_W1.shape}, weights_input_layer_to_hidden_layer: {self.weights_input_layer_to_hidden_layer.shape}")
        assert derivative_W1.shape == self.weights_input_layer_to_hidden_layer.shape

        return derivative_W2, derivative_b2, derivative_W1, derivative_b1

下面是我实现的前向传播：

    def forward_propagation(self, x):

        self.hidden_layer_without_activity = self.weights_input_layer_to_hidden_layer.T.dot(x.T) + self.bias_on_hidden_layer

        self.hidden_layer = self.sigmoid(
            self.hidden_layer_without_activity
        )

        self.output_layer_without_activity = self.weights_hidden_layer_to_output_layer.T.dot(
            self.hidden_layer
        ) + self.bias_on_output_layer

        self.output_layer = self.sigmoid(
            self.output_layer_without_activity
        )

        return self.output_layer

以weights_hidden_layer_to_output_layer变量为例，对权重和偏差的梯度下降更新是weights_on_hidden_layer_to_output_layer -= learning_rate*derivative_W2，其中derivative_W2是相对于weights_hidden_layer_to_output_layer的损失函数的导数。

python

machine-learning

math

deep-learning

neural-network

回答 1

Stack Overflow用户

发布于 2022-07-06 10:56:47

由于您没有提供您的函数的定义，所以很难知道它哪里出错了。但是，我通常使用这个代码片段来计算一个有一个隐藏层和所有sigmoid激活的NN。我希望它能帮助您调试代码。

for epoch in range(epochs):
    # forward propagation
    Z1 = np.dot(W1, X) + b1
    A1 = sigmoid(Z1)
    Z2 = np.dot(W2, A1) + b2
    A2 = Sigmoid(Z2)

    # backward propagation
    dZ2 = A2 - Y
    dW2 = 1/m * np.dot(dZ2, A1.T)
    db2 = 1/m * np.sum(dZ2, axis=1, keepdims=True)
    dZ1 = np.dot(W2.T, dZ2) * (1 - np.power(A1, 2))
    dW1 = 1/m * np.dot(dZ1, X.T)
    db1 = 1/m * np.sum(dZ1, axis=1, keepdims=True)

    # update parameters
    W1 = W1 - alpha * dW1
    b1 = b1 - alpha * db1
    W2 = W2 - alpha * dW2
    b2 = b2 - alpha * db2

print(f'W1:{W1} b1:{b1} W2:{W2} b2:{b2}')

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/72875645

复制

相似问题

问如何计算和计算三层神经网络的权值和偏差的导数(仅含numpy)？
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问如何计算和计算三层神经网络的权值和偏差的导数(仅含numpy)？EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问如何计算和计算三层神经网络的权值和偏差的导数(仅含numpy)？
EN