首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >神经网络python中偏导数的错误值

神经网络python中偏导数的错误值
EN

Stack Overflow用户
提问于 2015-06-03 21:29:37
回答 1查看 167关注 0票数 0

我正在为虹膜数据集实现一个简单的神经网络分类器。该神经网络有3个输入节点、1个两个节点的隐层和3个输出节点。我已经实现了所有,但是偏导数的值没有被正确地计算出来。我已经疲惫不堪地寻找着这个解,但我做不到。这是我的计算偏导数的代码。

代码语言:javascript
复制
def derivative_cost_function(self,X,Y,thetas):
    '''
        Computes the derivates of Cost function w.r.t input parameters (thetas)  
        for given input and labels.

        Input:
        ------
            X: can be either a single d X n-dimensional vector or d X n dimensional matrix of inputs
            theata: must  dk X 1-dimensional vector for representing vectors of k classes
            Y: Must be k X n-dimensional label vector
        Returns:
        ------
            partial_thetas: a dk X 1-dimensional vector of partial derivatives of cost function w.r.t parameters..
    '''

    #forward pass
    a2, a3=self.forward_pass(X,thetas)

    #now back-propogate 

    # unroll thetas
    l1theta, l2theta = self.unroll_thetas(thetas)


    nexamples=float(X.shape[1])

    # compute delta3, l2theta
    a3 = np.array(a3)
    a2 = np.array(a2)
    Y = np.array(Y)

    a3 = a3.T
    delta3 = (a3 * (1 - a3)) * (((a3 - Y)/((a3)*(1-a3)))) 
    l2Derivatives = np.dot(delta3, a2)
    #print "Layer 2 derivatives shape = ", l2Derivatives.shape
    #print "Layer 2 derivatives = ", l2Derivatives



    # compute delta2, l1 theta
    a2 = a2.T
    dotProduct = np.dot(l2theta.T,delta3)
    delta2 = dotProduct * (a2) * (1- a2)


    l1Derivatives = np.dot(delta2[1:], X.T)
    #print "Layer 1 derivatives shape = ", l1Derivatives.shape
    #print "Layer 1 derivatives = ", l1Derivatives


    #remember to exclude last element of delta2, representing the deltas of bias terms...
    # i.e. delta2=delta2[:-1]



    # roll thetas into a big vector
    thetas=(self.roll_thetas(l1Derivatives,l2Derivatives)).reshape(thetas.shape) # return the same shape as you received

    return thetas
EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2015-06-03 21:47:19

为什么不看看我在network/blob/master/nn.py中的实现

衍生产品实际上就在这里:

代码语言:javascript
复制
def dCostFunction(self, theta, in_dim, hidden_dim, num_labels, X, y):
        #compute gradient
        t1, t2 = self.uncat(theta, in_dim, hidden_dim)


        a1, z2, a2, z3, a3 = self._forward(X, t1, t2) # p x s matrix

        # t1 = t1[1:, :] # remove bias term
        # t2 = t2[1:, :]
        sigma3 = -(y - a3) * self.dactivation(z3) # do not apply dsigmode here? should I
        sigma2 = np.dot(t2, sigma3)
        term = np.ones((1,num_labels))
        sigma2 = sigma2 * np.concatenate((term, self.dactivation(z2)),axis=0)

        theta2_grad = np.dot(sigma3, a2.T)
        theta1_grad = np.dot(sigma2[1:,:], a1.T)

        theta1_grad = theta1_grad / num_labels
        theta2_grad = theta2_grad / num_labels

        return self.cat(theta1_grad.T, theta2_grad.T)

希望它能帮上忙

票数 2
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/30631108

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档