首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >Logistic回归中的梯度下降不能减少损失

Logistic回归中的梯度下降不能减少损失
EN

Stack Overflow用户
提问于 2020-05-24 23:29:28
回答 1查看 435关注 0票数 0

我开始学习机器,并尝试在Kaggle泰坦尼克号数据集上从头到尾实现Logistic回归。我写的代码是我在网上学到的,我在这里很难实现梯度下降。问题是在计算了W和B梯度并在称为logisitic_regression的函数中实现了一个更新,其中W=W- alpha_wgrad和b=b- alpha_bgrad,由于某种原因,损失不会减少,W和b参数也不会更新。我似乎找不到我的代码中的错误,有人能帮忙吗?请参阅下列功能。如果你还需要更多的信息,请告诉我。

代码语言:javascript
复制
#Implement sigmoid action potenial function
def sigmoid(z):
    '''
    Input:
        z: Scalar or arry of dimension n
    Output:
    sgmd: Scalar or array of dimension n

    '''

    sgmd = 1/(1+np.exp(-z))
    return sgmd



#Define prediction function
def yPredLogistic(X, w, b=0):
   '''
   Input:
    X: nxd matric
    w: d-dimensional vector
    b: scalar (optional, if not pass on is treated as 0)
   Output:
    prob: n-dimensional vector
   '''
   prob = sigmoid(np.inner(X,w.T) +b)

   return prob


#Define negative loglikelihood as log oss 
def log_loss(X, y, w, b=0):
   '''
   Input:
    X: nxd matrix 
    y: n-dimensional vector with labels (+1 or -1)
    w: d=dimensional vector 
   Output:
    nll: a scalar 
   '''


    nll = -np.sum(np.log(sigmoid(y*(np.inner(w.T,X) +b))))

    return nll

   #define gradient 
def gradient(X, y, w, b):
    '''
    Input:
     X: nxd matrix 
     y: n-dimensional vector with labels +1 or -1
     w: d-dimensional vector 
     b: scalr bias term 
   Output:
     wgrad: d-dimensional vector with gradient 
     bgrad: a scalar with gradient
  '''

    n, d = X.shape 
    #wgrad = np.zeros(d)
    #bgrad = 0.0

    #h = y - yPredLogistic(X,w, b)

    wgrad = -y*(sigmoid(-y*(np.inner(w.T,X) +b)))@X

    #partialx = -y*(sigmoid(-y*(np.inner(w.T,X) +b)))@X
    bgrad = np.sum(-y*(sigmoid(-y*(np.inner(w.T,X) +b))))

    return wgrad, bgrad


 #Implement weight update of gradient descent
def logisitic_regression(X,y, max_iter, alpha):
    '''
    Input:
     X: nxd matrix 
     y: n-dimensional vector with labels +1 or -1
     max_iter: max iterations
    alpha: learning or step rate
   Output: 
     w: d-dimensional vector 
     b: scalr bias term 
    losses: losses
'''
    n, d = X.shape
    w = np.zeros(d)
    b = 0.0
    #losses = np.zeros(max_iter)    
    losses = []
    for step in range(max_iter):


        #Get wgradient and b gradient

        wgrad, bgrad = gradient(X,y, w,b)

        w = w - alpha*wgrad

        #update b
        b = b - alpha*bgrad



        #define losses
        losses.append(log_loss(X,y,w,b))


return w, b, losses
EN

回答 1

Stack Overflow用户

发布于 2020-05-27 08:38:06

我认为问题在于,您试图实现+1,-1输出标签的代码,而泰坦尼克数据集的输出为0/1,而不是+/-1,所以您必须改变算法,正确地计算导数和日志丢失,因为这不是您对0/1标签使用的日志丢失公式。

票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/61993653

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档