我开始学习机器,并尝试在Kaggle泰坦尼克号数据集上从头到尾实现Logistic回归。我写的代码是我在网上学到的,我在这里很难实现梯度下降。问题是在计算了W和B梯度并在称为logisitic_regression的函数中实现了一个更新,其中W=W- alpha_wgrad和b=b- alpha_bgrad,由于某种原因,损失不会减少,W和b参数也不会更新。我似乎找不到我的代码中的错误,有人能帮忙吗?请参阅下列功能。如果你还需要更多的信息,请告诉我。
#Implement sigmoid action potenial function
def sigmoid(z):
'''
Input:
z: Scalar or arry of dimension n
Output:
sgmd: Scalar or array of dimension n
'''
sgmd = 1/(1+np.exp(-z))
return sgmd
#Define prediction function
def yPredLogistic(X, w, b=0):
'''
Input:
X: nxd matric
w: d-dimensional vector
b: scalar (optional, if not pass on is treated as 0)
Output:
prob: n-dimensional vector
'''
prob = sigmoid(np.inner(X,w.T) +b)
return prob
#Define negative loglikelihood as log oss
def log_loss(X, y, w, b=0):
'''
Input:
X: nxd matrix
y: n-dimensional vector with labels (+1 or -1)
w: d=dimensional vector
Output:
nll: a scalar
'''
nll = -np.sum(np.log(sigmoid(y*(np.inner(w.T,X) +b))))
return nll
#define gradient
def gradient(X, y, w, b):
'''
Input:
X: nxd matrix
y: n-dimensional vector with labels +1 or -1
w: d-dimensional vector
b: scalr bias term
Output:
wgrad: d-dimensional vector with gradient
bgrad: a scalar with gradient
'''
n, d = X.shape
#wgrad = np.zeros(d)
#bgrad = 0.0
#h = y - yPredLogistic(X,w, b)
wgrad = -y*(sigmoid(-y*(np.inner(w.T,X) +b)))@X
#partialx = -y*(sigmoid(-y*(np.inner(w.T,X) +b)))@X
bgrad = np.sum(-y*(sigmoid(-y*(np.inner(w.T,X) +b))))
return wgrad, bgrad
#Implement weight update of gradient descent
def logisitic_regression(X,y, max_iter, alpha):
'''
Input:
X: nxd matrix
y: n-dimensional vector with labels +1 or -1
max_iter: max iterations
alpha: learning or step rate
Output:
w: d-dimensional vector
b: scalr bias term
losses: losses
'''
n, d = X.shape
w = np.zeros(d)
b = 0.0
#losses = np.zeros(max_iter)
losses = []
for step in range(max_iter):
#Get wgradient and b gradient
wgrad, bgrad = gradient(X,y, w,b)
w = w - alpha*wgrad
#update b
b = b - alpha*bgrad
#define losses
losses.append(log_loss(X,y,w,b))
return w, b, losses发布于 2020-05-27 08:38:06
我认为问题在于,您试图实现+1,-1输出标签的代码,而泰坦尼克数据集的输出为0/1,而不是+/-1,所以您必须改变算法,正确地计算导数和日志丢失,因为这不是您对0/1标签使用的日志丢失公式。
https://stackoverflow.com/questions/61993653
复制相似问题