开发者社区

文档建议反馈控制台

文章/答案/技术大牛

发布

社区首页 >专栏 >Linear/Logistic/Softmax Regression对比

Linear/Logistic/Softmax Regression对比

用户2183996

修改于 2019-09-22 17:46:11

修改于 2019-09-22 17:46:11

1.5K0

举报

文章被收录于专栏：技术沉淀技术沉淀

开公众号啦，分享读书心得，欢迎一起交流成长。

Linear/Logistic/Softmax Regression是常见的机器学习模型，且都是广义线性模型的一种，有诸多相似点，详细对比之。原文见Linear/Logistic/Softmax Regression对比。

概述

Linear Regression是回归模型，Logistic Regression是二分类模型，Softmax Regression是多分类模型，但三者都属于广义线性「输入的线性组合」模型「GLM」。

其中Softmax Regression可以看做Logistic Regression在多类别上的拓展。

Softmax Regression (synonyms: Multinomial Logistic, Maximum Entropy Classifier, or just Multi-class Logistic Regression) is a generalization of logistic regression that we can use for multi-class classification (under the assumption that the classes are mutually exclusive).

符号约定

样本

样本数

特征维度

Linear Regression输出

Logistic Regression类别

Softmax Regression类别

Softmax Regression类别数

损失函数

Indicator函数

模型参数对比

Linear Regression，维度为

的向量

Logistic Regression，维度为

的向量

Softmax Regression，维度为

的矩阵

模型输出对比

Linear Regression输出样本的得分「标量」。

Logistic Regression输出正样本的概率「标量」。

Softmax Regression输出为

个类别的概率「向量」。

损失函数对比

Linear Regression是回归问题，损失函数一般取平方误差；Logistic/Softmax Regression是分类问题，损失函数一般用交叉熵。

分类问题，对样本

，模型输出在类别上的概率分布，可统一表示为条件概率

，可以直接写出交叉熵表达式，也可以通过极大似然法则导出，最终效果一样。

Linear Regression。

Logistic Regression。条件概率可以表示为

对所有训练样本，损失函数为

Softmax Regression。条件概率可以表示为

对所有训练样本，损失函数为

对比式子Logistic/Softmax Regression，二者的损失函数形式完全一致，就是交叉熵损失。真实概率分布

和预估概率分布

的交叉熵为

对Logistic Regression来说，真实概率分布为

或

对Softmax Regression来说，真实概率分布为

、

或

梯度对比

Linear/Logistic/Softmax Regression都是广义线性模型的一种，其形式都极其相似，包括梯度。

Linear Regression梯度

其中

。

Logistic Regression梯度

其中

。

Softmax Regression梯度

其中预测结果见上文模型输出对比内容，方便表示，分别对

求导。

梯度形式非常的Intuitive，更新尺度正比于误差项！

The magnitude of the update is proportional to the error term

; thus, for instance, if we are encountering a training example on which our prediction nearly matches the actual value of

, then we ﬁnd that there is little need to change the parameters; in contrast, a larger change to the parameters will be made if our prediction

has a large error (i.e., if it is very far from

).

本文参与腾讯云自媒体同步曝光计划，分享自作者个人站点/博客。

原始发表：2019.04.07 ，如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自作者个人站点/博客前往查看

如有侵权，请联系 cloudcommunity@tencent.com 删除。

本文参与腾讯云自媒体同步曝光计划，欢迎热爱写作的你一起参与！

评论

登录后参与评论

0 条评论

热度

最新

目录

概述

符号约定

模型参数对比

模型输出对比

损失函数对比

梯度对比