拟合度:调整R方 $\hat{R}^2=1-\frac{RSS/(n-d-1)}{TSS/(n-1)}$ 其中: * RSS为残差平方和 * TSS为总平方和 * n为观测值 * d为特征值 ESS: 回归平方和回归方程中含有截距项时TSS=RSS+ESS 否则上述公式不成立。
分类问题的评价指标是准确率,那么回归算法的评价指标就是MSE,RMSE,MAE、R-Squared。 0.5 3、MAE (Mean absolute Error)平均绝对误差 mae_test=np.sum(np.absolute(y_preditc-y_test))/len(y_test) 4、R-Squared ,y_predict) mean_absolute_error(y_test,y_predict) r2_score(y_test,y_predict) 【1】回归评价指标MSE、RMSE、MAE、R-Squared
猛着看一下这个公式是不是觉得眼熟,这不就是线性回归的损失函数嘛!!! 对,在线性回归的时候我们的目的就是让这个损失函数最小。那么模型做出来了,我们把损失函数丢到测试集上去看看损失值不就好了嘛。简单直观暴力!
值', regressor.score(X_test, y_test)) print('二次多项式回归 r-squared值', regressor_quadratic.score(X_test_quadratic 简单线性回归 r-squared值 0.809726797707665 二次多项式回归 r-squared值 0.8675443656345054 # 决定系数更大 当改为 3 阶拟合时,多项式回归 r-squared 值 0.8356924156037133 当改为 4 阶拟合时,多项式回归 r-squared值 0.8095880795746723 当改为 9 阶拟合时,多项式回归 r-squared值 -0.09435666704315328 r-squared scores: [0.57365322 0.73833251 0.69391029 0.67979254 0.73491949] Average cross validation r-squared score: 0.6841216111623614 Test set r-squared score 0.7716363798764403 help(SGDRegressor) class
R-squared 增大,Adjust R-squared 增大 B. R-squared 增大,Adjust R-squared 减小 C. R-squared 减小,Adjust R-squared 减小 D. R-squared 减小,Adjust R-squared 增大 答案:AB 解析:线性回归问题中,R-Squared 是用来衡量回归方程与真实样本输出之间的相似程度。 一般来说,R-Squared 越大,表示模型拟合效果越好。R-Squared 反映的是大概有多准,因为,随着样本数量的增加,R-Squared 必然增加,无法真正定量说明准确程度,只能大概定量。 R-Squared B. Adjusted R-Squared C. F Statistics D.
R-squared 增大,Adjust R-squared 增大 B. R-squared 增大,Adjust R-squared 减小 C. R-squared 减小,Adjust R-squared 减小 D. R-squared 减小,Adjust R-squared 增大 答案:AB 解析:线性回归问题中,R-Squared 是用来衡量回归方程与真实样本输出之间的相似程度。 一般来说,R-Squared 越大,表示模型拟合效果越好。R-Squared 反映的是大概有多准,因为,随着样本数量的增加,R-Squared 必然增加,无法真正定量说明准确程度,只能大概定量。 R-Squared B. Adjusted R-Squared C. F Statistics D.
R-squared 增大,Adjust R-squared 增大 B. R-squared 增大,Adjust R-squared 减小 C. R-squared 减小,Adjust R-squared 减小 D. R-squared 减小,Adjust R-squared 增大 答案:AB 解析:线性回归问题中,R-Squared 是用来衡量回归方程与真实样本输出之间的相似程度。 一般来说,R-Squared 越大,表示模型拟合效果越好。R-Squared 反映的是大概有多准,因为,随着样本数量的增加,R-Squared 必然增加,无法真正定量说明准确程度,只能大概定量。 R-Squared B. Adjusted R-Squared C. F Statistics D.
R-Squared 和 Adjusted R-Squared 都会增大 B. R-Squared 保持不变 Adjusted R-Squared 增加 C. R-Squared 和 Adjusted R-Squared 都会减小 D. 以上说法都不对 答案:D 解析:本题考查的是线性回归模型的评估准则 R-Squared 和 Adjusted R-Squared。 根据 R-Squared 的取值,来判断模型的好坏:如果结果是 0,说明模型拟合效果很差;如果结果是 1,说明模型无错误。一般来说,R-Squared 越大,表示模型拟合效果越好。 R-Squared 反映的是大概有多准,因为,随着样本数量的增加,R-Square必然增加,无法真正定量说明准确程度,只能大概定量。 单独看 R-Squared,并不能推断出增加的特征是否有意义。
R-Squared 和 Adjusted R-Squared 都会增大 B. R-Squared 保持不变 Adjusted R-Squared 增加 C. R-Squared 和 Adjusted R-Squared 都会减小 D. 以上说法都不对 答案:D 解析:本题考查的是线性回归模型的评估准则 R-Squared 和 Adjusted R-Squared。 根据 R-Squared 的取值,来判断模型的好坏:如果结果是 0,说明模型拟合效果很差;如果结果是 1,说明模型无错误。一般来说,R-Squared 越大,表示模型拟合效果越好。 R-Squared 反映的是大概有多准,因为,随着样本数量的增加,R-Square必然增加,无法真正定量说明准确程度,只能大概定量。 单独看 R-Squared,并不能推断出增加的特征是否有意义。
R-squared 增大,Adjust R-squared 增大 B. R-squared 增大,Adjust R-squared 减小 C. R-squared 减小,Adjust R-squared 减小 D. R-squared 减小,Adjust R-squared 增大 **答案**:AB **解析**:线性回归问题中,R-Squared 是用来衡量回归方程与真实样本输出之间的相似程度。 一般来说,R-Squared 越大,表示模型拟合效果越好。R-Squared 反映的是大概有多准,因为,随着样本数量的增加,R-Squared 必然增加,无法真正定量说明准确程度,只能大概定量。 R-Squared B. Adjusted R-Squared C. F Statistics D.
: 0.2143, Adjusted R-squared: 0.1619F-statistic: 4.09 on 1 and 15 DF, p-value: 0.06134Response: : 0.1846, Adjusted R-squared: 0.1302anova(model, model.null) # shows p-value for model pseudoDf ) 12.6890 4.2009 3.021 0.0056 **Weight 1.6017 0.6176 2.593 0.0154 *Multiple R-squared : 0.2055, Adjusted R-squared: 0.175F-statistic: 6.726 on 1 and 26 DF, p-value: 0.0154### Neither the r-squared nor the p-value agrees with what is reported### in the Handbook.library(car)Anova(model
'*' 0.05 '.' 0.1 ' ' 1 ## ## Residual standard error: 2.1 on 187 degrees of freedom ## Multiple R-squared : 0.1002, Adjusted R-squared: 0.09543 ## F-statistic: 20.83 on 1 and 187 DF, p-value: 9.055e-061、 ' 0.05 '.' 0.1 ' ' 1 ## ## Residual standard error: 0.6307 on 185 degrees of freedom ## Multiple R-squared : 0.0238, Adjusted R-squared: 0.01324 ## F-statistic: 2.255 on 2 and 185 DF, p-value: 0.1077 ## : 0.1325, Adjusted R-squared: 0.1279 ## F-statistic: 28.57 on 1 and 187 DF, p-value: 2.615e-07绘制残差
: 0.968, Adjusted R-squared: 0.9674 F-statistic: 1473 on 3 and 146 DF, p-value: < 2.2e-16 lm()函数中 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 0.3187 on 147 degrees of freedomMultiple R-squared : 0.9942, Adjusted R-squared: 0.9941 F-statistic: 8426 on 3 and 147 DF, p-value: < 2.2e-16 我们将截距去掉可以看到 : 0.9688, Adjusted R-squared: 0.968 F-statistic: 1127 on 4 and 145 DF, p-value: < 2.2e-16 上述代码在模型 在交互项的选择方面,原则上需要将解释变量进行组合,建模并参考R-squared项进行选取,使得R-squared变大且参数估计能通过显著性检验的交互项就可以引入回归模型中,该方法适用于解释变量不多的情况
**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 1.015 on 9998 degrees of freedom Multiple R-squared : 0.9988, Adjusted R-squared: 0.9988 F-statistic: 8.006e+06 on 1 and 9998 DF, p-value: < 2.2e-16 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 1.121 on 72 degrees of freedom Multiple R-squared : 0.1233, Adjusted R-squared: 0.1112 F-statistic: 10.13 on 1 and 72 DF, p-value: 0.002155 R平方值低得多
' 0.05 '.' 0.1 ' ' 1 ## ## Residual standard error: 2.1 on 187 degrees of freedom ## Multiple R-squared : 0.1002, Adjusted R-squared: 0.09543 ## F-statistic: 20.83 on 1 and 187 DF, p-value: 9.055e-06 0.05 '.' 0.1 ' ' 1 ## ## Residual standard error: 0.6307 on 185 degrees of freedom ## Multiple R-squared : 0.0238, Adjusted R-squared: 0.01324 ## F-statistic: 2.255 on 2 and 185 DF, p-value: 0.1077 # : 0.1325, Adjusted R-squared: 0.1279 ## F-statistic: 28.57 on 1 and 187 DF, p-value: 2.615e-07
0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 1.097 on 27 degrees of freedom Multiple R-squared : 0.6668, Adjusted R-squared: 0.6545 F-statistic: 54.04 on 1 and 27 DF, p-value: 6.592e-08 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 0.3235 on 27 degrees of freedom Multiple R-squared : 0.9136, Adjusted R-squared: 0.9104 F-statistic: 285.6 on 1 and 27 DF, p-value: 6.934e-16
Variable: y R-squared: 0.508 Model: R-squared: 0.508 Method: Least Squares F-statistic: Variable: y R-squared: 0.523 Model: R-squared: 0.523 Method: Least Squares F-statistic: R-squared: 0.523 Method: Least Squares F-statistic:
'*' 0.05 '.' 0.1 ' ' 1 ## ## Residual standard error: 0.9817 on 997 degrees of freedom ## Multiple R-squared : 0.7541, Adjusted R-squared: 0.7536 ## F-statistic: 1528 on 2 and 997 DF, p-value: < 2.2e-16 b '*' 0.05 '.' 0.1 ' ' 1 ## ## Residual standard error: 0.7534 on 997 degrees of freedom ## Multiple R-squared : 0.4636, Adjusted R-squared: 0.4626 ## F-statistic: 430.9 on 2 and 997 DF, p-value: < 2.2e-16 第 : 0.4158, Adjusted R-squared: 0.4146 ## F-statistic: 354.8 on 2 and 997 DF, p-value: < 2.2e-16 结果
0.01 '*' 0.05 '.' 0.1 ' ' 1 #> #> Residual standard error: 6.506 on 18 degrees of freedom #> Multiple R-squared : 0.5922, Adjusted R-squared: 0.5695 #> F-statistic: 26.14 on 1 and 18 DF, p-value: 7.282e-05 多个预测变量的线性回归 0.01 '*' 0.05 '.' 0.1 ' ' 1 #> #> Residual standard error: 4.859 on 17 degrees of freedom #> Multiple R-squared : 0.7852, Adjusted R-squared: 0.7599 #> F-statistic: 31.07 on 2 and 17 DF, p-value: 2.1e-06 交互效应 : 0.8505, Adjusted R-squared: 0.8225 #> F-statistic: 30.34 on 3 and 16 DF, p-value: 7.759e-07 原文链接
2.382311 0.0197 C 0.358679 0.120148 2.985296 0.0038 @TREND(2003M03) 0.001129 0.000629 1.793788 0.0768 R-squared 0.232264 Mean dependent var 0.017248 Adjusted R-squared 0.191857 S.D. dependent var 0.092995 2.990637 0.0041 C 0.796047 0.216183 3.682283 0.0005 @TREND(2003M03) 0.003431 0.001065 3.222824 0.0021 R-squared 0.442577 Mean dependent var 0.011301 Adjusted R-squared 0.317638 S.D. dependent var 0.096100