文章/答案/技术大牛

发布

社区首页 >问答首页 >如何从GLM输出中获取概率

问如何从GLM输出中获取概率
EN

Stack Overflow用户

提问于 2018-01-07 03:46:54

回答 2查看 10.4K关注 0票数 2

当我试图弄清楚如何从R中的glm输出中计算概率时，我非常困惑。我知道数据非常微不足道，但我真的很想知道如何从这样的输出中获得概率。我正在考虑尝试inv.logit()，但不知道应该在括号中放入什么变量。

数据来自入住率研究。我正在评估捕毛器方法与相机捕鼠器在检测3种物种(红松鼠、松鼠和入侵灰松鼠)方面的成功。我想看看是什么影响了对各种物种的检测(或不检测)。一种假设是，在该地点检测到另一个焦点物种将影响红松鼠的可检测性。鉴于松鼠是红松鼠的捕食者，而灰松鼠是竞争对手，这两个物种出现在一个地点可能会影响红松鼠的可探测性。

这会显示概率吗？inv.logit(-1.14 - 0.1322 * nonRS events)

 glm(formula = RS_sticky ~ NonRSevents_before1stRS, family = binomial(link = "logit"), data = data)
Deviance Residuals: 
    Min       1Q   Median       3Q      Max  
-0.7432  -0.7432  -0.7222  -0.3739   2.0361  
Coefficients:
                    Estimate Std. Error z value Pr(>|z|)  
(Intercept)              -1.1455     0.4677  -2.449   0.0143 *
NonRSevents_before1stRS  -0.1322     0.1658  -0.797   0.4255  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
   (Dispersion parameter for binomial family taken to be 1)
Null deviance: 34.575  on 33  degrees of freedom
   Residual deviance: 33.736  on 32  degrees of freedom
  (1 observation deleted due to missingness)
   AIC: 37.736
  Number of Fisher Scoring iterations: 5*

glm

probability

回答 2

Stack Overflow用户

发布于 2018-01-07 06:10:52

如果要预测预测变量的一组指定值的响应概率：

pframe <- data.frame(NonRSevents_before1stRS=4)
predict(fitted_model, newdata=pframe, type="response")

其中fitted_model是glm()拟合的结果，您将其存储在变量中。您可能不熟悉统计分析的R方法，即将拟合的模型存储为对象/变量，然后对其应用不同的方法(summary()、plot()、predict()、residuals()等)。

这显然只是一个虚构的例子:我不知道4对于预测来说是否是一个合理的值variable)

you可以指定更多不同的值来进行预测同时你有多个

，你必须为每个预测的每个NonRSevents_before1stRS指定一些值，例如data.frame(x=4:8,y=mean(orig_data$y), ...)

如果需要原始数据集中观测值的预测概率，只需使用predict(fitted_model, type="response")

您正确地认为inv.logit() (来自一堆不同的包，不知道您使用的是哪一个)或plogis() (来自基数R，本质上相同)将从logit或对数赔率尺度转换为概率尺度，因此

plogis(predict(fitted_model))

也是有效的(默认情况下，predict提供了对链接函数的预测，在本例中为logit/log-odds scale )。

票数 8

Stack Overflow用户

发布于 2018-01-07 04:01:52

logistic回归中的因变量是对数优势比。我们将说明如何使用MASS软件包中的航天飞机自动登陆器数据来解释系数。

加载数据后，我们将创建一个二进制依赖变量，其中：

1 = autolander used, 
0 = autolander not used.

我们还将创建一个用于航天飞机稳定性的二进制自变量：

1 = stable positioning
0 = unstable positioning.

然后，我们将使用family=binomial(link="logit")运行glm()。由于系数是对数赔率比，我们将对它们求幂，以将它们转换回赔率比。

library(MASS)
str(shuttle)
shuttle$stable <- 0
shuttle[shuttle$stability =="stab","stable"] <- 1
shuttle$auto <- 0
shuttle[shuttle$use =="auto","auto"] <- 1

fit <- glm(use ~ factor(stable),family=binomial(link = "logit"),data=shuttle) # specifies base as unstable

summary(fit)
exp(fit$coefficients)

对输出执行...and操作：

> fit <- glm(use ~ factor(stable),family=binomial(link = "logit"),data=shuttle) # specifies base as unstable
> 
> summary(fit)

Call:
glm(formula = use ~ factor(stable), family = binomial(link = "logit"), 
data = shuttle)

Deviance Residuals: 
    Min       1Q   Median       3Q      Max  
-1.1774  -1.0118  -0.9566   1.1774   1.4155  

Coefficients:
                  Estimate Std. Error z value Pr(>|z|)  
(Intercept)      4.747e-15  1.768e-01   0.000   1.0000  
factor(stable)1 -5.443e-01  2.547e-01  -2.137   0.0326 *
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 350.36  on 255  degrees of freedom
Residual deviance: 345.75  on 254  degrees of freedom
AIC: 349.75

Number of Fisher Scoring iterations: 4

> exp(fit$coefficients)
    (Intercept) factor(stable)1 
      1.0000000       0.5802469 
>

0的截距是不稳定的对数赔率，-.5443系数是稳定的对数赔率。在对系数进行指数运算后，我们观察到，在不稳定的穿梭机为1.0的情况下，自动破碎机的赔率是使用的，如果穿梭机是稳定的，则会乘以.58。这意味着如果穿梭机具有稳定的定位，则不太可能使用自动剥离器。

计算使用自动调用器的概率

我们可以通过两种方式做到这一点。首先，手动方法:对系数求幂，并使用以下方程式将赔率转换为概率。

p = odds / (1 + odds)

对于穿梭自动生成器数据，它的工作方式如下所示。

# convert intercept to probability
odds_i <- exp(fit$coefficients[1])
odds_i / (1 + odds_i)
# convert stable="stable" to probability
odds_p <- exp(fit$coefficients[1]) * exp(fit$coefficients[2])
odds_p / (1 + odds_p)

对输出执行...and操作：

> # convert intercept to probability
> odds_i <- exp(fit$coefficients[1])
> odds_i / (1 + odds_i)
(Intercept) 
        0.5 
> # convert stable="stable" to probability
> odds_p <- exp(fit$coefficients[1]) * exp(fit$coefficients[2])
> odds_p / (1 + odds_p)
(Intercept) 
  0.3671875 
>

当梭子不稳定时，使用自动剥离器的概率为0.5，当梭子稳定时，使用自动剥离器的概率降低到0.37。

第二种生成概率的方法是使用predict()函数。

# convert to probabilities with the predict() function
predict(fit,data.frame(stable="0"),type="response")
predict(fit,data.frame(stable="1"),type="response")

请注意，输出与手动计算的概率相匹配。

> # convert to probabilities with the predict() function
> predict(fit,data.frame(stable="0"),type="response")
  1 
0.5 
> predict(fit,data.frame(stable="1"),type="response")
        1 
0.3671875 
>

将此应用于OP数据

我们可以将这些步骤应用于OP的glm()输出，如下所示。

coefficients <- c(-1.1455,-0.1322)
exp(coefficients)
odds_i <- exp(coefficients[1])
odds_i / (1 + odds_i)
# convert nonRSEvents = 1 to probability
odds_p <- exp(coefficients[1]) * exp(coefficients[2])
odds_p / (1 + odds_p)
# simulate up to 10 nonRSEvents prior to RS
coef_df <- data.frame(nonRSEvents=0:10,
                  intercept=rep(-1.1455,11),
                  nonRSEventSlope=rep(-0.1322,11))
coef_df$nonRSEventValue <- coef_df$nonRSEventSlope * 
coef_df$nonRSEvents
coef_df$intercept_exp <- exp(coef_df$intercept)
coef_df$slope_exp <- exp(coef_df$nonRSEventValue)
coef_df$odds <- coef_df$intercept_exp * coef_df$slope_exp
coef_df$probability <- coef_df$odds / (1 + coef_df$odds)
# print the odds & probabilities by number of nonRSEvents
coef_df[,c(1,7:8)]

对最终输出执行...and命令。

> coef_df[,c(1,7:8)]
   nonRSEvents    odds probability
1            0 0.31806     0.24131
2            1 0.27868     0.21794
3            2 0.24417     0.19625
4            3 0.21393     0.17623
5            4 0.18744     0.15785
6            5 0.16423     0.14106
7            6 0.14389     0.12579
8            7 0.12607     0.11196
9            8 0.11046     0.09947
10           9 0.09678     0.08824
11          10 0.08480     0.07817
>

票数 5

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/48131237

复制

相似问题

问如何从GLM输出中获取概率
EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问如何从GLM输出中获取概率EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问如何从GLM输出中获取概率
EN