我有一个包含两个变量的数据框架,性别和城镇(Df1)。我想计算性别的优势比(用female=1),我想每个城镇都这样做,这样我最终得到的是Df1的三个赔率比。
我的实际数据集包含了更多的城镇,所以我想知道是否有一种更普遍的方法来做到这一点,而不是手动将观察到的数量输入到墓志铭::oddsratio()?
谢谢!
起点(df):
Df1 <- data.frame(gender=c("m","m","m","f","f","f","m","m","m","f","m","f","m","f","f","f","f","f","f","f"), town=c("ny","la","ny","la","ny","la","ny","la","ny","la","ny","la","ny","la","ma","ma","ma","ma","ma","ma"))目前为止的代码:
library(epitools)
Df2 <- matrix(c(12,20,8,20),byrow=TRUE,ncol=2)
dimnames(Df2) <- list(Group=c("females","males"),MI=c("subtotal","total"))
oddsratio(Df2)注意事项:赔率比(字面意思是两个赔率之间的比率)
假设每10名男性中有7名入院: p=0.7,q=1-0.7=0.3
假设每10名女性中有3名被录取: p=0.3,q=1-0,3=0.7
男性入学几率: 0.7/0.3=2.333 (被录取/不被录取)
女性入学比率: 0.3/0.7=0.429
入学比数比: OR=2.333/0.429=5.44,
男性被接纳的几率是女性的5.44倍。
发布于 2017-10-04 07:15:27
像这样吗?
library(tidyverse)
Df1 <- data.frame(gender=c("m","m","m","f","f","f","m","m","m","f","m","f","m","f","f","f","f","f","f","f"), town=c("ny","la","ny","la","ny","la","ny","la","ny","la","ny","la","ny","la","ma","ma","ma","ma","ma","ma"))
Df1 %>% group_by(town) %>% summarise(
p_males = sum(gender == "m")/n(),
p_females = sum(gender == "f")/n(),
odds_males = p_males/p_females,
odds_females = p_females/p_males,
odds_ratio = odds_males/odds_females)https://stackoverflow.com/questions/46558335
复制相似问题