我有一个数据框架,在一年中的每一季有3个捕获期,为期3年(真正的数据库有超过100个季节和800个捕获季节)。对于每个陷阱季节,我有3个二项式变量("Non_Breeder","Potential_Breeder“和”育种“)。
# example
Year <- c(rep(2000,12), rep(2001,12), rep(2002,12))
Season <- c(rep (seq(1:4), each=3,3))
Trap_Session <- seq(1:36)
Non_Breeder <- (rbinom(36, 1, prob=0.5))
Potential_Breeder <- (rbinom(36, 1, prob=0.8))
Breeding <- (rbinom(36, 1, prob=0.4))
Month <- sample(12, 36, replace = TRUE)
db <- cbind (Year, Season, Trap_Session, Non_Breeder, Potential_Breeder, Breeding)
db <- as.data.frame (db)我想计算"(Potential_Breeder +育种)/ (Non_Breeder + Potential_Breeder +育种)“的每个季节保持变量”年“,”季节“和”比率“。
我尝试过使用函数table,但我不知道如何自动为每个季节创建一个循环,并保留变量“年份”、“季节”和“比率”。
例如:如果我有以下数据:
Year Season Trap_Session Non_Breeder Potential_Breeder Breeding
1 2000 1 1 1 1 0
2 2000 1 2 1 1 0
3 2000 1 3 0 1 0
4 2000 2 4 0 1 1
5 2000 2 5 1 1 1
6 2000 2 6 1 1 1 我想得到:
Year Season Ratio
2000 1 0.6 # (3/5)
2000 2 0.75 # (6/8)
#Explanation of the calculation
# 2000 Season 1
(3 Potential_Breeder / 5 (3Potential_Breeder+2 Non_Breeder)
# 2000 Season 2
(3Potential_Breeder + 2Breeding / 2Non_Breeder + 3Potential_Breeder +2Breeding)有人知道怎么做吗?
发布于 2015-05-22 09:17:46
试试这个:
library(data.table)
setDT(db)[ , .("Ratio" = sum(Potential_Breeder, Breeding) /
sum(Non_Breeder, Potential_Breeder, Breeding)), by = .(Year, Season)]这将为现有的按年份和季节分组的数据添加一个名为“比率”(按您希望命名)的变量,
dplyr也是如此:
library(dplyr)
group_by(db, Year, Season) %>% summarise("Ratio" = sum(Potential_Breeder, Breeding) /
sum(Non_Breeder, Potential_Breeder, Breeding))这两种方法都给出了OP中给定的db的以下输出:
Year Season Ratio
1: 2000 1 0.8000000
2: 2000 2 0.5000000
3: 2000 3 0.6000000
4: 2000 4 0.8000000
5: 2001 1 0.6666667
6: 2001 2 0.8000000
7: 2001 3 0.8000000
8: 2001 4 0.6000000
9: 2002 1 1.0000000
10: 2002 2 0.5000000
11: 2002 3 0.8571429
12: 2002 4 0.6666667https://stackoverflow.com/questions/30392707
复制相似问题