我试图用R来解决这个问题,但我似乎找不到正确的解决办法
我的数据是这样的:
Carrier Station Month TYSeats LYSeats
AAL BSB 6 10560 10560
AAL BSB 7 10912 10912
AAL BSB 8 10560 9328
AAL BSB 9 9152 7392
AAL BSB 10 9328 9152
AAL BSB 11 8976 10384
AAL BSB 12 10208 10912
AAL CNF 6 12122 12644
AAL CNF 7 12958 13516
AAL CNF 8 10868 10138
AAL CNF 9 5434 5614
AAL CNF 10 5434 7630
AAL CNF 11 8987 9241
AAL CNF 12 12122 12958我正在使用以下代码:
aggregate((TYSeats-LYSeats)/LYSeats~Carrier+Station,data=df,FUN=mean)我所期望的解决方案应该是这样的(即(sum(TYSeats) - sum(LYSeats)) over sum(LYSeats))):
1 AAL BSB 0.015385
2 AAL CNF -0.053191但是我得到了这个(它是平均每个月的每个手术)
1 AA BSB 0.0270417328
2 AA CNF -0.0603483997有什么方法可以在一个简单的行/命令中完成我所需要的吗?
谢谢!
发布于 2016-05-26 22:47:43
您还可以使用来自ddply包的plyr函数:
library(plyr)
ddply(df, .(Carrier, Station), summarise,
PerentChange = (sum(TYSeats) - sum(LYSeats))/sum(LYSeats))
Carrier Station PerentChange
1 AAL BSB 0.01538462
2 AAL CNF -0.05319134发布于 2016-05-27 02:22:16
我们可以使用dplyr
library(dplyr)
df1 %>%
group_by(Carrier, Station) %>%
summarise(PercentChange = (sum(TYSeats) - sum(LYSeats))/sum(LYSeats))
# Carrier Station PercentChange
# <chr> <chr> <dbl>
#1 AAL BSB 0.01538462
#2 AAL CNF -0.05319134发布于 2016-05-26 22:50:35
一个简单而快速的data.table解决方案。
library(data.table)
setDT(df)
df[ , .(PercentChange = sum(TYSEATs -LYSeats)/sum(LYSEATs)) , by = .(Carrier, Station) ]https://stackoverflow.com/questions/37472151
复制相似问题