R相对较新,我还没能在网上找到等同于SQL left join的东西。假设我有如下数据:
School| Year| Grade | #students | math_approached| math_metorexceeded
610534| 2016| Mathematics Grade 3| 57 | 5.3% | 94.7%
610534| 2016| Mathematics Grade 4| 60 | 8.3% | 91.7%
610534| 2016| Mathematics Grade 5| 59 | 6.8% | 93.2%
610534| 2015| Mathematics Grade 3| 57 | 5.3% | 94.7%
610534| 2015| Mathematics Grade 4| 60 | 8.3% | 91.7%
610534| 2015| Mathematics Grade 5| 59 | 6.8% | 93.2%
699999| 2015| Mathematics Grade 3| 51 | 5.3% | 94.7%
699999| 2015| Mathematics Grade 4| 61 | 8.3% | 91.7%
699999| 2015| Mathematics Grade 5| 53 | 6.8% | 93.2%我正在试着找出学校成绩前一年的数学百分比接近值。在SQL中,这将如下所示
select a.*, b.math_approached, b.math_metorexceeded
from mydata as a
left join mydata as b
on a.school = b.school
and a.grade = b.grade
and b.year = '2015'
and a.year = '2016'在R中,我有一个保存所有数据的dataframe df。它有
df$school
df$year
df$grade
df$students
df$math..approached
df$math..met.or.exceeded作为它的列
发布于 2018-04-25 00:04:06
您可以使用的一种选择是使用SQL包,它允许您在R中对数据框运行实际的sqldf查询。代码很简单:
library(sqldf)
query <- "select a.*, b.math_approached, b.math_metorexceeded
from df as a
left join df as b
on a.school = b.school
and a.grade = b.grade
and b.year = '2015'
and a.year = '2016'"
result <- sqldf(query)我必须对原始SQL查询进行的惟一更改是将SQL表名mydata替换为R中包含相同信息的数据框的名称df。
https://stackoverflow.com/questions/50006132
复制相似问题