我正在尝试将缺少的行添加到以下数据帧中。
df = data.frame(DATE = as.Date(c("2016-05-31", "2016-08-31", "2016-10-31", "2016-07-31", "2016-08-31", "2016-10-31", "2016-12-31")),
KONTR = c("122","122","122","553","553","102","102"),
KAP = as.double(1:7),
DIV =c("PI","PI","PI","OP","OP","PR","PR"))这段代码可以工作
result = df %>%
group_by(KONTR) %>%
do(left_join(data.frame(KONTR = .$KONTR[1], DATE = seq(min(.$DATE)+1, max(.$DATE)+1, by="1 month")-1), .,
by=c("KONTR", "DATE")))但由于我的实际数据帧有150万行,因此需要超过15分钟才能完成。我试着在下面的代码中使用multidplyr,但是我得到了错误,我不知道哪里出了问题。
cluster <- create_cluster(3)
by_kontr <- df %>% partition(KONTR,cluster=cluster)
result = by_kontr %>%
group_by(KONTR) %>%
do(left_join(data.frame(KONTR = .$KONTR[1], DATE = seq(min(.$DATE)+1, max(.$DATE)+1, by="1 month")-1), .,
by=c("KONTR", "DATE")))
Error in checkForRemoteErrors(lapply(cl, recvResult)) :
3 nodes produced errors; first error: could not find function "left_join"发布于 2017-01-18 19:09:17
我终于找到了解决方案。库附件在从机级是必需的,所以我必须在代码中添加以下行:
cluster_eval(cluster,library(dplyr))发布于 2017-07-23 05:55:15
另一个选项是预先注册您要使用的库
multidplyr::cluster_library(cluster, "dplyr")
by_kontr %>%
group_by(KONTR) %>%
do(left_join(data.frame(KONTR = .$KONTR[1], DATE = seq(min(.$DATE)+1, max(.$DATE)+1, by="1 month")-1), ., by=c("KONTR", "DATE")))或者在do命令中编写package::function。也就是说,您编写dplyr::left_join而不是left_join
by_kontr %>%
group_by(KONTR) %>%
do(dplyr::left_join(data.frame(KONTR = .$KONTR[1], DATE = seq(min(.$DATE)+1, max(.$DATE)+1, by="1 month")-1), ., by=c("KONTR", "DATE")))https://stackoverflow.com/questions/41713673
复制相似问题