文章/答案/技术大牛

发布

社区首页 >问答首页 >rlang:用.收集()

问rlang:用.收集()
EN

Stack Overflow用户

提问于 2019-07-10 10:42:06

回答 2查看 202关注 0票数 4

假设我想计算自定义函数中任意数量的组的mean、min和max。

玩具数据如下所示：

library(tidyverse)
df <- tibble(
  Gender = c("m", "f", "f", "m", "m", 
             "f", "f", "f", "m", "f"),
  IQ = rnorm(10, 100, 15),
  Other = runif(10),
  Test = rnorm(10),
  group2 = c("A", "A", "A", "A", "A",
             "B", "B", "B", "B", "B")
)

要实现这一点，我可以使用两组(性别，group2)

df %>% 
  gather(Variable, Value, -c(Gender, group2)) %>% 
  group_by(Gender, group2, Variable) %>% 
  summarise(mean = mean(Value), 
            min = min(Value), 
            max = max(Value))

可以与来自rlang的新的rlang运算符集成

descriptive_by <- function(data, group1, group2) {
  data %>% 
    gather(Variable, Value, -c({{ group1 }}, {{ group2 }})) %>% 
    group_by({{ group1 }}, {{ group2 }}, Variable) %>% 
    summarise(mean = mean(Value), 
              min = min(Value), 
              max = max(Value))
}

通常，我会假设我可以用...替换指定的组，但是它似乎不是那样工作的。

descriptive_by <- function(data, ...) {
  data %>% 
    gather(Variable, Value, -c(...)) %>% 
    group_by(..., Variable) %>% 
    summarise(mean = mean(Value), 
              min = min(Value), 
              max = max(Value))
}

当它返回错误时

Map_lgl中的错误(.x，.p，.)：找不到对象“性别”

dplyr

rlang

回答 2

Stack Overflow用户

回答已采纳

发布于 2019-07-10 11:40:14

这里有一个可能的解决方案，其中...直接传递给group_by，而gather只是收集数值列(因为我认为它永远不应该独立于输入...收集非数值列)。

library(tidyverse)

set.seed(1)

## data
df <- tibble(
    Gender = c("m", "f", "f", "m", "m", 
        "f", "f", "f", "m", "f"),
    IQ = rnorm(10, 100, 15),
    Other = runif(10),
    Test = rnorm(10),
    group2 = c("A", "A", "A", "A", "A",
        "B", "B", "B", "B", "B")
)

## function
descriptive_by <- function(data, ...) {

  data %>% 
      gather(Variable, Value, names(select_if(., is.numeric))) %>% 
      group_by(..., Variable) %>% 
      summarise(mean = mean(Value), 
          min = min(Value), 
          max = max(Value))
}

descriptive_by(df, Gender, group2)
#> # A tibble: 12 x 6
#> # Groups:   Gender, group2 [4]
#>    Gender group2 Variable    mean      min     max
#>    <chr>  <chr>  <chr>      <dbl>    <dbl>   <dbl>
#>  1 f      A      IQ        95.1    87.5    103.   
#>  2 f      A      Other      0.432   0.212    0.652
#>  3 f      A      Test       0.464  -0.0162   0.944
#>  4 f      B      IQ       100.     87.7    111.   
#>  5 f      B      Other      0.281   0.0134   0.386
#>  6 f      B      Test       0.599   0.0746   0.919
#>  7 m      A      IQ       106.     90.6    124.   
#>  8 m      A      Other      0.442   0.126    0.935
#>  9 m      A      Test       0.457  -0.0449   0.821
#> 10 m      B      IQ       109.    109.     109.   
#> 11 m      B      Other      0.870   0.870    0.870
#> 12 m      B      Test      -1.99   -1.99    -1.99

票数 2

Stack Overflow用户

发布于 2019-07-11 02:50:52

复杂的部分是弄清楚如何否定NSE变量(xxx和-xxx)。下面是我如何处理它的一个例子：

desc_by <- function(dat, ...) {

  drops <- lapply(enquos(...), function(d) call("-", d))

  dat %>% 
    gather(var, val, !!!drops) %>% 
    group_by(...) %>% 
    summarise_at(vars(val), funs(min, mean, max))

}

desc_by(head(iris), Species, Petal.Width)

A Petal.Width :2×5#类群:物种1种，平均1种，平均1种，平均1种，平均1种，1.33%，3.18 %，5.1 %，0.4 %，1.7 %，3.67 %，5.4 %。

您仍然必须使用enquos和!!!来将-应用于每个变量，但否则...可以用于分组等。因此，你根本不需要新的“胡子”/卷曲算子。

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/56968968

复制

相似问题

问rlang:用.收集()
EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问rlang:用.收集()EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问rlang:用.收集()
EN