首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >如何创建数据汇总函数?

如何创建数据汇总函数?
EN

Stack Overflow用户
提问于 2019-11-01 01:16:20
回答 1查看 929关注 0票数 0

我正在尝试创建一个汇总了几个向量的函数,提示是

代码语言:javascript
复制
Write a function data_summary which takes three inputs:\
`dataset`: A data frame\
`vars`: A character vector whose elements are names of columns from dataset which the user wants summaries for\
`group.name`: A length one character vector which gives the name of the column from dataset which contains the factor which will be used as a grouping variable
\`var.names`: A character vector of the same length as vars which gives the names that the user would like used as the entries under “Variable” in the resulting output. This should be set equal to vars by default, so the default behavior is to use the column names from dataset.

The output of the function should be a data frame with the following structure:

Column names of the data frame will be:\
`Variable`\
`Missing`\
The `first` level of the factor group.name\
The `second` level of the factor group.name\
…\
The `kth` level of the factor group.name\
`p-value`

我已经设置好代码了,

代码语言:javascript
复制
data_summary <- function(dataset,vars,group.name,var.names) {
}

有一个例子说明了

代码语言:javascript
复制
#data_summary<-function(dataset, vars,group.name, var.name){}

#example
#data_summary(titanic4, c("survived", "female", "age", "sibsp", "parch", "fare", "cabin"), "pclass")
#data_summary(titanic4, c("survived", "female", "age", "sibsp", "parch", "fare", "cabin"), "pclass", c("Survival rate", "% Female", "Age", "# siblings/spouses aboard", "# children/parents aboard", "Fare ($)", "Cabin"))

但是除了为函数输入参数之外,它真的对我没有什么帮助。

EN

回答 1

Stack Overflow用户

发布于 2019-11-01 03:31:35

您可以使用dplyr包来实现此功能。我也不知道你想通过哪些函数来总结你的数据帧,所以我使用了summary函数从基础包返回的所有函数。

我的数据:

代码语言:javascript
复制
> NewSKUMatrix
# A tibble: 268,918 x 4
   LagerID FilialID CSBID Price
     <int>    <int> <int> <dbl>
 1     233     2578  1005  38.3
 2     333     2543    NA  61.0
 3     334     2543    NA  15.0
 4     335     2543    NA  11.0
 5     337     2301    NA  71.0
 6     338     2031    NA  37.0
 7     338     2044    NA  35.0
 8     338     2054    NA  36.0
 9     338     2060    NA  37.0
10     338     2063    NA  36.0
# ... with 268,908 more rows

功能:

代码语言:javascript
复制
data_summary <- function(data,
                         variables,
                         values,
                         names = NULL) {
   if (is.null(x = names)) {
      names <- variables
   }
   data %>%
      group_by_at(.vars = variables) %>%
      summarise_at(
         .vars = values,
         .funs = list(
            Min. = min,
            `1st Qu.` = ~ quantile(x = ., probs = 0.25),
            Median = median,
            Mean = mean,
            `3rd Qu.` = ~ quantile(x = ., probs = 0.75),
            Max. = max
         )
      ) %>%
      rename_at(.vars = variables,
                .funs = ~ names)
}

输出:

代码语言:javascript
复制
data_summary(NewSKUMatrix,
             c('LagerID'),
             c('Price'),
             c('SKU'))
# A tibble: 32,454 x 7
     SKU  Min. `1st Qu.` Median  Mean `3rd Qu.`  Max.
   <int> <dbl>     <dbl>  <dbl> <dbl>     <dbl> <dbl>
 1    17  39.0      39.0   39.0  39.0      39.0  39.0
 2    18 120.      120.   120.  121.      120.  140. 
 3    21 289.      289.   289.  289.      289.  289. 
 4    24  37.0      37.0   37.0  45.2      45.2  70.0
 5    25  14.0      14.0   14.0  14.0      14.0  14.0
 6    55  30.9      30.9   30.9  30.9      30.9  30.9
 7   117  26.9      26.9   26.9  26.9      26.9  26.9
 8   118  24.8      24.9   24.9  25.1      25.1  25.7
 9   119  24.8      24.8   24.9  25.1      25.3  25.7
10   158 104.      108.   108.  107.      108.  108. 
# ... with 32,444 more rows
票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/58648351

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档