文章/答案/技术大牛

发布

社区首页 >问答首页 >强迫.name_repair创建名称

问强迫.name_repair创建名称
EN

Stack Overflow用户

提问于 2019-11-10 15:55:04

回答 1查看 2.1K关注 0票数 1

我有一些数据df_single和df_multi。df_multi工作得很好，但是当我将相同的数据应用于df_single时，我遇到了问题。

我运行以下代码：

df_single %>% 
  as_tibble(., .name_repair = "universal") %>% 
  summarise_at(.vars = 8:ncol(.), .funs = c(mean = "mean", sd = "sd"))

这给了我以下几点：

# A tibble: 1 x 2
   mean    sd
  <dbl> <dbl>
1  42.4 0.380

这是好的，但不是以正确的格式，我希望它是在。如果我运行以下命令：

df_multi %>% 
  as_tibble(., .name_repair = "universal") %>% 
  summarise_at(.vars = 8:ncol(.), .funs = c(mean = "mean", sd = "sd"))

我得到：

# A tibble: 1 x 8
  pza_del_carmen_… pza_de_espana_m… escuelas_aguirr… retiro_mean pza_del_carmen_… pza_de_espana_sd
             <dbl>            <dbl>            <dbl>       <dbl>            <dbl>            <dbl>
1             29.5             23.8             31.8        11.8             21.2             18.3
# … with 2 more variables: escuelas_aguirre_sd <dbl>, retiro_sd <dbl>

这是正确的格式，我希望它是在。

我预期df_single的输出将是：

# A tibble: 1 x 2
   tres_olivos_mean    tres_olivos_sd
  <dbl>                  <dbl>
1  42.4                  0.380

名字从哪里来的。我发现“问题”来自.name_repair =，因为df_signle数据中的列名没有冲突。看着df_single

# A tibble: 6 x 8
  date         day month  year quarter semester weekday tres_olivos
  <date>     <int> <dbl> <dbl>   <int>    <int>   <dbl>       <dbl>
1 2010-01-01     1     1  2010       1        1       0        42.9
2 2010-01-02     2     1  2010       1        1       0        42.7
3 2010-01-03     3     1  2010       1        1       0        42.5
4 2010-01-04     4     1  2010       1        1       0        42.3
5 2010-01-05     5     1  2010       1        1       0        42.1
6 2010-01-06     6     1  2010       1        1       0        41.9

我想从感兴趣的专栏中拿出tres_olivos。df_multi看起来像：

# A tibble: 6 x 11
  date         day month  year quarter semester weekday pza_del_carmen pza_de_espana escuelas_aguirre retiro
  <date>     <int> <dbl> <dbl>   <int>    <int>   <dbl>          <dbl>         <dbl>            <dbl>  <dbl>
1 2010-01-01     1     1  2010       1        1       0              6             4               18      3
2 2010-01-02     2     1  2010       1        1       0             26            20               28      9
3 2010-01-03     3     1  2010       1        1       0             51            50               41     22
4 2010-01-04     4     1  2010       1        1       0             57            39               48     21
5 2010-01-05     5     1  2010       1        1       0             29            25               37     12
6 2010-01-06     6     1  2010       1        1       0              8             5               19      4

数据：

df_single <- structure(list(date = structure(c(14610, 14611, 14612, 14613, 
14614, 14615), class = "Date"), day = 1:6, month = c(1, 1, 1, 
1, 1, 1), year = c(2010, 2010, 2010, 2010, 2010, 2010), quarter = c(1L, 
1L, 1L, 1L, 1L, 1L), semester = c(1L, 1L, 1L, 1L, 1L, 1L), weekday = c(0, 
0, 0, 0, 0, 0), tres_olivos = c(42.8840939928959, 42.6809748158197, 
42.4778556387312, 42.2747364616426, 42.0716172845541, 41.8684981074656
)), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, 
-6L))

df_multi <- structure(list(date = structure(c(14610, 14611, 14612, 14613, 
14614, 14615), class = "Date"), day = 1:6, month = c(1, 1, 1, 
1, 1, 1), year = c(2010, 2010, 2010, 2010, 2010, 2010), quarter = c(1L, 
1L, 1L, 1L, 1L, 1L), semester = c(1L, 1L, 1L, 1L, 1L, 1L), weekday = c(0, 
0, 0, 0, 0, 0), pza_del_carmen = c(6, 26, 51, 57, 29, 8), pza_de_espana = c(4, 
20, 50, 39, 25, 5), escuelas_aguirre = c(18, 28, 41, 48, 37, 
19), retiro = c(3, 9, 22, 21, 12, 4)), class = c("tbl_df", "tbl", 
"data.frame"), row.names = c(NA, -6L))

编辑:来自文档

tibble()和as_tibble()的.name_repair参数引用了这些级别。或者，用户可以传递自己的名称修复函数。它应该将最小的名称作为输入，同样地，应该返回至少最小的名称。

传递我自己的名字修复函数可能会很有趣。

编辑：

他的数据是这样的：

my_list <- list(list(structure(list(date = structure(c(14610, 14611, 14612, 
14613, 14614, 14615), class = "Date"), day = 1:6, month = c(1, 
1, 1, 1, 1, 1), year = c(2010, 2010, 2010, 2010, 2010, 2010), 
    quarter = c(1L, 1L, 1L, 1L, 1L, 1L), semester = c(1L, 1L, 
    1L, 1L, 1L, 1L), weekday = c(0, 0, 0, 0, 0, 0), pza_del_carmen = c(6, 
    26, 51, 57, 29, 8), pza_de_espana = c(4, 20, 50, 39, 25, 
    5), escuelas_aguirre = c(18, 28, 41, 48, 37, 19), retiro = c(3, 
    9, 22, 21, 12, 4)), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -6L)), structure(list(date = structure(c(14611, 
14612, 14613, 14614, 14615, 14616), class = "Date"), day = 2:7, 
    month = c(1, 1, 1, 1, 1, 1), year = c(2010, 2010, 2010, 2010, 
    2010, 2010), quarter = c(1L, 1L, 1L, 1L, 1L, 1L), semester = c(1L, 
    1L, 1L, 1L, 1L, 1L), weekday = c(0, 0, 0, 0, 0, 0), pza_del_carmen = c(26, 
    51, 57, 29, 8, 22), pza_de_espana = c(20, 50, 39, 25, 5, 
    12), escuelas_aguirre = c(28, 41, 48, 37, 19, 26), retiro = c(9, 
    22, 21, 12, 4, 7)), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -6L))), list(structure(list(date = structure(c(14610, 
14611, 14612, 14613, 14614, 14615), class = "Date"), day = 1:6, 
    month = c(1, 1, 1, 1, 1, 1), year = c(2010, 2010, 2010, 2010, 
    2010, 2010), quarter = c(1L, 1L, 1L, 1L, 1L, 1L), semester = c(1L, 
    1L, 1L, 1L, 1L, 1L), weekday = c(0, 0, 0, 0, 0, 0), tres_olivos = c(42.8840939928959, 
    42.6809748158197, 42.4778556387312, 42.2747364616426, 42.0716172845541, 
    41.8684981074656)), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -6L)), structure(list(date = structure(c(14611, 
14612, 14613, 14614, 14615, 14616), class = "Date"), day = 2:7, 
    month = c(1, 1, 1, 1, 1, 1), year = c(2010, 2010, 2010, 2010, 
    2010, 2010), quarter = c(1L, 1L, 1L, 1L, 1L, 1L), semester = c(1L, 
    1L, 1L, 1L, 1L, 1L), weekday = c(0, 0, 0, 0, 0, 0), tres_olivos = c(42.6809748158197, 
    42.4778556387312, 42.2747364616426, 42.0716172845541, 41.8684981074656, 
    41.6653789303771)), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -6L))))

我试图尽可能多地复制原始列表，使用：

mylist <- list(
  list(head(map(rolled_splits[[2]]$splits, ~ analysis(.x))[[1]]),
       head(map(rolled_splits[[2]]$splits, ~ analysis(.x))[[2]])),
  list(head(map(rolled_splits[[3]]$splits, ~ analysis(.x))[[1]]),
       head(map(rolled_splits[[3]]$splits, ~ analysis(.x))[[2]]))
)

回答 1

Stack Overflow用户

回答已采纳

发布于 2019-11-10 16:28:29

这里有一个我们可以做的小技巧，因为默认情况下，一个列的名称将得到函数名，请参见?summarise_at

library(dplyr)
df_single %>% 
   summarise_at(.vars = 7:ncol(.), .funs = c(mean = "mean", sd = "sd")) %>% 
   rename_all(~paste0('_',.))

# A tibble: 1 x 2
  tres_olivos_mean tres_olivos_sd
             <dbl>          <dbl>
1             42.4          0.380

来自?summarise_at 命名部分：

创建的列的名称是从输入变量的名称和函数的名称派生出来的。

如果只有一个未命名的变量，则使用函数的名称来命名所创建的列。

map(my_list, ~map(.,~if(ncol(.)>8) .x %>% summarise_at(.vars = 7:ncol(.), .funs = c(mean = "mean", sd = "sd")) 
                     else .x %>% summarise_at(.vars = 7:ncol(.), .funs = c(mean = "mean", sd = "sd")) %>% select(2,4)))

#A robust solution is to depend on names rather than positions 
summarise_fun <- function(df){
  #browser()
  nms <- setdiff(names(df), c("date", "day", "month", "year", "quarter", "semester", "weekday"))
  if(length(nms)>1){
    df %>% summarise_at(.vars = nms, .funs = c(mean = "mean", sd = "sd"))
  }else{
    df %>% summarise_at(.vars = nms, .funs = c(mean = "mean", sd = "sd")) %>% rename_all(~paste0(nms,'_',.))
  }
}

map(my_list, ~map(., summarise_fun))

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/58790341

复制

相似问题

问强迫.name_repair创建名称
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问强迫.name_repair创建名称EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问强迫.name_repair创建名称
EN