首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >同时聚合数字和字符

同时聚合数字和字符
EN

Stack Overflow用户
提问于 2020-10-15 23:57:06
回答 4查看 47关注 0票数 2

我需要一次聚合不同类的多个变量。

代码语言:javascript
复制
test<- data.frame (name = c("anna", "joe", "anna"), 
                   party = c("red", "blue", "red"),
                   text = c("hey there", "we ate an apple", "i took a walk"), 
                   numberofwords = c(2, 4, 4), 
                   score1 = 1:3, 
                   score2 = 4:6)

现在看起来像这样

代码语言:javascript
复制
#   name    party      text           numberofwords score1 score2
#1  anna    red       hey there             2         1      4
#2  joe     blue    we ate an apple         4         2      5
#3  anna    red      i took a walk          4         3      6

我希望根据名称和参与方聚合score1、score2、name score1和文本变量。

期望的结果是:

代码语言:javascript
复制
#   name  party            text                  numberofwords score1 score2
#1  anna  red           hey there i took a walk       6           4      10
#2   joe  blue           we ate an apple              4           2      5
EN

回答 4

Stack Overflow用户

回答已采纳

发布于 2020-10-16 00:20:12

across中使用最新版本的dplyr

代码语言:javascript
复制
test %>%
  group_by(name, party) %>%
  summarize(
    across(text, paste, collapse = " "),
    across(where(is.numeric), sum)
  )
# # A tibble: 2 x 6
#   name  party text                    numberofwords score1 score2
#   <chr> <chr> <chr>                           <dbl>  <int>  <int>
# 1 anna  red   hey there i took a walk             6      4     10
# 2 joe   blue  we ate an apple                     4      2      5   

旧版本,保留first party值:

代码语言:javascript
复制
test %>%
  group_by(name) %>%
  summarize(
    across(party, first),
    across(text, paste, collapse = " "),
    across(where(is.numeric), sum)
  )
# # A tibble: 2 x 6
#   name  party text                    numberofwords score1 score2
#   <chr> <chr> <chr>                           <dbl>  <int>  <int>
# 1 anna  red   hey there i took a walk             6      4     10
# 2 joe   blue  we ate an apple                     4      2      5   
票数 3
EN

Stack Overflow用户

发布于 2020-10-16 00:16:27

我们可以根据dplyr中每一列的类进行条件summarise

代码语言:javascript
复制
library(dplyr)

test %>% 
  mutate_at("text", as.character) %>% 
  group_by(name) %>% 
  summarise_all(list(~if(is.numeric(.)) sum(., na.rm = TRUE)  
                      else if(is.factor(.)) first(.) 
                      else paste(., collapse = " ")))

#> # A tibble: 2 x 6
#>   name  party text                    numberofwords score1 score2
#>   <fct> <fct> <chr>                           <dbl>  <int>  <int>
#> 1 anna  red   hey there i took a walk             6      4     10
#> 2 joe   blue  we ate an apple                     4      2      5
票数 2
EN

Stack Overflow用户

发布于 2020-10-16 00:27:01

base R中,我们可以使用aggregatemerge来实现这一点

代码语言:javascript
复制
out1 <- aggregate(cbind(numberofwords, score1, score2) ~ name + party, test, sum)
out2 <- aggregate(text ~ name + party, test, paste, collapse=' ')
merge(out1, out2)

-output

代码语言:javascript
复制
# name party numberofwords score1 score2                    text
#1 anna   red             6      4     10 hey there i took a walk
#2  joe  blue             4      2      5         we ate an apple
票数 2
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/64375399

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档