文章/答案/技术大牛

发布

社区首页 >问答首页 >在我的dataframe上使用ungroup()是否破坏了我的功能？

问在我的dataframe上使用ungroup()是否破坏了我的功能？
EN

Stack Overflow用户

提问于 2021-01-22 13:18:26

回答 2查看 72关注 0票数 1

我有以下示例dataframe：

countries = c("Australia", "Australia", "Chile", "Chile", "Brazil", "Brazil", "Brazil")
techs = c("AI", "AI", "AI", "Bio", "AI", "Bio", "computers")
value = c(404, 402, 2313, 424, 1424, 2141, 214)
year = c(2018, 2019,2018, 2018, 2018, 2018, 2018)

df = data.frame(countries, techs, value, year)

我有一个函数，计算每个国家每项技术的总价值(实质上是每项技术和国家的年份之和)：

country_tech = function(data, tech, country){
  result =  data %>% 
    select(countries, techs, value) %>% 
    filter(countries == country) %>% 
    filter(techs == tech) %>% 
    summarise(Total = sum(value, na.rm = TRUE))
  
}

我创建了一个新的dataframe，它对国家/技术进行分组，并减少年份，这样我就可以在其中追加新的数据：

df2 = select(df, countries, techs) %>%  group_by(countries, techs) %>% distinct()

然后，我在我的新的dataframe中创建了一个新列，该函数总结了每个国家的技术价值：

df2 = df2 %>% mutate(value = country_tech(df, techs, countries ))

一切都很好。但是，由于我在制作df2时没有取消分组，所以我在分发数据时遇到了问题。

如果我添加一个ungroup()，例如：

df2 = select(df, countries, techs) %>%  group_by(countries, techs) %>% distinct() %>% ungroup()

然后，我的函数不再工作，并得到以下错误：

Error: Problem with `mutate()` input `value`.
x Problem with `filter()` input `..1`.
x Input `..1` must be of size 4 or 1, not size 6.
i Input `..1` is `techs == tech`.
i Input `value` is `country_tech(df, techs, countries)`.

有人知道我哪里出了问题吗？

回答 2

Stack Overflow用户

回答已采纳

发布于 2021-01-22 13:42:30

进一步更新了，您已经将该列命名了两次，这就是问题的原因。

像这样使用它，它就会工作(不要给出列的名称，因为您已经在自定义函数中给出了它)

df2 %>% group_by(techs, countries) %>% mutate(country_tech(df, techs, countries)) %>% ungroup() %>%
  spread(techs, value)
# A tibble: 3 x 4
  countries    AI   Bio computers
  <chr>     <dbl> <dbl>     <dbl>
1 Australia   806    NA        NA
2 Brazil     1424  2141       214
3 Chile      2313   424        NA

更新

实际上，通过函数方法生成的列名就是问题所在。看看你能不能这样做就行了。

#ungrouping as you desire
df2 = select(df, countries, techs) %>%  group_by(countries, techs) %>% distinct() %>% ungroup()

#mutating with custom function
df2 %>% group_by(techs, countries) %>% mutate(value = country_tech(df, techs, countries)) %>% ungroup()
# A tibble: 6 x 3
  countries techs     value$Total
  <chr>     <chr>           <dbl>
1 Australia AI                806
2 Chile     AI               2313
3 Chile     Bio               424
4 Brazil    AI               1424
5 Brazil    Bio              2141
6 Brazil    computers         214

注意上面结果中的列名。

# using pivot_wider instead of spread
df2 %>% group_by(techs, countries) %>% mutate(value = country_tech(df, techs, countries)) %>% ungroup() %>%
  pivot_wider(names_from = techs, values_from = value)

# A tibble: 3 x 4
  countries AI$Total Bio$Total computers$Total
  <chr>        <dbl>     <dbl>           <dbl>
1 Australia      806        NA              NA
2 Chile         2313       424              NA
3 Brazil        1424      2141             214

旧答案我想知道你为什么不用这个来获得你的最终输出

df %>% group_by(countries, techs) %>% summarise(value_total = sum(value)) %>% ungroup()

# A tibble: 6 x 3
  countries techs     value_total
  <chr>     <chr>           <dbl>
1 Australia AI                806
2 Brazil    AI               1424
3 Brazil    Bio              2141
4 Brazil    computers         214
5 Chile     AI               2313
6 Chile     Bio               424

ungroup()在这个例子中也是多余的。

编辑如果您想使用自定义函数，请尝试如下

df2 = select(df, countries, techs) %>%  group_by(countries, techs) %>% slice_head()

票数 3

Stack Overflow用户

发布于 2021-01-22 14:45:04

在每次调用中都有一个接受原始数据集和筛选某些值的函数，这是效率低下的。您应该按照所需的术语将数据集拆分，然后将某些函数应用于数据集。如果您需要做“多件事情”，我假设您希望您的函数返回一个具有多个值的数据框架(将它们添加到summarise函数中)。您可以在nest编辑的数据上这样做。

country_tech = function(data_subset){
  data_subset %>% 
    summarise(Total = sum(value, na.rm = TRUE))
}

df %>% 
  group_by(countries, techs) %>% 
  nest() %>% 
  mutate(data = map(data, country_tech)) %>% 
  unnest(data)

输出：

# A tibble: 6 x 3
# Groups:   countries, techs [9]
  countries techs     Total
  <fct>     <fct>     <dbl>
1 Australia AI          806
2 Chile     AI         2313
3 Chile     Bio         424
4 Brazil    AI         1424
5 Brazil    Bio        2141
6 Brazil    computers   214

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/65845960

复制

相似问题

问在我的dataframe上使用ungroup()是否破坏了我的功能？
EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问在我的dataframe上使用ungroup()是否破坏了我的功能？EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问在我的dataframe上使用ungroup()是否破坏了我的功能？
EN