首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >通过FIPS跨多列的Aggrate (sum),即两个时间段范围内的县ID

通过FIPS跨多列的Aggrate (sum),即两个时间段范围内的县ID
EN

Stack Overflow用户
提问于 2022-11-09 16:16:26
回答 1查看 14关注 0票数 1

基于下面的数据,我如何能够sumInflowOutflowNetMigrationInAGIOutAGIFIPS跨两个时间段2011-20152016-2020?有些县可能没有某一财政年度的数据,但这并不重要,因为其想法是在这两个时间段内对数据进行汇总。当然,在最终的数据集中需要一些NAs。我使用FIPS是因为有几个县的名字是一样的。因此,不再需要列Key,因为它是FIPSYear的连接。

所需的输出模式/列:

代码语言:javascript
复制
    FIPS    County          State   TotInflow   TotOutflow  TotNetMigration TotInAGI    TotOutAGI   Time_Period
12001       Alachua County  FL                                                                      2011-2015
12001       Alachua County  FL                                                                      2016-2020
08001       Adams County    CO                                                                      2011-2015
08001       Adams County    CO                                                                      2016-2020

样本数据:

代码语言:javascript
复制
     df = structure(list(Key = c("080012020", "120012020", "120012018", 
"120012017", "080012017", "120012016", "120012015", "080012014", 
"120012013", "120012012", "080012012", "080012011", "080012016"
), County = c("Adams County", "Alachua County", "Alachua County", 
"Alachua County", "Adams County", "Alachua County", "Alachua County", 
"Adams County", "Alachua County", "Alachua County", "Adams County", 
"Adams County", "Adams County"), State = c("CO", "FL", "FL", 
"FL", "CO", "FL", "FL", "CO", "FL", "FL", "CO", "CO", "CO"), 
    FIPS = c("08001", "12001", "12001", "12001", "08001", "12001", 
    "12001", "08001", "12001", "12001", "08001", "08001", "08001"
    ), Inflow = c(38L, 261L, 321L, 339L, 58L, 288L, 254L, 46L, 
    413L, 433L, 30L, 42L, NA), InAGI = c(1817L, 6287L, 8423L, 
    8364L, 1865L, 14720L, 5224L, 1074L, 11774L, 10151L, 921L, 
    500L, NA), FiscalYear = c("2019- 2020", "2019- 2020", "2017 - 2018", 
    "2016 - 2017", "2016 - 2017", "2015 - 2016", "2014 - 2015", 
    "2013 - 2014", "2012 - 2013", "2011 - 2012", "2011 - 2012", 
    "2010 - 2011", "2015 - 2016"), Year = c(2020L, 2020L, 2018L, 
    2017L, 2017L, 2016L, 2015L, 2014L, 2013L, 2012L, 2012L, 2011L, 
    2016L), Outflow = c(54L, 447L, 444L, 558L, 44L, 436L, 334L, 
    49L, 466L, 495L, 39L, 31L, 51L), OutAGI = c(1879L, 13106L, 
    15409L, 16496L, 2408L, 12675L, 7448L, 733L, 10309L, 11677L, 
    847L, 605L, 1114L), NetMigration = c(-16L, -186L, -123L, 
    -219L, 14L, -148L, -80L, -3L, -53L, -62L, -9L, 11L, NA)), row.names = c(NA, 
-13L), class = "data.frame")
EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2022-11-09 16:33:40

按“FIPS”、“County”、“State”和从“年份”创建的Time_Period列进行分组,根据“年份”是否位于开始、结束年份之间,然后通过循环across这些列名获得感兴趣的列的sum

代码语言:javascript
复制
library(dplyr)
df %>%
   group_by(FIPS, County, State, 
   Time_Period = case_when(between(Year, 2011, 2015)~
     '2011-2015', between(Year, 2016, 2020)~ '2016-2020')) %>% 
  summarise(across(c(Inflow, InAGI, Outflow, OutAGI, NetMigration), 
   ~ sum(.x, na.rm = TRUE), .names = "Total{.col}"),.groups = "drop")

-output

代码语言:javascript
复制
# A tibble: 4 × 9
  FIPS  County         State Time_Period TotalInflow TotalInAGI TotalOutflow TotalOutAGI TotalNetMigration
  <chr> <chr>          <chr> <chr>             <int>      <int>        <int>       <int>             <int>
1 08001 Adams County   CO    2011-2015           118       2495          119        2185                -1
2 08001 Adams County   CO    2016-2020            96       3682          149        5401                -2
3 12001 Alachua County FL    2011-2015          1100      27149         1295       29434              -195
4 12001 Alachua County FL    2016-2020          1209      37794         1885       57686              -676
票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/74378108

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档