我有关于工人工资的数据,有些工人每月领工资,另一些人每周领工资。我想将数据按员工和周(每年的)组合成一个面板。要做到这一点,我需要展开每月行。
这些数据看起来如下:
pay_data <- tibble(worker="Jim", start=ymd("2020-1-3"), end=ymd("2020-2-2"), rate=10, hours=50, wages=rate*hours) %>%
mutate(f_week=week(start), l_week=week(end))
# A tibble: 1 x 8
worker start end rate hours wages f_week l_week
<chr> <date> <date> <dbl> <dbl> <dbl> <dbl> <dbl>
1 Jim 2020-01-03 2020-02-02 10 50 500 1 5是否有一种方法可以使用complete、fill或任何其他dplyr函数来获取如下所示的数据?
# A tibble: 5 x 5
worker week rate hours wage
<chr> <int> <dbl> <dbl> <dbl>
1 Jim 1 10 50 500
2 Jim 2 10 50 500
3 Jim 3 10 50 500
4 Jim 4 10 50 500
5 Jim 5 10 50 500(当然,我会把这些款项分成两部分,把它们都放在共同的单位中)。
谢谢!
发布于 2021-01-16 04:58:46
另一种tidyverse方式是:
library(tidyverse)
pay_data %>%
mutate(week = map2(f_week, l_week, seq)) %>%
unnest(week) %>%
select(worker, rate:wages, week)
# worker rate hours wages week
# <chr> <dbl> <dbl> <dbl> <int>
#1 Jim 10 50 500 1
#2 Jim 10 50 500 2
#3 Jim 10 50 500 3
#4 Jim 10 50 500 4
#5 Jim 10 50 500 5发布于 2021-01-15 23:46:02
使用tidyverse的tidyr::separate_rows方法看起来可能是这样的。为了使数据更有趣,我为第二个工作人员添加了数据。
library(tidyverse)
tbl %>%
rowwise() %>%
mutate(weeks = paste(seq(f_week, l_week, by = 1), collapse = ", ")) %>%
ungroup() %>%
separate_rows(weeks) %>%
select(-ends_with("_week"), -start, -end)
#> # A tibble: 13 x 5
#> worker rate hours wages weeks
#> <chr> <int> <int> <int> <chr>
#> 1 Jim 10 50 500 1
#> 2 Jim 10 50 500 2
#> 3 Jim 10 50 500 3
#> 4 Jim 10 50 500 4
#> 5 Jim 10 50 500 5
#> 6 John 20 100 1000 1
#> 7 John 20 100 1000 2
#> 8 John 20 100 1000 3
#> 9 John 20 100 1000 4
#> 10 John 20 100 1000 5
#> 11 John 20 100 1000 6
#> 12 John 20 100 1000 7
#> 13 John 20 100 1000 8数据
tbl <- read.table(text="worker start end rate hours wages f_week l_week
1 Jim 2020-01-03 2020-02-02 10 50 500 1 5\n
2 John 2020-01-03 2020-02-02 20 100 1000 1 8", header = TRUE)
tbl
#> worker start end rate hours wages f_week l_week
#> 1 Jim 2020-01-03 2020-02-02 10 50 500 1 5
#> 2 John 2020-01-03 2020-02-02 20 100 1000 1 8发布于 2021-01-15 23:37:59
试试这个:
#Code
pay_data <- pay_data[rep(seq_len(nrow(pay_data)), unique(pay_data$l_week)),
c('worker','rate','hours','wages')]
pay_data$week <- 1:nrow(pay_data)输出:
# A tibble: 5 x 5
worker rate hours wages week
<chr> <dbl> <dbl> <dbl> <int>
1 Jim 10 50 500 1
2 Jim 10 50 500 2
3 Jim 10 50 500 3
4 Jim 10 50 500 4
5 Jim 10 50 500 5https://stackoverflow.com/questions/65744881
复制相似问题