我有一个有两列、三列的数据集。第三列的日期值与一些字符串混合。
ID Col1 Value
123 Start.Date 2011-06-18
123 Stem A1
123 Stem_1 A6
123 Stem_2 NA
321 Start.Date 2014-08-05
321 Stem C1
321 Stem_1 C4
321 Stem_2 NA
677 Start.Date NA
677 Stem NA
677 Stem_1 NA
677 Stem_2 NA我如何区分日期并将它们存储在这样一个不同的列中?
ID Col1 Value Start.Date
123 Stem A1 2011-06-18
123 Stem_1 A6 2011-06-18
123 Stem_2 NA 2011-06-18
321 Stem C1 2014-08-05
321 Stem_1 C4 2014-08-05
321 Stem_2 NA 2014-08-05
677 Stem NA NA
677 Stem_1 NA NA
677 Stem_2 NA NA谢谢。
发布于 2021-01-27 02:08:48
在数据中创建一个新列,该列的值来自Value列、wehre、Col1 = 'Start.Date'或NA。对于每个ID,我们可以从以前的日期填充NA值,并使用'Start.Date'删除行。
library(dplyr)
library(tidyr)
df %>%
mutate(Start.Date = as.Date(replace(Value, Col1 != 'Start.Date', NA))) %>%
group_by(ID) %>%
fill(Start.Date) %>%
ungroup() %>%
filter(Col1 != 'Start.Date')
# ID Col1 Value Start.Date
# <int> <chr> <chr> <date>
#1 123 Stem A1 2011-06-18
#2 123 Stem_1 A6 2011-06-18
#3 123 Stem_2 NA 2011-06-18
#4 321 Stem C1 2014-08-05
#5 321 Stem_1 C4 2014-08-05
#6 321 Stem_2 NA 2014-08-05
#7 677 Stem NA NA
#8 677 Stem_1 NA NA
#9 677 Stem_2 NA NA 数据
df <- structure(list(ID = c(123L, 123L, 123L, 123L, 321L, 321L, 321L,
321L, 677L, 677L, 677L, 677L), Col1 = c("Start.Date", "Stem",
"Stem_1", "Stem_2", "Start.Date", "Stem", "Stem_1", "Stem_2",
"Start.Date", "Stem", "Stem_1", "Stem_2"), Value = c("2011-06-18",
"A1", "A6", NA, "2014-08-05", "C1", "C4", NA, NA, NA, NA, NA)),
class = "data.frame", row.names = c(NA, -12L))发布于 2021-01-27 02:13:41
一种完全基于tidyr的替代解决方案
df %>% pivot_wider(ID, names_from = Col1, values_from = Value ) %>%
pivot_longer(c("Stem", "Stem_1", "Stem_2"), names_to = "Col1", values_to = "Value") https://stackoverflow.com/questions/65911890
复制相似问题