我有一张像这样的数据
ID Math Chem HoursAvailable
1 Math NA 3:00-4:00
2 NA Chem 4:00-5:00
3 Math Chem 12:00-2:00我正在尝试将可用的时间合并到行中,所以如下所示
ID Math Chem HoursAvailable
1 3:00-4:00 NA 3:00-4:00
2 NA 4:00-5:00 4:00-5:00
3 12:00-2:00 12:00-2:00 12:00-2:00在不覆盖所有NA值的情况下,我无法使数据合并。我还尝试将HoursAvailable分离成单独的数据帧,然后尝试合并。我也试过使用tidyverse,但是做不到。
发布于 2017-10-03 22:55:07
下面是一种使用tidyverse包中的gather和spread来实现该功能的方法。注意,只有当您需要将HoursAvailable合并到多个变量时,这才可能是有用的。否则,在使用基数R的评论中,@KevinArseneau这样的东西会简单得多。
library(tidyverse)
df <- read_table("ID Math Chem HoursAvailable
1 Math NA 3:00-4:00
2 NA Chem 4:00-5:00
3 Math Chem 12:00-2:00")
df %>%
gather(key, value, -c(ID, HoursAvailable)) %>%
mutate(value = if_else(is.na(value), value, HoursAvailable)) %>%
spread(key, value) %>%
select(ID, Math, Chem, HoursAvailable)
#> # A tibble: 3 x 4
#> ID Math Chem HoursAvailable
#> * <int> <chr> <chr> <chr>
#> 1 1 3:00-4:00 <NA> 3:00-4:00
#> 2 2 <NA> 4:00-5:00 4:00-5:00
#> 3 3 12:00-2:00 12:00-2:00 12:00-2:00发布于 2017-10-03 22:59:48
基R
df[,c('Math', 'Chem')][!is.na(df[,c('Math', 'Chem')])]=df[,c('HoursAvailable','HoursAvailable')][!is.na(df[,c('Math', 'Chem')])]
df
ID Math Chem HoursAvailable
1 1 3:00-4:00 <NA> 3:00-4:00
2 2 <NA> 4:00-5:00 4:00-5:00
3 3 12:00-2:00 12:00-2:00 12:00-2:00发布于 2017-10-03 23:01:09
您可以使用dplyr::mutate和ifelse来获取数据结构。
library(dplyr)
# example data
df1 <- structure(list(ID = 1:3, Math = c("Math", NA, "Math"),
Chem = c(NA, "Chem", "Chem"),
HoursAvailable = c("3:00-4:00", "4:00-5:00", "12:00-2:00")),
.Names = c("ID", "Math", "Chem", "HoursAvailable"),
class = "data.frame", row.names = c(NA, -3L))
df1 %>%
mutate(Math = ifelse(is.na(Math), NA, HoursAvailable),
Chem = ifelse(is.na(Chem), NA, HoursAvailable))
ID Math Chem HoursAvailable
1 1 3:00-4:00 <NA> 3:00-4:00
2 2 <NA> 4:00-5:00 4:00-5:00
3 3 12:00-2:00 12:00-2:00 12:00-2:00不过,我想进一步创建一个整洁的数据框架:一个列中有主题,另一个列有小时。
library(tidyr)
df1 %>%
mutate(Math = ifelse(is.na(Math), NA, HoursAvailable),
Chem = ifelse(is.na(Chem), NA, HoursAvailable)) %>%
select(-HoursAvailable) %>%
gather(subject, hours, -ID)
ID subject hours
1 1 Math 3:00-4:00
2 2 Math <NA>
3 3 Math 12:00-2:00
4 1 Chem <NA>
5 2 Chem 4:00-5:00
6 3 Chem 12:00-2:00还可以将%>% na.omit()添加到末尾,以删除带有NA的行。
https://stackoverflow.com/questions/46554450
复制相似问题