我正在尝试创建一个基于包含某个字符的不同变量的观察值。我已经尝试了以下代码:
site<- c('5.1', 'CD 1.1', 'FD 1', 'FD 2', 'FD 3', 'FD 4',
'FD 5', 'FD 6')
year<- c(2011, 2013, 2010, 2010, 2010, 2010, 2010, 2010)
diveLocation<- NA
df = data.frame(site, year, diveLocation)
df$diveLocation<-as.character(df$diveLocation)
df$diveLocation<- gsub("^C\\w+", "compliance", df$site)
head(df)这就给出了:
site year diveLocation
1 5.1 2011 5.1
2 CD 1.1 2013 compliance 1.1
3 FD 1 2010 FD 1
4 FD 2 2010 FD 2
5 FD 3 2010 FD 3
6 FD 4 2010 FD 4唯一积极的是,合规性已经填充了"diveLocation",然而,我只想要合规性字符(即不是“site”观察值1.1),我不希望所有其他“site”观察值都出现在“diveLocation”(例如5.1等)中,而只是使用NAs来填充。任何建议都将不胜感激!
发布于 2018-12-26 12:48:42
使用str_detect包以及case_when和tidyverse的组合
library(tidyverse)
site<- c('5.1', 'CD 1.1', 'FD 1', 'FD 2', 'FD 3', 'FD 4',
'FD 5', 'FD 6')
year<- c(2011, 2013, 2010, 2010, 2010, 2010, 2010, 2010)
diveLocation<- NA
df = data.frame(site, year, diveLocation) %>%as_tibble()
new_df <- df %>%
mutate(diveLocation = case_when(
str_detect(site,pattern = "C") ~ "compliance",
str_detect(site, pattern = "F") ~"farm",
TRUE ~ NA_character_
))
new_df发布于 2018-12-26 12:39:49
这段代码应该可以帮你完成这项工作。
site<- c('5.1', 'CD 1.1', 'FD 1', 'FD 2', 'FD 3', 'FD 4',
'FD 5', 'FD 6')
year<- c(2011, 2013, 2010, 2010, 2010, 2010, 2010, 2010)
diveLocation<- NA
df = data.frame(site, year, diveLocation)
df$diveLocation <- ifelse(substr(df$site, 1, 1) == "C", "compliance", ifelse(substr(df$site, 1, 1) == "F", "Farm","NA"))发布于 2018-12-26 12:42:35
我们可以使用grep创建一个数字索引。根据索引设置“site”子集,并将值分配给“diveLocation”的相应元素
i1 <- grep("^CD", df$site)
df$diveLocation[i1] <- 'compliance'
df
# site year diveLocation
#1 5.1 2011 <NA>
#2 CD 1.1 2013 compliance
#3 FD 1 2010 <NA>
#4 FD 2 2010 <NA>
#5 FD 3 2010 <NA>
#6 FD 4 2010 <NA>
#7 FD 5 2010 <NA>
#8 FD 6 2010 <NA>
i2 <- grep("^FD", df$site)
df$diveLocation[i2] <- 'Farm'或者使用data.table
library(data.table)
setDT(df)[grep("^CD", site), diveLocation := 'compliance'][]https://stackoverflow.com/questions/53927325
复制相似问题