我有一个与数据类似的数据(参见下面的示例)。我想要创建一个向量,包含IIIF的所有字符串字符,中间用逗号分隔。
data=data.frame(IIIT=c("a", "b", "c", "d", "e", "f", "g"), IIIF=c("aze,hyt,fre", NA, "ade", "ijh, deg","oij,erf", "eft,kij", "efg,kijj,lerod,kjhyg"))
data
IIIT IIIF
1 a aze,hyt,fre
2 b <NA>
3 c ade
4 d ijh, deg
5 e oij,erf
6 f eft,kij
7 g efg,kijj,lerod,kjhyg
out
[1] "aze" "hyt" "fre" NA "ade" "ijh" "deg" "oij" "erf" "eft" "kij" "efg" "kijj" "lerod" "kjhyg"我怎么能这么做?
发布于 2021-12-17 20:26:41
基R有strsplit(),它将创建一个列表,列表中的每个元素都是原始向量中每个单独单词的字符向量。然后,可以使用unlist()组合结果。
> unlist(strsplit(data$IIIF, split = ","))
[1] "aze" "hyt" "fre" NA "ade" "ijh" " deg" "oij" "erf"
[10] "eft" "kij" "efg" "kijj" "lerod" "kjhyg"发布于 2021-12-17 20:32:54
我们可以像下面这样尝试scan
> scan(text = data$IIIF, sep = ",", what = "character")
Read 15 items
[1] "aze" "hyt" "fre" NA "ade" "ijh" " deg" "oij" "erf"
[10] "eft" "kij" "efg" "kijj" "lerod" "kjhyg"发布于 2021-12-17 20:42:05
tidyverse解决方案:
library(tidyverse)
data=data.frame(IIIT=c("a", "b", "c", "d", "e", "f", "g"), IIIF=c("aze,hyt,fre", NA, "ade", "ijh, deg","oij,erf", "eft,kij", "efg,kijj,lerod,kjhyg"))
data %>%
separate_rows(IIIF, sep=",") %>%
select(IIIF) %>% unlist %>% set_names(NULL)
#> [1] "aze" "hyt" "fre" NA "ade" "ijh" " deg" "oij" "erf"
#> [10] "eft" "kij" "efg" "kijj" "lerod" "kjhyg"编辑
以上解决方案可以简化,根据@Adam的低俗评论,我感谢:
library(tidyverse)
data %>%
separate_rows(IIIF, sep=",") %>%
pull(IIIF)https://stackoverflow.com/questions/70398512
复制相似问题