我导入R的原始数据有3列:(I)名称,(II)统计量,(III) #单元格
Name列中给出的名称是详细的,例如:
01Sep17 Trm diffn_Tube_001.fcs/Lymphocytes/Live/CD8a subset/integrin B7 subset
01Sep17 Trm diffn_Tube_003.fcs/Lymphocytes/Live/CD4 subset/CD103 subset
01Sep17 Trm diffn_Tube_004.fcs/Lymphocytes/Live/CD4 subset/CD73 subset
(table not shown as there are several hundred rows). 为了简化本专栏,我希望比较一下名称,并删除每个示例之间共享的部分(使用Tube_*.* )。通配符)。例如,上述3项应成为:
CD8a subset/integrin B7 subset
CD4 subset/CD103 subset
CD4 subset/CD73 subset对于如何做到这一点,有什么建议吗?我不想用
01Sep17 Trm diffn_Tube_0*.*.fcs/Lymphocytes/Live在变量内部并使用
as.data.frame(sapply(NameofDataFrame,gsub,pattern=VariableName,replacement=""))因为确切的名字会在实验之间发生变化
发布于 2017-09-04 21:03:46
为此,可以使用gsub:
gsub("(.*)+Live+/","",x)
[1] "CD8a subset/integrin B7 subset" "CD4 subset/CD103 subset"
[3] "CD4 subset/CD73 subset"样本数据:
x <- c(
"01Sep17 Trm diffn_Tube_001.fcs/Lymphocytes/Live/CD8a subset/integrin B7 subset",
"01Sep17 Trm diffn_Tube_003.fcs/Lymphocytes/Live/CD4 subset/CD103 subset",
"01Sep17 Trm diffn_Tube_004.fcs/Lymphocytes/Live/CD4 subset/CD73 subset")发布于 2017-09-04 21:36:17
使用stringr
library(stringr)
x <- c(
"01Sep17 Trm diffn_Tube_001.fcs/Lymphocytes/Live/CD8a subset/integrin B7 subset",
"01Sep17 Trm diffn_Tube_003.fcs/Lymphocytes/Live/CD4 subset/CD103 subset",
"01Sep17 Trm diffn_Tube_004.fcs/Lymphocytes/Live/CD4 subset/CD73 subset")
str_match(x, '.*/Live/(.*)')[,2]https://stackoverflow.com/questions/46043824
复制相似问题