我有多个文件,结构如下:
(Genome1_Sample4A_protein_Genome1_Sample4A_132_2:0.0060449,(Genome1_Sample5A_protein_Genome1_Sample5A_30_12:1e-06,(Genome1_Sample1B_protein_Genome1_Sample1B_99_2:1e-06,Genome1_Sample6A_protein_Genome1_Sample6A_295_2:0.00366292)n2:0.00370314)n1:0.0060449)n0; 我想把"_protein“和":”之间的部分删除。因此,输出结果如下:
(Genome1_Sample4A:0.0060449,(Genome1_Sample5A:1e-06,(Genome1_Sample1B:1e-06,Genome1_Sample6A:0.00366292)n2:0.00370314)n1:0.0060449)n0; 我尝试使用sed和awk:
sed -i 's/_protein.*:/:/g' tree1.txt
sed -i 's/_protein.*_[[:digit:]]*:/:/g' tree1.txt
awk '{gsub(/\_protein*:/,":");}1' tree1.txt但是这些代码中的任何一个都给了我想要的输出。
发布于 2019-09-06 22:54:06
.*是greedy,所以改用下面的代码:
sed 's/_protein[^:]*:/:/g' tree1.txt输出:
(Genome1_Sample4A:0.0060449,(Genome1_Sample5A:1e-06,(Genome1_Sample1B:1e-06,Genome1_Sample6A:0.00366292)n2:0.00370314)n1:0.0060449)n0;https://stackoverflow.com/questions/57824260
复制相似问题