我有一个制表符分隔的文件,如下所示-
loci1 loci2 name1 name2
utr3p utr3p TERF1 ISCA2
utr3p intron LPP PAAF1
utr3p intron RPL37A RCC1
coding intron BAG2 RP11
intron intron KIF1B SNORA21
intron downstream GUSBP4 CTD
intron intron CLTC VMP1
utr3p utr3p PCYT1A ZHX3我想连接两个列name1和name2 (由“__”连接),合并后的列应该作为新列"merged_names“粘贴到新文件中。我如何使用awk来做这件事。
预期输出-
loci1 loci2 name1 name2 merged_names
utr3p utr3p TERF1 ISCA2 TERF1__ISCA2
utr3p intron LPP PAAF1 LPP__PAAF1
utr3p intron RPL37A RCC1 RPL37A__RCC1
coding intron BAG2 RP11 BAG2__RP11
intron intron KIF1B SNORA21 KIF1B__SNORA21
intron downstream GUSBP4 CTD GUSBP4__CTD
intron intron CLTC VMP1 CLTC__VMP1
utr3p utr3p PCYT1A ZHX3 PCYT1A__ZHX3发布于 2016-08-08 19:19:05
您可以使用此awk
awk 'BEGIN{OFS=FS="\t"} NR==1{$(NF+1)="merged_names"} NR!=1{$(NF+1)=$(NF-1) "__" $NF}1' file更短的awk
awk 'BEGIN{OFS=FS="\t"} {$(NF+1)=(NR==1)? "merged_names" : $(NF-1)"__"$NF}1' file发布于 2016-08-08 19:13:33
awk 'BEGIN{OFS="\t"; print "loci1 loci2 name1 name2 MERGED__NAMES"} {print $1,$2,$3,$4,$3 "__" $4}' infile
loci1 loci2 name1 name2 MERGED__NAMES
loci1 loci2 name1 name2 name1__name2
utr3p utr3p TERF1 ISCA2 TERF1__ISCA2
utr3p intron LPP PAAF1 LPP__PAAF1
utr3p intron RPL37A RCC1 RPL37A__RCC1
coding intron BAG2 RP11 BAG2__RP11
intron intron KIF1B SNORA21 KIF1B__SNORA21
intron downstream GUSBP4 CTD GUSBP4__CTD
intron intron CLTC VMP1 CLTC__VMP1
utr3p utr3p PCYT1A ZHX3 PCYT1A__ZHX3https://stackoverflow.com/questions/38827686
复制相似问题