我有60个不同长度和相同列名的文本文件。
例如:
cat Sample_145_Chimeric.out.junction.new.back_spliced_junction.bed.Circexplorer2.txt | gawk '{print $14}' | sort | uniq -c
19258 circRNA
612 ciRNA
cat Sample_146_Chimeric.out.junction.new.back_spliced_junction.bed.Circexplorer2.txt | gawk '{print $14}' | sort | uniq -c
17791 circRNA
729 ciRNA
cat Sample_147_Chimeric.out.junction.new.back_spliced_junction.bed.Circexplorer2.txt | gawk '{print $14}' | sort | uniq -c
22838 circRNA
686 ciRNA
cat Sample_148_Chimeric.out.junction.new.back_spliced_junction.bed.Circexplorer2.txt | gawk '{print $14}' | sort | uniq -c
19404 circRNA
475 ciRNA我希望为所有标识的circRNAs生成一个“主”表,其中readnumber作为每个示例的列,flankintron作为行名:

发布于 2019-03-26 20:43:40
如果所有文件中的所有列都以相同的顺序排列,那么只需将它们与>>连接起来:
for x in {1..60}; do
# These flags for tail just cut of the top line, which is your headers
tail -n 2 Sample_$x_blah.txt >> Sample_master.txt
# and the double carat makes the output append^
done 如果没有,那么您可以像上面那样用awk编写翻译,即
$ cat Sample_1.txt
col1,col2,col3,col4 #etc
$ cat Sample_2.txt
col4,col3,col2,col1
$ cat Sample_1.txt > Sample_Master.txt # no translation needed
$ awk '{print $4","$3","$2","$1 }' Sample_2.txt >> Sample_Master.txt 但是对于60个文件,这将比使用python的csv编写python脚本的工作量更大.
https://askubuntu.com/questions/1128946
复制相似问题