File1有三列。Column1有组,column2有属于特定组的代谢途径的名称,column3对每个途径有一些值:
group1 pathway1 0.664
group1 pathway6 1
group1 pathway2 0.056
group2 pathway2 0.321
group2 pathway3 0.771File2有所有路径的列表:
pathway1
pathway2
pathway3
pathway4
pathway5
pathway6 输出:如何获得一个矩阵表格,如下所示:
group1 group2
pathway1 0.664
pathway2 0.056 0.321
pathway3 0.771
pathway4
pathway5
pathway6 1 发布于 2017-03-10 15:53:29
使用awk
awk 'BEGIN { print "\t group1\tgroup2";b[1];b[2] }
FNR==NR{ if ( $1 == "group1" ) a[$2"@"1]=$3;
if ( $1 == "group2" ) a[$2"@"2]=$3 }
FNR!=NR { printf $1" ";
for ( j in b) {
if ( $1"@"j in a) printf a[$1"@"j]"\t" ;
else printf "\t"};print "" }' File1 File2输出:
group1 group2
pathway1 0.664
pathway2 0.056 0.321
pathway3 0.771
pathway4
pathway5
pathway6 1发布于 2017-03-10 16:53:59
File1 ->有3个cols,File2 -> list of all list。
perl -wMstrict -Mvars='*h' -lane '
# Step: Gather data & populate the hash
if ( @ARGV ) {
next if @F < 3;
my( $group, $pathway, $value ) = @F;
$h{ $group }{ $pathway } = $value;
} else {
push @h, $_;
}
END{
my @data;
# Step: Prepare the tabular data
for my $pathway ( @h ) {
my @line = ( $pathway );
for my $group ( sort keys %h ) {
push @line, $h{ $group }{ $pathway } ||= q||;
}
push @data, [join ",", @line];
}
# Dynamiclly build the tbl template
print for(
q{.TS},
join( ",", qw/ allbox center /, q/tab(,);/ ),
( "c " x (1+keys %h) ),
( qw/l/, "n " x (-1+keys %h), qw/n./ ),
join( ",", q{}, sort keys %h ),
join( "\n", map { @$_ } @data ),
q{.TE},
);
}
' file1 file2 \
| tbl - | nroff -Tascii -ms | grep '.'输出
+---------+--------+--------+
| | group1 | group2 |
+---------+--------+--------+
|pathway1 | 0.664 | |
+---------+--------+--------+
|pathway2 | 0.056 | 0.321 |
+---------+--------+--------+
|pathway3 | | 0.771 |
+---------+--------+--------+
|pathway4 | | |
+---------+--------+--------+
|pathway5 | | |
+---------+--------+--------+
|pathway6 | 1 | |
+---------+--------+--------+https://stackoverflow.com/questions/42722882
复制相似问题