我有两个比较两个文件的脚本。
第一个脚本比较$3列:
> awk -v OFS="\t" 'NR==FNR{a[$3]=$4;next}{$2=$2 "\t"(a[$2]?a[$2]:"-")}1' file1 file2第二个脚本比较$2列:
> awk -v OFS="\t" 'NR==FNR{a[$2]=$4;next}{$2=$2 "\t"(a[$2]?a[$2]:"-")}1' file1 file2你看到了不同的地方
NR==FNR{a[$3]=$4;next}和
NR==FNR{a[$2]=$4;next}和
我想把它写成一个脚本
NR==FNR{a[$2 || $3]=$4}
能帮帮我吗?如果你想的话我可以分享文件和更多的信息。
输入:文件1
chr1 11796320 11796321 MTHFR
chr1 169549810 169549811 F5
chr1 173917077 173917078 SERPINC1
chr2 48962781 48962782 FSHR
chr4 121696961 121696962 ANXA5
chr4 121697010 121697011 ANXA5
chr4 121697036 121697037 ANXA5
chr4 121697055 121697056 ANXA5
chr11 46739504 46739505 F2
chr13 20189510 20189511 GJB2
chr13 20189546 20189547 GJB2File2
chr1 11796321 G 0 WILD ADP=1026
chr1 169549811 C 0 WILD ADP=940
chr1 173917078 C 0 WILD ADP=501
chr2 48962782 C T HET ADP=1665
chr4 121696962 C T HET ADP=212
chr4 121697011 A 0 WILD ADP=184
chr4 121697037 T 0 WILD ADP=111
chr4 121697037 tccc 0 INDEL AINDEL
chr4 121697056 C 0 WILD ADP=112
chr11 46739505 G 0 WILD ADP=202
chr13 20189511 C 0 WILD ADP=326
chr13 20189546 AC A INDEL ADP=164
chr13 20189547 C 0 WILD ADP=3产出:
chr1 11796321 MTHFR G 0 WILD ADP=1026
chr1 169549811 F5 C 0 WILD ADP=940
chr1 173917078 SERPINC1 C 0 WILD ADP=501
chr2 48962782 FSHR C T HET ADP=1665
chr4 121696962 ANXA5 C T HET ADP=212
chr4 121697011 ANXA5 A 0 WILD ADP=184
chr4 121697037 ANXA5 T 0 WILD ADP=111
chr4 121697037 ANXA5 tccc 0 INDEL AINDEL
chr4 121697056 ANXA5 C 0 WILD ADP=112
chr11 46739505 F2 G 0 WILD ADP=202
chr13 20189511 GJB2 C 0 WILD ADP=326
chr13 20189546 GJB2 AC A INDEL ADP=164
chr13 20189547 GJB2 C 0 WILD ADP=3发布于 2016-09-30 14:51:31
awk去营救!
$ awk 'NR==FNR{f2[$2]=f3[$3]=$4;next}
{k=$2; suf=((k in f2)?f2[k]:((k in f3)?f3[k]:"-"));
$2=k "\t" suf}1' file{1,2}
chr1 11796321 MTHFR G 0 WILD ADP=1026
chr1 169549811 F5 C 0 WILD ADP=940
chr1 173917078 SERPINC1 C 0 WILD ADP=501
chr2 48962782 FSHR C T HET ADP=1665
chr4 121696962 ANXA5 C T HET ADP=212
chr4 121697011 ANXA5 A 0 WILD ADP=184
chr4 121697037 ANXA5 T 0 WILD ADP=111
chr4 121697037 ANXA5 tccc 0 INDEL AINDEL
chr4 121697056 ANXA5 C 0 WILD ADP=112
chr11 46739505 F2 G 0 WILD ADP=202
chr13 20189511 GJB2 C 0 WILD ADP=326
chr13 20189546 GJB2 AC A INDEL ADP=164
chr13 20189547 GJB2 C 0 WILD ADP=3发布于 2016-09-30 13:13:56
另一个awk (编辑以包含$4)可能:
awk 'FNR==NR{A[$3]=$1 FS $3 FS $4;next} ($2 in A){print A[$2],$3,$4,$5,$6}' file1 file2
chr1 11796321 MTHFR G 0 WILD ADP=1026
chr1 169549811 F5 C 0 WILD ADP=940
chr1 173917078 SERPINC1 C 0 WILD ADP=501
chr2 48962782 FSHR C T HET ADP=1665
chr4 121696962 ANXA5 C T HET ADP=212
chr4 121697011 ANXA5 A 0 WILD ADP=184
chr4 121697037 ANXA5 T 0 WILD ADP=111
chr4 121697037 ANXA5 tccc 0 INDEL AINDEL
chr4 121697056 ANXA5 C 0 WILD ADP=112
chr11 46739505 F2 G 0 WILD ADP=202
chr13 20189511 GJB2 C 0 WILD ADP=326
chr13 20189547 GJB2 C 0 WILD ADP=3https://stackoverflow.com/questions/39791378
复制相似问题