我想这是一个很简单的问题,但我试图在将数据导入R之前编辑我的数据,我想在终端中这样做,以使它适合我的管道。
对于数据集中的每一行,如果$4 > $5,我想交换值并设置$7 = "-“。
我在考虑做一个for循环。在R里我看起来有点像
for (i in 1:nrow(df)){
while(df[i,4]>df[i,5]){
tmp <- df[i,4]
df[i,4] <- df[i,5]
df[i,5] <- tmp
df[i,7] <- "-"
}
}因此:
chr1 Cufflinks exon 1 100 . + .
chr1 Cufflinks exon 300 200 . + . 将改为:
chr1 Cufflinks exon 1 100 . + .
chr1 Cufflinks exon 200 300 . - . 在巴什我该怎么做?
我的数据示例:
chr1 Cufflinks exon 11869 12227 . + . gene_id "XLOC_000001"; transcript_id "TCONS_00000001"; exon_number "1"; gene_name "DDX11L1"; oId "ENST00000456328.2"; nearest_ref "ENST00000456328.2"; class_code "="; tss_id "TSS1";
chr1 Cufflinks exon 12613 12721 . + . gene_id "XLOC_000001"; transcript_id "TCONS_00000001"; exon_number "2"; gene_name "DDX11L1"; oId "ENST00000456328.2"; nearest_ref "ENST00000456328.2"; class_code "="; tss_id "TSS1";
chr1 Cufflinks exon 13221 14409 . + . gene_id "XLOC_000001"; transcript_id "TCONS_00000001"; exon_number "3"; gene_name "DDX11L1"; oId "ENST00000456328.2"; nearest_ref "ENST00000456328.2"; class_code "="; tss_id "TSS1";
chr1 Cufflinks exon 11869 12057 . + . gene_id "XLOC_000001"; transcript_id "TCONS_00000005"; exon_number "1"; gene_name "DDX11L1"; oId "CUFF.12.5"; nearest_ref "ENST00000450305.2"; class_code "j"; tss_id "TSS1";
chr1 Cufflinks exon 12179 12227 . + . gene_id "XLOC_000001"; transcript_id "TCONS_00000005"; exon_number "2"; gene_name "DDX11L1"; oId "CUFF.12.5"; nearest_ref "ENST00000450305.2"; class_code "j"; tss_id "TSS1";
chr1 Cufflinks exon 12613 12721 . + . gene_id "XLOC_000001"; transcript_id "TCONS_00000005"; exon_number "3"; gene_name "DDX11L1"; oId "CUFF.12.5"; nearest_ref "ENST00000450305.2"; class_code "j"; tss_id "TSS1";
chr1 Cufflinks exon 13225 13655 . + . gene_id "XLOC_000001"; transcript_id "TCONS_00000005"; exon_number "4"; gene_name "DDX11L1"; oId "CUFF.12.5"; nearest_ref "ENST00000450305.2"; class_code "j"; tss_id "TSS1";
chr1 Cufflinks exon 13661 14412 . + . gene_id "XLOC_000001"; transcript_id "TCONS_00000005"; exon_number "5"; gene_name "DDX11L1"; oId "CUFF.12.5"; nearest_ref "ENST00000450305.2"; class_code "j"; tss_id "TSS1";
chr1 Cufflinks exon 11869 12057 . + . gene_id "XLOC_000001"; transcript_id "TCONS_00000004"; exon_number "1"; gene_name "DDX11L1"; oId "CUFF.12.4"; nearest_ref "ENST00000450305.2"; class_code "j"; tss_id "TSS1";
chr1 Cufflinks exon 12179 12227 . + . gene_id "XLOC_000001"; transcript_id "TCONS_00000004"; exon_number "2"; gene_name "DDX11L1"; oId "CUFF.12.4"; nearest_ref "ENST00000450305.2"; class_code "j"; tss_id "TSS1";发布于 2015-10-21 21:53:19
试一试:
awk '{if ($4 > $5) {t=$4; $4=$5; $5=t; $7="-"; print} else {print}}' data但是,它将破坏列之间的一些空格。不知道这对你来说是不是个问题。
发布于 2015-10-21 21:46:07
使用awk。
喜欢
awk '{tmp=$4; $4=$5; $5=tmp; $7="-"; print;}' dataset.filehttps://stackoverflow.com/questions/33269755
复制相似问题