在我的输出文件中,两个对应于两个浮点数的列连接在一起,形成一列。这里给出了一个例子,这两列之间到底有什么区别吗?
在这里,这应该是由空格分隔的5列,但是列3和4之间的空格是缺少的。对于某些UNIX命令,如剪切、awk、sed甚至正则表达式,是否存在纠正此错误的方法?
3.77388 0.608871 -8216.342.42161 1.88655
4.39243 0.625 -8238.241.49211 0.889258
4.38903 0.608871 -7871.71.52994 0.883976
4.286 0.653226 -8287.322.3195 2.13736
4.29313 0.629032 -7954.651.59168 1.02046修正后的版本应该如下所示:
3.77388 0.608871 -8216.34 2.42161 1.88655
4.39243 0.625 -8238.24 1.49211 0.889258
4.38903 0.608871 -7871.7 1.52994 0.883976
4.286 0.653226 -8287.32 2.3195 2.13736
4.29313 0.629032 -7954.65 1.59168 1.02046更多信息:第4列总是小于10,所以小数点左边只有一个数字。
我试过使用awk:
tail -n 5 output.dat | awk '{print $3}'
-8216.342.42161
-8238.241.49211
-7871.71.52994
-8287.322.3195
-7954.651.59168有没有办法把这个列分成两列?
发布于 2014-01-13 16:10:06
一种解决办法是:
sed 's/\(\.[0-9]*\)\([0-9]\.\)/\1 \2/'发布于 2014-01-13 16:13:29
使用Perl一行程序:
perl -pe 's/(\d+\.\d+)(\d\.\d+)/$1 $2/' < output.dat > fixed_output.dat发布于 2014-01-13 17:03:21
您的输入文件
$ cat file
3.77388 0.608871 -8216.342.42161 1.88655
4.39243 0.625 -8238.241.49211 0.889258
4.38903 0.608871 -7871.71.52994 0.883976
4.286 0.653226 -8287.322.3195 2.13736
4.29313 0.629032 -7954.651.59168 1.02046Awk方法
awk '{
n = index($3,".") # index of dot from field 3
x = substr($3,1,n+3) ~/\.$/ ? n+1 : n+2 # Decision for no of char to consider
$3 = substr($3,1,x) OFS substr($3,x+1) # separate out fields
$0 = $0 # Recalculate fields (number of fields NF)
$1 = $1 # recalculate the record, removing excess spacing (the new field separator becomes OFS, default is a single space)
}1' OFS='\t' file结果
3.77388 0.608871 -8216.34 2.42161 1.88655
4.39243 0.625 -8238.24 1.49211 0.889258
4.38903 0.608871 -7871.7 1.52994 0.883976
4.286 0.653226 -8287.32 2.3195 2.13736
4.29313 0.629032 -7954.65 1.59168 1.02046https://stackoverflow.com/questions/21095662
复制相似问题