我正在研究如何编写一个.awk脚本,该脚本将一个.csv文件作为输入并输出它,没有逗号,列对齐。到目前为止,我已经尝试过了:
{ printf "%-10s %s\n", $1, $2, $3 ,$4 }但这只输出前两个字段中对齐的数据。它在删除逗号分隔符方面做得很好,但是第四列中的双引号中有逗号,我想知道是否会引起问题。任何指导都是非常感谢的,我是非常新的使用awk。
示例输入类似于:
Name,Last Name,Gender,Pet
Kit,Rattenberie,Male,"Crake, african black"
Cliff,Lakes,Male,"Red phalarope"
Tirrell,Stables,Male,"Rhea, greater"
Cherry,William,Female,"Crow, house"所期望的输出将类似于:
Name Last Name Gender Pet
Kit Rattenberie Male "Crake, african black"
Cliff Lakes Male "Red phalarope"
Tirrell Stables Male "Rhea, greater"
Cherry William Female "Crow, house"对于一个10行的.csv文件。提前感谢
发布于 2022-10-11 21:16:26
使用gnu-awk,您可以使用以下内容:
awk -v FPAT='"[^"]*"|[^,]+' '{
for (i=1; i<=NF; ++i) $i = sprintf("%-12s", $i)} 1' file
Name Last Name Gender Pet
Kit Rattenberie Male "Crake, african black"
Cliff Lakes Male "Red phalarope"
Tirrell Stables Male "Rhea, greater"
Cherry William Female "Crow, house"或者,如果宽度完全不可预测,那么使用以下awk + column解决方案:
awk -v FPAT='"[^"]*"|[^,]+' -v OFS=';' '{$1=$1} 1' file |
column -s';' -t
Name Last Name Gender Pet
Kit Rattenberie Male "Crake, african black"
Cliff Lakes Male "Red phalarope"
Tirrell Stables Male "Rhea, greater"
Cherry William Female "Crow, house"如果您想要创建一个awk脚本,那么使用:
cat col.awk
BEGIN {
FPAT="\"[^\"]*\"|[^,]+"
OFS=";"
}
{$1 = $1}
1将其用作:
awk -f col.awk file.csv | column -s';' -t发布于 2022-10-11 21:41:24
使用awk脚本(每一个OP的注释)和让awk确定每个列的最大宽度是一个想法:
$ cat script.awk
BEGIN { FPAT="\"[^\"]*\"|[^,]+" } # instead of parsing on field delimiter (via FS) ... parse on field format via (FPAT)
{ for (i=1;i<=NF;i++)
w[i]= length($i) > w[i] ? length($i) : w[i] # keep track of max width of each column
lines[FNR]=$0 # save entire line
}
END { for (i=1;i<=FNR;i++) { # loop through each saved line
n=patsplit(lines[i],a) # reparse based on FPAT, storing fields in array a[]
for (j=1;j<n;j++) # loop through array entries ...
printf "%-*s%s", w[j], a[j], OFS # printing to stdout
print a[n] # print last field plus "\n"
}
}或者使用多维数组来存储输入,从而消除了输入数据的第二个解析(通过patsplit()):
$ cat script.awk
BEGIN { FPAT="\"[^\"]*\"|[^,]+" }
{ for (i=1;i<=NF;i++) {
w[i]= length($i) > w[i] ? length($i) : w[i]
fields[FNR][i]=$i
}
}
END { for (i=1;i<=FNR;i++) {
for (j=1;j<NF;j++)
printf "%-*s%s", w[j], fields[i][j], OFS
print fields[i][NF]
}
}备注:
array)
awk/lines[]或awk/fields[][] GNU awk for FPAT和多维数组支持)。
这两者都产生了:
$ awk -f script.awk file
Name Last Name Gender Pet
Kit Rattenberie Male "Crake, african black"
Cliff Lakes Male "Red phalarope"
Tirrell Stables Male "Rhea, greater"
Cherry William Female "Crow, house"https://stackoverflow.com/questions/74034018
复制相似问题