我需要解析awk脚本中的字段来搜索特殊字符,如果存在,则替换为",“或"/”
awk脚本将CSV转换为DAT。定义的字段分隔符是;但有时用户会发送包含字段分隔符的注释。为了解决这个问题,我们需要解析数字为$4的注释字段,如果该字段包含;,则将其替换为逗号
这是文件
"PAT";"TARO";"GEO";"COMMENT"
"FRT";"1256";"USA";"THIS IS A COMMENT ; AFTER COMMENT"
outcome expected
PAT TARO GEO COMMENT
FRT 1256 USA THIS IS A COMMENT / AFTER COMMENT
BEGIN {
FS = ";" ;
OFS = " " ;
print "pat taro geo comment";
}
NR==1{
next
}
{
pat= $1;
taro = $2;
geo = $3 ;
comment = $4 ;
}
if $4 contains ";" then
replace with "/"
end if;
{
print "pat,taro,geo,comment";
}我该怎么做呢?
提前谢谢你
发布于 2020-10-29 21:11:09
如果字段中没有换行符,您可以使用例如GNU awk及其FPAT特性:
$ gawk '
BEGIN {
FPAT="([^;]*)|(\"[^\"]+\")"
}
{
print $4
}' file输出:
"COMMENT"
"THIS IS A COMMENT ; AFTER COMMENT"如果仍要替换注释中的;,请在print之前添加gsub(/;/,"/",$4)。
编辑
$ gawk '
BEGIN {
FPAT="([^;]*)|(\"[^\"]+\")" # FPAT;separates;semicolons;"and quotes"
print "pat taro geo comment" # print header
}
{
for(i=1;i<=NF;i++) # loop all 4 fields
gsub(/^"|"$/,"",$i) # remove quotes
gsub(/;/,"/",$4) # change the ; in $4 to /
pat= $1 # no need for this mut since you wanted
taro = $2
geo = $3
comment = $4
print pat,taro,geo,comment # output new vars but you could as well:
# print $1,$2,$3,$4 # use this too or
# print $0 # since record was rebuilt on gsub
}' file输出:
pat taro geo comment
PAT TARO GEO COMMENT
FRT 1256 USA THIS IS A COMMENT / AFTER COMMENThttps://stackoverflow.com/questions/64589202
复制相似问题