文章/答案/技术大牛

发布

社区首页 >问答首页 >基于公用键合并awk中的3个文件

问基于公用键合并awk中的3个文件
EN

Stack Overflow用户

提问于 2020-12-17 10:43:59

回答 3查看 89关注 0票数 3

3份档案：

n.txt
id-3a,oc-ctrl-jr-0,ACTIVE,-,Running,cp=172.31.0.7
id-5e,oc-ctrl-jr-1,ACTIVE,-,Running,cp=172.31.0.6
id-5f,oc-ctrl-jr-2,ACTIVE,-,Running,cp=172.31.0.5
id-0,oc-comp-jr-0,ACTIVE,-,Running,cp=172.31.0.9
id-77,oc-comp-jr-1,ACTIVE,-,Running,cp=172.31.0.8

bm.txt
server-10,id-77,power on,active,False
server-2,id-5f,power on,active,False
server-32,id-3a,power on,active,False
server-11,id-5e,power on,active,False
server-25,id-0,power on,active,False

第三个文件包含来自bm.txt的每一行的部分：

hosts.yaml
[..]
- arch: x86_64
  zone: foo
  cpu: 1
  disk: 10
  hw_model_type:
  - bar-8
  mac:
  - aa:aa:aa:aa:aa:aa:aa
  memory: 4096
  name: server-32
  desc: 'my host'
  ip_addr: 192.168.117.33
  info: false 
  type: bla
[..]

所需产出：

n name,b name,n power,n desc,n state,n cp,bm power,bm state,b error,ip
oc-ctrl-jr-0,server-32,ACTIVE,-,Running,cp=172.31.0.7,power on,active,False,192.168.117.33
oc-ctrl-jr-1,server-11,ACTIVE,-,Running,cp=172.31.0.6,power on,active,False,192.168.117.47
oc-ctrl-jr-2,server-2,ACTIVE,-,Running,cp=172.31.0.5,power on,active,False,192.168.117.87
oc-comp-jr-0,server-25,ACTIVE,-,Running,cp=172.31.0.9,power on,active,False,192.168.117.111
oc-comp-jr-1,server-10,ACTIVE,-,Running,cp=172.31.0.8,power on,active,False,192.168.117.3

我可以用这段代码连接前两个文件，但是列的顺序与所需的不同：

awk -F, 'BEGIN{print"N Name,N Power,N Desc,N State,N CP,BM Name,BM Power,BM State,BM Error"}
         NR==FNR{OFS=",";a[$2]=$1 OFS $3 OFS $4 OFS $5; next}
         $1 in a {print $2,$3,$4,$5,$6,a[$1]}' bm.txt n.txt

但我不知道如何重新排序，也不知道如何将第三个文件解析添加到主代码中。我能够分别解析第三个文件如下：

awk '$0~"name: server-32$"{getline;getline;print $NF}' hosts.yaml

对于如何获得所需的输出，我将不胜感激。以及关于如何改进当前代码的任何建议。

谢谢

awk

回答 3

Stack Overflow用户

回答已采纳

发布于 2020-12-17 13:15:46

使用GNU awk，请您试着使用所示的示例进行编写和测试。

awk '
ARGIND==1{
  if($0~/server-[0-9]+/){
    foundServer=1
    serverName=$2
  }
  if(foundServer && $0 ~ /ip_addr:/){
    servername[serverName]=$2
    serverName=foundServer=""
  }
  next
}
ARGIND==2{
  if(setFS==""){ FS=OFS=",";setFS=1 }
  server[$2]=$1
  powerarr[$2]=$3 OFS $4 OFS $5
  next
}
ARGIND==3{
  if($1 in server){
    print $2 OFS server[$1] OFS $3 OFS $4 OFS $5 OFS $6 OFS powerarr[$1],(servername[server[$1]]!=""?servername[server[$1]]:"NA")
  }
}
' hosts.yaml bm.txt n.txt

解释：添加了上面的详细说明。

awk '
##Starting awk program from here.
ARGIND==1{
##Checking condition if this is first Input_file then do following.
  if($0~/server-[0-9]+/){
##Checking condition if line has server with digits then do following.
    foundServer=1
##Setting foundServer to 1 here.
    serverName=$2
##Setting serverName to 2nd field which is value from  yaml file.
  }
  if(foundServer && $0 ~ /ip_addr:/){
##Checking condition if foundServer is SET and line has ip_addr in it then do following.
    servername[serverName]=$2
##Creating servername array with index of serverName with value of 2nd field.
    serverName=foundServer=""
##Nullifying serverName and foundServer here.
  }
  next
##next will skip all further statements from here.
}
ARGIND==2{
##Checking condition if this is 2nd Input_file is being read then do following.
  if(setFS==""){ FS=OFS=",";setFS=1 }
##Checking condition if setFS is NULL then set FS and OFS as comma here and setting setFS to 1.
  server[$2]=$1
##Creating server with index of 2nd field which has 1st field as value.
  powerarr[$2]=$3 OFS $4 OFS $5
##Creating powerarr with index as 2nd field and $3 OFS $4 OFS $5 as value.
  next
##next will skip all further statements from here.
}
ARGIND==3{
##Checking condition if this is 3rd Input_file is being read then do following.
  if($1 in server){
##Checking condition if 1st field is present in server then do following.
    print $2 OFS server[$1] OFS $3 OFS $4 OFS $5 OFS $6 OFS powerarr[$1],(servername[server[$1]]!=""?servername[server[$1]]:"NA")
##Printing needed values as per OP here.
  }
}
' hosts.yaml bm.txt n.txt ##Mentioning Input_file names here.

票数 5

Stack Overflow用户

发布于 2020-12-17 12:42:29

awk 'FILENAME=="n.txt" { # Process on the n.txt file
                split($0,arr,","); # Split data into array arr using , as the delimiter
                nam[arr[1]]=arr[2]; # Use entries in arr array to create separate arrays for each piece of data required.
                state[arr[1]]=arr[5];
                cp[arr[1]]=arr[6];
                state3[arr[1]]=arr[3];
                ndesc[arr[1]]=arr[4];
                next
               }
FILENAME=="bm.txt" { # Process only bm.txt.
                split($0,arr,",");
                serv[arr[2]]=arr[1]; # Create separate arrays as we did with the previous file
                power[arr[2]]=arr[3];
                state1[arr[2]]=arr[4];
                berr[arr[2]]=arr[5];
                next
               }
FILENAME=="hosts.yaml" && /name/ { # Search for name variable in yaml
                split($0,arr,":");
                gsub(" ","",arr[2]); # Remove spaces
                serva=arr[2]; # Hold server name in serva variable
                next
               }
FILENAME=="hosts.yaml" && /ip_addr/ { # Search for ip_addr in yaml
                split($0,arr,":");
                gsub(" ","",arr[2]);
                serv1[serva]=arr[2]; # Set up an array indexed with server name
                next
               }
END                {
                for (i in nam) { # Loop through nam array pulling out entries in other arrays
                   print nam[i]","serv[i]","state3[i]","ndesc[i]","state[i]","cp[i]","power[i]","state1[i]","berr[i]","serv1[serv[i]]
                }
               } n.txt bm.txt host.yaml

输出：

oc-ctrl-jr-1,server-11,ACTIVE,-,Running,cp=172.31.0.6,power on,active,False,
oc-ctrl-jr-2,server-2,ACTIVE,-,Running,cp=172.31.0.5,power on,active,False,
oc-comp-jr-0,server-25,ACTIVE,-,Running,cp=172.31.0.9,power on,active,False,
oc-ctrl-jr-0,server-32,ACTIVE,-,Running,cp=172.31.0.7,power on,active,False,192.168.117.33
oc-comp-jr-1,server-10,ACTIVE,-,Running,cp=172.31.0.8,power on,active,False,

票数 2

Stack Overflow用户

发布于 2020-12-19 00:30:33

这些文件有多大？

如果总数小于3-4 GB，我可能会建议将每个文件全部加载到awk中，并利用基本上是DB索引的关联数组。

可能要比试图跟上这个多路文件连接稍微少一些。除非Unicode是一个问题，或者gensub( )是不可缺少的，请考虑运行其他人在mawk1.3.4和mawk2-beta中建议的精确代码。

看到恐龙时代的脚本语言在他们自己的游戏中击败了tr、cut、sed、grep，这是相当超现实的，在我的日常用例中有相当一部分。

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/65339152

复制

相似问题

问基于公用键合并awk中的3个文件
EN

回答 3

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问基于公用键合并awk中的3个文件EN

回答 3

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问基于公用键合并awk中的3个文件
EN