文章/答案/技术大牛

发布

社区首页 >问答首页 >如何在日志中分组并计数bash中的每个子组

问如何在日志中分组并计数bash中的每个子组
EN

Stack Overflow用户

提问于 2020-10-19 10:30:39

回答 4查看 106关注 0票数 0

我想分析一个日志文件。它有几个操作，每个操作都包含一组子操作。我想提取按操作分组的子操作的数量。这在sql中很容易，但我却陷入了bash。

以下是该文件的简化版本：

    [21:30:21.538Z #a9a.012 DEBUG -            -   ] c.h.c.w.j.JobTrackingWorkerReporter: Reporting bulk completion: Partition: tenant-xla; Job: ingestion-4759-9-13-41; Tasks: [ingestion-4759-9-13-41.1.43, ingestion-4759-9-13-41.1.44, ingestion-4759-9-13-41.1.41]

otherlogs stuff ...

[21:31:21.538Z #a9a.012 DEBUG -            -   ] c.h.c.w.j.JobTrackingWorkerReporter: Reporting bulk completion: Partition: tenant-xla; Job: ingestion-4757-10-17-4; Tasks: [ingestion-4757-10-17-4.1.2, ingestion-4757-10-17-4.1.1, ingestion-4757-10-17-4.1.3, ingestion-4757-10-17-4.1.4]

otherlogs stuff ...

[21:31:21.690Z #a9a.012 DEBUG -            -   ] c.h.c.w.j.JobTrackingWorkerReporter: Reporting bulk completion: Partition: tenant-xla; Job: ingestion-4757-10-18-3; Tasks: [ingestion-4757-10-18-3.1.137, ingestion-4757-10-18-3.1.139, ingestion-4757-10-18-3.1.138, ingestion-4757-10-18-3.1.140, ingestion-4757-10-18-3.1.136, ingestion-4757-10-18-3.1.141]

每个操作都是点之前的部分，其余的属于任何子操作。

我正在寻找像下面这样的结果，例如，我可以将其存储在一个文件中：

operationName            suboperationCount
ingestion-4757-10-18-3         3
ingestion-4757-10-18-4         4
ingestion-4757-10-18-3         6

我一直在尝试像cat xlogs.txt | grep 'ingestion' | uniq | wc -w > fileresult.txt这样的组合

但这只会返回全球数字。

谢谢！

bash

awk

count

grep

grouping

回答 4

Stack Overflow用户

回答已采纳

发布于 2020-10-19 10:40:44

编辑:OP评论后的知道我们只需要在TASKS中包含in，所以在这种情况下您可以尝试遵循，严格考虑到您的Input_file中每一行只有一个TASK字符串。

awk '
{
  sub(/.*Tasks/,"Tasks")
  while(match($0,/ingestion-[0-9-]+/)){
    arr[substr($0,RSTART,RLENGTH)]++
    $0=substr($0,RSTART+RLENGTH)
  }
}
END{
  for(i in arr){
    print i,arr[i]
  }
}'  Input_file

使用awk，请您试着用所示的样品进行跟踪、书写和测试。

awk '
{
  while(match($0,/ingestion-[0-9-]+/)){
    arr[substr($0,RSTART,RLENGTH)]++
    $0=substr($0,RSTART+RLENGTH)
  }
}
END{
  for(i in arr){
    print i,arr[i]
  }
}' Input_file

解释：添加了上面的详细说明。

awk '                                       ##Starting awk program from here.
{
  while(match($0,/ingestion-[0-9-]+/)){     ##Running while loop till match function returns a TRUE result after matching regex init.
    arr[substr($0,RSTART,RLENGTH)]++        ##Creating array arr whihc has index as matched regex substring and keep increasing its value by 1 here.
    $0=substr($0,RSTART+RLENGTH)            ##Now saving rest of the line(after the matched regx above) into current line.
  }
}
END{                                        ##Starting END block of this awk program from here.
  for(i in arr){                            ##Traversing through arr all elements here.
    print i,arr[i]                          ##printing index of array and value of array with index of i.
  }
}' Input_file                               ##mentioning Input_file name here.

票数 5

Stack Overflow用户

发布于 2020-10-19 10:36:38

您可以使用以下grep + uniq命令：

grep -Eo '\bingestion-[0-9-]+' file.log | uniq -c

  4 ingestion-4759-9-13-41
  5 ingestion-4757-10-17-4
  7 ingestion-4757-10-18-3

票数 1

Stack Overflow用户

发布于 2020-10-19 10:42:44

$grep -o  'ingestion[\.0-9-]*\.'  file | uniq -c
      3 ingestion-4759-9-13-41.1.
      4 ingestion-4757-10-17-4.1.
      6 ingestion-4757-10-18-3.1.

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/64425538

复制

相似问题

问如何在日志中分组并计数bash中的每个子组
EN

回答 4

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问如何在日志中分组并计数bash中的每个子组EN

回答 4

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问如何在日志中分组并计数bash中的每个子组
EN