首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >如何在没有日期的情况下在两次之间提取日志条目?

如何在没有日期的情况下在两次之间提取日志条目?
EN

Stack Overflow用户
提问于 2022-05-03 20:22:06
回答 3查看 54关注 0票数 0

我正在尝试拥有一个自动化脚本,它可以获取最新的日志条目,并从两个小时前收集所有日志条目,而不管在这段时间内是否存在日志条目。我一直在研究的问题是,我发现的所有例子都附有日期,而我没有。一个示例日志输出是:

代码语言:javascript
复制
13:26:28.709883 IP unn-37-19-198-173.datapacket.com.https > term-IdeaPad-Flex.46364: Flags [P.], seq 9136:9287, ack 13044, win 420, length 151
13:26:28.713687 IP unn-37-19-198-173.datapacket.com.https > term-IdeaPad-Flex.46364: Flags [P.], seq 9287:9522, ack 13044, win 420, length 235
13:26:28.713766 IP term-IdeaPad-Flex.46364 > unn-37-19-198-173.datapacket.com.https: Flags [.], ack 9522, win 24576, length 0
13:26:28.840650 IP term-IdeaPad-Flex.46364 > unn-37-19-198-173.datapacket.com.https: Flags [.], seq 14286:15624, ack 9522, win 24576, length 1338
13:26:28.848949 IP unn-37-19-198-173.datapacket.com.https > term-IdeaPad-Flex.46364: Flags [P.], seq 9522:9599, ack 14286, win 420, length 77
13:26:28.849002 IP term-IdeaPad-Flex.46364 > unn-37-19-198-173.datapacket.com.https: Flags [P.], seq 15624:15674, ack 9599, win 24576, length 50
13:26:28.849023 IP unn-37-19-198-173.datapacket.com.https > term-IdeaPad-Flex.46364: Flags [P.], seq 9599:9743, ack 14286, win 420, length 144
13:26:28.849031 IP unn-37-19-198-173.datapacket.com.https > term-IdeaPad-Flex.46364: Flags [P.], seq 9743:10269, ack 14286, win 420, length 526

所以我的时间在前线,没有约会。date不喜欢使用它,它给了我一个date: invalid date ‘+%s’响应,并且没有输出任何东西。我目前的工作是:

代码语言:javascript
复制
#!/bin/bash

truncate -s 0 twoHour.log

NEW=$(tail -n1 $1 | cut -d ":" -f1)
# echo $NEW
New=$(date -d "$NEW" +%s)
OLD=$(($NEW-2))
New=$(date -d "$OLD" +%s)
# echo $OLD
START=$(egrep "$NEW\:\d\d\:\d\d" $1 | tail | date -d +%s)
END=$(egrep "$OLD\:\d\d\:\d\d" $1 | head | date -d +%s)

while read line; do

    # Extract the date for each line.
    # First strip off everything up to the first "[".
    # Then remove everything after the first "]".
    # Finally, straighten up the format with the cleandate function
    date="${date%%.*}"
    date=$( cleandate "$date" )

    # If the date falls between d1 and d2, print it
    if [[ $date -ge $START && $date -le $END ]]; then
         echo "$line"
    fi

done

新的和旧的是为被提取的时间。开始和结束是两者之间的边界,两者之间的一切都是逐行输出的。$1用于日志文件。

我已经尝试修改bash/awk脚本和搜索任何预先制作的脚本已经有几个小时了,所以我不知道如何让它工作。

EN

回答 3

Stack Overflow用户

回答已采纳

发布于 2022-05-03 23:00:29

sed可用于提取正则表达式的线条。

/^11:.*$/,/^13:26:28.849031 .*$/p

通过获取分钟数并将表达式添加为/^11:(2[6-9]|[3-5][0-9]).*$/,/^13:26:28.849031 .*$/p,可以进一步细化第一个地址。

代码语言:javascript
复制
last_line=$(tail -n1 test.txt)
end_time=$(cut -d ' ' -f1 <<<"$last_line")
end_hour="${end_time:0:2}"
min_msb="${end_time:3:1}"
min_next=$(($min_msb+1))
min_lsb="${end_time:4:1}"
start_hour=$(($end_hour-2))

if [ "$min_msb" -lt 5 ];then
  min_next=$(($min_msb+1))
else
  min_next=5
fi

sed -rn "/^$start_hour:($min_msb[$min_lsb-9]|[$min_next-5][0-9]).*$/,/^$end_time .*$/p" test.txt

如果时间跨度超过24小时

代码语言:javascript
复制
22:57:46.709883 IP unn-37-19-198-173.datapacket.com.https > term-IdeaPad-Flex.46364: Flags [P.], seq 9136:9287, ack 13044, win 420, length 151
23:26:28.709883 IP unn-37-19-198-173.datapacket.com.https > term-IdeaPad-Flex.46364: Flags [P.], seq 9136:9287, ack 13044, win 420, length 151
...
00:36:28.849031 IP unn-37-19-198-173.datapacket.com.https > term-IdeaPad-Flex.46364: Flags [P.], seq 9743:10269, ack 14286, win 420, length 526

然后

UPDATE:修正了当时间超过午夜时sed第一个地址的正则表达式。

代码语言:javascript
复制
hour_range=2
last_line=$(tail -n1 test.txt)
#end_time=$(cut -d ' ' -f1 <<<"$last_line")
end_time="${last_line:0:8}"

start_time="$(date -d "$(date -d "$end_time" --iso=seconds) -$hour_range hour" '+%T')"
echo "Time range: $start_time - $end_time"

end_hour="$(printf "%d" ${end_time:0:2})"
min_msb="$(printf "%d" ${end_time:3:1})"
min_lsb="$(printf "%d" ${end_time:4:1})"
start_hour="$(printf '%d' ${start_time:0:2})"

if [ "$min_msb" -lt 5 ];then
  min_next=$(($min_msb+1))
else
  min_next=5
fi
# Crossed midnight
start_hour_expr="$start_hour:($min_msb[$min_lsb-9]|[$min_next-5][0-9])"
if [ "$start_hour" -gt "$end_hour" ];then
  start_hour_lsb_next=$((${start_hour:1:1} + 1))
  start_hour_next="${start_hour:0:1}${start_hour_lsb_next}"
  if [ "$start_hour_next" -eq 24 ]; then
     start_hour_next="00"
  fi
  start_hour_expr="($start_hour_expr|$start_hour_next:[0-5][0-9])"
fi

echo "sed expression:"
echo -e "/^$start_hour_expr.*$/,/^$end_time.*$/p \n"

sed -rn "/^$start_hour_expr.*$/,/^$end_time.*$/p" test.txt

给定的

代码语言:javascript
复制
21:32:28.709883 IP unn-37-19-198-173.datapacket.com.https > term-IdeaPad-Flex.46364: Flags [P.], seq 9136:9287, ack 13044, win 420, length 151
21:57:46.709883 IP unn-37-19-198-173.datapacket.com.https > term-IdeaPad-Flex.46364: Flags [P.], seq 9136:9287, ack 13044, win 420, length 151
22:10:46.709883 IP unn-37-19-198-173.datapacket.com.https > term-IdeaPad-Flex.46364: Flags [P.], seq 9136:9287, ack 13044, win 420, length 151
23:07:46.709883 IP unn-37-19-198-173.datapacket.com.https > term-IdeaPad-Flex.46364: Flags [P.], seq 9136:9287, ack 13044, win 420, length 151
23:26:28.709883 IP unn-37-19-198-173.datapacket.com.https > term-IdeaPad-Flex.46364: Flags [P.], seq 9136:9287, ack 13044, win 420, length 151
00:26:28.849023 IP unn-37-19-198-173.datapacket.com.https > term-IdeaPad-Flex.46364: Flags [P.], seq 9599:9743, ack 14286, win 420, length 144
00:36:28.849031 IP unn-37-19-198-173.datapacket.com.https > term-IdeaPad-Flex.46364: Flags [P.], seq 9743:10269, ack 14286, win 420, length 526

返回

代码语言:javascript
复制
sed expression:
/^(22:(3[6-9]|[4-5][0-9])|23:[0-5][0-9]).*$/,/^00:36:28.*$/p 

23:07:46.709883 IP unn-37-19-198-173.datapacket.com.https > term-IdeaPad-Flex.46364: Flags [P.], seq 9136:9287, ack 13044, win 420, length 151
23:26:28.709883 IP unn-37-19-198-173.datapacket.com.https > term-IdeaPad-Flex.46364: Flags [P.], seq 9136:9287, ack 13044, win 420, length 151
00:26:28.849023 IP unn-37-19-198-173.datapacket.com.https > term-IdeaPad-Flex.46364: Flags [P.], seq 9599:9743, ack 14286, win 420, length 144
00:36:28.849031 IP unn-37-19-198-173.datapacket.com.https > term-IdeaPad-Flex.46364: Flags [P.], seq 9743:10269, ack 14286, win 420, length 526
票数 0
EN

Stack Overflow用户

发布于 2022-05-04 00:43:41

假设:

  • 偏移量(在OP示例中为2小时)小于24小时
  • 每一行都以格式的时间戳( HH:MM:SS )开头
  • 日志可以跨越多天。

计划:

  • 将偏移量(例如,2 hrs)转换为秒;我们将此称为offset_secs
  • 从最后一行文件中获取时间;我们将调用这个last_time
  • 将时间戳转换为纪元/秒;我们将调用此last_epoch
  • offset_secs中减去last_epoch;我们称之为first_epoch
  • first_epoch转换回HH:MM:SS字符串;我们将调用此first_time
  • 为了解决文件跨越多个午夜的时间戳,我们将在数组中保存感兴趣的行,当我们发现另一个午夜要去时,重新设置数组
  • awk/END处理期间,我们将行数组打印到标准输出。

GNU awk的一个想法是:

代码语言:javascript
复制
$ cat log.awk
BEGIN { FS="." }                                # set input field delimiter to "."

# first line of input is last line of log file; grab time and calculate the offset/start time

NR==1 { last_time   = $1
        last_epoch  = mktime( strftime("%Y %m %d") " " gensub(/:/," ","g",last_time))
        first_epoch = last_epoch - offset_secs
        first_time  = strftime("%H:%M:%S", first_epoch)

        if (first_time > last_time)
           spans_midnight=1
        next
      }

# for the rest of the input lines determine if the time falls within the last "offset_secs"

      { curr_time = $1
        if ( (  spans_midnight && curr_time >= first_time) ||
             (  spans_midnight && curr_time <= last_time)  ||
             ( !spans_midnight && curr_time >= first_time && curr_time <= last_time) )
           lines[++cnt]=$0
        else {                                  # outside the time range so ...
           delete lines                         # delete anything saved up to this point and ...
           cnt=0                                # reset the array index
        }
      }
END   { for (i=1;i<=cnt;i++)                    # print the lines that occurred within the last "offset_secs"
            print lines[i]
      }

注意:有关mktime()strftime()函数的更多详细信息,请参见GNU awk:时间函数

测试#1:持续2小时;不跨越午夜;文件跨越午夜

代码语言:javascript
复制
$ cat sample.log
22:22:00.896232 IP 104.16.42.63.https  ignore this line
06:22:00.896232 IP 104.16.42.63.https  ignore this line; crossed midnight
07:22:00.896232 IP 104.16.42.63.https  ignore this line
09:23:00.896232 IP 104.16.42.63.https  ignore this line
09:51:49.896232 IP 104.16.42.63.https  ignore this line
09:51:50.896232 IP 104.16.42.63.https  keep this line
10:24:37.896232 IP 104.16.42.63.https  keep this line
11:51:50.896232 IP 104.16.42.63.https  keep this line

$ offset_secs=$((2*60*60))                   # 2 hours

$ awk -v offset_secs="${offset_secs}" -f log.awk <(tail -1 sample.log) sample.log
09:51:50.896232 IP 104.16.42.63.https  keep this line
10:24:37.896232 IP 104.16.42.63.https  keep this line
11:51:50.896232 IP 104.16.42.63.https  keep this line

测试2:持续4小时;跨午夜;文件跨越多个午夜

代码语言:javascript
复制
$ cat sample.log
20:22:00.896232 IP 104.16.42.63.https  ignore this line
23:22:00.896232 IP 104.16.42.63.https  ignore this line
01:22:00.896232 IP 104.16.42.63.https  ignore this line; crossed midnight
23:22:00.896232 IP 104.16.42.63.https  ignore this line
01:22:00.896232 IP 104.16.42.63.https  ignore this line; crossed midnight
06:22:00.896232 IP 104.16.42.63.https  ignore this line
07:22:00.896232 IP 104.16.42.63.https  ignore this line
09:23:00.896232 IP 104.16.42.63.https  ignore this line
22:51:49.896232 IP 104.16.42.63.https  ignore this line
22:51:50.896232 IP 104.16.42.63.https  keep this line
23:07:37.896232 IP 104.16.42.63.https  keep this line
00:51:50.896232 IP 104.16.42.63.https  keep this line; crossed midnight
01:24:37.896232 IP 104.16.42.63.https  keep this line
02:51:50.896232 IP 104.16.42.63.https  keep this line

$ offset_secs=$((4*60*60))                   # 4 hours

$ awk -v offset_secs="${offset_secs}" -f log.awk <(tail -1 sample.log) sample.log
22:51:50.896232 IP 104.16.42.63.https  keep this line
23:07:37.896232 IP 104.16.42.63.https  keep this line
00:51:50.896232 IP 104.16.42.63.https  keep this line; crossed midnight
01:24:37.896232 IP 104.16.42.63.https  keep this line
02:51:50.896232 IP 104.16.42.63.https  keep this line
票数 1
EN

Stack Overflow用户

发布于 2022-05-04 14:20:33

egrep的正则表达式功能是有限的。您可以使用[0-9][[:digit:]],但不能使用\d。如果您想要\d,可以在grep -P中使用Perl样式的RegEx。

您还可以告诉grep只输出与-o匹配的数据。

值得注意的是,egrepgrep -E是同义词;我建议显式地使用grep -E,但这只是我的首选。

代码语言:javascript
复制
  -E, --extended-regexp     PATTERN is an extended regular expression (ERE)
  -P, --perl-regexp         PATTERN is a Perl regular expression
  -o, --only-matching       show only the part of a line matching PATTERN

对于tailhead,您似乎要为每个行寻找一个单行、第一行和最后一行。默认情况下,它们输出10行。这可以用-n 1来控制。

日期命令失败,因为它不知道从哪个文件读取。您可以指定-f -来指示输入文件是STDIN (管道字符串到GNU日期的转换-如何使它从stdin读取?)

有了这些,下面的内容就能让你上路了。

代码语言:javascript
复制
START=$(egrep -o "$NEW:[0-9]{2}:[0-9]{2}\.[0-9]+" $1 | tail -n 1 | date +%s -f -)
END=$(egrep -o "$OLD:[0-9]{2}:[0-9]{2}\.[0-9]+" $1 | head -n 1| date +%s -f -)

提示:在对bash脚本进行故障排除时使用bash -x可以更好地了解所发生的事情。

代码语言:javascript
复制
[root@91192da89fc4 temp]# bash -x date-orig.sh log
+ truncate -s 0 twoHour.log
++ tail -n1 log
++ cut -d : -f1
+ NEW=13
++ date -d 13 +%s
+ New=1651582800
+ OLD=11
++ date -d 11 +%s
+ New=1651575600
++ egrep '13\:\d\d\:\d\d' log
++ tail
++ date -d +%s
date: invalid date '+%s'
+ START=
++ egrep '11\:\d\d\:\d\d' log
++ head
++ date -d +%s
date: invalid date '+%s'
+ END=
+ read line
票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/72105051

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档