我正在尝试拥有一个自动化脚本,它可以获取最新的日志条目,并从两个小时前收集所有日志条目,而不管在这段时间内是否存在日志条目。我一直在研究的问题是,我发现的所有例子都附有日期,而我没有。一个示例日志输出是:
13:26:28.709883 IP unn-37-19-198-173.datapacket.com.https > term-IdeaPad-Flex.46364: Flags [P.], seq 9136:9287, ack 13044, win 420, length 151
13:26:28.713687 IP unn-37-19-198-173.datapacket.com.https > term-IdeaPad-Flex.46364: Flags [P.], seq 9287:9522, ack 13044, win 420, length 235
13:26:28.713766 IP term-IdeaPad-Flex.46364 > unn-37-19-198-173.datapacket.com.https: Flags [.], ack 9522, win 24576, length 0
13:26:28.840650 IP term-IdeaPad-Flex.46364 > unn-37-19-198-173.datapacket.com.https: Flags [.], seq 14286:15624, ack 9522, win 24576, length 1338
13:26:28.848949 IP unn-37-19-198-173.datapacket.com.https > term-IdeaPad-Flex.46364: Flags [P.], seq 9522:9599, ack 14286, win 420, length 77
13:26:28.849002 IP term-IdeaPad-Flex.46364 > unn-37-19-198-173.datapacket.com.https: Flags [P.], seq 15624:15674, ack 9599, win 24576, length 50
13:26:28.849023 IP unn-37-19-198-173.datapacket.com.https > term-IdeaPad-Flex.46364: Flags [P.], seq 9599:9743, ack 14286, win 420, length 144
13:26:28.849031 IP unn-37-19-198-173.datapacket.com.https > term-IdeaPad-Flex.46364: Flags [P.], seq 9743:10269, ack 14286, win 420, length 526所以我的时间在前线,没有约会。date不喜欢使用它,它给了我一个date: invalid date ‘+%s’响应,并且没有输出任何东西。我目前的工作是:
#!/bin/bash
truncate -s 0 twoHour.log
NEW=$(tail -n1 $1 | cut -d ":" -f1)
# echo $NEW
New=$(date -d "$NEW" +%s)
OLD=$(($NEW-2))
New=$(date -d "$OLD" +%s)
# echo $OLD
START=$(egrep "$NEW\:\d\d\:\d\d" $1 | tail | date -d +%s)
END=$(egrep "$OLD\:\d\d\:\d\d" $1 | head | date -d +%s)
while read line; do
# Extract the date for each line.
# First strip off everything up to the first "[".
# Then remove everything after the first "]".
# Finally, straighten up the format with the cleandate function
date="${date%%.*}"
date=$( cleandate "$date" )
# If the date falls between d1 and d2, print it
if [[ $date -ge $START && $date -le $END ]]; then
echo "$line"
fi
done新的和旧的是为被提取的时间。开始和结束是两者之间的边界,两者之间的一切都是逐行输出的。$1用于日志文件。
我已经尝试修改bash/awk脚本和搜索任何预先制作的脚本已经有几个小时了,所以我不知道如何让它工作。
发布于 2022-05-03 23:00:29
sed可用于提取正则表达式的线条。
/^11:.*$/,/^13:26:28.849031 .*$/p
通过获取分钟数并将表达式添加为/^11:(2[6-9]|[3-5][0-9]).*$/,/^13:26:28.849031 .*$/p,可以进一步细化第一个地址。
last_line=$(tail -n1 test.txt)
end_time=$(cut -d ' ' -f1 <<<"$last_line")
end_hour="${end_time:0:2}"
min_msb="${end_time:3:1}"
min_next=$(($min_msb+1))
min_lsb="${end_time:4:1}"
start_hour=$(($end_hour-2))
if [ "$min_msb" -lt 5 ];then
min_next=$(($min_msb+1))
else
min_next=5
fi
sed -rn "/^$start_hour:($min_msb[$min_lsb-9]|[$min_next-5][0-9]).*$/,/^$end_time .*$/p" test.txt如果时间跨度超过24小时
22:57:46.709883 IP unn-37-19-198-173.datapacket.com.https > term-IdeaPad-Flex.46364: Flags [P.], seq 9136:9287, ack 13044, win 420, length 151
23:26:28.709883 IP unn-37-19-198-173.datapacket.com.https > term-IdeaPad-Flex.46364: Flags [P.], seq 9136:9287, ack 13044, win 420, length 151
...
00:36:28.849031 IP unn-37-19-198-173.datapacket.com.https > term-IdeaPad-Flex.46364: Flags [P.], seq 9743:10269, ack 14286, win 420, length 526然后
UPDATE:修正了当时间超过午夜时sed第一个地址的正则表达式。
hour_range=2
last_line=$(tail -n1 test.txt)
#end_time=$(cut -d ' ' -f1 <<<"$last_line")
end_time="${last_line:0:8}"
start_time="$(date -d "$(date -d "$end_time" --iso=seconds) -$hour_range hour" '+%T')"
echo "Time range: $start_time - $end_time"
end_hour="$(printf "%d" ${end_time:0:2})"
min_msb="$(printf "%d" ${end_time:3:1})"
min_lsb="$(printf "%d" ${end_time:4:1})"
start_hour="$(printf '%d' ${start_time:0:2})"
if [ "$min_msb" -lt 5 ];then
min_next=$(($min_msb+1))
else
min_next=5
fi
# Crossed midnight
start_hour_expr="$start_hour:($min_msb[$min_lsb-9]|[$min_next-5][0-9])"
if [ "$start_hour" -gt "$end_hour" ];then
start_hour_lsb_next=$((${start_hour:1:1} + 1))
start_hour_next="${start_hour:0:1}${start_hour_lsb_next}"
if [ "$start_hour_next" -eq 24 ]; then
start_hour_next="00"
fi
start_hour_expr="($start_hour_expr|$start_hour_next:[0-5][0-9])"
fi
echo "sed expression:"
echo -e "/^$start_hour_expr.*$/,/^$end_time.*$/p \n"
sed -rn "/^$start_hour_expr.*$/,/^$end_time.*$/p" test.txt给定的
21:32:28.709883 IP unn-37-19-198-173.datapacket.com.https > term-IdeaPad-Flex.46364: Flags [P.], seq 9136:9287, ack 13044, win 420, length 151
21:57:46.709883 IP unn-37-19-198-173.datapacket.com.https > term-IdeaPad-Flex.46364: Flags [P.], seq 9136:9287, ack 13044, win 420, length 151
22:10:46.709883 IP unn-37-19-198-173.datapacket.com.https > term-IdeaPad-Flex.46364: Flags [P.], seq 9136:9287, ack 13044, win 420, length 151
23:07:46.709883 IP unn-37-19-198-173.datapacket.com.https > term-IdeaPad-Flex.46364: Flags [P.], seq 9136:9287, ack 13044, win 420, length 151
23:26:28.709883 IP unn-37-19-198-173.datapacket.com.https > term-IdeaPad-Flex.46364: Flags [P.], seq 9136:9287, ack 13044, win 420, length 151
00:26:28.849023 IP unn-37-19-198-173.datapacket.com.https > term-IdeaPad-Flex.46364: Flags [P.], seq 9599:9743, ack 14286, win 420, length 144
00:36:28.849031 IP unn-37-19-198-173.datapacket.com.https > term-IdeaPad-Flex.46364: Flags [P.], seq 9743:10269, ack 14286, win 420, length 526返回
sed expression:
/^(22:(3[6-9]|[4-5][0-9])|23:[0-5][0-9]).*$/,/^00:36:28.*$/p
23:07:46.709883 IP unn-37-19-198-173.datapacket.com.https > term-IdeaPad-Flex.46364: Flags [P.], seq 9136:9287, ack 13044, win 420, length 151
23:26:28.709883 IP unn-37-19-198-173.datapacket.com.https > term-IdeaPad-Flex.46364: Flags [P.], seq 9136:9287, ack 13044, win 420, length 151
00:26:28.849023 IP unn-37-19-198-173.datapacket.com.https > term-IdeaPad-Flex.46364: Flags [P.], seq 9599:9743, ack 14286, win 420, length 144
00:36:28.849031 IP unn-37-19-198-173.datapacket.com.https > term-IdeaPad-Flex.46364: Flags [P.], seq 9743:10269, ack 14286, win 420, length 526发布于 2022-05-04 00:43:41
假设:
HH:MM:SS )开头计划:
2 hrs)转换为秒;我们将此称为offset_secslast_timelast_epochoffset_secs中减去last_epoch;我们称之为first_epochfirst_epoch转换回HH:MM:SS字符串;我们将调用此first_timeawk/END处理期间,我们将行数组打印到标准输出。GNU awk的一个想法是:
$ cat log.awk
BEGIN { FS="." } # set input field delimiter to "."
# first line of input is last line of log file; grab time and calculate the offset/start time
NR==1 { last_time = $1
last_epoch = mktime( strftime("%Y %m %d") " " gensub(/:/," ","g",last_time))
first_epoch = last_epoch - offset_secs
first_time = strftime("%H:%M:%S", first_epoch)
if (first_time > last_time)
spans_midnight=1
next
}
# for the rest of the input lines determine if the time falls within the last "offset_secs"
{ curr_time = $1
if ( ( spans_midnight && curr_time >= first_time) ||
( spans_midnight && curr_time <= last_time) ||
( !spans_midnight && curr_time >= first_time && curr_time <= last_time) )
lines[++cnt]=$0
else { # outside the time range so ...
delete lines # delete anything saved up to this point and ...
cnt=0 # reset the array index
}
}
END { for (i=1;i<=cnt;i++) # print the lines that occurred within the last "offset_secs"
print lines[i]
}注意:有关mktime()和strftime()函数的更多详细信息,请参见GNU awk:时间函数
测试#1:持续2小时;不跨越午夜;文件跨越午夜
$ cat sample.log
22:22:00.896232 IP 104.16.42.63.https ignore this line
06:22:00.896232 IP 104.16.42.63.https ignore this line; crossed midnight
07:22:00.896232 IP 104.16.42.63.https ignore this line
09:23:00.896232 IP 104.16.42.63.https ignore this line
09:51:49.896232 IP 104.16.42.63.https ignore this line
09:51:50.896232 IP 104.16.42.63.https keep this line
10:24:37.896232 IP 104.16.42.63.https keep this line
11:51:50.896232 IP 104.16.42.63.https keep this line
$ offset_secs=$((2*60*60)) # 2 hours
$ awk -v offset_secs="${offset_secs}" -f log.awk <(tail -1 sample.log) sample.log
09:51:50.896232 IP 104.16.42.63.https keep this line
10:24:37.896232 IP 104.16.42.63.https keep this line
11:51:50.896232 IP 104.16.42.63.https keep this line测试2:持续4小时;跨午夜;文件跨越多个午夜
$ cat sample.log
20:22:00.896232 IP 104.16.42.63.https ignore this line
23:22:00.896232 IP 104.16.42.63.https ignore this line
01:22:00.896232 IP 104.16.42.63.https ignore this line; crossed midnight
23:22:00.896232 IP 104.16.42.63.https ignore this line
01:22:00.896232 IP 104.16.42.63.https ignore this line; crossed midnight
06:22:00.896232 IP 104.16.42.63.https ignore this line
07:22:00.896232 IP 104.16.42.63.https ignore this line
09:23:00.896232 IP 104.16.42.63.https ignore this line
22:51:49.896232 IP 104.16.42.63.https ignore this line
22:51:50.896232 IP 104.16.42.63.https keep this line
23:07:37.896232 IP 104.16.42.63.https keep this line
00:51:50.896232 IP 104.16.42.63.https keep this line; crossed midnight
01:24:37.896232 IP 104.16.42.63.https keep this line
02:51:50.896232 IP 104.16.42.63.https keep this line
$ offset_secs=$((4*60*60)) # 4 hours
$ awk -v offset_secs="${offset_secs}" -f log.awk <(tail -1 sample.log) sample.log
22:51:50.896232 IP 104.16.42.63.https keep this line
23:07:37.896232 IP 104.16.42.63.https keep this line
00:51:50.896232 IP 104.16.42.63.https keep this line; crossed midnight
01:24:37.896232 IP 104.16.42.63.https keep this line
02:51:50.896232 IP 104.16.42.63.https keep this line发布于 2022-05-04 14:20:33
egrep的正则表达式功能是有限的。您可以使用[0-9]或[[:digit:]],但不能使用\d。如果您想要\d,可以在grep -P中使用Perl样式的RegEx。
您还可以告诉grep只输出与-o匹配的数据。
值得注意的是,egrep和grep -E是同义词;我建议显式地使用grep -E,但这只是我的首选。
-E, --extended-regexp PATTERN is an extended regular expression (ERE)
-P, --perl-regexp PATTERN is a Perl regular expression
-o, --only-matching show only the part of a line matching PATTERN对于tail和head,您似乎要为每个行寻找一个单行、第一行和最后一行。默认情况下,它们输出10行。这可以用-n 1来控制。
日期命令失败,因为它不知道从哪个文件读取。您可以指定-f -来指示输入文件是STDIN (管道字符串到GNU日期的转换-如何使它从stdin读取?)
有了这些,下面的内容就能让你上路了。
START=$(egrep -o "$NEW:[0-9]{2}:[0-9]{2}\.[0-9]+" $1 | tail -n 1 | date +%s -f -)
END=$(egrep -o "$OLD:[0-9]{2}:[0-9]{2}\.[0-9]+" $1 | head -n 1| date +%s -f -)提示:在对bash脚本进行故障排除时使用bash -x可以更好地了解所发生的事情。
[root@91192da89fc4 temp]# bash -x date-orig.sh log
+ truncate -s 0 twoHour.log
++ tail -n1 log
++ cut -d : -f1
+ NEW=13
++ date -d 13 +%s
+ New=1651582800
+ OLD=11
++ date -d 11 +%s
+ New=1651575600
++ egrep '13\:\d\d\:\d\d' log
++ tail
++ date -d +%s
date: invalid date '+%s'
+ START=
++ egrep '11\:\d\d\:\d\d' log
++ head
++ date -d +%s
date: invalid date '+%s'
+ END=
+ read linehttps://stackoverflow.com/questions/72105051
复制相似问题