我需要编写一个shell脚本来选择/exp/files目录中的所有文件(而不是目录)。对于目录中的每个文件,我想知道是否收到了文件的最后一行。文件中的最后一行是尾部记录。最后一行中的第三个字段是数据记录数,即2315 (文件中的总行数-2 (头部、尾部))。在我的unix shell脚本中,我希望通过检查T来检查最后一行是否为尾部记录,并希望检查文件中的行数是否等于(2315+2)。如果这是成功的,那么我想将文件移动到另一个目录/exp/ready。
tail -1 test.csv
T,Test.csv,2315,80045.96此外,在输入文件中,有时尾部记录的0或1个字段可以包含在双引号内
"T","Test.csv","2315","80045.96"
"T", Test.csv, 2212,"80045.96"
T,Test.csv,2315,80045.96发布于 2010-02-22 17:02:20
您可以使用以下命令来测试最后一行是否存在:
tail -1 ${filename} | egrep '^T,|^"T",' >/dev/null 2>&1
rc=$?此时,如果行以T,或"T",开头,则$rc将为0,假设这足以捕获尾部记录。
一旦确定了这一点,就可以使用以下命令提取行数:
lc=$(cat ${filename} | wc -l)您可以使用以下命令获取预期的行数:
elc=$(tail -1 ${filename} | awk -F, '{sub(/^"/,"",$3);print 2+$3}')并将两者进行比较。
因此,将所有这些结合在一起,这将是一个很好的开始。它输出文件本身(我的测试文件num[1-9].tst)以及一条指示文件是否正常或为什么不正常的消息。
#!/bin/bash
cd /exp/files
for fspec in *.tst ; do
if [[ -f ${fspec} ]] ; then
cat ${fspec} | sed 's/^/ /'
tail -1 ${fspec} | egrep '^T,|^"T",' >/dev/null 2>&1
rc=$?
if [[ ${rc} -eq 0 ]] ; then
lc=$(cat ${fspec} | wc -l)
elc=$(tail -1 ${fspec} | awk -F, '{sub(/^"/,"",$3);print 2+$3}')
if [[ ${lc} -eq ${elc} ]] ; then
echo '***' File ${fspec} is done and dusted.
else
echo '***' File ${fspec} line count mismatch: ${lc}/${elc}.
fi
else
echo '***' File ${fspec} has no valid trailer.
fi
else
ls -ald ${fspec} | sed 's/^/ /'
echo '***' File ${fspec} is not a regular file.
fi
done运行示例,显示我使用的测试文件:
H,Test.csv,other rubbish goes here
this file does not have a trailer
*** File num1.tst has no valid trailer.
H,Test.csv,other rubbish goes here
this file does have a trailer with all quotes and correct count
"T","Test.csv","1","80045.96"
*** File num2.tst is done and dusted.
H,Test.csv,other rubbish goes here
this file does have a trailer with all quotes but bad count
"T","Test.csv","9","80045.96"
*** File num3.tst line count mismatch: 3/11.
H,Test.csv,other rubbish goes here
this file does have a trailer with all quotes except T, and correct count
T,"Test.csv","1","80045.96"
*** File num4.tst is done and dusted.
H,Test.csv,other rubbish goes here
this file does have a trailer with no quotes on T or count and correct count
T,"Test.csv",1,"80045.96"
*** File num5.tst is done and dusted.
H,Test.csv,other rubbish goes here
this file does have a traier with quotes on T only, and correct count
"T",Test.csv,1,80045.96
*** File num6.tst is done and dusted.
drwxr-xr-x+ 2 pax None 0 Feb 23 09:55 num7.tst
*** File num7.tst is not a regular file.
H,Test.csv,other rubbish goes here
this file does have a trailer with all quotes except the bad count
"T","Test.csv",8,"80045.96"
*** File num8.tst line count mismatch: 3/10.
H,Test.csv,other rubbish goes here
this file does have a trailer with no quotes and a bad count
T,Test.csv,7,80045.96
*** File num9.tst line count mismatch: 3/9.发布于 2010-02-22 16:37:21
如果你想在文件被写入和关闭后移动它们,那么你应该考虑使用inotify,incron,FAM,gamin等工具。
发布于 2010-02-22 18:46:29
这段代码通过一个对awk的调用来完成所有的逻辑计算,这使得它非常高效。它也没有硬编码2315的示例值,而是使用了行中包含的值,因为我相信这是您的意图。
如果您对结果感到满意,请记住删除echo。
#!/bin/bash
for file in /exp/files/*; do
if [[ -f "$file" ]]; then
if nawk -F, '{v0=$0;v1=$1;v3=$3}END{gsub(/"/,"",v0);exit !(v1 == "T" && NR == v3+2)}' "$file"; then
echo mv "$file" /ext/ready
fi
fi
done更新
我不得不添加{v0=$0;v1=$1;v3=$3},因为SunOS的awk实现不支持END{}访问字段变量($0,$1,$2等)。但是,如果您想在END{}内部处理它们,则必须保存到用户定义的变量中。查看This awk feature comparison link中第一个表的最后一行
https://stackoverflow.com/questions/2309673
复制相似问题