下面的bash几乎完成了。我唯一挣扎的部分是,在process.log中,如果找到字符串The bam file is corrupted and has been removed, please check log for reason.,则删除bash中相应的.bam ($f)。我补充说:
echo "The bam file is corrupted and has been removed, please check log for reason."
[[ -f "$f" ]] && rm -f "$f"在尝试这样做时,它看起来不管是移除最后一个.bam (在process.log NA19240.bam中(该文件中包含搜索字符串),但它没有删除。相反,.bam (NS12911)中的最后一个process.log是(即使搜索字符串不在那里)。我无法解决这个问题,需要一些专家的帮助。我为这篇冗长的帖子道歉,只是想补充所有的细节。谢谢:)。
bash
logfile=/home/cmccabe/Desktop/NGS/API/5-4-2016/process.log
for f in /home/cmccabe/Desktop/NGS/API/5-4-2016/*.bam ; do
echo "Start bam validation creation: $(date) - File: $f"
bname=`basename $f`
pref=${bname%%.bam}
bam validate --in $f --verbose 2> /home/cmccabe/Desktop/NGS/API/5-4-2016/bam_validation/${pref}_validation.txt
echo "End bam validation creation: $(date) - File: $f"
done >> "$logfile"
for file in /home/cmccabe/Desktop/NGS/API/5-4-2016/bam_validation/*.txt ; do
echo "Start verifying $(date) - File: $file"
bname=`basename $file`
if $(grep -iq "(SUCCESS)" "${file}"); then
echo "The verification of the bam file has completed sucessfully."
else
echo "The bam file is corrupted and has been removed, please check log for reason."
[[ -f "$f" ]] && rm -f "$f"
echo "End of bam file verification: $(date) - File: ${file}"
fi
done >> "$logfile"process.log
Start bam validation creation: Fri May 6 13:20:48 CDT 2016 - File: /home/cmccabe/Desktop/NGS/API/5-4-2016/NA12878.bam
End bam validation creation: Fri May 6 13:24:15 CDT 2016 - File: /home/cmccabe/Desktop/NGS/API/5-4-2016/NA12878.bam
Start bam validation creation: Fri May 6 13:24:15 CDT 2016 - File: /home/cmccabe/Desktop/NGS/API/5-4-2016/NA19240.bam
End bam validation creation: Fri May 6 13:24:15 CDT 2016 - File: /home/cmccabe/Desktop/NGS/API/5-4-2016/NA19240.bam
Start bam validation creation: Fri May 6 13:24:15 CDT 2016 - File: /home/cmccabe/Desktop/NGS/API/5-4-2016/NS12911.bam
End bam validation creation: Fri May 6 13:28:03 CDT 2016 - File: /home/cmccabe/Desktop/NGS/API/5-4-2016/NS12911.bam
Start verifying Fri May 6 13:28:03 CDT 2016 - File: /home/cmccabe/Desktop/NGS/API/5-4-2016/bam_validation/NA12878_validation.txt
The verification of the bam file has completed successfully.
End of bam file verification: Fri May 6 13:28:03 CDT 2016 - File: /home/cmccabe/Desktop/NGS/API/5-4-2016/bam_validation/NA12878_validation.txt
Start verifying Fri May 6 13:28:03 CDT 2016 - File: /home/cmccabe/Desktop/NGS/API/5-4-2016/bam_validation/NA19240_validation.txt
The bam file is corrupted and has been removed, please check log for reason.
End of bam file verification: Fri May 6 13:28:05 CDT 2016 - File: /home/cmccabe/Desktop/NGS/API/5-4-2016/bam_validation/NA19240_validation.txt
Start verifying Fri May 6 13:28:05 CDT 2016 - File: /home/cmccabe/Desktop/NGS/API/5-4-2016/bam_validation/NS12911_validation.txt
The verification of the bam file has completed successfully.
End of bam file verification: Fri May 6 13:28:05 CDT 2016 - File: /home/cmccabe/Desktop/NGS/API/5-4-2016/bam_validation/NS12911_validation.txt发布于 2016-05-06 19:53:42
对我来说,完全复制你的环境有点困难,所以我不得不对你的设置和你的约束做一些假设。我认为这个过程可以通过许多方式简化或提高效率,但我主要关注的不是引入许多不必要的更改,而是将重点放在脚本的工作上。
尽管如此,我确实将处理重新安排到每个${pref}_validation.txt在创建后立即被验证的位置。
您是否可以尝试以下操作(注意:更新了脚本)。第一次,我跑得太快了,抄袭了错误的版本.),并告诉我结果是什么:
#!/bin/bash
logfile="/home/cmccabe/Desktop/NGS/API/5-4-2016/process.log"
for f in /home/cmccabe/Desktop/NGS/API/5-4-2016/*.bam ; do
echo "Start bam validation creation: $(date) - File: $f"
bname="$(basename "$f")"
pref="${bname%%.bam}"
bam validate --in "$f" --verbose 2> "/home/cmccabe/Desktop/NGS/API/5-4-2016/bam_validation/${pref}_validation.txt"
echo "End bam validation creation: $(date) - File: $f"
file="/home/cmccabe/Desktop/NGS/API/5-4-2016/bam_validation/${pref}_validation.txt"
echo "Start verifying $(date) - File: $file"
if grep -iq "(SUCCESS)" "${file}"; then
echo "The verification of the bam file has completed sucessfully."
else
if [[ -f "$f" ]]; then
rm -f "$f"
echo "The bam file is corrupted and has been removed, please check log for reason."
fi
fi
echo "End of bam file verification: $(date) - File: ${file}"
done >> "$logfile"希望结合在一个for循环中的两个步骤不会偏离您的某些流程需求。我发现这样做很有帮助,因为它允许一个更流线型的代码流,日志文件现在应该读起来如下:
Start bam validation creation: Fri May 6 13:20:48 CDT 2016 - File: /home/cmccabe/Desktop/NGS/API/5-4-2016/NA12878.bam
End bam validation creation: Fri May 6 13:24:15 CDT 2016 - File: /home/cmccabe/Desktop/NGS/API/5-4-2016/NA12878.bam
Start verifying Fri May 6 13:28:03 CDT 2016 - File: /home/cmccabe/Desktop/NGS/API/5-4-2016/bam_validation/NA12878_validation.txt
The verification of the bam file has completed successfully.
End of bam file verification: Fri May 6 13:28:03 CDT 2016 - File: /home/cmccabe/Desktop/NGS/API/5-4-2016/bam_validation/NA12878_validation.txt
Start bam validation creation: Fri May 6 13:24:15 CDT 2016 - File: /home/cmccabe/Desktop/NGS/API/5-4-2016/NA19240.bam
...高修改版本
我尝试了一个高度流线型和更具弹性的脚本版本。如果你也能查一下这个,我会感兴趣的:
#!/bin/bash
# basepath allows you to quickly move the script by updating this path
basepath="/home/cmccabe/Desktop/NGS/API/5-4-2016"
# give the logfile a name
logfile="${basepath}/process.log"
# for each .bam file in basepath do
for f in ${basepath}/*.bam ; do
# validate the file with the bam command
# capture the stdout, stderr and return code via some crazy bash fu
eval "$({ cmd_err=$({ cmd_out=$( \
bam validate --in "$f" --verbose \
); cmd_rtn=$?; } 2>&1; declare -p cmd_out cmd_rtn >&2); declare -p cmd_err; } 2>&1)"
# check the return code for positive completion
if [ "${cmd_ret}" -eq "0" ]; then
printf -- "%s - bam validation completed for: %s\n" "$(date)" "${f}"
# check for string "(SUCCESS)" in bam command standard output
if grep -iq "(SUCCESS)" <<< "${cmd_out}"; then
printf -- "%s - Verification of the bam file has completed sucessfully.\n" "$(date)"
else
# verify the bam file exists and can be deleted
if [[ -f "$f" ]] && rm -f "$f" ; then
printf -- "%s - The bam file is corrupted and has been removed, please check log for reason.\n" "$(date)"
else
printf -- "%s - WARNING: The bam file is corrupted but the file could not be deleted.\n" "$(date)"
fi
fi
else
# The bam validate command above did not complete with a
# satisfactory result. This should not really ever happen unless
# the bam command does not exist or some serious error occurred
# when executing the bam command.
# Consider addition actions in addition to logging the outcome
printf -- "%s - WARNING: bam validation failed for file: %s - [%s]\n" "$(date)" "${f}" "${cmd_err}"
fi
done >> "$logfile"https://stackoverflow.com/questions/37079438
复制相似问题