首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >如果在另一个文件中找到搜索字符串,则可以删除目录中的文件。

如果在另一个文件中找到搜索字符串,则可以删除目录中的文件。
EN

Stack Overflow用户
提问于 2016-05-06 18:48:33
回答 1查看 50关注 0票数 0

下面的bash几乎完成了。我唯一挣扎的部分是,在process.log中,如果找到字符串The bam file is corrupted and has been removed, please check log for reason.,则删除bash中相应的.bam ($f)。我补充说:

代码语言:javascript
复制
echo "The bam file is corrupted and has been removed, please check log for reason."
             [[ -f "$f" ]] && rm -f "$f"

在尝试这样做时,它看起来不管是移除最后一个.bam (在process.log NA19240.bam中(该文件中包含搜索字符串),但它没有删除。相反,.bam (NS12911)中的最后一个process.log是(即使搜索字符串不在那里)。我无法解决这个问题,需要一些专家的帮助。我为这篇冗长的帖子道歉,只是想补充所有的细节。谢谢:)。

bash

代码语言:javascript
复制
logfile=/home/cmccabe/Desktop/NGS/API/5-4-2016/process.log
for f in /home/cmccabe/Desktop/NGS/API/5-4-2016/*.bam ; do
 echo "Start bam validation creation: $(date) - File: $f"
 bname=`basename $f`
 pref=${bname%%.bam}
 bam validate --in $f --verbose 2> /home/cmccabe/Desktop/NGS/API/5-4-2016/bam_validation/${pref}_validation.txt
 echo "End bam validation creation: $(date) - File: $f"
done >> "$logfile"
for file in /home/cmccabe/Desktop/NGS/API/5-4-2016/bam_validation/*.txt ; do
 echo "Start verifying $(date) - File: $file"
 bname=`basename $file`
 if $(grep -iq "(SUCCESS)" "${file}"); then
    echo "The verification of the bam file has completed sucessfully."
else
    echo "The bam file is corrupted and has been removed, please check log for reason."
             [[ -f "$f" ]] && rm -f "$f"
    echo "End of bam file verification: $(date) - File: ${file}"
fi
done >> "$logfile"

process.log

代码语言:javascript
复制
 Start bam validation creation: Fri May  6 13:20:48 CDT 2016 - File: /home/cmccabe/Desktop/NGS/API/5-4-2016/NA12878.bam
 End bam validation creation: Fri May  6 13:24:15 CDT 2016 - File: /home/cmccabe/Desktop/NGS/API/5-4-2016/NA12878.bam
 Start bam validation creation: Fri May  6 13:24:15 CDT 2016 - File: /home/cmccabe/Desktop/NGS/API/5-4-2016/NA19240.bam
 End bam validation creation: Fri May  6 13:24:15 CDT 2016 - File: /home/cmccabe/Desktop/NGS/API/5-4-2016/NA19240.bam
 Start bam validation creation: Fri May  6 13:24:15 CDT 2016 - File: /home/cmccabe/Desktop/NGS/API/5-4-2016/NS12911.bam
 End bam validation creation: Fri May  6 13:28:03 CDT 2016 - File: /home/cmccabe/Desktop/NGS/API/5-4-2016/NS12911.bam
 Start verifying Fri May  6 13:28:03 CDT 2016 - File: /home/cmccabe/Desktop/NGS/API/5-4-2016/bam_validation/NA12878_validation.txt
 The verification of the bam file has completed successfully.
 End of bam file verification: Fri May  6 13:28:03 CDT 2016 - File: /home/cmccabe/Desktop/NGS/API/5-4-2016/bam_validation/NA12878_validation.txt
 Start verifying Fri May  6 13:28:03 CDT 2016 - File: /home/cmccabe/Desktop/NGS/API/5-4-2016/bam_validation/NA19240_validation.txt
 The bam file is corrupted and has been removed, please check log for reason.
 End of bam file verification: Fri May  6 13:28:05 CDT 2016 - File: /home/cmccabe/Desktop/NGS/API/5-4-2016/bam_validation/NA19240_validation.txt
 Start verifying Fri May  6 13:28:05 CDT 2016 - File: /home/cmccabe/Desktop/NGS/API/5-4-2016/bam_validation/NS12911_validation.txt
 The verification of the bam file has completed successfully.
 End of bam file verification: Fri May  6 13:28:05 CDT 2016 - File: /home/cmccabe/Desktop/NGS/API/5-4-2016/bam_validation/NS12911_validation.txt
EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2016-05-06 19:53:42

对我来说,完全复制你的环境有点困难,所以我不得不对你的设置和你的约束做一些假设。我认为这个过程可以通过许多方式简化或提高效率,但我主要关注的不是引入许多不必要的更改,而是将重点放在脚本的工作上。

尽管如此,我确实将处理重新安排到每个${pref}_validation.txt在创建后立即被验证的位置。

您是否可以尝试以下操作(注意:更新了脚本)。第一次,我跑得太快了,抄袭了错误的版本.),并告诉我结果是什么:

代码语言:javascript
复制
#!/bin/bash

logfile="/home/cmccabe/Desktop/NGS/API/5-4-2016/process.log"

for f in /home/cmccabe/Desktop/NGS/API/5-4-2016/*.bam ; do
    echo "Start bam validation creation: $(date) - File: $f"
    bname="$(basename "$f")"
    pref="${bname%%.bam}"
    bam validate --in "$f" --verbose 2> "/home/cmccabe/Desktop/NGS/API/5-4-2016/bam_validation/${pref}_validation.txt"
    echo "End bam validation creation: $(date) - File: $f"

    file="/home/cmccabe/Desktop/NGS/API/5-4-2016/bam_validation/${pref}_validation.txt"

    echo "Start verifying $(date) - File: $file"

    if grep -iq "(SUCCESS)" "${file}"; then
        echo "The verification of the bam file has completed sucessfully."
    else
        if [[ -f "$f" ]]; then
            rm -f "$f"
            echo "The bam file is corrupted and has been removed, please check log for reason."
        fi
    fi

    echo "End of bam file verification: $(date) - File: ${file}"

done >> "$logfile"

希望结合在一个for循环中的两个步骤不会偏离您的某些流程需求。我发现这样做很有帮助,因为它允许一个更流线型的代码流,日志文件现在应该读起来如下:

代码语言:javascript
复制
Start bam validation creation: Fri May  6 13:20:48 CDT 2016 - File: /home/cmccabe/Desktop/NGS/API/5-4-2016/NA12878.bam
End bam validation creation: Fri May  6 13:24:15 CDT 2016 - File: /home/cmccabe/Desktop/NGS/API/5-4-2016/NA12878.bam
Start verifying Fri May  6 13:28:03 CDT 2016 - File: /home/cmccabe/Desktop/NGS/API/5-4-2016/bam_validation/NA12878_validation.txt
The verification of the bam file has completed successfully.
End of bam file verification: Fri May  6 13:28:03 CDT 2016 - File: /home/cmccabe/Desktop/NGS/API/5-4-2016/bam_validation/NA12878_validation.txt
Start bam validation creation: Fri May  6 13:24:15 CDT 2016 - File: /home/cmccabe/Desktop/NGS/API/5-4-2016/NA19240.bam
...

高修改版本

我尝试了一个高度流线型和更具弹性的脚本版本。如果你也能查一下这个,我会感兴趣的:

代码语言:javascript
复制
#!/bin/bash

# basepath allows you to quickly move the script by updating this path
basepath="/home/cmccabe/Desktop/NGS/API/5-4-2016"

# give the logfile a name
logfile="${basepath}/process.log"

# for each .bam file in basepath do
for f in ${basepath}/*.bam ; do

    # validate the file with the bam command
    # capture the stdout, stderr and return code via some crazy bash fu
    eval "$({ cmd_err=$({ cmd_out=$( \
        bam validate --in "$f" --verbose \
      ); cmd_rtn=$?; } 2>&1; declare -p cmd_out cmd_rtn >&2); declare -p cmd_err; } 2>&1)"

    # check the return code for positive completion
    if [ "${cmd_ret}" -eq "0" ]; then
        printf -- "%s - bam validation completed for: %s\n" "$(date)" "${f}"

        # check for string "(SUCCESS)" in bam command standard output 
        if grep -iq "(SUCCESS)" <<< "${cmd_out}"; then
            printf -- "%s - Verification of the bam file has completed sucessfully.\n" "$(date)"
        else
            # verify the bam file exists and can be deleted
            if [[ -f "$f" ]] && rm -f "$f" ; then
                printf -- "%s - The bam file is corrupted and has been removed, please check log for reason.\n" "$(date)"
            else
                printf -- "%s - WARNING: The bam file is corrupted but the file could not be deleted.\n" "$(date)"
            fi
        fi
    else
        # The bam validate command above did not complete with a
        # satisfactory result. This should not really ever happen unless
        # the bam command does not exist or some serious error occurred
        # when executing the bam command.
        # Consider addition actions in addition to logging the outcome
        printf -- "%s - WARNING: bam validation failed for file: %s - [%s]\n" "$(date)" "${f}" "${cmd_err}"
    fi

done >> "$logfile"
票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/37079438

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档