我已经绞尽脑汁去做一些对比我更有Perl编程经验的人来说可能更容易的事情。
我有下面的代码。
use strict;
use warnings;
my @lines = do {
open my $in_fh, '<', 'input.txt' or die qq{Unable to open "input.txt" for input: $!};
<$in_fh>;
};
chomp @lines;
my $re = join '|', @lines;
my @files = grep /^(?:$re)/, glob '*.bam';
$_ = "INPUT=$_" for @files;
foreach my $file (@files) {
foreach my $line (@lines) {
if ($file =~ m/$line/) {
my $command = "picard MergeSamFiles $file OUTPUT=$line" . "-tmp-herc2.bam MERGE_SEQUENCE_DICTIONARIES=TRUE CREATE_INDEX=TRUE";
system($command);
my $command2 = "picard MarkDuplicates $line OUTPUT=$line-herc2.bam METRICS_FILE=tmp REMOVE_DUPLICATES=TRUE CREATE_INDEX=TRUE";
system($command2);
unlink "$line-tmp-herc2.bam";
unlink "$line-tmp-herc2.bai";
unlink "tmp";
}
}
}在input.txt中,我有样例名称,用来验证样例是否在目录中。在本例中,我只使用了两个示例。
HG00096
HG00117所以,使用上面的代码,我得到了类似这样的东西。
picard MergeSamFiles INPUT=HG00096.mapped.ILLUMINA.bwa.GBR.exome.20120522.bam_herc2_data.bam OUTPUT=HG00096-tmp-herc2.bam MERGE_SEQUENCE_DICTIONARIES=TRUE CREATE_INDEX=TRUE
picard MarkDuplicates HG00096 OUTPUT=HG00096-herc2.bam METRICS_FILE=tmp REMOVE_DUPLICATES=TRUE CREATE_INDEX=TRUE
picard MergeSamFiles INPUT=HG00096.mapped.ILLUMINA.bwa.GBR.low_coverage.20101123.bam_herc2_phase1.bam OUTPUT=HG00096-tmp-herc2.bam MERGE_SEQUENCE_DICTIONARIES=TRUE CREATE_INDEX=TRUE
picard MarkDuplicates HG00096 OUTPUT=HG00096-herc2.bam METRICS_FILE=tmp REMOVE_DUPLICATES=TRUE CREATE_INDEX=TRUE
picard MergeSamFiles INPUT=HG00096.mapped.ILLUMINA.bwa.GBR.low_coverage.20120522.bam_herc2_data.bam OUTPUT=HG00096-tmp-herc2.bam MERGE_SEQUENCE_DICTIONARIES=TRUE CREATE_INDEX=TRUE
picard MarkDuplicates HG00096 OUTPUT=HG00096-herc2.bam METRICS_FILE=tmp REMOVE_DUPLICATES=TRUE CREATE_INDEX=TRUE
picard MergeSamFiles INPUT=HG00096.mapped.illumina.mosaik.GBR.exome.20110411.bam_herc2_phase1.bam OUTPUT=HG00096-tmp-herc2.bam MERGE_SEQUENCE_DICTIONARIES=TRUE CREATE_INDEX=TRUE
picard MarkDuplicates HG00096 OUTPUT=HG00096-herc2.bam METRICS_FILE=tmp REMOVE_DUPLICATES=TRUE CREATE_INDEX=TRUE
picard MergeSamFiles INPUT=HG00117.mapped.ILLUMINA.bwa.GBR.exome.20120522.bam_herc2_data.bam OUTPUT=HG00117-tmp-herc2.bam MERGE_SEQUENCE_DICTIONARIES=TRUE CREATE_INDEX=TRUE
picard MarkDuplicates HG00117 OUTPUT=HG00117-herc2.bam METRICS_FILE=tmp REMOVE_DUPLICATES=TRUE CREATE_INDEX=TRUE
picard MergeSamFiles INPUT=HG00117.mapped.ILLUMINA.bwa.GBR.low_coverage.20101123.bam_herc2_phase1.bam OUTPUT=HG00117-tmp-herc2.bam MERGE_SEQUENCE_DICTIONARIES=TRUE CREATE_INDEX=TRUE
picard MarkDuplicates HG00117 OUTPUT=HG00117-herc2.bam METRICS_FILE=tmp REMOVE_DUPLICATES=TRUE CREATE_INDEX=TRUE
picard MergeSamFiles INPUT=HG00117.mapped.ILLUMINA.bwa.GBR.low_coverage.20120522.bam_herc2_data.bam OUTPUT=HG00117-tmp-herc2.bam MERGE_SEQUENCE_DICTIONARIES=TRUE CREATE_INDEX=TRUE
picard MarkDuplicates HG00117 OUTPUT=HG00117-herc2.bam METRICS_FILE=tmp REMOVE_DUPLICATES=TRUE CREATE_INDEX=TRUE
picard MergeSamFiles INPUT=HG00117.mapped.illumina.mosaik.GBR.exome.20110411.bam_herc2_phase1.bam OUTPUT=HG00117-tmp-herc2.bam MERGE_SEQUENCE_DICTIONARIES=TRUE CREATE_INDEX=TRUE
picard MarkDuplicates HG00117 OUTPUT=HG00117-herc2.bam METRICS_FILE=tmp REMOVE_DUPLICATES=TRUE CREATE_INDEX=TRUE当我真的想要这样的东西时。
picard MergeSamFiles INPUT=HG00096.mapped.ILLUMINA.bwa.GBR.exome.20120522.bam_herc2_data.bam INPUT=HG00096.mapped.ILLUMINA.bwa.GBR.low_coverage.20101123.bam_herc2_phase1.bam INPUT=HG00096.mapped.ILLUMINA.bwa.GBR.low_coverage.20120522.bam_herc2_data.bam INPUT=HG00096.mapped.illumina.mosaik.GBR.exome.20110411.bam_herc2_phase1.bam OUTPUT=HG00096-tmp-herc2.bam MERGE_SEQUENCE_DICTIONARIES=TRUE CREATE_INDEX=TRUE
picard MarkDuplicates HG00096-tmp-herc2.bam OUTPUT=HG00096-herc2.bam METRICS_FILE=tmp REMOVE_DUPLICATES=TRUE CREATE_INDEX=TRUE
picard MergeSamFiles INPUT=HG00117.mapped.ILLUMINA.bwa.GBR.exome.20120522.bam_herc2_data.bam INPUT=HG00117.mapped.ILLUMINA.bwa.GBR.low_coverage.20101123.bam_herc2_phase1.bam INPUT=HG00117.mapped.ILLUMINA.bwa.GBR.low_coverage.20120522.bam_herc2_data.bam INPUT=HG00117.mapped.illumina.mosaik.GBR.exome.20110411.bam_herc2_phase1.bam OUTPUT=HG00117-tmp-herc2.bam MERGE_SEQUENCE_DICTIONARIES=TRUE CREATE_INDEX=TRUE
picard MarkDuplicates HG00117-tmp-herc2.bam OUTPUT=HG00117-herc2.bam METRICS_FILE=tmp REMOVE_DUPLICATES=TRUE CREATE_INDEX=TRUE因此,INPUT数据应该放在一起,以便系统command合并文件e为下一个源command2生成OUTPUT。
我知道我是在摆弄foreach循环,但我试图弄清楚如何正确地迭代它,但我被卡住了。
希望你能帮我解决这个问题。
发布于 2015-04-06 23:01:13
在第一个命令中,将后缀添加到输出文件:
my $command = "picard MergeSamFiles $file OUTPUT=$line" . "-tmp-herc2.bam MERGE_SEQUENCE_DICTIONARIES=TRUE CREATE_INDEX=TRUE";
# here ___^_______________^只需对第二个命令执行相同的操作:
my $command2 = "picard MarkDuplicates ${line}-tmp-herc2.bam OUTPUT=$line-herc2.bam METRICS_FILE=tmp REMOVE_DUPLICATES=TRUE CREATE_INDEX=TRUE";
# here ___^____________^https://stackoverflow.com/questions/29471973
复制相似问题