我在snakemake管道中得到以下错误:
Building DAG of jobs...
Using shell: /usr/bin/bash
Provided cores: 16
Rules claiming more threads will be scaled down.
Job counts:
count jobs
1 long_read_assembly
1
[Wed Jan 15 11:35:18 2020]
rule long_read_assembly:
input: long_reads/F19FTSEUHT1027.PSU4_ISF1A_long.fastq.gz
output: canu-outputs/F19FTSEUHT1027.PSU4_ISF1A.subreads.contigs.fasta
jobid: 0
wildcards: sample=F19FTSEUHT1027.PSU4_ISF1A
/usr/bin/bash: canu: command not found
[Wed Jan 15 11:35:18 2020]
Error in rule long_read_assembly:
jobid: 0
output: canu-outputs/F19FTSEUHT1027.PSU4_ISF1A.subreads.contigs.fasta
shell:
canu -p F19FTSEUHT1027.PSU4_ISF1A -d canu-outputs genomeSize=8m -pacbio-raw long_reads/F19FTSEUHT1027.PSU4_ISF1A_long.fastq.gz
(one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message我假设这意味着找不到canu命令。但是Canu包确实存在于conda环境中:
(hybrid_assembly) [lamma@fe1 Assembly]$ conda list | grep canu
canu 1.9 he1b5a44_0 biocondasnakefile如下所示:
workdir: config["path_to_files"]
wildcard_constraints:
separator = config["separator"],
sample = '|' .join(config["samples"]),
rule all:
input:
expand("assembly-stats/{sample}_stats.txt", sample = config["samples"])
rule short_reads_QC:
input:
f"short_reads/{{sample}}_short{config['separator']}*.fq.gz"
output:
"fastQC-reports/{sample}.html"
conda:
"/home/lamma/env-export/hybrid_assembly.yaml"
shell:
"""
mkdir fastqc-reports
fastqc -o fastqc-reports {input}
"""
rule quallity_trimming:
input:
forward = f"short_reads/{{sample}}_short{config['separator']}1.fq.gz",
reverse = f"short_reads/{{sample}}_short{config['separator']}2.fq.gz",
output:
forward = "cleaned_short-reads/{sample}_short_1-clean.fastq",
reverse = "cleaned_short-reads/{sample}_short_2-clean.fastq"
conda:
"/home/lamma/env-export/hybrid_assembly.yaml"
shell:
"bbduk.sh -Xmx1g in1={input.forward} in2={input.reverse} out1={output.forward} out2={output.reverse} qtrim=rl trimq=10"
rule long_read_assembly:
input:
"long_reads/{sample}_long.fastq.gz"
output:
"canu-outputs/{sample}.subreads.contigs.fasta"
conda:
"/home/lamma/env-export/hybrid_assembly.yaml"
shell:
"canu -p {wildcards.sample} -d canu-outputs genomeSize=8m -pacbio-raw {input}"
rule short_read_alignment:
input:
short_read_fwd = "cleaned_short-reads/{sample}_short_1-clean.fastq",
short_read_rvs = "cleaned_short-reads/{sample}_short_2-clean.fastq",
reference = "canu-outputs/{sample}.subreads.contigs.fasta"
output:
"bwa-output/{sample}_short.bam"
conda:
"/home/lamma/env-export/hybrid_assembly.yaml"
shell:
"bwa mem {input.reference} {input.short_read_fwd} {input.short_read_rvs} | samtools view -S -b > {output}"
rule indexing_and_sorting:
input:
"bwa-output/{sample}_short.bam"
output:
"bwa-output/{sample}_short_sorted.bam"
conda:
"/home/lamma/env-export/hybrid_assembly.yaml"
shell:
"samtools sort {input} > {output}"
rule polishing:
input:
bam_files = "bwa-output/{sample}_short_sorted.bam",
long_assembly = "canu-outputs/{sample}.subreads.contigs.fasta"
output:
"pilon-output/{sample}-improved.fasta"
conda:
"/home/lamma/env-export/hybrid_assembly.yaml"
shell:
"pilon --genome {input.long_assembly} --frags {input.bam_files} --output {output} --outdir pilon-output"
rule assembly_stats:
input:
"pilon-output/{sample}-improved.fasta"
output:
"assembly-stats/{sample}_stats.txt"
conda:
"/home/lamma/env-export/hybrid_assembly.yaml"
shell:
"stats.sh in={input} gc=assembly-stats/{wildcards.sample}/{wildcards.sample}_gc.csv gchist=assembly-stats/{wildcards.sample}/{wildcards.sample}_gchist.csv shist=assembly-stats/{wildcards.sample}/{wildcards.sample}_shist.csv > assembly-stats/{wildcards.sample}/{wildcards.sample}_stats.txt"据我所知,调用canu的规则的语法是正确的,所以我不确定是什么导致了这个错误。
编辑:添加snakemake命令
snakemake --latency-wait 60 --rerun-incomplete --keep-going --jobs 99 --cluster-status 'python /home/lamma/faststorage/scripts/slurm-status.py' --cluster 'sbatch -t {cluster.time} --mem={cluster.mem} --cpus-per-task={cluster.c} --error={cluster.error} --job-name={cluster.name} --output={cluster.output} --wait --parsable' --cluster-config bacterial-hybrid-assembly-config.json --configfile yaml-config-files/test_experiment3.yaml --snakefile bacterial-hybrid-assembly.smk发布于 2020-01-15 21:24:26
在运行snakemake工作流时,如果要在特定于规则的conda环境中运行某些规则,则命令行调用的形式应为
snakemake [... various options ...] --use-conda [--conda-prefix <some-directory>]如果您不告诉snakemake使用conda,则规则中的所有conda: <some_path>条目都将被忽略,并且规则将在当前激活的任何环境中运行。
--conda-prefix <dir>是可选的,但它会告诉snakemake在哪里可以找到安装的环境(如果你不指定它,.snakemake文件夹中就会安装一个conda环境,这意味着.snakemake文件夹可能会变得非常大,并且多个项目的.snakemake文件夹可能包含很多重复的conda内容)
https://stackoverflow.com/questions/59749945
复制相似问题