我有几个基因组文件,都以.1.ht2l to .8.ht2l为后缀
bob.1.ht2l
bob.2.ht2l
bob.3.ht2l
bob.4.ht2l
bob.5.ht2l
bob.6.ht2l
bob.7.ht2l
bob.8.ht2l
steve.1.ht2l ....steve.8.ht2l and so on和sereval RNAseq示例,如
flower_kevin_1.fastq.gz
flower_kevin_2.fastq.gz
flower_daniel_1.fastq.gz
flower_daniel_2.fastq.gz and so on also with different tissues.我想将所有的rnaseq reds与基因组进行比对。更新:
workdir: "/path/to/aligned"
(HISAT2_INDEX_PREFIX,)=glob_wildcards("/path/to/index/{prefix}.1.ht2l")
(SAMPLES,)=glob_wildcards("/path/to/{sample}_1.fastq.gz")
print(HISAT2_INDEX_PREFIX)
print (SAMPLES)
rule all:
input:
expand("{prefix}.{sample}.bam", zip, prefix=HISAT2_INDEX_PREFIX, sample=SAMPLES)
rule hisat2:
input:
hisat2_index=expand("%s.{ix}.ht2l" % "/path/to/index/{prefix}", ix=range(1, 9)),
fastq1="/path/to/{sample}_1.fastq.gz",
fastq2="/path/to/{sample}_2.fastq.gz"
output:
bam = "{prefix}.{sample}.bam",
txt = "{prefix}.{sample}.txt",
log: "{prefix}.{sample}.snakemake_log.txt"
threads: 5
shell:
"/Tools/hisat2-2.1.0/hisat2 -p {threads} -x {HISAT2_INDEX_PREFIX}"
" -1 {input.fastq1} -2 {input.fastq2} --summary-file {output.txt} |"
"/Tools/samtools-1.9/samtools sort -@ {threads} -o {output.bam}"我得到了缺少输入文件的错误,我确信错误是在某个有后缀的地方,但是,我不知道如何解决,任何建议都是非常感谢的。更新:所有的基因组文件bob.1.ht2l..bob.8.ht2l,steve.1.ht2l.steve.8.ht2l都会被一次调用,为什么?
发布于 2019-12-25 03:06:58
让我们比较一下这两行:
(HISAT2_INDEX_PREFIX,)=glob_wildcards("/path/to/index/{prefix}.1.ht2l") hisat2_index=expand("%s.{ix}.ht2l" % HISAT2_INDEX_PREFIX, ix=range(1, 9)),第一行显式指定了/path/to/index/,第二行遗漏了此路径。保持一致,这应该可以解决你的问题。
https://stackoverflow.com/questions/59460188
复制相似问题