首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >Nextflow真的不一致还是我使用nf/rnaseq做错了什么?

Nextflow真的不一致还是我使用nf/rnaseq做错了什么?
EN

Stack Overflow用户
提问于 2021-09-10 17:56:20
回答 2查看 1.6K关注 0票数 1

我想用我对Nextflow非常陌生的方式来说明这一点,如果我不包括调试的键,我很抱歉,请告诉我。

====================================

案例1:我尝试运行以下命令:

nextflow run nf-core/rnaseq --aligner histat2 -profile test,docker

但最终得到了以下错误:

代码语言:javascript
复制
-[nf-core/rnaseq] Pipeline completed with errors-
WARN: To render the execution DAG in the required format it is required to install Graphviz -- See http://www.graphviz.org for more info.
Error executing process > 'NFCORE_RNASEQ:RNASEQ:MULTIQC_CUSTOM_BIOTYPE (RAP1_UNINDUCED_REP2)'

Caused by:
  Process `NFCORE_RNASEQ:RNASEQ:MULTIQC_CUSTOM_BIOTYPE (RAP1_UNINDUCED_REP2)` terminated with an error exit status (1)

Command executed:

  cut -f 1,7 RAP1_UNINDUCED_REP2.featureCounts.txt | tail -n +3 | cat biotypes_header.txt - >> RAP1_UNINDUCED_REP2.biotype_counts_mqc.tsv
  mqc_features_stat.py RAP1_UNINDUCED_REP2.biotype_counts_mqc.tsv -s RAP1_UNINDUCED_REP2 -f rRNA -o RAP1_UNINDUCED_REP2.biotype_counts_rrna_mqc.tsv

Command exit status:
  1

Command output:
  (empty)

Command error:
  cut: RAP1_UNINDUCED_REP2.featureCounts.txt: No such file or directory
  cat: can't open 'biotypes_header.txt': No such file or directory

Work dir:
  /mnt/c/Users/mkozubov/Desktop/nextflow_tutorial/work/e7/5df55125d9662b3c6ee83cdeea9ea9

Tip: you can replicate the issue by changing to the process work dir and entering the command `bash .command.run`

我去了它告诉我的"work“目录,运行了广告中的bash .command.run,它运行得很好!为什么会出错?

==========================================

案例2:

我认为我的问题是码头,我也使用奇点。我按如下方式运行,导致两次失败,一次成功。以下是命令和错误:

  1. nextflow run nf-core/rnaseq --aligner hisat2 -profile test,singularity
代码语言:javascript
复制
Caused by:
  Failed to pull singularity image
  command: singularity pull  --name depot.galaxyproject.org-singularity-qualimap-2.2.2d--1.img.pulling.1631227989457 https://depot.galaxyproject.org/singularity/qualima
p:2.2.2d--1 > /dev/null
  status : 255
  message:
    INFO:    Downloading network image
    INFO:    Cleaning up incomplete download: /home/mkozubov/.singularity/cache/net/tmp_601246724
    FATAL:   unexpected EOF
  1. nextflow run nf-core/rnaseq --aligner hisat2 -profile test,singularity -resume
代码语言:javascript
复制
Caused by:
  Failed to pull singularity image
  command: singularity pull  --name depot.galaxyproject.org-singularity-bioconductor-dupradar-1.18.0--r40_1.img.pulling.1631228803940 https://depot.galaxyproject.org/si
ngularity/bioconductor-dupradar:1.18.0--r40_1 > /dev/null
  status : 255
  message:
    INFO:    Downloading network image
    INFO:    Cleaning up incomplete download: /home/mkozubov/.singularity/cache/net/tmp_504979312
    FATAL:   unexpected EOF
  1. nextflow run nf-core/rnaseq --aligner hisat2 -profile test,singularity -resume ecstatic_minsky
代码语言:javascript
复制
WARN: To render the execution DAG in the required format it is required to install Graphviz -- See http://www.graphviz.org for more info.
Completed at: 09-Sep-2021 16:40:43
Duration    : 26m 44s
CPU hours   : 1.5 (29.8% cached)
Succeeded   : 116
Cached      : 64

我意识到我的第二份简历可能什么也没做,但是为什么要恢复我的第一次运行呢?为什么奇点不能在第一次拉下它所需要的图像?我是一个菜鸟,我真的不知道从哪里开始调试这样的问题,任何帮助都是非常感谢的。

===========================================================

配置文件:

代码语言:javascript
复制
========================================================================================
    nf-core/rnaseq Nextflow config file
========================================================================================
    Default config options for all compute environments
----------------------------------------------------------------------------------------
*/

// Global default params, used in configs
params {

    // Input options
    input                      = null

    // References
    genome                     = null
    transcript_fasta           = null
    additional_fasta           = null
    splicesites                = null
    gtf_extra_attributes       = 'gene_name'
    gtf_group_features         = 'gene_id'
    featurecounts_feature_type = 'exon'
    featurecounts_group_type   = 'gene_biotype'
    gencode                    = false
    save_reference             = false

    // UMI handling
    with_umi                   = false
    umitools_extract_method    = 'string'
    umitools_bc_pattern        = null
    save_umi_intermeds         = false

    // Trimming
    clip_r1                    = null
    clip_r2                    = null
    three_prime_clip_r1        = null
    three_prime_clip_r2        = null
    trim_nextseq               = null
    save_trimmed               = false
    skip_trimming              = false

    // Ribosomal RNA removal
    remove_ribo_rna            = false
    save_non_ribo_reads        = false
    ribo_database_manifest     = "${projectDir}/assets/rrna-db-defaults.txt"

    // Alignment
    aligner                    = 'star_salmon'
    pseudo_aligner             = null
    seq_center                 = null
    bam_csi_index              = false
    star_ignore_sjdbgtf        = false
    salmon_quant_libtype       = null
    hisat2_build_memory        = '200.GB'  // Amount of memory required to build HISAT2 index with splice sites
    stringtie_ignore_gtf       = false
    min_mapped_reads           = 5
    save_merged_fastq          = false
    save_unaligned             = false
    save_align_intermeds       = false
    skip_markduplicates        = false
    skip_alignment             = false

    // QC
    skip_qc                    = false
    skip_bigwig                = false
    skip_stringtie             = false
    skip_fastqc                = false
    skip_preseq                = false
    skip_dupradar              = false
    skip_qualimap              = false
    skip_rseqc                 = false
    skip_biotype_qc            = false
    skip_deseq2_qc             = false
    skip_multiqc               = false
    deseq2_vst                 = false
    rseqc_modules              = 'bam_stat,inner_distance,infer_experiment,junction_annotation,junction_saturation,read_distribution,read_duplication'

    // Boilerplate options
    outdir                     = './results'
    publish_dir_mode           = 'copy'
    multiqc_config             = null
    multiqc_title              = null
    email                      = null
    email_on_fail              = null
    max_multiqc_email_size     = '25.MB'
    plaintext_email            = false
    monochrome_logs            = false
    help                       = false
    igenomes_base              = 's3://ngi-igenomes/igenomes'
    tracedir                   = "${params.outdir}/pipeline_info"
    igenomes_ignore            = false
    validate_params            = true
    show_hidden_params         = false
    schema_ignore_params       = 'genomes,modules'
    enable_conda               = false
    singularity_pull_docker_container = false

    // Config options
    custom_config_version      = 'master'
    custom_config_base         = "https://raw.githubusercontent.com/nf-core/configs/${params.custom_config_version}"
    hostnames                  = [:]
    config_profile_description = null
    config_profile_contact     = null
    config_profile_url         = null
    config_profile_name        = null

    // Max resource options
    // Defaults only, expecting to be overwritten
    max_memory                 = '128.GB'
    max_cpus                   = 16
    max_time                   = '240.h'
}

// Load base.config by default for all pipelines
includeConfig 'conf/base.config'

// Load modules.config for DSL2 module specific options
includeConfig 'conf/modules.config'

// Load nf-core custom profiles from different Institutions
try {
    includeConfig "${params.custom_config_base}/nfcore_custom.config"
} catch (Exception e) {
    System.err.println("WARNING: Could not load nf-core/config profiles: ${params.custom_config_base}/nfcore_custom.config")
}

// Load nf-core/rnaseq custom config
try {
    includeConfig "${params.custom_config_base}/pipeline/rnaseq.config"
} catch (Exception e) {
    System.err.println("WARNING: Could not load nf-core/config/rnaseq profiles: ${params.custom_config_base}/pipeline/rnaseq.config")
}

// Load igenomes.config if required
if (!params.igenomes_ignore) {
    includeConfig 'conf/igenomes.config'
} else {
    params.genomes = [:]
}

profiles {
    debug { process.beforeScript = 'echo $HOSTNAME' }
    conda {
        params.enable_conda    = true
        docker.enabled         = false
        singularity.enabled    = false
        podman.enabled         = false
        shifter.enabled        = false
        charliecloud.enabled   = false
    }
    docker {
        docker.enabled         = true
        docker.userEmulation   = true
        singularity.enabled    = false
        podman.enabled         = false
        shifter.enabled        = false
        charliecloud.enabled   = false
    }
    singularity {
        singularity.enabled    = true
        singularity.autoMounts = true
        docker.enabled         = false
        podman.enabled         = false
        shifter.enabled        = false
        charliecloud.enabled   = false
    }
    podman {
        podman.enabled         = true
        docker.enabled         = false
        singularity.enabled    = false
        shifter.enabled        = false
        charliecloud.enabled   = false
    }
    shifter {
        shifter.enabled        = true
        docker.enabled         = false
        singularity.enabled    = false
        podman.enabled         = false
        charliecloud.enabled   = false
    }
    charliecloud {
        charliecloud.enabled   = true
        docker.enabled         = false
        singularity.enabled    = false
        podman.enabled         = false
        shifter.enabled        = false
    }
    test      { includeConfig 'conf/test.config'      }
    test_full { includeConfig 'conf/test_full.config' }
}

// Export these variables to prevent local Python/R libraries from conflicting with those in the container
env {
    PYTHONNOUSERSITE = 1
    R_PROFILE_USER   = "/.Rprofile"
    R_ENVIRON_USER   = "/.Renviron"
}

def trace_timestamp = new java.util.Date().format( 'yyyy-MM-dd_HH-mm-ss')
timeline {
    enabled = true
    file    = "${params.tracedir}/execution_timeline_${trace_timestamp}.html"
}
report {
    enabled = true
    file    = "${params.tracedir}/execution_report_${trace_timestamp}.html"
}
trace {
    enabled = true
    file    = "${params.tracedir}/execution_trace_${trace_timestamp}.txt"
}
dag {
    enabled = true
    file    = "${params.tracedir}/pipeline_dag_${trace_timestamp}.svg"
}

manifest {
    name            = 'nf-core/rnaseq'
    author          = 'Phil Ewels, Rickard Hammarén'
    homePage        = 'https://github.com/nf-core/rnaseq'
    description     = 'Nextflow RNA-Seq analysis pipeline, part of the nf-core community.'
    mainScript      = 'main.nf'
    nextflowVersion = '!>=21.04.0'
    version         = '3.3'
}

// Function to ensure that resource requirements don't go beyond
// a maximum limit
def check_max(obj, type) {
    if (type == 'memory') {
        try {
            if (obj.compareTo(params.max_memory as nextflow.util.MemoryUnit) == 1)
                return params.max_memory as nextflow.util.MemoryUnit
            else
                return obj
        } catch (all) {
            println "   ### ERROR ###   Max memory '${params.max_memory}' is not valid! Using default value: $obj"
            return obj
        }
    } else if (type == 'time') {
        try {
            if (obj.compareTo(params.max_time as nextflow.util.Duration) == 1)
                return params.max_time as nextflow.util.Duration
            else
                return obj
        } catch (all) {
            println "   ### ERROR ###   Max time '${params.max_time}' is not valid! Using default value: $obj"
            return obj
        }
    } else if (type == 'cpus') {
        try {
            return Math.min( obj, params.max_cpus as int )
        } catch (all) {
            println "   ### ERROR ###   Max cpus '${params.max_cpus}' is not valid! Using default value: $obj"
            return obj
        }
    }
}

=================================

我使用conda构建了env,它看起来如下:

我是用conda env export买的

代码语言:javascript
复制
name: nf-core
channels:
  - bioconda
  - conda-forge
  - defaults
dependencies:
  - _libgcc_mutex=0.1=conda_forge
  - _openmp_mutex=4.5=1_gnu
  - alsa-lib=1.2.3=h516909a_0
  - appdirs=1.4.4=pyh9f0ad1d_0
  - attrs=21.2.0=pyhd8ed1ab_0
  - backports=1.0=py_2
  - backports.functools_lru_cache=1.6.4=pyhd8ed1ab_0
  - brotlipy=0.7.0=py37h5e8e339_1001
  - bzip2=1.0.8=h7f98852_4
  - c-ares=1.17.2=h7f98852_0
  - ca-certificates=2021.5.30=ha878542_0
  - cairo=1.16.0=h6cf1ce9_1008
  - cattrs=1.8.0=pyhd8ed1ab_0
  - certifi=2021.5.30=py37h89c1867_0
  - cffi=1.14.6=py37hc58025e_0
  - chardet=4.0.0=py37h89c1867_1
  - charset-normalizer=2.0.0=pyhd8ed1ab_0
  - click=8.0.1=py37h89c1867_0
  - cni=0.8.0=hc0beb16_0
  - cni-plugins=0.9.1=ha8f183a_0
  - colorama=0.4.4=pyh9f0ad1d_0
  - commonmark=0.9.1=py_0
  - coreutils=8.25=1
  - cryptography=3.4.7=py37h5d9358c_0
  - curl=7.78.0=hea6ffbf_0
  - expat=2.4.1=h9c3ff4c_0
  - fontconfig=2.13.1=hba837de_1005
  - freetype=2.10.4=h0708190_1
  - future=0.18.2=py37h89c1867_3
  - gettext=0.19.8.1=h0b5b191_1005
  - giflib=5.2.1=h36c2ea0_2
  - git=2.33.0=pl5321hc30692c_0
  - gitdb=4.0.7=pyhd8ed1ab_0
  - gitpython=3.1.18=pyhd8ed1ab_0
  - graphite2=1.3.13=h58526e2_1001
  - harfbuzz=2.9.1=h83ec7ef_0
  - icu=68.1=h58526e2_0
  - idna=3.1=pyhd3deb0d_0
  - importlib-metadata=4.8.1=py37h89c1867_0
  - importlib_metadata=4.8.1=hd8ed1ab_0
  - itsdangerous=2.0.1=pyhd8ed1ab_0
  - jbig=2.1=h7f98852_2003
  - jinja2=3.0.1=pyhd8ed1ab_0
  - jpeg=9d=h36c2ea0_0
  - jq=1.6=h36c2ea0_1000
  - jsonschema=3.2.0=py37hc8dfbb8_1
  - krb5=1.19.2=hcc1bbae_0
  - lcms2=2.12=hddcbb42_0
  - ld_impl_linux-64=2.36.1=hea4e1c9_2
  - lerc=2.2.1=h9c3ff4c_0
  - libarchive=3.5.2=hccf745f_0
  - libcurl=7.78.0=h2574ce0_0
  - libdeflate=1.7=h7f98852_5
  - libedit=3.1.20191231=he28a2e2_2
  - libev=4.33=h516909a_1
  - libffi=3.3=h58526e2_2
  - libgcc=7.2.0=h69d50b8_2
  - libgcc-ng=11.1.0=hc902ee8_8
  - libglib=2.68.4=h3e27bee_0
  - libgomp=11.1.0=hc902ee8_8
  - libiconv=1.16=h516909a_0
  - libnghttp2=1.43.0=h812cca2_0
  - libpng=1.6.37=h21135ba_2
  - libseccomp=2.4.4=h36c2ea0_0
  - libssh2=1.10.0=ha56f1ee_0
  - libstdcxx-ng=11.1.0=h56837e0_8
  - libtiff=4.3.0=hf544144_1
  - libuuid=2.32.1=h7f98852_1000
  - libwebp-base=1.2.1=h7f98852_0
  - libxcb=1.13=h7f98852_1003
  - libxml2=2.9.12=h72842e0_0
  - lz4-c=1.9.3=h9c3ff4c_1
  - lzo=2.10=h516909a_1000
  - markupsafe=2.0.1=py37h5e8e339_0
  - ncurses=6.2=h58526e2_4
  - nextflow=21.04.0=h4a94de4_0
  - nf-core=2.1=pyh5e36f6f_0
  - oniguruma=6.9.7.1=h7f98852_0
  - openjdk=11.0.9.1=h5cc2fde_1
  - openssl=1.1.1l=h7f98852_0
  - packaging=21.0=pyhd8ed1ab_0
  - pcre=8.45=h9c3ff4c_0
  - pcre2=10.37=h032f7d1_0
  - perl=5.32.1=0_h7f98852_perl5
  - pip=21.2.4=pyhd8ed1ab_0
  - pixman=0.40.0=h36c2ea0_0
  - prompt-toolkit=3.0.20=pyha770c72_0
  - prompt_toolkit=3.0.20=hd8ed1ab_0
  - pthread-stubs=0.4=h36c2ea0_1001
  - pycparser=2.20=pyh9f0ad1d_2
  - pygments=2.10.0=pyhd8ed1ab_0
  - pyopenssl=20.0.1=pyhd8ed1ab_0
  - pyparsing=2.4.7=pyh9f0ad1d_0
  - pyrsistent=0.17.3=py37h5e8e339_2
  - pysocks=1.7.1=py37h89c1867_3
  - python=3.7.10=hffdb5ce_100_cpython
  - python_abi=3.7=2_cp37m
  - pyyaml=5.4.1=py37h5e8e339_1
  - questionary=1.10.0=pyhd8ed1ab_0
  - readline=8.1=h46c0cb4_0
  - requests=2.26.0=pyhd8ed1ab_0
  - requests-cache=0.8.0=pyhd8ed1ab_0
  - rich=10.9.0=py37h89c1867_0
  - setuptools=58.0.4=py37h89c1867_0
  - singularity=3.7.1=hca90b9e_0
  - six=1.16.0=pyh6c4a22f_0
  - smmap=3.0.5=pyh44b312d_0
  - sqlite=3.36.0=h9cd32fc_1
  - squashfs-tools=4.4=h6b73730_2
  - tabulate=0.8.9=pyhd8ed1ab_0
  - tk=8.6.11=h27826a3_1
  - typing_extensions=3.10.0.0=pyha770c72_0
  - url-normalize=1.4.3=pyhd8ed1ab_0
  - urllib3=1.26.6=pyhd8ed1ab_0
  - wcwidth=0.2.5=pyh9f0ad1d_2
  - wheel=0.37.0=pyhd8ed1ab_1
  - xorg-fixesproto=5.0=h7f98852_1002
  - xorg-inputproto=2.3.2=h7f98852_1002
  - xorg-kbproto=1.0.7=h7f98852_1002
  - xorg-libice=1.0.10=h7f98852_0
  - xorg-libsm=1.2.3=hd9c2040_1000
  - xorg-libx11=1.7.2=h7f98852_0
  - xorg-libxau=1.0.9=h7f98852_0
  - xorg-libxdmcp=1.1.3=h7f98852_0
  - xorg-libxext=1.3.4=h7f98852_1
  - xorg-libxfixes=5.0.3=h7f98852_1004
  - xorg-libxi=1.7.10=h7f98852_0
  - xorg-libxrender=0.9.10=h7f98852_1003
  - xorg-libxtst=1.2.3=h7f98852_1002
  - xorg-recordproto=1.14.2=h7f98852_1002
  - xorg-renderproto=0.11.1=h7f98852_1002
  - xorg-xextproto=7.3.0=h7f98852_1002
  - xorg-xproto=7.0.31=h7f98852_1007
  - xz=5.2.5=h516909a_1
  - yaml=0.2.5=h516909a_0
  - zipp=3.5.0=pyhd8ed1ab_0
  - zlib=1.2.11=h516909a_1010
  - zstd=1.5.0=ha95c52a_0
EN

回答 2

Stack Overflow用户

发布于 2021-09-11 14:12:54

有时作业由于各种原因而失败,Nextflow管道可以以不同的方式处理这些错误,无论是好是坏。nf-core/rnaseq管道(版本3.3)使用以下errorStrategy

代码语言:javascript
复制
    errorStrategy = { task.exitStatus in [143,137,104,134,139] ? 'retry' : 'finish' }
    maxRetries    = 1
    maxErrors     = '-1'

https://github.com/nf-core/rnaseq/blob/3.3/conf/base.config#L17-L19

请注意,maxRetries的值仅在使用“重试”错误策略时应用。

在“Case 1”中,您会得到一个“No这类文件或目录”,因为输入文件在试图运行脚本命令之前没有被放置。重新运行.command.run脚本(就像您所做的那样)将首先尝试在.command.sh中运行脚本命令之前对输入文件进行分期。您应该能够在不需要手动干预的情况下只对工作流进行-resume,而失败的作业将被自动重试。

“Case 2”中的两个故障在提取两个(不同的)奇点图像时看起来就像是网络错误。这可能是弱网络连接的结果。

我不会太担心这样的错误。这些并不少见。尽管如此,我认为可以更好地处理第一个问题,只需在您的nextflow.config中设置nextflow.config来覆盖默认行为。实际上,我发现带退避的动态重试 (如下面所示)也很好用。WRT网络问题,如果您计划一次又一次地运行管道,可能值得为奇点设置一个cacheDir以避免重复拉出.

代码语言:javascript
复制
process {

  errorStrategy = {
    sleep( Math.pow( 2, task.attempt ) * 150 as long )
    return 'retry'
  }
  maxRetries = 3
}

singularity {

  cacheDir = '/path/to/containers'
}
票数 1
EN

Stack Overflow用户

发布于 2021-09-16 20:17:44

我忘记提到我在配置了WSL2的Windows 10 PC上,我遇到了一个奇怪的问题,VMMEM正在抓取我所有的内存,却不让它消失。在处理了Nextflow并通过论坛查找问题和错误的原因之后,我意识到我是一个巨大的noob,并且设置我的.wslconfig文件来限制我的子系统只有2GB的内存,但是默认的noob/rnaseq管道要求6GB。

此命令修复了我的所有问题:

nextflow run nf-core/rnaseq -profile test,singularity --aligner hisat2 --max_memory 1.5GB

我希望我正确地发现了潜在的问题,但是now /rnaseq现在对我有用了:)

编辑:他们甚至提到默认资源在这里可能不合适:https://nf-co.re/rnaseq/usage#resource-requests

票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/69136233

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档