我有两个规则能够根据通配符值产生相同的输出,这会导致模糊的规则异常。
我阅读了关于http://snakemake.readthedocs.io/en/latest/snakefiles/rules.html?highlight=ruleorder#handling-ambiguous-rules的关于处理模糊规则异常的文档。
使用规则顺序似乎是解决问题的方法。但是,我的规则preprocess_zheng17的输入取决于simulate_data规则的输出。因此,如果我使用ruleorder: simulate_data > preprocess_zheng17,则永远不会运行preprocess_zheng17规则。
我想要做的是首先运行simulate_data,然后对每个通配符对运行preprocess_zheng17规则。我想知道什么是一个很好的工作流设计实践来解决这个问题。规则如下。
rule preprocess_zheng17:
input:
loom_file = SIMULATED_DATA_OUTPUT+'/{sample}_sim_loc{loc}.loom'
params:
transpose = False
output:
SIMULATED_DATA_OUTPUT+'/{sample}_sim_loc{loc}_zheng17.loom'
script:
"scripts/preprocess_zheng17.py"
rule simulate_data:
input:
sample_loom = HDF5_OUTPUT+'/{sample}.loom'
params:
group_prob = config['splat_simulate']['group_prob'],
dropout_present = config['splat_simulate']['dropout_present']
output:
SIMULATED_DATA_OUTPUT+'/{sample}_sim_loc{loc}.loom'
script:
"scripts/data_simulation.R"提前谢谢你。
发布于 2018-02-02 12:46:27
您的问题不是来自设计,而是因为无法区分两个规则的输出和使用的通配符。SIMULATED_DATA_OUTPUT+'/{sample}_sim_loc{loc}_zheng17.loom'和SIMULATED_DATA_OUTPUT+'/{sample}_sim_loc{loc}.loom'
以相同的模式开始和结束,snakemake无法确定_zheng17是否是通配符{loc}的一部分。
您可以使用bli在他的注释中所描述的内容,也可以稍微更改任一规则的输出。例如:
rule preprocess_zheng17:
input:
loom_file = SIMULATED_DATA_OUTPUT+'/{sample}_sim_loc{loc}.loom'
params:
transpose = False
output:
SIMULATED_DATA_OUTPUT+'/{sample}_sim_zheng17_loc{loc}.loom'
script:
"scripts/preprocess_zheng17.py"
rule simulate_data:
input:
sample_loom = HDF5_OUTPUT+'/{sample}.loom'
params:
group_prob = config['splat_simulate']['group_prob'],
dropout_present = config['splat_simulate']['dropout_present']
output:
SIMULATED_DATA_OUTPUT+'/{sample}_sim_loc{loc}.loom'
script:
"scripts/data_simulation.R"https://stackoverflow.com/questions/48568166
复制相似问题