我有Snakamake规则如下所示,其中通配符包含特殊字符,所以我使用sub对它们进行转义,请参阅答案here。输出文件:data/extract_AAV(1).csv。
import re
rule get_data:
input:
output: "data/extract_{re.sub(r'([()])', r'\\\1', filename)}.csv"
shell: "python get_data.py --filename {re.sub(r'([()])', r'\\\1', wildcards.filename)}"但是,我得到一个错误如下:
模块're‘没有属性’sub‘(’r‘)
在Python中运行re模块很好:
filename = 'extract_AAV(1).csv'
print(re.sub(r'([()])', r'\\\1', filename)
# returns: extract_AAV\\(1\\).csv在将具有特殊字符的通配符从Snakemake传递到python脚本时,一个可重复的错误示例如下:
Snakemake文件:
rule get_data:
output: "extract_{sample}.csv"
shell: "python run.py --fn {wildcards.sample}"run.py
import argparse
parser = argparse.ArgumentParser()
parser.add_argument('--fn', type=str)
args = parser.parse_args()
import pandas as pd
df = pd.DataFrame({'a': [1,2,3]})
df.to_csv("extract_"+args.fn+'.csv')命令执行尝试1:
$ snakemake extract_AAV(1).csv --cores 1
bash: syntax error near unexpected token `('命令执行尝试2:
$ snakemake extract_AAV\(1\).csv --cores 1
Building DAG of jobs...
Using shell: /bin/bash
Provided cores: 1 (use --cores to define parallelism)
Rules claiming more threads will be scaled down.
Job counts:
count jobs
1 get_data
1
[Wed Apr 29 11:31:34 2020]
rule get_data:
output: extract_AAV(1).csv
jobid: 0
wildcards: sample=AAV(1)
/bin/bash: -c: line 0: syntax error near unexpected token `('
/bin/bash: -c: line 0: `set -euo pipefail; python run.py --fn AAV(1)'
[Wed Apr 29 11:31:34 2020]
Error in rule get_data:
jobid: 0
output: extract_AAV(1).csv
shell:
python run.py --fn AAV(1)
(one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: /home/XXXXX/.snakemake/log/2020-04-29T113134.773987.snakemake.log发布于 2020-04-29 17:14:30
在您的情况下,不需要在通配符中转义符号。此外,通配符只是一个标识符,不允许表达式。下面的脚本演示了如何生成文件data/extract_AAV(1).csv
rule all:
input: "data/extract_AAV(1).csv"
rule get_data:
output: "data/extract_AAV({index}).csv"
shell: "touch {output}"根据我的经验,如果在文件名中使用空格或引号,可能会出现问题。例如,如果文件名包含空白,则需要将其放在CLI命令中的引号中:
rule all:
input: "data/extract_AAV (1).csv"
rule get_data:
output: "data/extract_AAV ({index}).csv"
shell: "touch \"{output}\""发布于 2020-04-29 14:15:38
要么我不能重现这个问题,要么你使事情变得更加复杂。这对我来说很管用:
samples = ['AAV(1)', 'AAV(2)']
rule all:
input:
expand('data/extract_{sample}.csv', sample= samples),
wildcard_constraints:
sample= '|'.join([re.escape(x) for x in samples]),
rule one:
output:
'data/extract_{sample}.csv'
shell:
r"""
touch '{output}'
"""(这里没有必要使用wildcard_constraints位,但我倾向于使用它。)
如果这没有帮助,你能张贴一个可复制的例子吗?
https://stackoverflow.com/questions/61479321
复制相似问题