如何使用通配符为分组规则定义 slurm 输出文件名

Question

我正在使用 slurm 集群启动 snakemake 作业。我的问题是我无法使用 slurm 的 --output 和 --error 选项为每个启动的作业自定义名称。

例如，我在名为 "filtmap" 的组中有两个规则（以便在同一个作业实例中启动这 2 个规则）。我尝试像 documentation

中介绍的那样设置集群配置

这里是 config_cluster.json

{
    "__default__":
    {
        "account": "mytilus",
        "time": "10-00:00",
        "nodes": 1,
        "ntasks": 1,
        "partition": "long",
        "mem": 100,
        "output": "logs/cluster/{rule}.{wildcards}.out",
        "error": "logs/cluster/{rule}.{wildcards}.err"
    },
}

并使用这些选项启动 snakemake

snakemake --use-singularity \
--jobs 40 --cluster-config config_cluster.json \
-s Snakefile
--cluster "sbatch -A {cluster.account} -p {cluster.partition} \
--output {cluster.output} --error {cluster.error} \
-t {cluster.time} --error {cluster.error} \
--nodes {cluster.nodes} \
--ntasks {cluster.ntasks} --mem {cluster.mem}G \
-D /shared/projects/mytilus/Preprocessing \
--cpus-per-task {threads}"

然而，无论我尝试使用什么通配符，这总是 return 以下类型的错误

WorkflowError:
NameError with group job efafdfe8-225c-594b-a71a-d0d58516876c: The name 'rule' is unknown in this context. Please make sure that you defined that variable. Also note that braces not used for variable access have to be escaped by repeating them, i.e. {{print }}`

在调用和配置文件中删除 --output 和 --error 标志的所有使用 snakemake 运行s 没问题。但是我真的很想拥有自定义名称的输出和错误文件。

编辑： 经过几次测试，当每个规则都是运行自己，没有组定义时，似乎没有出现问题。所以我的问题变成了 "How do I set a job name for each group using common wildcards between included rules?"

Answer 1

As per developer on this error,

Yes, things like {rule} and {wildcards} are not supported in combination with groups

如何使用通配符为分组规则定义 slurm 输出文件名

How to define slurm output file name using wildcards for grouped rules

cluster-computing

snakemake