使用多输入的 Snakemake

Snakemake using multi inputs

我开始为我的生物信息学项目编写一个管道,我正在使用 Snakemake 作为工作流程。 我制作了官方网站的所有教程和一些文档。

我想 运行 一个 shell 命令,像这样: <b>fastp -i</b> 输入-1 <b>-I</b> 输入-2 <b>-o</b> 输出- 1<b>-O</b>输出-2

我在 Snakefile 中的代码:

SAMPLES = ['1', '2', '3', '4']
rule fastp:
    input:
        reads1=expand("sample{sample}.R1.fq.gz", sample=SAMPLES),
        reads2=expand("sample{sample}.R2.fq.gz", sample=SAMPLES)
    output:
        reads1out=expand("sample{sample}.R1.fq.gz.out", sample=SAMPLES),
        reads2out=expand("sample{sample}.R2.fq.gz.out", sample=SAMPLES)
    shell:
        "fastp -i {input.reads1} -I {input.reads2} -o {output.reads1out} -O {output.reads2out}"

但是程序运行这一行代码: <b>fastp -i</b>sample1.R1.fq.gzsample2.R1.fq.gzsample3.R1.fq.gzsample4.R1.fq.gz<b>-I</b> sample1.R2.fq.gz sample2.R2.fq.gz sample3.R2.fq.gz sample4.R2.fq.gz <b>-o</b> sample1.R1.fq.gz .out sample2.R1.fq.gz.out sample3.R1.fq.gz.out sample4.R1.fq.gz.out <b>-O</b> sample1.R2.fq.gz.out sample2.R2.fq.gz.out sample3.R2.fq.gz.out sample4.R2.fq.gz.out

如何编写程序为每个样本执行不同的 shell 命令?我在 rule fastp: 之后尝试了 for i in SAMPLES: 但没有用,我不知道我现在可以尝试什么。抱歉,如果这个主题在某些方面太基础了,但我是 Python.

的菜鸟

谢谢。

您需要 define target 使用 rule all 输出文件。

SAMPLES = ['1', '2', '3', '4']
rule all:
    input:
        expand("sample{sample}.R{read_no}.fq.gz.out", sample=SAMPLES, read_no=['1', '2'])

rule fastp:
    input:
        reads1="sample{sample}.R1.fq.gz",
        reads2="sample{sample}.R2.fq.gz"
    output:
        reads1out="sample{sample}.R1.fq.gz.out",
        reads2out="sample{sample}.R2.fq.gz.out"
    shell:
        "fastp -i {input.reads1} -I {input.reads2} -o {output.reads1out} -O {output.reads2out}"

这是命令 snakemake -np 的输出,带有 rule all: ,如 JeeYem 所写:

λ fastp/testdata master ✗ snakemake -np rule fastp: input: sample1.R1.fq.gz, sample2.R1.fq.gz, sample3.R1.fq.gz, sample4.R1.fq.gz, sample1.R2.fq.gz, sample2.R2.fq.gz, sample3.R2.fq.gz, sample4.R2.fq.gz output: sample1.R1.fq.gz.out, sample2.R1.fq.gz.out, sample3.R1.fq.gz.out, sample4.R1.fq.gz.out, sample1.R2.fq.gz.out, sample2.R2.fq.gz.out, sample3.R2.fq.gz.out, sample4.R2.fq.gz.out jobid: 1 fastp -i sample1.R1.fq.gz sample2.R1.fq.gz sample3.R1.fq.gz sample4.R1.fq.gz -I sample1.R2.fq.gz sample2.R2.fq.gz sample3.R2.fq.gz sample4.R2.fq.gz -o sample1.R1.fq.gz.out sample2.R1.fq.gz.out sample3.R1.fq.gz.out sample4.R1.fq.gz.out -O sample1.R2.fq.gz.out sample2.R2.fq.gz.out sample3.R2.fq.gz.out sample4.R2.fq.gz.out localrule all: input: sample1.R1.fq.gz.out, sample1.R2.fq.gz.out, sample2.R1.fq.gz.out, sample2.R2.fq.gz.out, sample3.R1.fq.gz.out, sample3.R2.fq.gz.out, sample4.R1.fq.gz.out, sample4.R2.fq.gz.out jobid: 0 Job counts: count jobs 1 all 1 fastp 2