使用多输入的 Snakemake
Snakemake using multi inputs
我开始为我的生物信息学项目编写一个管道,我正在使用 Snakemake 作为工作流程。
我制作了官方网站的所有教程和一些文档。
我想 运行 一个 shell 命令,像这样:
<b>fastp -i</b> 输入-1 <b>-I</b> 输入-2 <b>-o</b> 输出- 1<b>-O</b>输出-2
我在 Snakefile 中的代码:
SAMPLES = ['1', '2', '3', '4']
rule fastp:
input:
reads1=expand("sample{sample}.R1.fq.gz", sample=SAMPLES),
reads2=expand("sample{sample}.R2.fq.gz", sample=SAMPLES)
output:
reads1out=expand("sample{sample}.R1.fq.gz.out", sample=SAMPLES),
reads2out=expand("sample{sample}.R2.fq.gz.out", sample=SAMPLES)
shell:
"fastp -i {input.reads1} -I {input.reads2} -o {output.reads1out} -O {output.reads2out}"
但是程序运行这一行代码:
<b>fastp -i</b>sample1.R1.fq.gzsample2.R1.fq.gzsample3.R1.fq.gzsample4.R1.fq.gz<b>-I</b> sample1.R2.fq.gz sample2.R2.fq.gz sample3.R2.fq.gz sample4.R2.fq.gz <b>-o</b> sample1.R1.fq.gz .out sample2.R1.fq.gz.out sample3.R1.fq.gz.out sample4.R1.fq.gz.out <b>-O</b> sample1.R2.fq.gz.out sample2.R2.fq.gz.out sample3.R2.fq.gz.out sample4.R2.fq.gz.out
如何编写程序为每个样本执行不同的 shell 命令?我在 rule fastp:
之后尝试了 for i in SAMPLES:
但没有用,我不知道我现在可以尝试什么。抱歉,如果这个主题在某些方面太基础了,但我是 Python.
的菜鸟
谢谢。
您需要 define target 使用 rule all
输出文件。
SAMPLES = ['1', '2', '3', '4']
rule all:
input:
expand("sample{sample}.R{read_no}.fq.gz.out", sample=SAMPLES, read_no=['1', '2'])
rule fastp:
input:
reads1="sample{sample}.R1.fq.gz",
reads2="sample{sample}.R2.fq.gz"
output:
reads1out="sample{sample}.R1.fq.gz.out",
reads2out="sample{sample}.R2.fq.gz.out"
shell:
"fastp -i {input.reads1} -I {input.reads2} -o {output.reads1out} -O {output.reads2out}"
这是命令 snakemake -np 的输出,带有 rule all:
,如 JeeYem 所写:
λ fastp/testdata master ✗ snakemake -np
rule fastp:
input: sample1.R1.fq.gz, sample2.R1.fq.gz, sample3.R1.fq.gz, sample4.R1.fq.gz, sample1.R2.fq.gz, sample2.R2.fq.gz, sample3.R2.fq.gz, sample4.R2.fq.gz
output: sample1.R1.fq.gz.out, sample2.R1.fq.gz.out, sample3.R1.fq.gz.out, sample4.R1.fq.gz.out, sample1.R2.fq.gz.out, sample2.R2.fq.gz.out, sample3.R2.fq.gz.out, sample4.R2.fq.gz.out
jobid: 1
fastp -i sample1.R1.fq.gz sample2.R1.fq.gz sample3.R1.fq.gz sample4.R1.fq.gz -I sample1.R2.fq.gz sample2.R2.fq.gz sample3.R2.fq.gz sample4.R2.fq.gz -o sample1.R1.fq.gz.out sample2.R1.fq.gz.out sample3.R1.fq.gz.out sample4.R1.fq.gz.out -O sample1.R2.fq.gz.out sample2.R2.fq.gz.out sample3.R2.fq.gz.out sample4.R2.fq.gz.out
localrule all:
input: sample1.R1.fq.gz.out, sample1.R2.fq.gz.out, sample2.R1.fq.gz.out, sample2.R2.fq.gz.out, sample3.R1.fq.gz.out, sample3.R2.fq.gz.out, sample4.R1.fq.gz.out, sample4.R2.fq.gz.out
jobid: 0
Job counts:
count jobs
1 all
1 fastp
2
我开始为我的生物信息学项目编写一个管道,我正在使用 Snakemake 作为工作流程。 我制作了官方网站的所有教程和一些文档。
我想 运行 一个 shell 命令,像这样:
<b>fastp -i</b> 输入-1 <b>-I</b> 输入-2 <b>-o</b> 输出- 1<b>-O</b>输出-2
我在 Snakefile 中的代码:
SAMPLES = ['1', '2', '3', '4']
rule fastp:
input:
reads1=expand("sample{sample}.R1.fq.gz", sample=SAMPLES),
reads2=expand("sample{sample}.R2.fq.gz", sample=SAMPLES)
output:
reads1out=expand("sample{sample}.R1.fq.gz.out", sample=SAMPLES),
reads2out=expand("sample{sample}.R2.fq.gz.out", sample=SAMPLES)
shell:
"fastp -i {input.reads1} -I {input.reads2} -o {output.reads1out} -O {output.reads2out}"
但是程序运行这一行代码:
<b>fastp -i</b>sample1.R1.fq.gzsample2.R1.fq.gzsample3.R1.fq.gzsample4.R1.fq.gz<b>-I</b> sample1.R2.fq.gz sample2.R2.fq.gz sample3.R2.fq.gz sample4.R2.fq.gz <b>-o</b> sample1.R1.fq.gz .out sample2.R1.fq.gz.out sample3.R1.fq.gz.out sample4.R1.fq.gz.out <b>-O</b> sample1.R2.fq.gz.out sample2.R2.fq.gz.out sample3.R2.fq.gz.out sample4.R2.fq.gz.out
如何编写程序为每个样本执行不同的 shell 命令?我在 rule fastp:
之后尝试了 for i in SAMPLES:
但没有用,我不知道我现在可以尝试什么。抱歉,如果这个主题在某些方面太基础了,但我是 Python.
谢谢。
您需要 define target 使用 rule all
输出文件。
SAMPLES = ['1', '2', '3', '4']
rule all:
input:
expand("sample{sample}.R{read_no}.fq.gz.out", sample=SAMPLES, read_no=['1', '2'])
rule fastp:
input:
reads1="sample{sample}.R1.fq.gz",
reads2="sample{sample}.R2.fq.gz"
output:
reads1out="sample{sample}.R1.fq.gz.out",
reads2out="sample{sample}.R2.fq.gz.out"
shell:
"fastp -i {input.reads1} -I {input.reads2} -o {output.reads1out} -O {output.reads2out}"
这是命令 snakemake -np 的输出,带有 rule all:
,如 JeeYem 所写:
λ fastp/testdata master ✗ snakemake -np
rule fastp:
input: sample1.R1.fq.gz, sample2.R1.fq.gz, sample3.R1.fq.gz, sample4.R1.fq.gz, sample1.R2.fq.gz, sample2.R2.fq.gz, sample3.R2.fq.gz, sample4.R2.fq.gz
output: sample1.R1.fq.gz.out, sample2.R1.fq.gz.out, sample3.R1.fq.gz.out, sample4.R1.fq.gz.out, sample1.R2.fq.gz.out, sample2.R2.fq.gz.out, sample3.R2.fq.gz.out, sample4.R2.fq.gz.out
jobid: 1
fastp -i sample1.R1.fq.gz sample2.R1.fq.gz sample3.R1.fq.gz sample4.R1.fq.gz -I sample1.R2.fq.gz sample2.R2.fq.gz sample3.R2.fq.gz sample4.R2.fq.gz -o sample1.R1.fq.gz.out sample2.R1.fq.gz.out sample3.R1.fq.gz.out sample4.R1.fq.gz.out -O sample1.R2.fq.gz.out sample2.R2.fq.gz.out sample3.R2.fq.gz.out sample4.R2.fq.gz.out
localrule all:
input: sample1.R1.fq.gz.out, sample1.R2.fq.gz.out, sample2.R1.fq.gz.out, sample2.R2.fq.gz.out, sample3.R1.fq.gz.out, sample3.R2.fq.gz.out, sample4.R1.fq.gz.out, sample4.R2.fq.gz.out
jobid: 0
Job counts:
count jobs
1 all
1 fastp
2