Skip to content

Commit d7b9c15

Browse files
authored
Merge pull request #152 from assemblerflow/recipes
Added recipes system
2 parents ab3d5fa + 5afcd32 commit d7b9c15

24 files changed

+866
-51
lines changed

changelog.md

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,13 @@
22

33
## Changes in upcoming release (`dev` branch)
44

5+
### New features
6+
7+
- Added new `recipe` system to flowcraft along with 6 starting recipes.
8+
Recipes are pre-made and curated pipelines that address specific questions.
9+
To create a recipe, the `-r <recipe_name>` can be used. To list available
10+
recipes, the `--recipe-list` and `--recipe-list-short` options were added.
11+
512
### Components changes
613

714
- Added new `disableRR` param in the `spades` component that disables repeat

docs/dev/create_recipe.rst

Lines changed: 116 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,116 @@
1+
Recipe creation guidelines
2+
===========================
3+
4+
Recipes are pre-made pipeline strings that may be associated with specific
5+
parameters and directives and are used to rapidly build a certain type of
6+
pipeline.
7+
8+
Instead of building a pipeline like::
9+
10+
-t "integrity_coverage fastqc_trimmomatic fastqc spades pilon"
11+
12+
The user simply can specific a recipe with that pipeline::
13+
14+
-r assembly
15+
16+
Recipe creation
17+
---------------
18+
19+
The creation of new recipes is a very simple and straightforward process.
20+
You need to create a new file in the ``flowcraft/generator/recipes`` folder
21+
with any name and create a basic class with three attributes::
22+
23+
try:
24+
from generator.recipe import Recipe
25+
except ImportError:
26+
from flowcraft.generator.recipe import Recipe
27+
28+
29+
class Innuca(Recipe):
30+
31+
def __init__(self):
32+
super().__init__()
33+
34+
# Recipe name
35+
self.name = "innuca"
36+
37+
# Recipe pipeline
38+
self.pipeline_str = <pipeline string>
39+
40+
# Recipe parameters and directives
41+
self.directives = { <directives> }
42+
43+
And that's it! Now there is a new recipe available with the ``innuca`` name and
44+
we can build this pipeline using the option ``-r innuca``.
45+
46+
Name
47+
^^^^
48+
49+
This is the name of the recipe, which is used to make a match with the recipe
50+
name provided by the user via the ``-r`` option.
51+
52+
Pipeline_str
53+
^^^^^^^^^^^^
54+
55+
The pipeline string as if provided via the ``-t`` option.
56+
57+
Directives
58+
^^^^^^^^^^
59+
60+
A dictionary containing the parameters and directives for each process in the
61+
pipeline string. **Setting this attribute is optional and components
62+
that are not specified here will assume their default values**. In general, each
63+
element in this dictionary should have the following format::
64+
65+
self.directives = {
66+
"component_name": {
67+
"params": {
68+
"paramA": "value"
69+
},
70+
"directives": {
71+
"directiveA": "value"
72+
}
73+
}
74+
}
75+
76+
This will set the provided parameters and directives to the component, but it is
77+
possible to provide only one.
78+
79+
A more concrete example of a real component and directives follows::
80+
81+
self.pipeline_str = "integrity_coverage fastqc"
82+
83+
# Set parameters and directives only for integrity_coverage
84+
# and leave fastqc with the defaults
85+
self.directives = {
86+
"integrity_coverage": {
87+
"params": {
88+
"minCoverage": 20
89+
},
90+
"directives": {
91+
"memory": "1GB"
92+
}
93+
}
94+
}
95+
96+
Duplicate components
97+
~~~~~~~~~~~~~~~~~~~~
98+
99+
In some cases, the same component may be present multiple times in the pipeline
100+
string of a recipe. In these cases, directives can be assigned to each individual
101+
component by adding a ``#<id>`` suffix to the component::
102+
103+
self.pipeline_str = "integrity_coverage ( trimmomatic spades#1 | spades#2)"
104+
105+
self.directives = {
106+
"spades#1": {
107+
"directives": {
108+
"memory": "10GB"
109+
}
110+
},
111+
"spades#2": {
112+
"directives": {
113+
"version": "3.7.0"
114+
}
115+
}
116+
}

docs/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -44,6 +44,7 @@ A NextFlow pipeline assembler for genomics.
4444
dev/general_orientation
4545
dev/create_process
4646
dev/create_template
47+
dev/create_recipe
4748
dev/containers
4849
dev/process_dotfiles
4950
dev/pipeline_reporting
138 KB
Loading
Loading

docs/user/pipeline_reports.rst

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,7 @@ Pipeline reports
99
.. include:: reports/fastqc.rst
1010
.. include:: reports/fastqc_trimmomatic.rst
1111
.. include:: reports/integrity_coverage.rst
12+
.. include:: reports/mash_dist.rst
1213
.. include:: reports/mlst.rst
1314
.. include:: reports/patho_typing.rst
1415
.. include:: reports/pilon.rst
@@ -18,6 +19,7 @@ Pipeline reports
1819
.. include:: reports/process_spades.rst
1920
.. include:: reports/process_viral_assembly.rst
2021
.. include:: reports/seq_typing.rst
22+
.. include:: reports/sistr.rst
2123
.. include:: reports/trimmomatic.rst
2224
.. include:: reports/true_coverage.rst
2325

docs/user/reports/mash_dist.rst

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
mash_dist
2+
---------
3+
4+
Table data
5+
^^^^^^^^^^
6+
7+
Plasmids table:
8+
- **Mash Dist**: Number of plasmid hits
9+
10+
.. image:: ../resources/reports/mash_dist_table.png
11+
:align: center
12+
13+
Plot data
14+
^^^^^^^^^
15+
16+
- **Sliding window Plasmid annotation**: Provides annotation of plasmid
17+
hits along the genome assembly. This report component is only available
18+
when the ``mash_dist`` component is used.
19+
20+
.. image:: ../resources/reports/sliding_window_mash_dist.png

docs/user/reports/sistr.rst

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
sistr
2+
-----
3+
4+
Table data
5+
^^^^^^^^^^
6+
7+
Typing table:
8+
- **sistr**: The sequence typing result.
9+
10+
.. image:: ../resources/reports/typing_table.png
11+
:align: center

flowcraft/bin/parse_fasta.py

Lines changed: 47 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,47 @@
1+
#!/usr/bin/env python3
2+
3+
4+
import argparse
5+
from itertools import groupby
6+
import os
7+
8+
9+
def replace_char(text):
10+
for ch in ['/', '`', '*', '{', '}', '[', ']', '(', ')', '#', '+', '-', '.', '!', '$', ':']:
11+
text = text.replace(ch, "_")
12+
return text
13+
14+
def getSequence(ref, fasta):
15+
16+
entry = (x[1] for x in groupby(fasta, lambda line: line[0] == ">"))
17+
18+
for header in entry:
19+
headerStr = header.__next__()[1:].strip()
20+
seq = "".join(s.strip() for s in entry.__next__())
21+
22+
if ref == headerStr.replace('>',''):
23+
filename = os.path.join(os.getcwd(), ref.replace('/','_').split('|')[0])
24+
fasta_header = replace_char(headerStr)
25+
output_file = open(filename + '.fa', "w")
26+
output_file.write(">" + fasta_header + "\n" + seq.upper() + "\n")
27+
output_file.close()
28+
29+
def main():
30+
31+
parser = argparse.ArgumentParser(prog='parse_fasta.py', description="Parse FASTA files for a specific header", formatter_class=argparse.ArgumentDefaultsHelpFormatter)
32+
parser.add_argument('--version', help='Version information', action='version', version=str('%(prog)s v0.1'))
33+
34+
parser_required = parser.add_argument_group('Required options')
35+
parser_required.add_argument('-t', type=str, metavar='header of sequence to be retrieved',
36+
help='Uncompressed fastq file containing mate 1 reads', required=True)
37+
parser_required.add_argument('-f', type=argparse.FileType('r'), metavar='/path/to/input/file.fasta',
38+
help='Fasta with the sequences', required=True)
39+
40+
args = parser.parse_args()
41+
42+
getSequence(args.t, args.f)
43+
44+
45+
46+
if __name__ == "__main__":
47+
main()

flowcraft/flowcraft.py

Lines changed: 27 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@
1616
from generator.engine import NextflowGenerator, process_map
1717
from generator.inspect import NextflowInspector
1818
from generator.report import FlowcraftReport
19-
from generator.recipe import brew_recipe, available_recipes
19+
from generator.recipe import brew_innuendo, brew_recipe, list_recipes
2020
from generator.pipeline_parser import parse_pipeline, SanityError
2121
from generator.process_details import proc_collector, colored_print
2222
import generator.error_handling as eh
@@ -25,7 +25,8 @@
2525
from flowcraft.generator.engine import NextflowGenerator, process_map
2626
from flowcraft.generator.inspect import NextflowInspector
2727
from flowcraft.generator.report import FlowcraftReport
28-
from flowcraft.generator.recipe import brew_recipe, available_recipes
28+
from flowcraft.generator.recipe import brew_innuendo, \
29+
brew_recipe, list_recipes
2930
from flowcraft.generator.pipeline_parser import parse_pipeline, \
3031
SanityError
3132
from flowcraft.generator.process_details import proc_collector, \
@@ -77,13 +78,22 @@ def get_args(args=None):
7778
const=True, help="Check only the validity of the pipeline "
7879
"string and exit.")
7980
group_lists.add_argument(
80-
"-L", "--detailed-list", action="store_const", dest="detailed_list",
81+
"-L", "--component-list", action="store_const", dest="detailed_list",
8182
const=True, help="Print a detailed description for all the "
82-
"currently available processes")
83+
"currently available processes.")
8384
group_lists.add_argument(
84-
"-l", "--short-list", action="store_const", dest="short_list",
85+
"-l", "--component-list-short", action="store_const", dest="short_list",
8586
const=True, help="Print a short list of the currently "
86-
"available processes")
87+
"available processes.")
88+
group_lists.add_argument(
89+
"--recipe-list", dest="recipe_list", action="store_const", const=True,
90+
help="Print a short list of the currently available recipes."
91+
)
92+
group_lists.add_argument(
93+
"--recipe-list-short", dest="recipe_list_short", action="store_const",
94+
const=True, help="Print a condensed list of the currently available "
95+
"recipes"
96+
)
8797
build_parser.add_argument(
8898
"-cr", "--check-recipe", dest="check_recipe",
8999
action="store_const", const=True,
@@ -277,6 +287,12 @@ def build(args):
277287
if args.export_params or args.export_directives:
278288
logger.setLevel(logging.ERROR)
279289

290+
if args.recipe_list_short:
291+
list_recipes()
292+
293+
if args.recipe_list:
294+
list_recipes(full=True)
295+
280296
welcome = [
281297
"========= F L O W C R A F T =========",
282298
"Build mode\n"
@@ -293,13 +309,14 @@ def build(args):
293309
# appropriate recipe
294310
if args.recipe:
295311
if args.recipe == "innuendo":
296-
pipeline_string = brew_recipe(args, available_recipes)
312+
pipeline_string = brew_innuendo(args)
297313
else:
298-
pipeline_string = available_recipes[args.recipe]
314+
# pipeline_string = available_recipes[args.recipe]
315+
pipeline_string = brew_recipe(args.recipe)
299316
if args.tasks:
300317
logger.warning(colored_print(
301-
"-t parameter will be ignored for recipe: {}\n"
302-
.format(args.recipe), "yellow_bold")
318+
"-t parameter will be ignored for recipe: {}\n".format(
319+
args.recipe), "yellow_bold")
303320
)
304321

305322
if args.check_recipe:

0 commit comments

Comments
 (0)