Skip to content

Commit 38a0fbf

Browse files
authored
Merge pull request #169 from assemblerflow/dev
Version 1.4.0 release
2 parents 2173530 + 8c7ccde commit 38a0fbf

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

73 files changed

+2161
-286
lines changed

changelog.md

Lines changed: 49 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,54 @@
22

33
## Changes in upcoming release (`dev` branch)
44

5+
## 1.4.0
6+
7+
### New features
8+
9+
- Added new `recipe` system to flowcraft along with 6 starting recipes.
10+
Recipes are pre-made and curated pipelines that address specific questions.
11+
To create a recipe, the `-r <recipe_name>` can be used. To list available
12+
recipes, the `--recipe-list` and `--recipe-list-short` options were added.
13+
- Added `-ft` or `--fetch-tags` which allows to retrieve all DockerHub
14+
container tags.
15+
- Added function to collect all the components from the components classes,
16+
replacing the current process_map dictionary implementation. Now, it will be
17+
generated from the engine rather than hardcoded into the dict.
18+
19+
### Components changes
20+
21+
- Added new `disableRR` param in the `spades` component that disables repeat
22+
resolution
23+
- The `abyss` and `spades` components emit GFA in a secondary channel.
24+
- The new `bandage` component can accept either FASTA from a primary channel
25+
or GFA from a secondary channel.
26+
- Updated skesa to version 2.3.0.
27+
- Updated mash based components for the latest version - 1.6.0-1.
28+
29+
### New components
30+
31+
- Added component `abyss`.
32+
- Added component `bandage`.
33+
- Added component `unicycler`.
34+
- Added component `prokka`.
35+
- Added component `bcalm`.
36+
- Added component `diamond`.
37+
38+
### Minor/Other changes
39+
40+
- Added removal of duplicate IDs from `reads_download` component input.
41+
- Added seed parameter to `downsample_fastq` component.
42+
- Added bacmet database to `abricate` component.
43+
- Added default docker option to avoid docker permission errors.
44+
- Changed the default URL generated by inspect and report commands.
45+
- Added directives to `-L` parameter of build module.
46+
47+
### Bug fixes
48+
49+
- Fixed forks with same source process name.
50+
- Fixed `inspect` issue when tasks took more than a day in duration.
51+
- Added hardware address to `inpsect` and `report` hash.
52+
553
## 1.3.1
654

755
### Features
@@ -16,6 +64,7 @@ which is particularly useful in very large workflows.
1664
`mapping_patlas`.
1765

1866
### New components
67+
1968
- Added component `fast_ani`.
2069

2170
### Minor/Other changes

docs/dev/create_process.rst

Lines changed: 11 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -14,10 +14,6 @@ The addition of a new process to FlowCraft requires three main steps:
1414
information about the process (e.g., expected input/output, secondary inputs,
1515
etc.).
1616

17-
#. `Add to available processes`_: Add the :class:`~flowcraft.generator.process` class to the
18-
dictionary of available process in
19-
:attr:`flowcraft.generator.engine.process_map`.
20-
2117
.. _create-process:
2218

2319
Create process template
@@ -211,20 +207,20 @@ Depending on the process, other attributes may be required:
211207
- `Directives`_: Default information for RAM/CPU/Container directives
212208
and more.
213209

214-
Add to available processes
210+
Add to available components
215211
::::::::::::::::::::::::::
216212

217-
The final step is to add your new process to the list of available processes.
218-
This list is defined in :attr:`flowcraft.generator.engine.process_map`
219-
module, which is a dictionary
220-
mapping the process template name to the corresponding template class::
221-
222-
process_map = {
223-
<other_process>
224-
"my_process_template": process.MyProcess
225-
}
213+
Contrary to previous implementation (version <= 1.3.1), the available components
214+
are now retrieved automatically by FlowCraft and there is no need to add the
215+
process to any dictionary (previous ``process_map``). In order for the component
216+
to be accessible to ``flowcraft build`` the process template name in
217+
``snake_case`` must match the process class in ``CamelCase``. For instance,
218+
if the process template is named ``my_process.nf``, the process class must
219+
be ``MyProcess``, then the FlowCraft will be able to automatically add it to the
220+
list of available components.
226221

227-
Note that the template string does not include the ``.nf`` extension.
222+
.. note::
223+
Note that the template string does not include the ``.nf`` extension.
228224

229225
Process attributes
230226
------------------

docs/dev/create_recipe.rst

Lines changed: 116 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,116 @@
1+
Recipe creation guidelines
2+
===========================
3+
4+
Recipes are pre-made pipeline strings that may be associated with specific
5+
parameters and directives and are used to rapidly build a certain type of
6+
pipeline.
7+
8+
Instead of building a pipeline like::
9+
10+
-t "integrity_coverage fastqc_trimmomatic fastqc spades pilon"
11+
12+
The user simply can specific a recipe with that pipeline::
13+
14+
-r assembly
15+
16+
Recipe creation
17+
---------------
18+
19+
The creation of new recipes is a very simple and straightforward process.
20+
You need to create a new file in the ``flowcraft/generator/recipes`` folder
21+
with any name and create a basic class with three attributes::
22+
23+
try:
24+
from generator.recipe import Recipe
25+
except ImportError:
26+
from flowcraft.generator.recipe import Recipe
27+
28+
29+
class Innuca(Recipe):
30+
31+
def __init__(self):
32+
super().__init__()
33+
34+
# Recipe name
35+
self.name = "innuca"
36+
37+
# Recipe pipeline
38+
self.pipeline_str = <pipeline string>
39+
40+
# Recipe parameters and directives
41+
self.directives = { <directives> }
42+
43+
And that's it! Now there is a new recipe available with the ``innuca`` name and
44+
we can build this pipeline using the option ``-r innuca``.
45+
46+
Name
47+
^^^^
48+
49+
This is the name of the recipe, which is used to make a match with the recipe
50+
name provided by the user via the ``-r`` option.
51+
52+
Pipeline_str
53+
^^^^^^^^^^^^
54+
55+
The pipeline string as if provided via the ``-t`` option.
56+
57+
Directives
58+
^^^^^^^^^^
59+
60+
A dictionary containing the parameters and directives for each process in the
61+
pipeline string. **Setting this attribute is optional and components
62+
that are not specified here will assume their default values**. In general, each
63+
element in this dictionary should have the following format::
64+
65+
self.directives = {
66+
"component_name": {
67+
"params": {
68+
"paramA": "value"
69+
},
70+
"directives": {
71+
"directiveA": "value"
72+
}
73+
}
74+
}
75+
76+
This will set the provided parameters and directives to the component, but it is
77+
possible to provide only one.
78+
79+
A more concrete example of a real component and directives follows::
80+
81+
self.pipeline_str = "integrity_coverage fastqc"
82+
83+
# Set parameters and directives only for integrity_coverage
84+
# and leave fastqc with the defaults
85+
self.directives = {
86+
"integrity_coverage": {
87+
"params": {
88+
"minCoverage": 20
89+
},
90+
"directives": {
91+
"memory": "1GB"
92+
}
93+
}
94+
}
95+
96+
Duplicate components
97+
~~~~~~~~~~~~~~~~~~~~
98+
99+
In some cases, the same component may be present multiple times in the pipeline
100+
string of a recipe. In these cases, directives can be assigned to each individual
101+
component by adding a ``#<id>`` suffix to the component::
102+
103+
self.pipeline_str = "integrity_coverage ( trimmomatic spades#1 | spades#2)"
104+
105+
self.directives = {
106+
"spades#1": {
107+
"directives": {
108+
"memory": "10GB"
109+
}
110+
},
111+
"spades#2": {
112+
"directives": {
113+
"version": "3.7.0"
114+
}
115+
}
116+
}

docs/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -44,6 +44,7 @@ A NextFlow pipeline assembler for genomics.
4444
dev/general_orientation
4545
dev/create_process
4646
dev/create_template
47+
dev/create_recipe
4748
dev/containers
4849
dev/process_dotfiles
4950
dev/pipeline_reporting
138 KB
Loading
Loading

docs/user/basic_usage.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -393,7 +393,7 @@ Local visualization
393393
:::::::::::::::::::
394394

395395
The FlowCraft report JSON file can also be visualized locally by drag and dropping
396-
it into the FlowCraft web application page, currently hosted at http://192.92.149.169/reports
396+
it into the FlowCraft web application page, currently hosted at http://www.flowcraft.live/reports
397397

398398
Offline visualization
399399
:::::::::::::::::::::

docs/user/components/diamond.rst

Lines changed: 55 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,55 @@
1+
diamond
2+
=======
3+
4+
Purpose
5+
-------
6+
7+
This component performs ``blastx`` or ``blastp`` with diamond. The database
8+
used by diamond can be provided from the local disk or generated in the process.
9+
This component uses the same output type as abricate with the same blast output
10+
fields.
11+
12+
.. note::
13+
Software page: https://github.com/bbuchfink/diamond
14+
15+
16+
Input/Output type
17+
-----------------
18+
19+
- Input type: ``Fasta``
20+
- Output type: None
21+
22+
.. note::
23+
The default input parameter for fasta data is ``--fasta``.
24+
25+
Parameters
26+
----------
27+
28+
- ``pathToDb``: Provide full path for the diamond database. If none is provided
29+
then will try to fetch from the previous process. Default: None
30+
31+
- ``fastaToDb``: Provide the full path for the fasta to construct a diamond
32+
database. Default: None
33+
34+
- ``blastType``: Defines the type of blast that diamond will do. Can wither be
35+
blastx or blastp. Default: blastx
36+
37+
Published results
38+
-----------------
39+
40+
- ``results/annotation/diamond*``: Stores the results of the abricate screening
41+
for each sample and for each specified database.
42+
43+
Published reports
44+
-----------------
45+
46+
None.
47+
48+
Default directives
49+
------------------
50+
51+
- ``diamond``:
52+
- ``container``: flowcraft/diamond
53+
- ``version``: 0.9.22-1
54+
- ``memory``: { 4.GB * task.attempt }
55+
- ``cpus``: 2

docs/user/components/downsample_fastq.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,7 @@ Parameters
2323
- ``genomeSize``: Genome size estimate for the samples. It is used to
2424
estimate the coverage.
2525
- ``depth``: The target depth to which the reads should be subsampled.
26+
- ``seed``: The seed number for seqtk. By default it is 100.
2627

2728
Published results
2829
-----------------

docs/user/components/mash_dist.rst

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -44,7 +44,10 @@ Parameters
4444

4545
- ``refFile``: Specifies the reference file to be provided to mash. It can either
4646
be a fasta or a .msh reference sketch generated by mash.
47-
Default: '/ngstools/data/patlas.msh'.
47+
Default: '/ngstools/data/patlas.msh'. If the component ``mash_sketch_fasta``
48+
is executed before this component, this parameter will be ignored and instead
49+
the secondary link between the two processes will be used to feed this
50+
component with the reference sketch.
4851

4952

5053
Published results

docs/user/components/mash_screen.rst

Lines changed: 7 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,11 @@
11
mash_screen
2-
==============
2+
===========
33

44
Purpose
55
-------
66

77
This component performes mash screen to find plasmids
8-
contained in high throughoput sequencing data, using as inputs read files
8+
contained in high throughout sequencing data, using as inputs read files
99
(FastQ files). Then, the resulting file can
1010
be imported into `pATLAS <http://www.patlas.site/>`_.
1111
This component searches for containment of a given sequence in read sequencing
@@ -38,8 +38,11 @@ Parameters
3838
reference sequence. Default: 0.9.
3939

4040
- ``refFile``: "Specifies the reference file to be provided to mash. It can
41-
either be a fasta or a .msh reference sketch generated by mash.
42-
Default: '/ngstools/data/patlas.msh'.
41+
either be a fastq or a .msh reference sketch generated by mash.
42+
Default: '/ngstools/data/patlas.msh'. If the component ``mash_sketch_fastq``
43+
is executed before this component, this parameter will be ignored and instead
44+
the secondary link between the two processes will be used to feed this
45+
component with the reference sketch.
4346

4447

4548
Published results

docs/user/components/mash_sketch_fasta.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -44,4 +44,4 @@ Default directives
4444

4545
- ``mashSketchFasta``:
4646
- ``container``: flowcraft/mash-patlas
47-
- ``version``: 1.4.1
47+
- ``version``: 1.6.0-1

docs/user/components/mash_sketch_fastq.rst

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,9 @@ mash_sketch_fastq
44
Purpose
55
-------
66

7-
This component performs mash sketch for fastq input files.
7+
This component performs mash sketch for fastq input files. These sketches can
8+
be used by ``mash_dist`` and ``mash_screen`` components to fetch the
9+
reference file for mash.
810

911
.. note::
1012
- MASH documentation can be found `here <https://mash.readthedocs.io/en/latest/>`_.
@@ -52,4 +54,4 @@ Default directives
5254

5355
- ``mashSketchFastq``:
5456
- ``container``: flowcraft/mash-patlas
55-
- ``version``: 1.4.1
57+
- ``version``: 1.6.0-1

0 commit comments

Comments
 (0)