You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
**nfcore/ampliseq** is a bioinformatics analysis pipeline used for amplicon sequencing data, supporting 16S, ITS and 18S data. Supported is paired-end Illumina or single-end Illumina, PacBio and IonTorrent data.
23
+
**nfcore/ampliseq** is a bioinformatics analysis pipeline used for amplicon sequencing, supporting denoising of any amplicon and, currently, taxonomic assignment of 16S, ITS and 18S amplicons. Supported is paired-end Illumina or single-end Illumina, PacBio and IonTorrent data. Default is the analysis of 16S rRNA gene amplicons sequenced paired-end with Illumina.
24
24
25
25
The pipeline is built using [Nextflow](https://www.nextflow.io), a workflow tool to run tasks across multiple compute infrastructures in a very portable manner. It comes with docker containers making installation trivial and results highly reproducible.
26
26
@@ -41,6 +41,7 @@ The pipeline is built using [Nextflow](https://www.nextflow.io), a workflow tool
41
41
4. Start running your own analysis!
42
42
43
43
```bash
44
+
#16S rRNA gene amplicon analysis of Illumina paired-end data
*[Read count report](#Read-count-report) - Report of read counts during various steps of the pipeline
31
32
*[Pipeline information](#pipeline-information) - Report metrics generated during the workflow execution
@@ -76,7 +77,7 @@ DADA2 computes an error model on the sequencing reads (forward and reverse indep
76
77
77
78
DADA2 reduces sequence errors and dereplicates sequences by quality filtering, denoising, read pair merging (for paired end Illumina reads only) and PCR chimera removal.
78
79
79
-
Additionally, DADA2 taxonomically classifies the ASVs using pre-trained databases.
80
+
Additionally, DADA2 taxonomically classifies the ASVs using a choice of supplied databases (specified with `--dada_ref_taxonomy`).
80
81
81
82
**Output files:**
82
83
@@ -88,7 +89,7 @@ Additionally, DADA2 taxonomically classifies the ASVs using pre-trained database
88
89
*`DADA2_stats.tsv`: Tracking read numbers through DADA2 processing steps, for each sample.
89
90
*`DADA2_table.rds`: DADA2 ASV table as R object.
90
91
*`DADA2_tables.tsv`: DADA2 ASV table.
91
-
*`dada2/args/`: Directory containing all parameters for DADA2 steps.
92
+
*`dada2/args/`: Directory containing files with all parameters for DADA2 steps.
92
93
*`dada2/log/`: Directory containing log files for DADA2 steps.
93
94
*`dada2/QC/`
94
95
*`*.err.convergence.txt`: Convergence values for DADA2's dada command, should reduce over several magnitudes and approaching 0.
@@ -111,11 +112,11 @@ Optionally, the ITS region can be extracted from each ASV sequence using ITSx, a
111
112
112
113
**Quantitative Insights Into Microbial Ecology 2** ([QIIME2](https://qiime2.org/)) is a next-generation microbiome bioinformatics platform and the successor of the widely used [QIIME1](https://www.nature.com/articles/nmeth.f.303).
113
114
114
-
ASV sequencesand counts as produced before with DADA2 are imported into QIIME2 and further analysed. First, ASVs are taxonomically classified, than filtered (`--exclude_taxa`, `--min_frequency`, `--min_samples`), and abundance tables exported. Following, diversity indices are calculated and testing for differential abundant features between sample groups is performed.
115
+
ASV sequences, counts, and taxonomic classification as produced before with DADA2 are imported into QIIME2 and further analysed. Optionally, ASVs can be taxonomically classified also with QIIME2 against a database chosen with `--qiime_ref_taxonomy` (but DADA2 taxonomic classification takes precedence). Next, ASVs are filtered (`--exclude_taxa`, `--min_frequency`, `--min_samples`), and abundance tables are exported. Following, diversity indices are calculated and testing for differential abundant features between sample groups is performed.
115
116
116
117
#### Taxonomic classification
117
118
118
-
ASV abundance and sequences inferred in DADA2 are informative but routinely taxonomic classifications such as family or genus annotation is desireable.
119
+
Taxonomic classification with QIIME2 is typically similar to DADA2 classifications. However, both options are available. When taxonomic classification with DADA2 and QIIME2 is performed, DADA2 classification takes precedence over QIIME2 classifications for all downstream analysis.
119
120
120
121
**Output files:**
121
122
@@ -160,7 +161,7 @@ Absolute abundance tables produced by the previous steps contain count data, but
160
161
*`rel-table-6.tsv`: Tab-separated relative abundance table at genus level.
161
162
*`rel-table-7.tsv`: Tab-separated relative abundance table at species level.
162
163
*`rel-table-ASV.tsv`: Tab-separated relative abundance table for all ASVs.
163
-
*`qiime2_ASV_table.tsv`: Tab-separated table for all ASVs with taxonomic classification, sequence and relative abundance.
164
+
*`qiime2_ASV_table.tsv`: Tab-separated table for all ASVs with taxonomic classification, sequence and relative abundance.*NOTE: This file is based on QIIME2 taxonomic classifications, contrary to all other files that are based on DADA2 classification, if available.*
164
165
165
166
#### Barplot
166
167
@@ -180,24 +181,31 @@ Produces rarefaction plots for several alpha diversity indices, and is primarily
180
181
*`qiime2/alpha-rarefaction/`
181
182
*`index.html`: Interactive alphararefaction curve for taxa abundance per sample that can be viewed in your web browser.
182
183
183
-
#### Alpha diversity indices
184
+
#### Diversity analysis
184
185
185
-
Alpha diversity measures the species diversity within samples. Diversity calculations are based on sub-sampled data rarefied to the minimum read count of all samples. This step calculates alpha diversity using various methods and performs pairwise comparisons of groups of samples. It is based on a phylogenetic tree of all ASV sequences.
186
+
Diversity measures summarize important sample features (alpha diversity) or differences between samples (beta diversity). To do so, sample data is first rarefied to the minimum number of counts per sample. Also, a phylogenetic tree of all ASVs is computed to provide phylogenetic information.
186
187
187
188
**Output files:**
188
189
190
+
*`qiime2/diversity/`
191
+
*`Use the sampling depth of * for rarefaction.txt`: File that reports the rarefaction depth in the file name and file content.
189
192
*`qiime2/phylogenetic_tree/`
190
193
*`tree.nwk`: Phylogenetic tree in newick format.
191
194
*`rooted-tree.qza`: Phylogenetic tree in QIIME2 format.
192
-
*`qiime2/diversity/`
193
-
*`*.txt`: File that describes the rarefaction depth (file name and file contant).
195
+
196
+
##### Alpha diversity indices
197
+
198
+
Alpha diversity measures the species diversity within samples. Diversity calculations are based on sub-sampled data rarefied to the minimum read count of all samples. This step calculates alpha diversity using various methods and performs pairwise comparisons of groups of samples. It is based on a phylogenetic tree of all ASV sequences.
*`shannon_vector/index.html`: Shannon’s diversity index (quantitative).
199
207
200
-
#### Beta diversity indices
208
+
#####Beta diversity indices
201
209
202
210
Beta diversity measures the species community differences between samples. Diversity calculations are based on sub-sampled data rarefied to the minimum read count of all samples. This step calculates beta diversity distances using various methods and performs pairwise comparisons of groups of samples. Additionally principle coordinates analysis (PCoA) plots are produced that can be visualized with [Emperor](https://biocore.github.io/emperor/build/html/index.html) in your default browser without the need for installation. This calculations are based on a phylogenetic tree of all ASV sequences.
203
211
@@ -210,27 +218,22 @@ Beta diversity measures the species community differences between samples. Diver
210
218
211
219
**Output files:**
212
220
213
-
*`qiime2/phylogenetic_tree/`
214
-
*`tree.nwk`: Phylogenetic tree in newick format.
215
-
*`rooted-tree.qza`: Phylogenetic tree in QIIME2 format.
216
-
*`qiime2/diversity/`
217
-
*`*.txt`: File that describes the rarefaction depth (file name and file contant).
* treatment: depends on your metadata sheet or what metadata categories you have specified
223
226
224
227
#### ANCOM
225
228
226
229
Analysis of Composition of Microbiomes ([ANCOM](https://www.ncbi.nlm.nih.gov/pubmed/26028277)) is applied to identify features that are differentially abundant across sample groups. A key assumption made by ANCOM is that few taxa (less than about 25%) will be differentially abundant between groups otherwise the method will be inaccurate.
227
230
228
-
ANCOM is applied to each suitable or specified metadata column for 6 taxonomic levels.
231
+
ANCOM is applied to each suitable or specified metadata column for 5 taxonomic levels (2-6).
if [ \$(grep -v '^#' -c ancom/lvl${taxlevel}-${table}.feature-table.tsv) -lt 2 ]; then
42
+
if [ \$(grep -v '^#' -c ${table.baseName}-level-${taxlevel}.feature-table.tsv) -lt 2 ]; then
43
43
echo ${taxlevel} > ancom/\"WARNING Summing your data at taxonomic level ${taxlevel} produced less than two rows (taxa), ANCOM can't proceed -- did you specify a bad reference taxonomy?\".txt
0 commit comments