Skip to content

Commit aa27156

Browse files
authored
Merge pull request #287 from d4straub/dev
Update docs & prevent tempory files of ancom to be published
2 parents 7f4e9cf + 51166b3 commit aa27156

File tree

7 files changed

+37
-33
lines changed

7 files changed

+37
-33
lines changed

README.md

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# ![nf-core/ampliseq](docs/images/nf-core-ampliseq_logo.png)
22

3-
**16S rRNA amplicon sequencing analysis workflow using QIIME2**.
3+
**Amplicon sequencing analysis workflow using DADA2 and QIIME2**.
44

55
[![DOI](https://zenodo.org/badge/150448201.svg)](https://zenodo.org/badge/latestdoi/150448201)
66
[![Cite Publication](https://img.shields.io/badge/Cite%20Us!-Cite%20Publication-important)](https://doi.org/10.3389/fmicb.2020.550420)
@@ -20,7 +20,7 @@
2020

2121
## Introduction
2222

23-
**nfcore/ampliseq** is a bioinformatics analysis pipeline used for amplicon sequencing data, supporting 16S, ITS and 18S data. Supported is paired-end Illumina or single-end Illumina, PacBio and IonTorrent data.
23+
**nfcore/ampliseq** is a bioinformatics analysis pipeline used for amplicon sequencing, supporting denoising of any amplicon and, currently, taxonomic assignment of 16S, ITS and 18S amplicons. Supported is paired-end Illumina or single-end Illumina, PacBio and IonTorrent data. Default is the analysis of 16S rRNA gene amplicons sequenced paired-end with Illumina.
2424

2525
The pipeline is built using [Nextflow](https://www.nextflow.io), a workflow tool to run tasks across multiple compute infrastructures in a very portable manner. It comes with docker containers making installation trivial and results highly reproducible.
2626

@@ -41,6 +41,7 @@ The pipeline is built using [Nextflow](https://www.nextflow.io), a workflow tool
4141
4. Start running your own analysis!
4242

4343
```bash
44+
#16S rRNA gene amplicon analysis of Illumina paired-end data
4445
nextflow run nf-core/ampliseq -profile <docker/singularity/podman/shifter/charliecloud/conda/institute> --input "data" --FW_primer "GTGYCAGCMGCCGCGGTAA" --RV_primer "GGACTACNVGGGTWTCTAAT" --metadata "data/Metadata.tsv"
4546
```
4647

assets/email_template.html

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44
<meta http-equiv="X-UA-Compatible" content="IE=edge">
55
<meta name="viewport" content="width=device-width, initial-scale=1">
66

7-
<meta name="description" content="nf-core/ampliseq: 16S rRNA amplicon sequencing analysis workflow using QIIME2">
7+
<meta name="description" content="nf-core/ampliseq: Amplicon sequencing analysis workflow using DADA2 and QIIME2">
88
<title>nf-core/ampliseq Pipeline Report</title>
99
</head>
1010
<body>

docs/output.md

Lines changed: 25 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -24,8 +24,9 @@ and processes data using the following steps:
2424
* [Relative abundance tables](#relative-abundance-tables) - Exported relative abundance tables
2525
* [Barplot](#barplot) - Interactive barplot
2626
* [Alpha diversity rarefaction curves](#alpha-diversity-rarefaction-curves) - Rarefaction curves for quality control
27-
* [Alpha diversity indices](#alpha-diversity-indices) - Diversity within samples
28-
* [Beta diversity indices](#beta-diversity-indices) - Diversity between samples (e.g. PCoA plots)
27+
* [Diversity analysis](#diversity-analysis) - High level overview with different diversity indices
28+
* [Alpha diversity indices](#alpha-diversity-indices) - Diversity within samples
29+
* [Beta diversity indices](#beta-diversity-indices) - Diversity between samples (e.g. PCoA plots)
2930
* [ANCOM](#ancom) - Differential abundance analysis
3031
* [Read count report](#Read-count-report) - Report of read counts during various steps of the pipeline
3132
* [Pipeline information](#pipeline-information) - Report metrics generated during the workflow execution
@@ -76,7 +77,7 @@ DADA2 computes an error model on the sequencing reads (forward and reverse indep
7677

7778
DADA2 reduces sequence errors and dereplicates sequences by quality filtering, denoising, read pair merging (for paired end Illumina reads only) and PCR chimera removal.
7879

79-
Additionally, DADA2 taxonomically classifies the ASVs using pre-trained databases.
80+
Additionally, DADA2 taxonomically classifies the ASVs using a choice of supplied databases (specified with `--dada_ref_taxonomy`).
8081

8182
**Output files:**
8283

@@ -88,7 +89,7 @@ Additionally, DADA2 taxonomically classifies the ASVs using pre-trained database
8889
* `DADA2_stats.tsv`: Tracking read numbers through DADA2 processing steps, for each sample.
8990
* `DADA2_table.rds`: DADA2 ASV table as R object.
9091
* `DADA2_tables.tsv`: DADA2 ASV table.
91-
* `dada2/args/`: Directory containing all parameters for DADA2 steps.
92+
* `dada2/args/`: Directory containing files with all parameters for DADA2 steps.
9293
* `dada2/log/`: Directory containing log files for DADA2 steps.
9394
* `dada2/QC/`
9495
* `*.err.convergence.txt`: Convergence values for DADA2's dada command, should reduce over several magnitudes and approaching 0.
@@ -111,11 +112,11 @@ Optionally, the ITS region can be extracted from each ASV sequence using ITSx, a
111112

112113
**Quantitative Insights Into Microbial Ecology 2** ([QIIME2](https://qiime2.org/)) is a next-generation microbiome bioinformatics platform and the successor of the widely used [QIIME1](https://www.nature.com/articles/nmeth.f.303).
113114

114-
ASV sequences and counts as produced before with DADA2 are imported into QIIME2 and further analysed. First, ASVs are taxonomically classified, than filtered (`--exclude_taxa`, `--min_frequency`, `--min_samples`), and abundance tables exported. Following, diversity indices are calculated and testing for differential abundant features between sample groups is performed.
115+
ASV sequences, counts, and taxonomic classification as produced before with DADA2 are imported into QIIME2 and further analysed. Optionally, ASVs can be taxonomically classified also with QIIME2 against a database chosen with `--qiime_ref_taxonomy` (but DADA2 taxonomic classification takes precedence). Next, ASVs are filtered (`--exclude_taxa`, `--min_frequency`, `--min_samples`), and abundance tables are exported. Following, diversity indices are calculated and testing for differential abundant features between sample groups is performed.
115116

116117
#### Taxonomic classification
117118

118-
ASV abundance and sequences inferred in DADA2 are informative but routinely taxonomic classifications such as family or genus annotation is desireable.
119+
Taxonomic classification with QIIME2 is typically similar to DADA2 classifications. However, both options are available. When taxonomic classification with DADA2 and QIIME2 is performed, DADA2 classification takes precedence over QIIME2 classifications for all downstream analysis.
119120

120121
**Output files:**
121122

@@ -160,7 +161,7 @@ Absolute abundance tables produced by the previous steps contain count data, but
160161
* `rel-table-6.tsv`: Tab-separated relative abundance table at genus level.
161162
* `rel-table-7.tsv`: Tab-separated relative abundance table at species level.
162163
* `rel-table-ASV.tsv`: Tab-separated relative abundance table for all ASVs.
163-
* `qiime2_ASV_table.tsv`: Tab-separated table for all ASVs with taxonomic classification, sequence and relative abundance.
164+
* `qiime2_ASV_table.tsv`: Tab-separated table for all ASVs with taxonomic classification, sequence and relative abundance. *NOTE: This file is based on QIIME2 taxonomic classifications, contrary to all other files that are based on DADA2 classification, if available.*
164165

165166
#### Barplot
166167

@@ -180,24 +181,31 @@ Produces rarefaction plots for several alpha diversity indices, and is primarily
180181
* `qiime2/alpha-rarefaction/`
181182
* `index.html`: Interactive alphararefaction curve for taxa abundance per sample that can be viewed in your web browser.
182183

183-
#### Alpha diversity indices
184+
#### Diversity analysis
184185

185-
Alpha diversity measures the species diversity within samples. Diversity calculations are based on sub-sampled data rarefied to the minimum read count of all samples. This step calculates alpha diversity using various methods and performs pairwise comparisons of groups of samples. It is based on a phylogenetic tree of all ASV sequences.
186+
Diversity measures summarize important sample features (alpha diversity) or differences between samples (beta diversity). To do so, sample data is first rarefied to the minimum number of counts per sample. Also, a phylogenetic tree of all ASVs is computed to provide phylogenetic information.
186187

187188
**Output files:**
188189

190+
* `qiime2/diversity/`
191+
* `Use the sampling depth of * for rarefaction.txt`: File that reports the rarefaction depth in the file name and file content.
189192
* `qiime2/phylogenetic_tree/`
190193
* `tree.nwk`: Phylogenetic tree in newick format.
191194
* `rooted-tree.qza`: Phylogenetic tree in QIIME2 format.
192-
* `qiime2/diversity/`
193-
* `*.txt`: File that describes the rarefaction depth (file name and file contant).
195+
196+
##### Alpha diversity indices
197+
198+
Alpha diversity measures the species diversity within samples. Diversity calculations are based on sub-sampled data rarefied to the minimum read count of all samples. This step calculates alpha diversity using various methods and performs pairwise comparisons of groups of samples. It is based on a phylogenetic tree of all ASV sequences.
199+
200+
**Output files:**
201+
194202
* `qiime2/diversity/alpha_diversity/`
195203
* `evenness_vector/index.html`: Pielou’s Evenness.
196204
* `faith_pd_vector/index.html`: Faith’s Phylogenetic Diversity (qualitiative, phylogenetic).
197205
* `observed_otus_vector/index.html`: Observed OTUs (qualitative).
198206
* `shannon_vector/index.html`: Shannon’s diversity index (quantitative).
199207

200-
#### Beta diversity indices
208+
##### Beta diversity indices
201209

202210
Beta diversity measures the species community differences between samples. Diversity calculations are based on sub-sampled data rarefied to the minimum read count of all samples. This step calculates beta diversity distances using various methods and performs pairwise comparisons of groups of samples. Additionally principle coordinates analysis (PCoA) plots are produced that can be visualized with [Emperor](https://biocore.github.io/emperor/build/html/index.html) in your default browser without the need for installation. This calculations are based on a phylogenetic tree of all ASV sequences.
203211

@@ -210,27 +218,22 @@ Beta diversity measures the species community differences between samples. Diver
210218

211219
**Output files:**
212220

213-
* `qiime2/phylogenetic_tree/`
214-
* `tree.nwk`: Phylogenetic tree in newick format.
215-
* `rooted-tree.qza`: Phylogenetic tree in QIIME2 format.
216-
* `qiime2/diversity/`
217-
* `*.txt`: File that describes the rarefaction depth (file name and file contant).
218221
* `qiime2/diversity/beta_diversity/`
219-
* `<method>_distance_matrix-<treatment>/index.html`
220-
* `<method>_pcoa_results-PCoA/index.html`
222+
* `<method>_distance_matrix-<treatment>/index.html`: Box plots and significance analysis (PERMANOVA).
223+
* `<method>_pcoa_results-PCoA/index.html`: Interactive PCoA plot.
221224
* method: bray_curtis, jaccard, unweighted_unifrac, weighted_unifrac
222225
* treatment: depends on your metadata sheet or what metadata categories you have specified
223226

224227
#### ANCOM
225228

226229
Analysis of Composition of Microbiomes ([ANCOM](https://www.ncbi.nlm.nih.gov/pubmed/26028277)) is applied to identify features that are differentially abundant across sample groups. A key assumption made by ANCOM is that few taxa (less than about 25%) will be differentially abundant between groups otherwise the method will be inaccurate.
227230

228-
ANCOM is applied to each suitable or specified metadata column for 6 taxonomic levels.
231+
ANCOM is applied to each suitable or specified metadata column for 5 taxonomic levels (2-6).
229232

230233
**Output files:**
231234

232235
* `qiime2/ancom/`
233-
* `Category-<treatment>-<taxonomic level>/index.html`
236+
* `Category-<treatment>-<taxonomic level>/index.html`: Statistical results and interactive Volcano plot.
234237
* treatment: depends on your metadata sheet or what metadata categories you have specified
235238
* taxonomic level: level-2 (phylum), level-3 (class), level-4 (order), level-5 (family), level-6 (genus), ASV
236239

@@ -240,7 +243,7 @@ This report includes information on how many reads per sample passed each pipeli
240243

241244
**Output files:**
242245

243-
* `overall_summary.tsv`
246+
* `overall_summary.tsv`: Tab-separated file with count summary.
244247

245248
## Pipeline information
246249

docs/usage.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,7 @@ results # Finished results (configurable, see below)
3030
# Other nextflow hidden files, eg. history of pipeline runs and old logs.
3131
```
3232

33-
See the [nf-core/ampliseq website documentation](https://nf-co.re/ampliseq/usage#usage) for more information about pipeline specific parameters.
33+
See the [nf-core/ampliseq website documentation](https://nf-co.re/ampliseq/parameters) for more information about pipeline specific parameters.
3434

3535
### Updating the pipeline
3636

modules/local/qiime2_ancom_tax.nf

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -37,9 +37,9 @@ process QIIME2_ANCOM_TAX {
3737
3838
# Extract summarised table and output a file with the number of taxa
3939
qiime tools export --input-path lvl${taxlevel}-${table} --output-path exported/
40-
biom convert -i exported/feature-table.biom -o ancom/lvl${taxlevel}-${table}.feature-table.tsv --to-tsv
40+
biom convert -i exported/feature-table.biom -o ${table.baseName}-level-${taxlevel}.feature-table.tsv --to-tsv
4141
42-
if [ \$(grep -v '^#' -c ancom/lvl${taxlevel}-${table}.feature-table.tsv) -lt 2 ]; then
42+
if [ \$(grep -v '^#' -c ${table.baseName}-level-${taxlevel}.feature-table.tsv) -lt 2 ]; then
4343
echo ${taxlevel} > ancom/\"WARNING Summing your data at taxonomic level ${taxlevel} produced less than two rows (taxa), ANCOM can't proceed -- did you specify a bad reference taxonomy?\".txt
4444
else
4545
qiime composition add-pseudocount \

nextflow.config

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -78,7 +78,7 @@ params {
7878
singularity_pull_docker_container = false
7979
validate_params = true
8080
show_hidden_params = false
81-
schema_ignore_params = 'dada_ref_databases,qiime_ref_databases,modules'
81+
schema_ignore_params = 'dada_ref_databases,qiime_ref_databases,modules,igenomes_base'
8282

8383
// Defaults only, expecting to be overwritten
8484
max_memory = 128.GB
@@ -199,7 +199,7 @@ manifest {
199199
name = 'nf-core/ampliseq'
200200
author = 'Daniel Straub, Alexander Peltzer'
201201
homePage = 'https://github.com/nf-core/ampliseq'
202-
description = '16S rRNA amplicon sequencing analysis workflow using QIIME2'
202+
description = 'Amplicon sequencing analysis workflow using DADA2 and QIIME2'
203203
mainScript = 'main.nf'
204204
nextflowVersion = '!>=21.04.0'
205205
version = '2.0.0dev'

0 commit comments

Comments
 (0)