Skip to content

Commit 3115488

Browse files
authored
Merge pull request #599 from d4straub/pr2-species-assignment
PR2 exact species assignment now without taxa ending with sp.
2 parents c675e10 + 808185b commit 3115488

File tree

2 files changed

+3
-1
lines changed

2 files changed

+3
-1
lines changed

CHANGELOG.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
2121
- [#563](https://github.com/nf-core/ampliseq/pull/563) - Renamed DADA2 taxonomic classification files to include the chosen reference taxonomy abbreviation.
2222
- [#567](https://github.com/nf-core/ampliseq/pull/567) - Renamed `--dada_tax_agglom_min` and `--qiime_tax_agglom_min` to `--tax_agglom_min` and `--dada_tax_agglom_max` and `--qiime_tax_agglom_max` to `--tax_agglom_max`
2323
- [#598](https://github.com/nf-core/ampliseq/pull/598) - Updated Workflow figure with SINTAX and phylogenetic placement
24+
- [#599](https://github.com/nf-core/ampliseq/pull/599) - For exact species assignment (DADA2's addSpecies) PR2 taxonomy database (e.g. `--dada_ref_taxonomy pr2`) now excludes any taxa that end with " sp.".
2425

2526
### `Fixed`
2627

bin/taxref_reformat_pr2.sh

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,4 +7,5 @@ gunzip -c *dada2.fasta.gz > assignTaxonomy.fna
77

88
# For addSpecies(), the UTAX file is downloaded and reformated to only contain the id and species.
99
# The second two sed calls are to replace "_" with space only in the species name and not the last part of the id (overdoing it a bit, as I don't the id actually matters as long as it's unique).
10-
gunzip -c *UTAX.fasta.gz | sed '/^>/s/>\([^;]*\);.*,s:\(.*\)/>\1 \2/' | sed 's/_/ /g' | sed 's/ \([A-Z]\) /_\1 /' > addSpecies.fna
10+
# The awk part removes any entries (sequence name and sequence) that have a sequence name ending with " sp."
11+
gunzip -c *UTAX.fasta.gz | sed '/^>/s/>\([^;]*\);.*,s:\(.*\)/>\1 \2/' | sed 's/_/ /g' | sed 's/ \([A-Z]\) /_\1 /' | awk '!/ sp.\n/' RS=">" ORS=">" > addSpecies.fna

0 commit comments

Comments
 (0)