-
Notifications
You must be signed in to change notification settings - Fork 9
Description
Hi,
I am new using this software. I have an Illumina sequencing of V4 sequence 16S rDNA. I trimmed and assembled the sequences using DADA2.
I wanted to use hmmufotu on DADA2's ESVs to compare the taxonomic classification of the two softwares, and get a phylogenetic tree for my ESVs.
Here is the beginning of the input :
Otu1
AGCAGTGGGGAATAT[...]CAAACAGGATTAGATACCCTGGTA
Otu2
AGCAGTGGGGAATAT[...]GGATTAGATACCCTGGTA
Otu3
AGCAGTGGGGAATAT[...]AGGATTAGATACCCTGGTA
After running hmmufotu and hmmufotu-sum on the file using GreenGenes (v13.8) species-level (97% OTU) reference + GTR DNA model that is recommanded, I got a very weird alignment were almost all bases of most of my ESVs (here they are called Otus, but it is only for compatibility with other software) are replaced by gaps '-'
Example
5360 DBName=Archive/GTR/gg_97_otus_GTR;Taxonomy="k__Bacteria;p__Firmicutes;c__Clostridia;o__Clostridiales;f__Lachnospiraceae;g__[Ruminococcus];s__gnavus";AnnoDist=0.64307999999999976;ReadCount=71;SampleHits=1
--------------[...]----------------------------------------------
--------------------------TACCAGGGCTACACACGTGCT----
---[...]-----
[...] are were I reduced the sequence length for the purpose of this message)
And the classification is also very weird, with only 7% of agreement at Phylum level with classification on SILVA database using RDP classifier.
Perhaps I am using it wrong ? I know this software is supposed to be used on raw reads, but I thought it would have been great to compare its classification resolution with RDP classifier.
Thanks you in advance!