-
Notifications
You must be signed in to change notification settings - Fork 6
Open
Description
Some taxa have duplicated sequences in the fasta, thus you get an error like this from samtools:
[W::sam_hdr_create] Duplicated sequence "QJKH01000049.1" in file "/tmp/panphlan_wrzoniqg.sam"
[E::sam_hrecs_update_hashes] Duplicate entry "QJKH01000001.1" in sam header
samtools view: failed to add PG line to the header
[W::hts_set_opt] Cannot change block size for this format
samtools sort: failed to read header from "-"
samtools index: "panphlan/output/Dielma_fastidiosa/map_results/SRR14117082_Dielma_fastidiosa_out.bam" is in a format that cannot be usefully indexed
[E] Samtools index encountered some error.
fixed by removing duplicated sequences with seqkit rmdup -n
I had this issue for the genomes of: Cutibacterium_acnes, Roseburia_intestinalis, Olsenella_uli, Acinetobacter_ursingii, Actinomyces_naeslundii, Dialister_pneumosintes, Peptoniphilus_lacrimalis and Dielma_fastidiosa
Metadata
Metadata
Assignees
Labels
No labels