Skip to content

clarification on length.tsv #51

@sherlyn99

Description

@sherlyn99

length.tsv should be a tab-delimited file with the first column as genome id (corresponding to the genome ids showing in the SAM files) and the second column as genome length. The file should contain no headers.

Here is an example of length.tsv

G000005825	4249288
G000006175	1936387
G000006605	2476842
G000006725	2731790
G000006745	4033484

If using wol2 as the database, you can find the length file in <wol2-folder>/genomes/length.map

If using other databases, you can locate the all.fna which is a fasta file containing all genomes and get the list of genome length using seqkit or equivalent tools. e.g.

seqkit fx2tab --length --name --header-line foo.fasta

ref: https://www.biostars.org/p/118954/

Will update the repo to reflect this on README.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions