Skip to content

manage long N intergenic regions #12

@jwollbrett

Description

@jwollbrett

Some reference intergenic regions are full of N bp.
For instance for human, more than 80 reference intergenic regions are a sequence of 20.000 N.
As we provide reference intergenic sequences in our FTP for the BgeeCall package, we should remove these sequences.
We should maybe also remove all long N regions in intergenic regions.
One solution could be to remove all N regions bigger or equal to default kmer size of kallisto (31bp).
One initial 20.000bp reference intergenic region could then result to more than one reference intergenic region.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions