-
Notifications
You must be signed in to change notification settings - Fork 14
Scripts Generic
Here you will find basic descriptions of the intent of generic scripts, please see the help/manual entries for full details as these will be the most current source of information.
Generates read and mapping statistics for a BAM file. It will process BAMs with read-group headers appropriately splitting information into individual rows of the resulting *.bas
file (bas concept from vr-pipe project).
The intent is to capture as many of the commonly requested statistics in a single pass of a BAM file. If you have ideas for new statistics please create an issue in the tracker.
Data not linked to a read-group is dumped into a single bin, unless you know the file only contains one lane of sequencing you should consider these statistics suspect.
Compares 2 BAM files at the record level. Checks stable elements of the header are matched (SQ entries and order) skips potentially unstable header items (PG may have different file paths etc.). Each read is compared for mapping and flag info. You are optionally able to skip reads with poor MAPQ values as these can be volatile.
Wraps any command and provides a report on memory and CPU time. Captures the max memory totalled across all threads of a process at a time point, i.e. if 3 threads are using 3GB at the same time then this will report 9GB. Many examples of tools to handle this failed to take this into account.
Please be aware that all commands under this are also prefixed with:
numactl --interleave=all