-
Couldn't load subscription status.
- Fork 2
Description
I did some reading about exome data analysis with the tools that we use in Selma and apparently it would produce sub optimal results according this discussion: https://gatkforums.broadinstitute.org/gatk/discussion/6894/gatk-best-practices-for-exome-targeted-capture-small-region
The important points are the following quotes:
- you should not use BQSR on [exome data]
- You are probably better off doing hard filtering for a small target region [instead of using VQSR]
This discussion also has good information about why BQSR is not advised for datasets with less than 100 million bases.
We discussed running hap.py on exome (interval file) data analyses but based on these points this may not be a good use of our time given that we shouldn't run the BQSR, VQSR and ApplyVQSR tools on small datasets.