Skip to content

Potential case studies (i.e., existing results, do the same thing easier using the container)

Tim Triche, Jr. edited this page Mar 26, 2015 · 22 revisions

MethylMix clustering: http://genomebiology.com/2015/16/1/17 (this is also an excuse for me to resurrect SciDB as a SummarizedExperiment backend; see related page)

FEM validation: http://bioinformatics.oxfordjournals.org/content/30/16/2360.short (in my own experience, further integration of mutational/fusion covariates into FEM works VERY well)

TCGA AML p53 deletion/mutation integration/verification using 450k methylation clustering: http://www.nejm.org/doi/full/10.1056/NEJMoa1301689
(pretty sure we dropped this into the supplement... maybe I had better check, or just dig out the code)

Obviously with a decent 450k/CNV + RNAseq/SNV calling pipeline, something like LAML becomes more trivial. It does however continue to illustrate the value of keeping sample identifiers strictly enforced during preprocessing, and verifying suspicious mappings (or mismappings) as part of initial QC. The fact that SNPs are used as a "barcode" for 450k data provides a way to label-verify, at the very least, CNV and DNA methylation; when SNPs are called in WGS and/or RNAseq, those can also be used for positive matching. I really have no idea if anyone else does this at the present time, but IMHO everyone should. The approach is also relatively easily extended to WGBS and ATACseq label mapping (given a set of SNP calls).

Will update with code as I assemble it --tjt, 3/26/2015

Clone this wiki locally