-
Notifications
You must be signed in to change notification settings - Fork 191
Fast Parallel UniFrac
A common tool in microbial ecology studies involving many samples is to calculate the "UniFrac" distance between all pairs of samples, and then perform various analyses on the resulting distance matrix.
The phyloseq package includes a native R
implementation of both the original UniFrac algorithm, as well as the better, faster, cleaner "Fast UniFrac" algorithm. Both approaches arrive at the same result. There are also two very different types of UniFrac calculation:
Weighted UniFrac - which does take into account differences in abundance of species between samples, but takes longer to calculate; and
Unweighted UniFrac - which only considers the presence/absence of species between sample pairs.
Both can be useful, and share slightly different insight. Both weighted and unweighted UniFrac are included, and all UniFrac calculations have the option of running "in parallel" for faster results on computers that have multiple cores/processors available.
All of this is accessed through a single function call:
UniFrac(physeq, weighted=FALSE, normalized=TRUE, parallel=FALSE, fast=TRUE)