by Alex Washburne & Jamie Morton

Jamie and I have penned a popularization of the isometric log-ratio transform with the intention to allow non-mathematicians to understand the intuition behind what it is, why we use it, and how different methods use the same tool in different ways. The full write-up is available here, but the sections below give you the big-picture:

Several recent papers we’ve been involved in have utilized the isometric log-ratio (ILR) transform to analyze microbiome datasets. The papers and their software packages range from a phylogenetic transform (PhILR), a phylogenetic version of factor analysis (phylofactor), and using balance trees for hierarchical clustering (gneiss). In this post, we will demystify the ILR transform to help readers disentangle the literature that uses this transform in different ways to perform different analyses.

The elevator speech is that the ILR transform is a reasonable and convenient way of measuring the difference between two groups of species and the three methods above, which all use the ILR transform to measure differences, differ in which groups of species they measure the difference between.

<all the details about the ILR>

… PhILR, phylofactor and gneiss differ in which two groups to differentiate and how to interpret the resulting ILR coordinates. PhILR and phylofactor both require the phylogenetic tree as a scaffolding for coordinates. PhILR differentiates sister clades, and so there is only one PhILR transform for a given tree, there can be no polytomies in the tree, and coordinates correspond to differences between sister clades weighted by the branch length separating the sister clades. Phylofactor differentiates clades along edges in the tree according to which edge is “coolest”, so there are many phylofactorizations for a given tree depending on the data and how you define “cool”, and coordinates are interpreted as inferences on edges along which an important, functional ecological trait may have arisen.  Gneiss differentiates groups of OTUs in more general hierarchical clustering schemes to investigate partitions that cannot be explained by phylogeny (for a fluent user, the machinery in Gneiss could be used to perform phylogenetically-informed hierarchical clustering – stay tuned).

The full write-up contains some whiskey-laced mathematical fun that can help you understand isometric log-ratios enough to build your own ILR transform. Check it out here!


The tough part about being a mathematical biologist is that the mathematicians think you've gone soft and the biologists think you've gone crazy. Math's long marriage with physics motivated the development of calculus, differential equations and more. Now, big datasets in biology are allowing us to watch populations evolve and compete with other populations over resources. Consequently, biology is motivating the development of novel mathematical tools to understand evolutionary dynamics, competition, and how to analyze big data "in light of evolution". I received a B.S. in math & biology from the University of New Mexico, a PhD in quantitative & computational biology from Princeton University studying with Simon Levin. Now, I'm wrapping up a post-doc with Diana Nemergut at Duke University and starting a new post-doc with Raina Plowright at Montana State University. I'm on team science - feel free to write me for questions or help, as we're all on the same team in our effort to unravel the mysteries of nature.

Leave a Reply

%d bloggers like this: