The Code of Survival: Machine Learning Deciphers How Life Thrives in Extreme Conditions
Learning objectives
- Describe macroevolutionary and epidemiological patterns of biodiversity using clade size.
- Derive theoretical expectations for clade size under birth–death and coalescent diversification models.
- Use clade size to explain parameter non-identifiability in birth–death models and to distinguish between density-dependent and time-dependent diversification.
- Formulate and interpret predictions for epidemiological cluster size in growing versus declining epidemics.
- Explain the relationship between clade size and the site-frequency spectrum.
Speaker Bio
Infectious pathogens have important consequences for human and wildlife populations alike. Research in the MacPherson lab use mathematical, and statistical tools to address questions at the intersection of evolution, ecology, and epidemiology. In particular, our research focuses on quantifying how biological diversity, from the population to the phylogenetic scale, impacts and is impacted by infectious disease.
Many phylogenetic questions, whether of the major taxonomic groups or within viral phylogenies, are at their core questions about why some clades are so large while others are so small. Within the macroevolutionary context, decades of effort have been devoted to identifying why some clades are specious and others depauperate, invoking explanations such as geography, species traits, and density- or time-dependent diversification. Within the epidemiological context, variation in the size of clades or epidemic clusters is often attributed to factors such as superspreading or the emergence of a variant of concern. Despite the implicit focus of the field on clade size, there exists few theoretical expectations for clade size beyond Yule’s “Hollow Curve” derived a century ago. Here I present some recent results on the size of clades of phylogenetic trees arising from the birth-death and coalescent diversification models. Within the macroevolutionary context we show that clade size can provide insights into the limitations of current phylogenetic methods, including model and parameter unidentifiability, and potential paths forward. Within the epidemiological context, we show that size of clades is inexorably linked to epidemiological dynamics, with declining epidemics unintuitively resulting in relatively large epidemiological clusters. Together, these results indicate that clade size should play a more central role in phylogenetic inference.