Abstract
Among people living with human immunodeficiency virus type 1 (HIV-1), the long-term persistence of a population of cells carrying transcriptionally silent integrated viral DNA (provirus) remains the primary barrier to developing an effective cure. Ongoing cell division via proliferation is generally considered to be the driving force behind the persistence of this latent HIV-1 reservoir. The contribution of this mechanism (clonal expansion) is supported by the observation that proviral sequences sampled from the reservoir are often identical. This outcome is quantified as the ‘clonality’ of the sample population, e.g. the fraction of provirus sequences observed more than once. However, clonality as a quantitative measure is inconsistently defined and its statistical properties are not well understood. In this Reflections article, we use mathematical and phylogenetic frameworks to formally examine the inherent problems of using clonality to characterize the dynamics and proviral composition of the reservoir. We describe how clonality is not adequate for this task due to the inherent complexity of how infected cells are ‘labeled’ by proviral sequences-the outcome of a sampling process from the evolutionary history of active viral replication before treatment-as well as variation in cell birth and death rates among lineages and over time. Lastly, we outline potential directions in statistical and phylogenetic research to address these issues.