Time-calibrated phylogenies of extant species (referred to here as "extant timetrees") are widely used for estimating diversification dynamics.
However, there has been considerable debate surrounding the reliability of these inferences and, to date, this critical question remains unresolved.
Here we clarify the precise information that can be extracted from extant timetrees under the generalized birth–death model, which underlies most existing methods of estimation.
We prove that, for any diversification scenario, there exists an infinite number of alternative diversification scenarios that are equally likely to have generated any given extant timetree.
These "congruent" scenarios cannot possibly be distinguished using extant timetrees alone, even in the presence of infinite data.
Importantly, congruent diversification scenarios can exhibit markedly different and yet similarly plausible dynamics, which suggests that many previous studies may have over-interpreted phylogenetic evidence.
We introduce identifiable and easily interpretable variables that contain all available information about past diversification dynamics, and demonstrate that these can be estimated from extant timetrees.
We suggest that measuring and modelling these identifiable variables offers a more robust way to study historical diversification dynamics.
Our findings also make it clear that palaeontological data will continue to be crucial for answering some macroevolutionary questions.
Commentary by evolutionary biologist Mark Pagel
Evolutionary trees can’t reveal speciation and extinction rates. Nature News and Views
Data and code overview
Example R code demonstrating the main analyses described in the paper can be downloaded below.
The provided code includes the analysis of the Cetacea tree (Steeman et al. 2009), of the seed plants tree (Smith et al. 2018) and of the fossil-based origination/extinction rates of marine invertebrate genera (Alroy 2008).
The code also demonstrates some of the simulations described in the paper's Supplement. In addition, the simulated trees analyzed in the paper are also provided below.
Please read the license agreement included in the code prior to using it.
If you use any of the real datasets provided below (Cetacea tree, seed plant tree, fossil-based rates) please cite their respective publications.
|Code demonstrating the analyses in the paper (includes required real input data).|
|Simulated timetrees used in the paper (in Newick format).||