About   People   Research   Publications   Software   Data   Blog   Join   Internal 
Zero extinction rate estimates
This data and code set is provided as supplementary material to the following manuscript:
Louca, S., Pennell, M.W. (2021). Why extinction estimates from extant phylogenies are so often zero. in review

Time-calibrated phylogenies of extant species ("extant timetrees") are widely used to estimate historical speciation and extinction rates by fitting stochastic birth-death models. These approaches have long been controversial as many phylogenetic studies report zero extinction in many taxa, contradicting the high extinction rates seen in the fossil record and the fact that the majority of species ever to have existed are now extinct. To date, the causes of this discrepancy remain unresolved. Here we provide a novel explanation for these "zero-inflated" extinction rate estimates, based on the recent discovery that there exist many alternative "congruent" diversification scenarios that cannot be distinguished based solely on extant timetrees. Due to such congruencies, estimation methods tend to converge to some scenario congruent to (i.e., statistically indistinguishable from) the true diversification scenario, but not necessarily to the true diversification scenario itself. This congruent scenario may exhibit negative extinction rates, a biologically meaningless but mathematically feasible situation, in which case estimators will tend to stick to the boundary estimate of zero extinction. Based on this explanation, we make multiple testable predictions, which we confirm using analyses of simulated trees and 121 empirical trees. In contrast to other proposed mechanisms for erroneous extinction rate estimates, our proposed mechanism specifically explains the zero-inflation of previous extinction rate estimates in the absence of detectable model violations, even for large trees. Not only do our results likely resolve a long-standing mystery in phylogenetics, they demonstrate that model congruencies can have severe consequences in practice.

Data and code overview
R code performing the main analyses described in the paper can be downloaded below. The code performs the following major tasks in sequence:
  • Fitting ELC birth-death models to timetrees simulated under time-dependent speciation and extinction rates, while either constraining the extinction rate to non-negative values (BDELC) or allowing for negative values (BDELCNeg).
  • Fitting BDELC and BDELCNeg models to a collection of empirical timetrees.
See the cited manuscript for detailed definitions and interpretations. The code has been tested on R v4.0.2, MacOS 10.13.6. The code requires the R package castor v1.6.7, and will not work with older versions. For ease of reproducibility, all required inputs (empirical timetrees and metadata) are included with the code.

Please read the license agreement included in the code prior to using it. If you use any of the empirical timetrees provided below please cite their respective publications! Citation info for each timetree can be found in the included file tree_descriptions.tsv.

Complete R code (includes required input trees).
Conceptual illustration of how a restriction to non-negative extinction rate estimates can lead to a zero-inflated distribution of estimates.

Distribution of present-day extinction rate estimates from simulated timetrees, when either constraining rates to be non-negative (left column) or allowing for negative extinction rates (right column).

(A,B) Present-day extinction rate estimates for simulated timetrees (vertical axis) compared to the true present-day extinction rate (horizontal axis), while either constraining rates to be non-negative (A) or allowing for negative rates (B). (C) Comparison of estimated present-day extinction rates when constrained to be non-negative (vertical axis) versus unconstrained (horizontal axis).

Top row: Speciation and extinction rates of a ELC birth-death model fitted to a simulated timetree, either constraining extinction rates to be non-negative (left column), or allowing extinction rates to be negative (middle column). The right column shows a model congruent to the one in (B), with speciation and extinction rates close to the truth. Bottom row: Lineages-through-time (LTT) curve of the tree compared to the deterministic LTT of the fitted or congruent models.

Louca lab. Department of Biology, University of Oregon, Eugene, USA
© 2021 Stilianos Louca all rights reserved