Software

A multivariate phylogenomic subsampling protocol implemented as a standalone R script that can be run locally within minutes. The code estimates and exports a set of 15 gene properties from populations of alignments and gene trees. A subset of these properties are further used to infer a compound axis – termed phylogenetic usefulness – that provides access to loci with high levels of phylogenetic signal and low evidence of potential sources of systematic bias. The matrix is sorted according to this axis, and a predefined subset is output for downstream use in phylogenetic inference or time calibration.

See publication in Molecular Biology and Evolution

Few tools exist to summarize the results of multiple time-calibrated analyses and explore the sensitivity of node ages estimates to methodological decisions, such as the type of clock implemented or the set of fossil constraints applied. This R package (in development) extracts node ages from Bayesian time-calibrated trees and plots the posterior distribution of the most sensitive nodes. It also uses chronospaces (multidimensional representations of node ages) to assess the relative impact of methodological choices on the overall results. As such, it provides new statistical tools to summarize, visualize, and explore estimates of divergence times.

Multi Ornstein-Uhlenbeck (OUM) models are commonly used to depict the evolution of continuous traits across adaptive landscapes. Few methods exist that can automatically detect the number and location of regime shifts, and those that do are either unable to work with trees containing fossil terminals (even when fossils have been shown to be crucial to ensure model accuracy), or only implement a subset of OUM models. This R package extends current methods to allow for OUM models with variable rates and/or selection pressures to be fit to non-ultrametric trees, and provides new ways of exploring the temporal aspect of macroevolutionary innovation.

See publication in Evolution

Taxonomic databases consolidate information that help organize biodiversity into a hierarchy of nested clades. For marine organisms, WoRMS (World Register of Marines Species) has grown into the most comprehensive authority of taxonomic names. This script implements a web scraping approach that downloads the taxonomic hierarchy for a clade of interest with only minimal user input. Having this information stored locally and in a manner that can interact with other databases can expedite research in numerous ways. For example, it can be used to easily compare the diversity of different lineages, fix synonyms, paint subtrees of a phylogeny, subsample alignments to only one representative per major lineage, and classify taxa ahead of implementing phylogenetic comparative methods, to name just a few.

Handling morphological datasets for phylogenetic inferences is tricky, as characters are often times adaptive, inter-dependent, and difficult to model. Nonetheless, they remain the only way of tackling the affinities of fossil taxa, allowing us to build total-evidence topologies that result in more accurate macroevolutionary inferences. TREvoSim is a simulation framework that simultaneously outputs character datasets and the phylogenetic trees they evolved in. It does so without using any model employed for inference (neither birth-death nor Markov processes), as well as incorporating adaptive evolution, thus resulting in datasets whose downstream analysis involves a level of model misspecification (as is expected of empirical ones). It provides a unique perspective on the limits of morphological phylogenetics, and a benchmark to evaluate the accuracy of inference and comparative approaches.

See release of version 2.0 in Proceedings B