Papers by Olivier Marvens Francois

Annual Review of Ecology, Evolution, and Systematics, 2012
There is a growing interest in identifying ecological factors that influence adaptive genetic div... more There is a growing interest in identifying ecological factors that influence adaptive genetic diversity patterns in both model and nonmodel species. The emergence of large genomic and environmental data sets, as well as the increasing sophistication of population genetics methods, provides an opportunity to characterize these patterns in relation to the environment. Landscape genetics has emerged as a flexible analytical framework that connects patterns of adaptive genetic variation to environmental heterogeneity in a spatially explicit context. Recent growth in this field has led to the development of numerous spatial statistical methods, prompting a discussion of the current benefits and limitations of these approaches. Here we provide a review of the design of landscape genetics studies, the different statistical tools, some important case studies, and perspectives on how future advances in this field are likely to shed light on important processes in evolution and ecology.

Frontiers in Genetics, 2012
In many species, spatial genetic variation displays patterns of "isolation-by-distance." Characte... more In many species, spatial genetic variation displays patterns of "isolation-by-distance." Characterized by locally correlated allele frequencies, these patterns are known to create periodic shapes in geographic maps of principal components which confound signatures of specific migration events and influence interpretations of principal component analyses (PCA). In this study, we introduced models combining probabilistic PCA and kriging models to infer population genetic structure from genetic data while correcting for effects generated by spatial autocorrelation. The corresponding algorithms are based on singular value decomposition and low rank approximation of the genotypic data. As their complexity is close to that of PCA, these algorithms scale with the dimensions of the data. To illustrate the utility of these new models, we simulated isolation-by-distance patterns and broad-scale geographic variation using spatial coalescent models. Our methods remove the horseshoe patterns usually observed in PC maps and simplify interpretations of spatial genetic variation. We demonstrate our approach by analyzing single nucleotide polymorphism data from the Human Genome Diversity Panel, and provide comparisons with other recently introduced methods.
Statistics and Computing, 2009
Approximate Bayesian inference on the basis of summary statistics is wellsuited to complex proble... more Approximate Bayesian inference on the basis of summary statistics is wellsuited to complex problems for which the likelihood is either mathematically or computationally intractable. However the methods that use rejection suffer from the curse of dimensionality when the number of summary statistics is increased. Here we propose a machine-learning approach to the estimation of the posterior density by introducing two innovations. The new method fits a nonlinear conditional heteroscedastic regression of the parameter on the summary statistics, and then adaptively improves estimation using importance sampling. The new algorithm is compared to the state-of-the-art approximate Bayesian methods, and achieves considerable reduction of the computational burden in two examples of inference in statistical genetics and in a queueing model.

Molecular Ecology Resources, 2010
This article reviews recent developments in Bayesian algorithms that explicitly include geographi... more This article reviews recent developments in Bayesian algorithms that explicitly include geographical information in the inference of population structure. Current models substantially differ in their prior distributions and background assumptions, falling into two broad categories: models with or without admixture. To aid users of this new generation of spatially explicit programs, we clarify the assumptions underlying the models, and we test these models in situations where their assumptions are not met. We show that models without admixture are not robust to the inclusion of admixed individuals in the sample, thus providing an incorrect assessment of population genetic structure in many cases. In contrast, admixture models are robust to an absence of admixture in the sample. We also give statistical and conceptual reasons why data should be explored using spatially explicit models that include admixture.

Molecular Ecology, 2012
Species range shifts in response to climate and land use change are commonly forecasted with spec... more Species range shifts in response to climate and land use change are commonly forecasted with species distribution models based on species occurrence or abundance data. Although appealing, these models ignore the genetic structure of species, and the fact that different populations might respond in different ways because of adaptation to their environment. Here, we introduced ancestry distribution models, that is, statistical models of the spatial distribution of ancestry proportions, for forecasting intra‐specific changes based on genetic admixture instead of species occurrence data. Using multi‐locus genotypes and extensive geographic coverage of distribution data across the European Alps, we applied this approach to 20 alpine plant species considering a global increase in temperature from 0.25 to 4 °C. We forecasted the magnitudes of displacement of contact zones between plant populations potentially adapted to warmer environments and other populations. While a global trend of mov...

Methods in Ecology and Evolution, 2012
Summary1. Many recent statistical applications involve inference under complex models, where it i... more Summary1. Many recent statistical applications involve inference under complex models, where it is computationally prohibitive to calculate likelihoods but possible to simulate data. Approximate Bayesian computation (ABC) is devoted to these complex models because it bypasses the evaluation of the likelihood function by comparing observed and simulated data.2. We introduce the R package ‘abc’ that implements several ABC algorithms for performing parameter estimation and model selection. In particular, the recently developed nonlinear heteroscedastic regression methods for ABC are implemented. The ‘abc’ package also includes a cross‐validation tool for measuring the accuracy of ABC estimates and to calculate the misclassification probabilities when performing model selection. The main functions are accompanied by appropriate summary and plotting tools.3. R is already widely used in bioinformatics and several fields of biology. The R package ‘abc’ will make the ABC algorithms availabl...
Journal of Mathematical Biology, 2009
Many social animals live in stable groups, and it has been argued that kinship plays a major role... more Many social animals live in stable groups, and it has been argued that kinship plays a major role in their group formation process. In this study we present the mathematical analysis of a recent model which uses kinship as a main factor to explain observed group patterns in a finite sample of individuals. We describe the average number of groups and the probability distribution of group sizes predicted by this model. Our method is based on the study of recursive equations underlying these quantities. We obtain asymptotic equivalents for probability distributions and moments as the sample size increases, and we exhibit power-law behaviours. Computer simulations are also utilized to measure the extent to which the asymptotic approximation can be applied with confidence.
PloS one, Jan 31, 2011
The mainland of the Americas is home to a remarkable diversity of languages, and the relationship... more The mainland of the Americas is home to a remarkable diversity of languages, and the relationships between genes and languages have attracted considerable attention in the past. Here we investigate to which extent geography and languages can predict the genetic structure of Native American populations.

Dans cet article, nous présentons plusieurs familles de modèles hiérarchiques bayésiens dédiés à ... more Dans cet article, nous présentons plusieurs familles de modèles hiérarchiques bayésiens dédiés à l'analyse de la structure génétique des populations à partir de génotypes multi-locus. L'analyse bayésienne de la structure génétique résout des problèmes de classification non supervisée à partir de données catégorielles. L'une des spécificités des modèles de la génétique des populations vient du fait que le génome d'un individu peut provenir de plusieurs groupes génétiques en raison du métissage. L'originalité des modèles présentés réside dans l'utilisation d'un contexte bayésien hiérarchique qui permet d'inclure, avec une couche de régression cachée, des covariables spatiales et environnementales pour modéliser le métissage. De plus, nous présentons différents critères de choix de modèles qui permettent de choisir le nombre de groupes génétiques ainsi que l'ensemble des covariables spatiales et environnementales. Une première application de ces modèles concerne la détection de la structure génétique des populations humaines et les relations entre structure génétique et classifications linguistiques pour les populations amérindiennes. Une deuxième application concerne l'estimation de la structure d'espèces de plantes et les prévision des modèles en fonction de différents scénarios de changement climatique.

The Annals of Applied Probability, 2006
For two decades, the Colless index has been the most frequently used statistic for assessing the ... more For two decades, the Colless index has been the most frequently used statistic for assessing the balance of phylogenetic trees. In this article, this statistic is studied under the Yule and uniform model of phylogenetic trees. The main tool of analysis is a coupling argument with another well-known index called the Sackin statistic. Asymptotics for the mean, variance and covariance of these two statistics are obtained, as well as their limiting joint distribution for large phylogenies. Under the Yule model, the limiting distribution arises as a solution of a functional fixed point equation. Under the uniform model, the limiting distribution is the Airy distribution. The cornerstone of this study is the fact that the probabilistic models for phylogenetic trees are strongly related to the random permutation and the Catalan models for binary search trees.
In interphase, microtubules form a more or less dynamic network of fibers, usually originating at... more In interphase, microtubules form a more or less dynamic network of fibers, usually originating at the centrosome.They play a role in intracellular movement and positioning of organelles (mitochondria, Golgi apparatus, cytoplasmic vesi- cles). When the cell enters mitosis, the interphase network disappears and microtubules start to assemble the mitotic spindle, the function of which is to segregate the chromo- somes
Uploads
Papers by Olivier Marvens Francois