Papers by Sebastian Ahnert

Nature Communications, 2015
Proteins assemble into complexes with diverse quaternary structures. Although most heteromeric co... more Proteins assemble into complexes with diverse quaternary structures. Although most heteromeric complexes of known structure have even stoichiometry, a significant minority have uneven stoichiometry-that is, differing numbers of each subunit type. To adopt this uneven stoichiometry, sequence-identical subunits must be asymmetric with respect to each other, forming different interactions within the complex. Here we first investigate the occurrence of uneven stoichiometry, demonstrating that it is common in vitro and is likely to be common in vivo. Next, we elucidate the structural determinants of uneven stoichiometry, identifying six different mechanisms by which it can be achieved. Finally, we study the frequency of uneven stoichiometry across evolution, observing a significant enrichment in bacteria compared with eukaryotes. We show that this arises due to a general increased tendency for bacterial proteins to self-assemble and form homomeric interactions, even within the context of a heteromeric complex.

PloS one, 2014
The provision of healthcare in rural African communities is a highly complex and largely unsolved... more The provision of healthcare in rural African communities is a highly complex and largely unsolved problem. Two main difficulties are the identification of individuals that are most likely affected by disease and the prediction of responses to health interventions. Social networks have been shown to capture health outcomes in a variety of contexts. Yet, it is an open question as to what extent social network analysis can identify and distinguish among households that are most likely to report poor health and those most likely to respond to positive behavioural influences. We use data from seven highly remote, post-conflict villages in Liberia and compare two prominent network measures: in-degree and betweenness. We define in-degree as the frequency in which members from one household are named by another household as a friends. Betweenness is defined as the proportion of shortest friendship paths between any two households in a network that traverses a particular household. We find t...
The electronic version of this article is the complete one and can be found online at
The cultural diversity of culinary practice, as illustrated by the variety of regional cuisines, ... more The cultural diversity of culinary practice, as illustrated by the variety of regional cuisines, raises the question of whether there are any general patterns that determine the ingredient combinations used in food today or principles that transcend individual tastes and recipes. We introduce a flavor network that captures the flavor compounds shared by culinary ingredients. Western cuisines show a tendency to use ingredient pairs that share many flavor compounds, supporting the so-called food pairing hypothesis. By contrast, East Asian cuisines tend to avoid compound sharing ingredients. Given the increasing availability of information on food preparation, our data-driven investigation opens new avenues towards a systematic understanding of culinary practice.

PLoS ONE, 2014
Competition is ubiquitous in many complex biological, social, and technological systems, playing ... more Competition is ubiquitous in many complex biological, social, and technological systems, playing an integral role in the evolutionary dynamics of the systems. It is often useful to determine the dominance hierarchy or the rankings of the components of the system that compete for survival and success based on the outcomes of the competitions between them. Here we propose a ranking method based on the random walk on the network representing the competitors as nodes and competitions as directed edges with asymmetric weights. We use the edge weights and node degrees to define the gradient on each edge that guides the random walker towards the weaker (or the stronger) node, which enables us to interpret the steady-state occupancy as the measure of the node's weakness (or strength) that is free of unwarranted degree-induced bias. We apply our method to two real-world competition networks and explore the issues of ranking stabilization and prediction accuracy, finding that our method outperforms other methods including the baseline win-loss differential method in sparse networks.

PLoS ONE, 2008
While genome-wide gene expression data are generated at an increasing rate, the repertoire of app... more While genome-wide gene expression data are generated at an increasing rate, the repertoire of approaches for pattern discovery in these data is still limited. Identifying subtle patterns of interest in large amounts of data (tens of thousands of profiles) associated with a certain level of noise remains a challenge. A microarray time series was recently generated to study the transcriptional program of the mouse segmentation clock, a biological oscillator associated with the periodic formation of the segments of the body axis. A method related to Fourier analysis, the Lomb-Scargle periodogram, was used to detect periodic profiles in the dataset, leading to the identification of a novel set of cyclic genes associated with the segmentation clock. Here, we applied to the same microarray time series dataset four distinct mathematical methods to identify significant patterns in gene expression profiles. These methods are called: Phase consistency, Address reduction, Cyclohedron test and Stable persistence, and are based on different conceptual frameworks that are either hypothesis-or data-driven. Some of the methods, unlike Fourier transforms, are not dependent on the assumption of periodicity of the pattern of interest. Remarkably, these methods identified blindly the expression profiles of known cyclic genes as the most significant patterns in the dataset. Many candidate genes predicted by more than one approach appeared to be true positive cyclic genes and will be of particular interest for future research. In addition, these methods predicted novel candidate cyclic genes that were consistent with previous biological knowledge and experimental validation in mouse embryos. Our results demonstrate the utility of these novel pattern detection strategies, notably for detection of periodic profiles, and suggest that combining several distinct mathematical approaches to analyze microarray datasets is a valuable strategy for identifying genes that exhibit novel, interesting transcriptional patterns.

Physical Review E, 2011
We investigate the evolutionary dynamics of an idealised model for the robust self-assembly of tw... more We investigate the evolutionary dynamics of an idealised model for the robust self-assembly of two-dimensional structures called polyominoes. The model includes rules that encode interactions between sets of square tiles that drive the self-assembly process. The relationship between the model's rule set and its resulting self-assembled structure can be viewed as a genotype-phenotype map and incorporated into a genetic algorithm. The rule sets evolve under selection for specified target structures. The corresponding, complex fitness landscape generates rich evolutionary dynamics as a function of parameters such as the population size, search space size, mutation rate, and method of recombination. Furthermore, these systems are simple enough that in some cases the associated model genome space can be completely characterised, shedding light on how the evolutionary dynamics depends on the detailed structure of the fitness landscape. Finally, we apply the model to study the emergence of the preference for dihedral over cyclic symmetry observed for homomeric protein tetramers.

Journal of Theoretical Biology, 2008
Despite tremendous advances in the field of genomics, the amount and function of the large non-co... more Despite tremendous advances in the field of genomics, the amount and function of the large non-coding part of the genome in higher organisms remains poorly understood. Here we report an observation, made for 37 fully sequenced eukaryotic genomes, which indicates that eukaryotes require a certain minimum amount of non-coding DNA (ncDNA). This minimum increases quadratically with the amount of DNA located in exons. Based on a simple model of the growth of regulatory networks, we derive a theoretical prediction of the required quantity of ncDNA and find it to be in excellent agreement with the data. The amount of additional ncDNA (in basepairs) which eukaryotes require obeys N DEF = 1/2 (N C / N P ) (N C -N P ), where N C is the amount of exonic DNA, and N P is a constant of about 10Mb. This value N DEF corresponds to a few percent of the genome in Homo sapiens and other mammals, and up to half the genome in simpler eukaryotes. Thus our findings confirm that eukaryotic life depends on a substantial fraction of ncDNA and also make a prediction of the size of this fraction, which matches the data closely.
Journal of Physics A: Mathematical and Theoretical, 2008
In recent work we presented a new approach to the analysis of weighted networks, by providing a s... more In recent work we presented a new approach to the analysis of weighted networks, by providing a straightforward generalization of any network measure defined on unweighted networks. This approach is based on the translation of a weighted network into an ensemble of edges, and is particularly suited to the analysis of fully connected weighted networks. Here we apply our method to several such networks including distance matrices, and show that the clustering coefficient, constructed by using the ensemble approach, provides meaningful insights into the systems studied. In the particular case of two datasets from microarray experiments the clustering coefficient identifies a number of biologically significant genes, outperforming existing identification approaches.
Entropy, 2013
Various statistical-mechanics approaches to complex networks have been proposed to describe expec... more Various statistical-mechanics approaches to complex networks have been proposed to describe expected topological properties in terms of ensemble averages. Here we extend this formalism by introducing the fundamental concept of graph temperature, controlling the degree of topological optimization of a network. We recover the temperature-dependent version of various important models as particular cases of our approach, and show examples where, remarkably, the onset of a percolation transition, a scale-free degree distribution, correlations and clustering can be understood as natural properties of an optimized (low-temperature) topology. We then apply our formalism to real weighted networks and we compute their temperature, finding that various techniques used to extract information from complex networks are again particular cases of our approach.
Developmental Biology, 2011
Uploads
Papers by Sebastian Ahnert