Phylogenetic trees are a starting point for the study of further evolutionary and ecological ques... more Phylogenetic trees are a starting point for the study of further evolutionary and ecological questions. We show that for avian evolutionary relationships, improved taxon sampling, longer sequences and additional data sets are giving stability to the prediction of the grouping of pelecaniforms and ciconiiforms, thus allowing inferences to be made about long-term niche occupancy. Here we report the phylogeny of the pelecaniform birds and their water-carnivore allies using complete mitochondrial genomes, and show that the basic groupings agree with nuclear sequence phylogenies, even though many short branches are not yet fully resolved. In detail, we show that the Pelecaniformes (minus the tropicbird) and the Ciconiiformes (storks, herons and ibises) form a natural group within a seabird water-carnivore clade. We find pelicans are the closest relatives of the shoebill (in a clade with the hammerkop), and we confirm that tropicbirds are not pelecaniforms. In general, the group appears to be an adaptive radiation into an 'aquatic carnivore' niche that it has occupied for 60-70 million years. From an ecological and life history perspective, the combined pelecaniform-ciconiform group is more informative than focusing on differences in morphology. These findings allow a start to integrating molecular evolution and macroecology.
We report three developments towards resolving the challenge of the apparent basal polytomy of ne... more We report three developments towards resolving the challenge of the apparent basal polytomy of neoavian birds. First we describe improved conditional down-weighting techniques to reduce noise relative to signal for deeper divergences and find increased agreement between datasets. Second, we present formulae for calculating the probabilities of finding predefined groupings in the optimal tree. Finally, we report a significant increase in data: nine new mitochondrial genomes (the dollarbird, New Zealand kingfisher, great potoo, Australian owletnightjar, white-tailed trogon, barn owl, a roadrunner (a ground cuckoo), New Zealand longtailed cuckoo and the peach-faced lovebird) and together they provide data for each of the six main groups of Neoaves proposed by Cracraft in 2001. We use his six main groups of modern birds as priors for evaluation of results. These include passerines; cuckoos; parrots and three other groups termed 'WoodKing' (woodpeckers/rollers/ kingfishers), 'SCA' (owls/potoos/ owlet-nightjars/hummingbirds/swifts) and 'Conglomerati'. In general the support is highly significant with just two exceptions, the owls move from the 'SCA' group to the raptors, particularly accipitrids (buzzards/eagles) and the osprey, and the shorebirds may be an independent group from the rest of the 'Conglomerati'. Molecular dating mitochondrial genomes support a major diversification of at least 12 Neoavian lineages in the Late Cretaceous. Our results form a basis for further testing with both nuclear coding sequences and rare genomic changes.
Simulations were used to study the performance of several character-based and distance-based phyl... more Simulations were used to study the performance of several character-based and distance-based phylogenetic methods in obtaining the correct tree from pseudo-randomly generated input data. The study included all the topologies of unrooted binary trees with from 4 to 10 pendant vertices (taxa) inclusive. The length of the character sequences used ranged from 10 to 10(5) characters exponentially. The methods studied include Closest Tree, Compatibility, Li's method, Maximum Parsimony, Neighbor-joining, Neighborliness, and UPGMA. We also provide a modification to Li's method (SimpLi) which is consistent with additive data. We give estimations of the sequence lengths required for given confidence in the output of these methods under the assumptions of molecular evolution used in this study. A notation for characterizing all tree topologies is described. We show that when the number of taxa, the maximum path length, and the minimum edge length are held constant, there it little but significant dependence of the performance of the methods on the tree topology. We show that those methods that are consistent with the model used perform similarly, whereas the inconsistent methods, UPGMA and Li's method, perform very poorly.
Cold Spring Harbor Symposia on Quantitative Biology, 1987
... These three developments made it possible to show that minimal trees from five ... It has bee... more ... These three developments made it possible to show that minimal trees from five ... It has been known for some time (Felsenstein 1978), however, that some evolutionary models, even with ... Likelihood methods have a theoretical advantage in handling these difficult cases, but they ...
The evolutionary origin of the pinnipeds (seals, sea lions, and walruses) is still uncertain. Mos... more The evolutionary origin of the pinnipeds (seals, sea lions, and walruses) is still uncertain. Most authors support a hypothesis of a monophyletic origin of the pinnipeds from a caniform carnivore. A minority view suggests a diphyletic origin with true seals being related to the mustelids (otters and ferrets). The phylogenetic relationships of the walrus to other pinniped and carnivore families are also still particularly problematic. Here we examined the relative support for mono-and diphyletic hypotheses using DNA sequence data from the mitochondrial small subunit ( 12s) rRNA and cytochrome b genes. We first analyzed a small group of taxa representing the three pinniped families (Phocidae, Otariidae, and Odobenidae) and caniform carnivore families thought to be related to them. We inferred phylogenetic reconstructions from DNA sequence data using standard parsimony and neighborjoining algorithms for phylogenetic inference as well as a new method called spectral analysis (Hendy and Penny) in which phylogenetic information is displayed independently of any selected tree. We identified and compensated for potential sources of error known to lead to selection of incorrect phylogenetic trees. These include sampling error, unequal evolutionary rates on lineages, unequal nucleotide composition among lineages, unequal rates of change at different sites, and inappropriate tree selection criteria. To correct for these errors, we performed additional transformations of the observed substitution patterns in the sequence data, applied more stringent structural constraints to the analyses, and included several additional taxa to help resolve long, unbranched lineages in the tree. We find that there is strong support for a monophyletic origin of the pinnipeds from within the caniform carnivores, close to the bear/raccoon/panda radiation. Evidence for a diphyletic origin was very weak and can be partially attributed to unequal nucleotide compositions among the taxa analyzed. Subsequently, there is slightly more evidence for grouping the walrus with the eared seals versus the true seals. A more conservative interpretation, however, is that the walrus is an early, but not the first, independent divergence from the common pinniped ancestor.
The phylogenetic branching order of the green algal groups that gave rise to land plants remains ... more The phylogenetic branching order of the green algal groups that gave rise to land plants remains uncertain despite its fundamental importance to understanding plant evolution. Previous studies have demonstrated that land plants evolved from streptophyte algae, but different lineages of streptophytes have been suggested to be the sister group of land plants. To better understand the evolutionary history of land plants and to determine the potential effects of " long-branch attraction " in phylogenetic reconstruction, we analyzed a chloroplast genome data set including three new chloroplast genomes from streptophyte algae: Coleochaetae orbicularis (Coleochaetales), Nitella hookeri (Charales), and Spirogyra communis (Zygnematales). We further applied a site pattern sorting method together with site-and time-heterogeneous models to investigate the branching order among streptophytes and land plants. Our chloroplast phylogenomic analyses support previous hypotheses based on nuclear data in placing Zygnematales alone, or a clade consisting of Coleochaetales plus Zygnematales, as the closest living relatives of land plants.
We report the chloroplast genomes of a tree fern (Dicksonia squarrosa) and a " fern ally " (Tmesi... more We report the chloroplast genomes of a tree fern (Dicksonia squarrosa) and a " fern ally " (Tmesipteris elongata), and show that the phylogeny of early land plants is basically as expected, and the estimates of divergence time are largely unaffected after removing the fastest evolving sites. The tree fern shows the major reduction in the rate of evolution, and there has been a major slowdown in the rate of mutation in both families of tree ferns. We suggest that this is related to a generation time effect; if there is a long time period between generations, then this is probably incompatible with a high mutation rate because otherwise nearly every propagule would probably have several lethal mutations. This effect will be especially strong in organisms that have large numbers of cell divisions between generations. This shows the necessity of going beyond phylogeny and integrating its study with other properties of organisms.
The questions that we are considering here are about ecological and microevolutionary processes r... more The questions that we are considering here are about ecological and microevolutionary processes rather than taxonomy; thus, we use the term 'dinosaur' in its usual (non-avian) meaning.
The aim is to use DNA sequence data to test between vicariance and long range dispersal (by float... more The aim is to use DNA sequence data to test between vicariance and long range dispersal (by floating seed-pods) explanations for the origin and range of the Edwardsia species of Sophora (Sophoreae: Papilionoideae: Leguminosae).
Passerines are the largest avian order, and the 6000 species comprise more than half of all extan... more Passerines are the largest avian order, and the 6000 species comprise more than half of all extant bird species. This successful radiation probably had its origin in the Australasian region, but dating this origin has been difficult due to a scarce fossil record and poor biogeographic assumptions. Many of New Zealand's endemic passerines fall within the deeper branches of the passerine radiation, and a well resolved phylogeny for the modern New Zealand element in the deeper branches of the oscine lineage will help us understand both oscine and passerine biogeography. To this end we present complete mitochondrial genomes representing all families of New Zealand passerines in a phylogenetic framework of over 100 passerine species. Dating analyses of this robust phylogeny suggest Passeriformes originated in the early Paleocene, with the major lineages of oscines 'escaping' from Australasia about 30 Ma, and radiating throughout the world during the Oligocene. This independently derived conclusion is consistent with the passerine fossil record.
Charles Darwin has had more impact on biological sciences, and society generally, than any other ... more Charles Darwin has had more impact on biological sciences, and society generally, than any other 19 th century biologist. Yet his modus operandi as a scientist is poorly known by evolutionists, and often seriously misinterpreted.
Land plants are a natural group, and Charophyte algae are the closest lineages of land plants and... more Land plants are a natural group, and Charophyte algae are the closest lineages of land plants and have six morphologically diverged groups. The conjugating green algae (Zygnematales) are now suggested to be the extant sister group to land plants, providing the novel understanding for character evolution and early multicellular innovations in land plants. We review recent molecular phylogenetic work on the origin of land plants and discuss some future directions in phylogenomic analyses.
We report a new transformation, the LogDet, that is consistent for sequences with differing nucle... more We report a new transformation, the LogDet, that is consistent for sequences with differing nucleotide composition and that have arisen under simple but asymmetric stochastic models of evolution. This transformation is required because existing methods tend to group sequences on the basis of their nucleotide composition, irrespective of their evolutionary history. This effect of differing nucleotide frequencies is illustrated by using a tree-selection criterion on a simple distance measure defined solely on the basis of base composition, independent of the actual sequences. The new LogDet transformation uses determinants of the observed divergence matrices and works because multiplication of determinants (real numbers) is commutative, whereas multiplication of matrices is not,except in special symmetric cases. The use of determinants thus allows more general models of evolution with a symmetric rates of nucleotide change. The transformation is illustrated on a theoretical data set (...
Early in the history of DNA, thymine replaced uracil, thus solving a short-term problem for stori... more Early in the history of DNA, thymine replaced uracil, thus solving a short-term problem for storing genetic information--mutation of cytosine to uracil through deamination. Any engineer would have replaced cytosine, but evolution is a tinkerer not an engineer. By keeping cytosine and replacing uracil the problem was never eliminated, returning once again with the advent of DNA methylation.
Opinion is strongly divided on whether life arose on earth under hot or cold conditions, the hot-... more Opinion is strongly divided on whether life arose on earth under hot or cold conditions, the hot-start and cold-start scenarios, respectively. The origin of life close to deep thermal vents appears as the majority opinion among biologists, but there is considerable biochemical evidence that high temperatures are incompatible with an RNA world. To be functional, RNA has to fold into a three-dimensional structure. We report both theoretical and experimental results on RNA folding and show that (as expected) hot conditions strongly reduce RNA folding. The theoretical results come from energy-minimization calculations of the average extent of folding of RNA, mainly from 0-90 degrees C, for both random sequences and tRNA sequences. The experimental results are from circular-dichroism measurements of tRNA over a similar range of temperatures. The quantitative agreement between calculations and experiment is remarkable, even to the shape of the curves indicating the cooperative nature of R...
One of the most useful features of molecular phylogenetic analyses is the potential for estimatin... more One of the most useful features of molecular phylogenetic analyses is the potential for estimating dates of divergence of evolutionary lineages from the DNA of extant species. But lineage-specific variation in rate of molecular evolution complicates molecular dating, because a calibration rate estimated from one lineage may not be an accurate representation of the rate in other lineages. Many molecular dating studies use a "clock test" to identify and exclude sequences that vary in rate between lineages. However, these clock tests should not be relied upon without a critical examination of their effectiveness at removing rate variable sequences from any given data set, particularly with regard to the sequence length and number of variable sites. As an illustration of this problem we present a power test of a frequently employed triplet relative rates test. We conclude that (1) relative rates tests are unlikely to detect moderate levels of lineage-specific rate variation (w...
New quantitative methods are applied to the 135 human mitochondrial sequences from the Vigilant e... more New quantitative methods are applied to the 135 human mitochondrial sequences from the Vigilant et al. data set. General problems in analyzing large numbers of short sequences are discussed, and an improved strategy is suggested. A key feature is to focus not on individual trees but on the general "landscape" of trees. Over 1,000 searches were made from random starting trees with only one tree (a local optimum) being retained each time, thereby ensuring optima were found independently. A new tree comparison metric was developed that is unaffected by rearrangements of trees around many very short internal edges. Use of this metric showed that downweighting hypervariable sites revealed more evolutionary structure than studies that weighted all sites equally. Our results are consistent with convergence toward a global optimum. Crucial features are that the best optima show very strong regional differentiation, a common group of 49 African sequences is found in all the best op...
Phylogenetic trees are a starting point for the study of further evolutionary and ecological ques... more Phylogenetic trees are a starting point for the study of further evolutionary and ecological questions. We show that for avian evolutionary relationships, improved taxon sampling, longer sequences and additional data sets are giving stability to the prediction of the grouping of pelecaniforms and ciconiiforms, thus allowing inferences to be made about long-term niche occupancy. Here we report the phylogeny of the pelecaniform birds and their water-carnivore allies using complete mitochondrial genomes, and show that the basic groupings agree with nuclear sequence phylogenies, even though many short branches are not yet fully resolved. In detail, we show that the Pelecaniformes (minus the tropicbird) and the Ciconiiformes (storks, herons and ibises) form a natural group within a seabird water-carnivore clade. We find pelicans are the closest relatives of the shoebill (in a clade with the hammerkop), and we confirm that tropicbirds are not pelecaniforms. In general, the group appears to be an adaptive radiation into an 'aquatic carnivore' niche that it has occupied for 60-70 million years. From an ecological and life history perspective, the combined pelecaniform-ciconiform group is more informative than focusing on differences in morphology. These findings allow a start to integrating molecular evolution and macroecology.
We report three developments towards resolving the challenge of the apparent basal polytomy of ne... more We report three developments towards resolving the challenge of the apparent basal polytomy of neoavian birds. First we describe improved conditional down-weighting techniques to reduce noise relative to signal for deeper divergences and find increased agreement between datasets. Second, we present formulae for calculating the probabilities of finding predefined groupings in the optimal tree. Finally, we report a significant increase in data: nine new mitochondrial genomes (the dollarbird, New Zealand kingfisher, great potoo, Australian owletnightjar, white-tailed trogon, barn owl, a roadrunner (a ground cuckoo), New Zealand longtailed cuckoo and the peach-faced lovebird) and together they provide data for each of the six main groups of Neoaves proposed by Cracraft in 2001. We use his six main groups of modern birds as priors for evaluation of results. These include passerines; cuckoos; parrots and three other groups termed 'WoodKing' (woodpeckers/rollers/ kingfishers), 'SCA' (owls/potoos/ owlet-nightjars/hummingbirds/swifts) and 'Conglomerati'. In general the support is highly significant with just two exceptions, the owls move from the 'SCA' group to the raptors, particularly accipitrids (buzzards/eagles) and the osprey, and the shorebirds may be an independent group from the rest of the 'Conglomerati'. Molecular dating mitochondrial genomes support a major diversification of at least 12 Neoavian lineages in the Late Cretaceous. Our results form a basis for further testing with both nuclear coding sequences and rare genomic changes.
Simulations were used to study the performance of several character-based and distance-based phyl... more Simulations were used to study the performance of several character-based and distance-based phylogenetic methods in obtaining the correct tree from pseudo-randomly generated input data. The study included all the topologies of unrooted binary trees with from 4 to 10 pendant vertices (taxa) inclusive. The length of the character sequences used ranged from 10 to 10(5) characters exponentially. The methods studied include Closest Tree, Compatibility, Li's method, Maximum Parsimony, Neighbor-joining, Neighborliness, and UPGMA. We also provide a modification to Li's method (SimpLi) which is consistent with additive data. We give estimations of the sequence lengths required for given confidence in the output of these methods under the assumptions of molecular evolution used in this study. A notation for characterizing all tree topologies is described. We show that when the number of taxa, the maximum path length, and the minimum edge length are held constant, there it little but significant dependence of the performance of the methods on the tree topology. We show that those methods that are consistent with the model used perform similarly, whereas the inconsistent methods, UPGMA and Li's method, perform very poorly.
Cold Spring Harbor Symposia on Quantitative Biology, 1987
... These three developments made it possible to show that minimal trees from five ... It has bee... more ... These three developments made it possible to show that minimal trees from five ... It has been known for some time (Felsenstein 1978), however, that some evolutionary models, even with ... Likelihood methods have a theoretical advantage in handling these difficult cases, but they ...
The evolutionary origin of the pinnipeds (seals, sea lions, and walruses) is still uncertain. Mos... more The evolutionary origin of the pinnipeds (seals, sea lions, and walruses) is still uncertain. Most authors support a hypothesis of a monophyletic origin of the pinnipeds from a caniform carnivore. A minority view suggests a diphyletic origin with true seals being related to the mustelids (otters and ferrets). The phylogenetic relationships of the walrus to other pinniped and carnivore families are also still particularly problematic. Here we examined the relative support for mono-and diphyletic hypotheses using DNA sequence data from the mitochondrial small subunit ( 12s) rRNA and cytochrome b genes. We first analyzed a small group of taxa representing the three pinniped families (Phocidae, Otariidae, and Odobenidae) and caniform carnivore families thought to be related to them. We inferred phylogenetic reconstructions from DNA sequence data using standard parsimony and neighborjoining algorithms for phylogenetic inference as well as a new method called spectral analysis (Hendy and Penny) in which phylogenetic information is displayed independently of any selected tree. We identified and compensated for potential sources of error known to lead to selection of incorrect phylogenetic trees. These include sampling error, unequal evolutionary rates on lineages, unequal nucleotide composition among lineages, unequal rates of change at different sites, and inappropriate tree selection criteria. To correct for these errors, we performed additional transformations of the observed substitution patterns in the sequence data, applied more stringent structural constraints to the analyses, and included several additional taxa to help resolve long, unbranched lineages in the tree. We find that there is strong support for a monophyletic origin of the pinnipeds from within the caniform carnivores, close to the bear/raccoon/panda radiation. Evidence for a diphyletic origin was very weak and can be partially attributed to unequal nucleotide compositions among the taxa analyzed. Subsequently, there is slightly more evidence for grouping the walrus with the eared seals versus the true seals. A more conservative interpretation, however, is that the walrus is an early, but not the first, independent divergence from the common pinniped ancestor.
The phylogenetic branching order of the green algal groups that gave rise to land plants remains ... more The phylogenetic branching order of the green algal groups that gave rise to land plants remains uncertain despite its fundamental importance to understanding plant evolution. Previous studies have demonstrated that land plants evolved from streptophyte algae, but different lineages of streptophytes have been suggested to be the sister group of land plants. To better understand the evolutionary history of land plants and to determine the potential effects of " long-branch attraction " in phylogenetic reconstruction, we analyzed a chloroplast genome data set including three new chloroplast genomes from streptophyte algae: Coleochaetae orbicularis (Coleochaetales), Nitella hookeri (Charales), and Spirogyra communis (Zygnematales). We further applied a site pattern sorting method together with site-and time-heterogeneous models to investigate the branching order among streptophytes and land plants. Our chloroplast phylogenomic analyses support previous hypotheses based on nuclear data in placing Zygnematales alone, or a clade consisting of Coleochaetales plus Zygnematales, as the closest living relatives of land plants.
We report the chloroplast genomes of a tree fern (Dicksonia squarrosa) and a " fern ally " (Tmesi... more We report the chloroplast genomes of a tree fern (Dicksonia squarrosa) and a " fern ally " (Tmesipteris elongata), and show that the phylogeny of early land plants is basically as expected, and the estimates of divergence time are largely unaffected after removing the fastest evolving sites. The tree fern shows the major reduction in the rate of evolution, and there has been a major slowdown in the rate of mutation in both families of tree ferns. We suggest that this is related to a generation time effect; if there is a long time period between generations, then this is probably incompatible with a high mutation rate because otherwise nearly every propagule would probably have several lethal mutations. This effect will be especially strong in organisms that have large numbers of cell divisions between generations. This shows the necessity of going beyond phylogeny and integrating its study with other properties of organisms.
The questions that we are considering here are about ecological and microevolutionary processes r... more The questions that we are considering here are about ecological and microevolutionary processes rather than taxonomy; thus, we use the term 'dinosaur' in its usual (non-avian) meaning.
The aim is to use DNA sequence data to test between vicariance and long range dispersal (by float... more The aim is to use DNA sequence data to test between vicariance and long range dispersal (by floating seed-pods) explanations for the origin and range of the Edwardsia species of Sophora (Sophoreae: Papilionoideae: Leguminosae).
Passerines are the largest avian order, and the 6000 species comprise more than half of all extan... more Passerines are the largest avian order, and the 6000 species comprise more than half of all extant bird species. This successful radiation probably had its origin in the Australasian region, but dating this origin has been difficult due to a scarce fossil record and poor biogeographic assumptions. Many of New Zealand's endemic passerines fall within the deeper branches of the passerine radiation, and a well resolved phylogeny for the modern New Zealand element in the deeper branches of the oscine lineage will help us understand both oscine and passerine biogeography. To this end we present complete mitochondrial genomes representing all families of New Zealand passerines in a phylogenetic framework of over 100 passerine species. Dating analyses of this robust phylogeny suggest Passeriformes originated in the early Paleocene, with the major lineages of oscines 'escaping' from Australasia about 30 Ma, and radiating throughout the world during the Oligocene. This independently derived conclusion is consistent with the passerine fossil record.
Charles Darwin has had more impact on biological sciences, and society generally, than any other ... more Charles Darwin has had more impact on biological sciences, and society generally, than any other 19 th century biologist. Yet his modus operandi as a scientist is poorly known by evolutionists, and often seriously misinterpreted.
Land plants are a natural group, and Charophyte algae are the closest lineages of land plants and... more Land plants are a natural group, and Charophyte algae are the closest lineages of land plants and have six morphologically diverged groups. The conjugating green algae (Zygnematales) are now suggested to be the extant sister group to land plants, providing the novel understanding for character evolution and early multicellular innovations in land plants. We review recent molecular phylogenetic work on the origin of land plants and discuss some future directions in phylogenomic analyses.
We report a new transformation, the LogDet, that is consistent for sequences with differing nucle... more We report a new transformation, the LogDet, that is consistent for sequences with differing nucleotide composition and that have arisen under simple but asymmetric stochastic models of evolution. This transformation is required because existing methods tend to group sequences on the basis of their nucleotide composition, irrespective of their evolutionary history. This effect of differing nucleotide frequencies is illustrated by using a tree-selection criterion on a simple distance measure defined solely on the basis of base composition, independent of the actual sequences. The new LogDet transformation uses determinants of the observed divergence matrices and works because multiplication of determinants (real numbers) is commutative, whereas multiplication of matrices is not,except in special symmetric cases. The use of determinants thus allows more general models of evolution with a symmetric rates of nucleotide change. The transformation is illustrated on a theoretical data set (...
Early in the history of DNA, thymine replaced uracil, thus solving a short-term problem for stori... more Early in the history of DNA, thymine replaced uracil, thus solving a short-term problem for storing genetic information--mutation of cytosine to uracil through deamination. Any engineer would have replaced cytosine, but evolution is a tinkerer not an engineer. By keeping cytosine and replacing uracil the problem was never eliminated, returning once again with the advent of DNA methylation.
Opinion is strongly divided on whether life arose on earth under hot or cold conditions, the hot-... more Opinion is strongly divided on whether life arose on earth under hot or cold conditions, the hot-start and cold-start scenarios, respectively. The origin of life close to deep thermal vents appears as the majority opinion among biologists, but there is considerable biochemical evidence that high temperatures are incompatible with an RNA world. To be functional, RNA has to fold into a three-dimensional structure. We report both theoretical and experimental results on RNA folding and show that (as expected) hot conditions strongly reduce RNA folding. The theoretical results come from energy-minimization calculations of the average extent of folding of RNA, mainly from 0-90 degrees C, for both random sequences and tRNA sequences. The experimental results are from circular-dichroism measurements of tRNA over a similar range of temperatures. The quantitative agreement between calculations and experiment is remarkable, even to the shape of the curves indicating the cooperative nature of R...
One of the most useful features of molecular phylogenetic analyses is the potential for estimatin... more One of the most useful features of molecular phylogenetic analyses is the potential for estimating dates of divergence of evolutionary lineages from the DNA of extant species. But lineage-specific variation in rate of molecular evolution complicates molecular dating, because a calibration rate estimated from one lineage may not be an accurate representation of the rate in other lineages. Many molecular dating studies use a "clock test" to identify and exclude sequences that vary in rate between lineages. However, these clock tests should not be relied upon without a critical examination of their effectiveness at removing rate variable sequences from any given data set, particularly with regard to the sequence length and number of variable sites. As an illustration of this problem we present a power test of a frequently employed triplet relative rates test. We conclude that (1) relative rates tests are unlikely to detect moderate levels of lineage-specific rate variation (w...
New quantitative methods are applied to the 135 human mitochondrial sequences from the Vigilant e... more New quantitative methods are applied to the 135 human mitochondrial sequences from the Vigilant et al. data set. General problems in analyzing large numbers of short sequences are discussed, and an improved strategy is suggested. A key feature is to focus not on individual trees but on the general "landscape" of trees. Over 1,000 searches were made from random starting trees with only one tree (a local optimum) being retained each time, thereby ensuring optima were found independently. A new tree comparison metric was developed that is unaffected by rearrangements of trees around many very short internal edges. Use of this metric showed that downweighting hypervariable sites revealed more evolutionary structure than studies that weighted all sites equally. Our results are consistent with convergence toward a global optimum. Crucial features are that the best optima show very strong regional differentiation, a common group of 49 African sequences is found in all the best op...
In order to make use of the high throughput available with next-generation sequencing technology,... more In order to make use of the high throughput available with next-generation sequencing technology, we developed a pipeline for sequencing and de novo assembly of multiple mitochondrial genomes without the costs of indexing. We first used simulations to explore the ability of existing sequence assembly algorithms to separate and assemble sequences from different sources. Once optimised, the same methods were successfully applied to reads from a single lane of an Illumina Genome Analyzer flow cell containing a mixture of PCR products from six different mitochondrial genomes. More recently, we applied a modified version of the same pipeline to four more mixtures, this time using total genomic DNA, and successfully assembled 17 mitochondrial genomes.
Uploads
Papers by David Penny