Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
2005, Manuscript distributed under the terms of the GNU Free Documentation License. 31 pp.
…
31 pages
1 file
1. Introduction 2. Implicit enumeration for nt terminals (nt >=2) 3. Find optimal binary trees using branch and bound, for nt terminals (nt >=2) 4. Build a tree by stepwise addition (n terminals, n >= 3) 5. Branch swapping 5.a. Introduction 5.b. A tree search strategy using SPR rearrangements of given trees 5.c. A tree search strategy using TBR rearrangements of given trees 6. Ratcheting 7. Tree drifting 8. Tree fusing 9. Static approximation 10. An integrated approach 11. Some quick comments on time complexity References
SIAM Journal on Discrete Mathematics, 2016
International Journal of Applied Pattern Recognition, 2016
While phylogenetic trees are widely used in bioinformatics, one of the major problems is that different dendrograms may be constructed depending on several factors. Albeit numerous quantitative measures to compare two different phylogenetic trees have been proposed, visual comparison is often necessary. Displaying a pair of alternative phylogenetic trees together by finding a proper order of taxa in the leaf level was considered earlier to give better visual insights of how two dendrograms are similar. This approach raised a problem of branch crossing. Here, a couple of efficient methods to count the number of branch crossings in the trees for a given taxa order are presented. Using the number of branch crossings as a fitness function, genetic algorithms are used to find a taxa order such that two alternative phylogenetic trees can be shown with semi-minimal number of branch crossing. A couple of methods to encode/decode a taxa order to/from a chromosome where genetic operators can be applied are also given.
Zeitschrift für Naturforschung C, 1979
An algorithm for phylogenetic trees’ construction is analyzed.
Sequence Data and Phylogenetic Trees Molecular Phylogeny Understanding evolutionary relationships between different organisms is a fundamental aspect of modern day biology. Trees structures are generally used to depict these relationships. In the days of Charles Darwin rough tree sketches were based on fossil records, morphology and geographical distribution [1]. This is no longer the case. With the advent of sequencing technologies [2] and the realization that both DNA and amino acid sequences could be used to accurately determine the relationship between different organisms [3] a plethora a tree producing algorithms has emerged [4] along with a branch of science referred to as molecular phylogeny. Molecular phylogeny is the science of estimating evolutionary histories using DNA and amino acid sequences. The first step in producing an evolutionary history is the identification of homologous sequences. These are sequences that share a common ancestry [5]. There are different types of homology which include orthology and paralogy. Orthologous sequences share similarities because they originated from a common ancestor. Paralogous sequences on the other hand share similarities due to gene duplication events within and individual species. To infer the evolutionary history between different organism's orthologous sequences are required. These can be aligned after which trees representing the evolutionary relationships between the sequences can be inferred. To improve the accuracy of the evolutionary relationships within the tree, models of sequence evolution are incorporated. Once a tree has been created there are many programs available for viewing and analysing the tree topology. In this chapter a few of the many aspects of aspects of molecular phylogeny will be discussed. Global Alignments Orthologous HIV sequences can be obtained from the Los Alamos HIV sequence database using the search interface provided at http://www.hiv.lanl.gov. Before a tree can be the sequences must be aligned alignment. In 1970 Needleman and Wunsch published an progressive alignment algorithm for performing a global pairwise alignment on two sequences [6]. The algorithm matches together as many characters as possible between two input sequences regardless of their lengths. It uses a process referred to as dynamic programming and is guaranteed to find the alignment with the highest score. The score between two sequences provides information about their evolutionary relationship to each other. When more than two sequences are present the scores between all combinations of sequence pairs form the starting point for producing a multiple alignment. The most famous programs implementing this algorithm are the Clustal series of programs [7-10] and the more recent Muscle [11]. In Home Publications Presentations Software Resources Contact
Biodata Mining, 2011
Background In in a typical "left-to-right" phylogenetic tree, the vertical order of taxa is meaningless, as only the branch path between them reflects their degree of similarity. To make unresolved trees more informative, here we propose an innovative Evolutionary Algorithm (EA) method to search the best graphical representation of unresolved trees, in order to give a biological meaning to the vertical order of taxa. Methods Starting from a West Nile virus phylogenetic tree, in a (1 + 1)-EA we evolved it by randomly rotating the internal nodes and selecting the tree with better fitness every generation. The fitness is a sum of genetic distances between the considered taxon and the r (radius) next taxa. After having set the radius to the best performance, we evolved the trees with (λ + μ)-EAs to study the influence of population on the algorithm. Results The (1 + 1)-EA consistently outperformed a random search, and better results were obtained setting the radius to 8. The (λ + μ)-EAs performed as well as the (1 + 1), except the larger population (1000 + 1000). Conclusions The trees after the evolution showed an improvement both of the fitness (based on a genetic distance matrix, then close taxa are actually genetically close), and of the biological interpretation. Samples collected in the same state or year moved close each other, making the tree easier to interpret. Biological relationships between samples are also easier to observe.
2015
A central theme in phylogenetics is the reconstruction and analysis of evolutionary trees from a given set of data. To determine the optimal search methods for the reconstruction of trees, it is crucial to understand the size and structure of neighbourhoods of trees under tree rearrangement operations. The diameter and size of the immediate neighbourhood of a tree has been well-studied, however little is known about the number of trees at distance two, three or (more generally) k from a given tree. In this thesis we explore previous results on the size of these neighbourhoods under common tree rearrangement operations (NNI, SPR and TBR). We obtain new results concerning the number of trees at distance k from a given tree under the Robinson-Foulds (RF) metric and the Nearest Neighbour Interchange (NNI) operation, and the number of trees at distance two from a given tree under the Subtree Prune and Regraft (SPR) operation. We also obtain an exact count for the number of pairs of binar...
Mathematical Biosciences, 1982
Two practical branch and bound algorithms for determining minimal and near-minimal phylogenetic trees from protein sequence data are presented. A mathematical description and analysis of phylogenetic trees introduces these algorithms. A comment on efficiency and fine tuning completes the paper. An example is cited where computer time was reduced from an estimated 55 days for a total search, to just under 5 minutes.
Algorithms for Molecular Biology
Background The supertree problem, i.e., the task of finding a common refinement of a set of rooted trees is an important topic in mathematical phylogenetics. The special case of a common leaf set L is known to be solvable in linear time. Existing approaches refine one input tree using information of the others and then test whether the results are isomorphic. Results An O(k|L|) algorithm, , for constructing the common refinement T of k input trees with a common leaf set L is proposed that explicitly computes the parent function of T in a bottom-up approach. Conclusion is simpler to implement than other asymptotically optimal algorithms for the problem and outperforms the alternatives in empirical comparisons. Availability An implementation of in Python is freely available at https://github.com/david-schaller/tralda.
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.
Proceedings of The National Academy of Sciences, 1979
ArXiv, 2021
Applied Soft Computing, 2014
Systematic Biology, 2001
Journal of Molecular Evolution, 1979
International Journal of Bioscience, Biochemistry and Bioinformatics, 2014
Intelligent Systems in Molecular Biology, 2001
ieeexplore.ieee.org, 2007
Lecture Notes in Computer Science, 2012
Lecture Notes in Computer Science, 2001
Journal of Mathematical Biology, 2012
Molecular Phylogenetics and Evolution, 2001
Proceedings. IEEE Computer Society Bioinformatics Conference
BMC Bioinformatics, 2011