Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
2005, Journal of Combinatorial Theory, Series A
The study of combinatorics on words, or finite sequences of symbols from a finite alphabet, finds applications in several areas of biology, computer science, mathematics, and physics. Molecular biology, in particular, has stimulated considerable interest in the study of combinatorics on partial words that are sequences that may have a number of "do not know" symbols also called "holes". This paper is devoted to a fundamental result on periods of words, the Critical Factorization Theorem, which states that the period of a word is always locally detectable in at least one position of the word resulting in a corresponding critical factorization. Here, we describe precisely the class of partial words w with one hole for which the weak period is locally detectable in at least one position of w. Our proof provides an algorithm which computes a critical factorization when one exists. A World Wide Web server interface at http://www.uncg.edu/mat/cft/ has been established for automated use of the program. We thank Ajay Chriscoe for very valuable comments and suggestions, and for implementing Algorithm 2 and creating a World Wide Web site for this research. We also thank the referee of a preliminary version of this paper for his/her very valuable comments and suggestions. 1 sequences that may have a number of "do not know" symbols. Such sequences are referred to as partial words and appear, for instance, when genes or proteins are compared. Another area of current interest for the study of the combinatorics on partial words is data communication where some information may be missing, lost, or unknown. While a word can be described by a total function, a partial word can be described by a partial function.
Theoretical Computer Science, 2004
The study of the combinatorial properties of strings of symbols from a finite alphabet (also referred to as words) is profoundly connected to numerous fields such as biology, computer science, mathematics, and physics. Research in combinatorics on words goes back roughly a century. There is a renewed interest in combinatorics on words as a result of emerging new application areas such as molecular biology. Partial words were recently introduced in this context. The motivation behind the notion of a partial word is the comparison of genes (or proteins). Alignment of two genes (or two proteins) can be viewed as a construction of partial words that are said to be compatible. While a word can be described by a total function, a partial word can be described by a partial function. More precisely, a partial word of length n over a finite alphabet A is a partial function from {1, . . . , n} into A. Elements of {1, . . . , n} without an image are called holes. A word is just a partial word without holes. The notion of period of a word is central in combinatorics on words. In the case of partial words, there are two notions: one is that of period, the other is that of local period. This paper extends to partial words with one hole the well known result of Guibas and Odlyzko which states that for every word u, there exists a word v of same length as u over the alphabet {0, 1} such that the set of all periods of u coincides with the set of all periods of v. Our result states that for every partial word u with one hole, there exists a partial word v of same length as u with at most one hole over the alphabet {0, 1} such that the set of all periods of u coincides with the set of all periods of v and the set of all local periods of u coincides with the set of all local periods of v. To prove our result, we use the technique of Halava, Harju and Ilie which they used * This material is based upon work supported by the National Science Foundation under Grants CCR-9700228 and CCR-0207673. A Research Assignment from the University of North Carolina at Greensboro is gratefully acknowledged. I thank Phuongchi Thi Le for very valuable comments and suggestions. She received a research assistantship from the University of North Carolina at Greensboro to work with me on this project.
Computers & Mathematics with Applications, 2004
Made available courtesy of Elsevier: http://www.elsevier.com ***Reprinted with permission. No further reproduction is authorized without written permission from Elsevier. This version of the document is not the version of record. Figures and/or pictures may be missing from this format of the document.*** Abstract: A partial word of length n over a finite alphabet A is a partial map from {0, … , n-1} into A. Elements of {0, … , n-1} without image are called holes (a word is just a partial word without holes). A fundamental periodicity result on words due to Fine and Wilf [1] intuitively determines how far two periodic events have to match in order to guarantee a common period. This result was extended to partial words with one hole by Berstel and Boasson [2] and to partial words with two or three holes by Blanchet-Sadri and Hegstrom [3]. In this paper, we give an extension to partial words with an arbitrary number of holes.
Computers & Mathematics with Applications, 2003
The study of the combinatorial properties of strings of symbols from a finite alphabet, also referred to as words, is profoundly connected to numerous fields such as biology, computer science, All rights reserved.
Theoretical Computer Science, 2002
A word of length n over a ÿnite alphabet A is a map from {0; : : : ; n − 1} into A. A partial word of length n over A is a partial map from {0; : : : ; n − 1} into A. In the latter case, elements of {0; : : : ; n − 1} without image are called holes (a word is just a partial word without holes). In this paper, we extend a fundamental periodicity result on words due to Fine and Wilf to partial words with two or three holes. This study was initiated by Berstel and Boasson for partial words with one hole. Partial words are motivated by molecular biology.
Fine and Wilf's well-known theorem states that any word having periods p, q and length at least p + q − gcd(p, q) also has gcd(p, q), the greatest common divisor of p and q, as a period. Moreover, the length p + q − gcd(p, q) is critical since counterexamples can be provided for shorter words. This result has since been extended to partial words, or finite sequences that may contain a number of "do not know" symbols or "holes." More precisely, any partial word u with H holes having weak periods p, q and length at least the so-denoted l H (p, q) also has strong period gcd(p, q) provided u is not (H,(p, q))-special. This extension was done for one hole by Berstel and Boasson (where the class of (1,(p, q))-special partial words is empty), for two or three holes by Blanchet-Sadri and Hegstrom, and for an arbitrary number of holes by Blanchet-Sadri. In this paper, we further extend these results, allowing an arbitrary number of weak periods. In addition to speciality, the concepts of intractable period sets and interference between periods play a role. * This material is based upon work supported by the National Science Foundation under Grant No. DMS-0452020. We thank the referees of a preliminary version of this paper for their very valuable comments and suggestions.
Information and Computation, 2008
The concept of periodicity has played over the years a centra1 role in the development of combinatorics on words and has been a highly valuable too1 for the design and analysis of algorithms. Fine and Wilf's famous periodicity result, which is one of the most used and known results on words, has extensions to partia1 words, or sequences that may have a number of "do not know" symbols. These extensions fal1 into two categories: the ones that relate to strong periodicity and the ones that relate to weak periodicity. In this paper, we obtain consequences by generalizing, in particular, the combinatoria1 property that "for any word u over {a, b}, ua or ub is primitive," which proves in some sense that there exist very many primitive partia1 words.
The concept of periodicity has played over the years a central role in the development of combinatorics on words and has been a highly valuable tool for the design and analysis of algorithms. There are many fundamental periodicity results on words. Among them is the famous result of Fine and Wilf which intuitively determines how far two periodic events have to match in order to guarantee a common period. This result states that for positive integers p and q, if the word u has periods p and q and the length of u is not less than p + q − gcd(p, q), then u has also period gcd(p, q). Fine and Wilf's result, which is one of the most used and known results on words, has extensions to partial words, or sequences that may have a number of "do not know" symbols. These extensions fall into two categories: The ones that relate to strong periodicity and the ones that relate to weak periodicity. In this paper, we study some consequences of these results.
2012
We propose an algorithm that given as input a full word w of length n, and positive integers p and d, outputs, if any exists, a maximal p-periodic partial word contained in w with the property that no two holes are within distance d (so-called d-valid). Our algorithm runs in O (nd) time and is used for the study of repetition-freeness of partial words.
Lecture Notes in Computer Science, 2009
We propose an algorithm that given as input a full word w of length n, and positive integers p and d, outputs (if any exists) a maximal p-periodic partial word contained in w with the property that no two holes are within distance d. Our algorithm runs in O(nd) time and is used for the study of freeness of partial words. Furthermore, we construct an infinite word over a five-letter alphabet that is overlapfree even after the insertion of an arbitrary number of holes, answering affirmatively a conjecture from Blanchet-Sadri, Mercaş, and Scott.
Information and Computation, 2008
Acta Informatica, 2001
We introduce the notion of periodic-like word. It is a word whose longest repeated prefix is not right special. Some different characterizations of this concept are given. In particular, we show that a word w is periodic-like if and only if it has a period not larger than |w| − R w , where R w is the least non-negative integer such that any prefix of w of length ≥ R w is not right special. We derive that if a word w has two periods p, q ≤ |w| − R w , then also the greatest common divisor of p and q is a period of w. This result is, in fact, an improvement of the theorem of Fine and Wilf. We also prove that the minimal period of a word w is equal to the sum of the minimal periods of its components in a suitable canonical decomposition in periodic-like subwords. Moreover, we characterize periodic-like words having the same set of proper boxes, in terms of the important notion of root-conjugacy. Finally, some new uniqueness conditions for words, related to the maximal box theorem are given.
2012
Partial words, or sequences over a finite alphabet that may have do-not-know symbols or holes, have been recently the subject of much investigation. Several interesting combinatorial properties have been studied such as the periodic behavior and the counting of distinct squares in partial words. In this paper, we extend the three-squares lemma on words to partial words with one hole.
Lecture Notes in Computer Science, 2008
By the famous theorem of Morse and Hedlund, a word is ultimately periodic if and only if it has bounded subword complexity, i.e., for sufficiently large n, the number of factors of length n is constant. In this paper we consider relational periods and relationally periodic sequences, where the relation is a similarity relation on words induced by a compatibility relation on letters. We investigate what would be a suitable definition for a relational subword complexity function such that it would imply a Morse and Hedlund-like theorem for relationally periodic words. We consider strong and weak relational periods and two candidates for subword complexity functions.
2008
Combinatorics on words, or sequences or strings of symbols over a finite alphabet, is a rather new field although the first papers were published at the beginning of the 20th century [120, 121]. The interest in the study of combinatorics on words has been increasing since it finds applications in various research areas of mathematics, computer science, and biology where the data can be easily represented as words over some alphabet. Such areas may be concerned with algorithms on strings [38, 48, 50, 51, 52, 69, 72, 84, 102, 118], semigroups, automata and languages [2, 45, 55, 75, 82, 92, 93], molecular genetics [78], or codes [5, 73, 79]. Motivated by molecular biology of nucleic acids, Berstel and Boasson introduced in 1999 the notion of partial words which are sequences that may contain a number of “do not know” symbols or “holes” [4]. DNA molecules are the carriers of the genetic information in almost all organisms. Let us look into the structure of such a molecule. A single stra...
RAIRO - Theoretical Informatics and Applications, 2009
A well known result of Fraenkel and Simpson states that the number of distinct squares in a full word of length n is bounded by 2n since at each position there are at most two distinct squares whose last occurrence start. In this paper, we investigate squares in partial words with one hole, or sequences over a finite alphabet that have a "do not know" symbol or "hole." A square in a partial word over a given alphabet has the form uv where u is compatible with v, and consequently, such square is compatible with a number of full words over the alphabet that are squares. Recently, it was shown that for partial words with one hole, there may be more than two squares that have their last occurrence starting at the same position. Here, we prove that if such is the case, then the length of the shortest square is at most half the length of the third shortest square. As a result, we show that the number of distinct full squares compatible with factors of a partial word with one hole of length n is bounded by 7n 2 .
RAIRO - Theoretical Informatics and Applications, 2013
Recently, Constantinescu and Ilie proved a variant of the well-known periodicity theorem of Fine and Wilf in the case of two relatively prime abelian periods and conjectured a result for the case of two non-relatively prime abelian periods. In this paper, we answer some open problems they suggested. We show that their conjecture is false but we give bounds, that depend on the two abelian periods, such that the conjecture is true for all words having length at least those bounds and show that some of them are optimal. We also extend their study to the context of partial words, giving optimal lengths and describing an algorithm for constructing optimal words.
Information and Computation/information and Control, 2011
One of the particularities of information encoded as DNA strands is that a string u contains basically the same information as its Watson-Crick complement, denoted here as θ(u). Thus, any expression consisting of repetitions of u and θ(u) can be considered in some sense periodic. In this paper, we give a generalization of Lyndon and Schützenberger's classical result about equations of the form u l = v n w m , to cases where both sides involve repetitions of words as well as their complements. Our main results show that, for such extended equations, if l 5, n, m 3, then all three words involved can be expressed in terms of a common word t and its complement θ(t). Moreover, if l 5, then n = m = 3 is an optimal bound. These results are established based on a complete characterization of all possible overlaps between two expressions that involve only some word u and its complement θ(u), which is also obtained in this paper. Crown
Indagationes Mathematicae, 2003
Let w = wt . ..w., be a word of maximal length n, and with a maximal number of distinct letters for this length, such that w has periods pt, . . ..pr but not period gcd(pt, . . ..pr). We provide a fast algorithm to compute n and w. We show that w is uniquely determined apart from isomorphism and that it is a palindrome. Furthermore we give lower and upper bounds for n as explicit functions of pt, . . ..pr. For I = 2 the exact value of n is due to Fine and Wilf. In case the number of distinct letters in the extremal word equals r a formula for n had been given by Castelli, Mignosi and Restivo in case Y = 3 and by Justin if r > 3.
Theoretical Computer Science, 1999
We extend the theorem of Fine and Wilf to words having three periods. We then define the set 3-PER of words of maximal length for which such result does not apply. We prove that the set 3-PER and the sequences of complexity 2n + 1, introduced by Amoux and Rauzy to generalize Sturmian words, have the same set of factors.
Proceedings of the 13th …, 2010
Recently, Constantinescu and Ilie proved a variant of the well-known periodicity theorem of Fine and Wilf in the case of two relatively prime abelian periods, and conjectured a result for the case of two non-relatively prime abelian periods. More precisely, they proved that any full word having two coprime abelian periods p, q and length at least 2pq − 1 has also gcd(p, q) = 1 as a period. In this paper, we answer some open problems they suggested by proving that the length 2pq − 1 is optimal and by answering affirmatively their conjecture. We also extend their study in the context of partial words, giving optimal lengths and describing an algorithm for constructing optimal words.
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.