Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
2019, Automata and Computability
Deterministic finite automata are one of the simplest and most practical models of computation studied in automata theory. Their conceptual extension is the non-deterministic finite automata which also have plenty of applications. In this article, we study these models through the lens of succinct data structures where our ultimate goal is to encode these mathematical objects using information theoretically optimal number of bits along with supporting queries on them efficiently. Towards this goal, we first design a succinct data structure for representing any deterministic finite automaton D having n states over a σ-letter alphabet Σ using (σ − 1)n log n + O(n log σ) bits of space, which can determine, given an input string x over Σ, whether D accepts x optimally in time proportional to the length of x, using constant words of working space. When the input deterministic finite automaton is acyclic, we can improve the above space bound significantly to (σ − 1)(n − 1) log n + 3n + O(log 2 σ) + o(n) bits, without compromising the running time for string acceptance checking. Finally, we exhibit our succinct data structure for representing a non-deterministic finite automaton N having n states over a σ-letter alphabet Σ using σn 2 + n bits of space, such that given an input string x, we can decide whether N accepts x efficiently in polynomial time.
2021
Deterministic finite automata are one of the simplest and most practical models of computation studied in automata theory. Their conceptual extension is the non-deterministic finite automata which also have plenty of applications. In this article, we study these models through the lens of succinct data structures where our ultimate goal is to encode these mathematical objects using information theoretically optimal number of bits along with supporting queries on them efficiently. Towards this goal, we first design a succinct data structure for representing any deterministic finite automaton D having n states over a σ-letter alphabet Σ using (σ−1)n log n+O(n log σ) bits of space, which can determine, given an input string x over Σ, whether D accepts x in O(|x| log σ) time, using constant words of working space. When the input deterministic finite automaton is acyclic, not only we can improve the above space bound significantly to (σ − 1)(n − 1) log n + 3n + O(log 2 σ) + o(n) bits, we also obtain optimal query time for string acceptance checking. More specifically, using our succinct representation, we can check if a given input string x can be accepted by the acyclic deterministic finite automaton using time proportional to the length of x, hence, the optimal query time. We also exhibit a succinct data structure for representing a non-deterministic finite automaton N having n states over a σ-letter alphabet Σ using σn 2 + n bits of space, such that given an input string x, we can decide whether N accepts x efficiently in O(n 2 |x|) time. Finally, we also provide time and space efficient algorithms for performing several standard operations such as union, intersection and complement on the languages accepted by deterministic finite automata.
SSRN Electronic Journal, 2022
Deterministic finite automata are one of the simplest and most practical models of computation studied in automata theory. Their conceptual extension is the non-deterministic finite automata which also have plenty of applications. In this article, we study these models through the lens of succinct data structures where our ultimate goal is to encode these mathematical objects using information theoretically optimal number of bits along with supporting queries on them efficiently. Towards this goal, we first design a succinct data structure for representing any deterministic finite automaton D having n states over a σ-letter alphabet Σ using (σ−1)n log n+O(n log σ) bits of space, which can determine, given an input string x over Σ, whether D accepts x in O(|x| log σ) time, using constant words of working space. When the input deterministic finite automaton is acyclic, not only we can improve the above space bound significantly to (σ − 1)(n − 1) log n + 3n + O(log 2 σ) + o(n) bits, we also obtain optimal query time for string acceptance checking. More specifically, using our succinct representation, we can check if a given input string x can be accepted by the acyclic deterministic finite automaton using time proportional to the length of x, hence, the optimal query time. We also exhibit a succinct data structure for representing a non-deterministic finite automaton N having n states over a σ-letter alphabet Σ using σn 2 + n bits of space, such that given an input string x, we can decide whether N accepts x efficiently in O(n 2 |x|) time. Finally, we also provide time and space efficient algorithms for performing several standard operations such as union, intersection and complement on the languages accepted by deterministic finite automata.
Information and Computation, 2011
Finite automata are probably best known for being equivalent to right-linear context-free grammars and, thus, for capturing the lowest level of the Chomsky-hierarchy, the family of regular languages. Over the last half century, a vast literature documenting the importance of deterministic, nondeterministic, and alternating finite automata as an enormously valuable concept has been developed. In the present paper, we tour a fragment of this literature. Mostly, we discuss developments relevant to finite automata related problems like, for example, (i) simulation of and by several types of finite automata, (ii) standard automata problems such as fixed and general membership, emptiness, universality, equivalence, and related problems, and (iii) minimization and approximation. We thus come across descriptional and computational complexity issues of finite automata. We do not prove these results but we merely draw attention to the big picture and some of the main ideas involved.
Lecture Notes in Computer Science, 2013
proved that every language L that is (m, n)-recognizable by a deterministic frequency automaton such that m > n/2 can be recognized by a deterministic finite automaton as well. First, the size of deterministic frequency automata and of deterministic finite automata recognizing the same language is compared. Then approximations of a language are considered, where a language L is called an approximation of a language L if L differs from L in only a finite number of strings. We prove that if a deterministic frequency automaton has k states and (m, n)-recognizes a language L, where m > n/2, then there is a language L approximating L such that L can be recognized by a deterministic finite automaton with no more than k states. Austinat et al. [2] also proved that every language L over a singleletter alphabet that is (1, n)-recognizable by a deterministic frequency automaton can be recognized by a deterministic finite automaton. For languages over a single-letter alphabet we show that if a deterministic frequency automaton has k states and (1, n)-recognizes a language L then there is a language L approximating L such that L can be recognized by a deterministic finite automaton with no more that k states. However, there are approximations such that our bound is much higher, i.e., k!.
Developments in Language Theory
The paper investigates the effect of basic language-theoretic operations on the number of states in two-way deterministic finite automata (2DFAs). If m and n are the number of states in the 2DFAs recognizing the arguments of the following operations, then their result requires the following number of states: at least m + n − o(m + n) and at most 4m + n + const for union; at least m + n − o(m + n) and at most m + n + 1 for intersection; at least (m n) + 2 (n) log m and at most 2m m+1 • 2 n n+1 for concatenation; at least 1 n 2 n 2 −1 and at most 2 O (n n+1) for Kleene star, square and projections; between n + 1 and n + 2 for reversal; exactly 2n for inverse homomorphisms. All results are obtained by first establishing high lower bounds on the number of states in any 1DFAs recognizing these languages, and then using these bounds to reason about the size of any equivalent 2DFAs.
Theoretical Computer Science, 2016
The state complexity of a Deterministic Finite-state automaton (DFA) is the number of states in its minimal equivalent DFA. We study the state complexity of random n-state DFAs over a k-symbol alphabet, drawn uniformly from the set [n] [n]×[k] × 2 [n] of all such automata. We show that, with high probability, the latter is α k n + O(√ n log n) for a certain explicit constant α k. 1 By symmetry, we may always take the state q = 1 to be the starting state.
1991
We show how to turn a regular expression into an O(s) space representation of McNaughton and Yamada's NFA, where s is the number of NFA states. The standard adjacency list representation of McNaughton and Yamada's NFA takes up s+s 2 space in the worst case. The adjacency list representation of the NFA produced by Thompson takes up between 2r and 5r space, where r s in general, and can be arbitrarily larger than s. Given any set T of NFA states, our representation can be used to compute the set N of states one transition away from the states in T in optimal time O(jTj + jNj). McNaughton and Yamada's NFA requires (jTj jNj) in the worst case. Using Thompson's NFA, the equivalent calculation requires (r) time in the worst case. An implementation of our NFA representation con rms that it takes up an order of magnitude less space than McNaughton and Yamada's machine. An implementation to produce a DFA from our NFA representation by subset construction shows linear and quadratic speedups over subset construction starting from both Thompson's and McNaughton and Yamada's NFA's. It also shows that the DFA produced from our NFA is as much as one order of magnitude smaller than DFA's constructed from the two other NFA's.
Theoretical Computer Science, 2004
We investigate the relationship between regular languages and syntactic monoid size. In particular, we consider the transformation monoids of n-state (minimal) deterministic finite automata. We show tight upper bounds on the syntactic monoid size, proving that an nstate deterministic finite automaton with singleton input alphabet (input alphabet with at least three letters, respectively) induces a linear (n n , respectively) size syntactic monoid. In the case of two letter input alphabet, we can show a lower bound of n n −`n ´ !n k −`n ´k k , for some natural numbers k and close to n 2 , for the size of the syntactic monoid of a language accepted by an n-state deterministic finite automaton. This induces a family of deterministic finite automata such that the fraction of the size of the induced syntactic monoid and n n tends to 1 as n goes to infinity. α = 1 2 . . . m m + 1 . . . m + r − 1 m + r 2 3 . . . m + 1 m + 2 . . . m + r m + 1
SIAM Journal on Computing, 2000
We study the problem of constructing the deterministic equivalent of a nondeterministic weighted finite-state automaton (WFA). Determinization of WFAs has important applications in automatic speech recognition (ASR). We provide the first polynomial-time algorithm to test for the twins property, which determines if a WFA admits a deterministic equivalent. We also give upper bounds on the size of the deterministic equivalent;
Information and Computation, 2007
The class of growing context-sensitive languages (GCSL) was proposed as a naturally defined subclass of context-sensitive languages whose membership problem is solvable in polynomial time [9]. GCSL and its deterministic counterpart called Church-Rosser Languages (CRL) complement the Chomsky hierarchy in a natural way [18], as the classes filling the gap between CFL and CSL. Interestingly, they possess characterizations by a natural machine model, length-reducing two-pushdown automata (lrTPDA). We present a lower bound technique applicable for lrTPDA. Using this method, we prove the conjecture that the set of palindromes is not in CRL [19]. This implies that CFL∩coCFL as well as UCFL∩coUCFLare not included in CRL, where UCFL denotes the class of unambiguous context-free languages (what solves an open problem from [1]). The another consequence of our result is that CRL is a strict subset of GCSL ∩ coGCSL.
2020
We revisit the complexity of procedures on SFAs (such as intersection, emptiness, etc.) and analyze them according to the measures we find suitable for symbolic automata: the number of states (n), the maximal number of transitions exiting a state (m) and the size of the most complex transition predicate (l). We pay attention to the special forms of SFAs: normalized SFAs and neat SFAs, as well as to SFAs over a monotonic effective Boolean algebra.
Computing Research Repository, 2009
We give an unique string representation, up to isomorphism, for initially connected deterministic finite automata (ICDFA's) with n states over an alphabet of k symbols. We show how to generate all these strings for each n and k, and how its enumeration provides an alternative way to obtain the exact number of ICDFA's. * Work partially funded by Fundação para a Ciência e Tecnologia (FCT) and Program POSI.
In this paper we are implementing the regular expression matching is done using finite state Aleshin type automata, including non-deterministic finite state aleshin type automata (NAAs) and deterministic finite automata (DAAs). Storage space of automata is jointly determined by the number of states and transitions between states. A key issue is that the size of the automaton obtained from a regular expression is large, where size is defined as the number of states and transition arcs between states. The size of an automaton is crucial for the efficiency of the algorithms using three pattern matching based on regular expressions, size directly affects both time and space efficiency. NAAs and DAAs have their own advantages and disadvantages in regular expression matching. Keywords: deterministic finite state Aleshin type automata, non-deterministic finite state aleshin type automata (NAAs) and partial derivative automata I.INTRODUCTION NFAs can provide an exponentially more succinct description than DFAs but equivalence, inclusion, and universality are computationally hard for NFAs, while many of these problems can be solved in polynomial time for DFAs. The processing complexity for each character in the input is O (1) in a DFA, but O (n 2) for an NFA if all n states are active at the same time. The key feature of a DFA is that there is only one active state at any time; but converting an NFA into a DFA may generate O (Σ n) states. The size of a DFA, obtained fro m a regular expression, can increase exponentially; the DFA of a regular expression with thousands of patterns yields tens of thousands of states, which means memory consumption of thousands of megabytes. Another problem is that a minimal NFA is hard to co mpute [6]. How to use the matching e fficiency of a DFA and the storage efficiency of an NFA to realize matching is always a pursued goal in the field of regular exp ression matching. The regular expression is an important notation for specific patterns. Owing to its expressive power and flexib ility in describing useful patterns [1], regular expression matching technology based on finite automat a is widely used in networks and information processing, including applications for network real-t ime processing, protocol analysis, intrusion detection systems, intrusion prevention systems, deep packet inspection systems, and virus detection systems like Snort [2], Linu x L7-filter [3], and Bro [4]. Regular expressions are replacing explicit string patterns as the method of choice for describing patterns. However, with the increasing scale and number of regular exp ressions in a practical system, it is challenging to achieve good performance fo r pattern matching based on regular expressions. For example, the number of signatures in Snort has grown from 3166 in 2003 to 15,047 in 2009 and the pattern matching routines in Snort account for up to 70% of the total execution time with 80% of the instructions executed on real traces [5]. II. REGULAR EXPRESSION The term alphabet denotes any finite set of symbols. A string over an alphabet is a finite sequence of symbols drawn fro m that alphabet with the term word often used as a synonym for the term string. Let Σ be an alphabet and Σ * be the set of all words over Σ , i.e., Σ * denotes the set of all finite strings of symbols in Σ. If Σ is an alphabet, then any subset of Σ * is a language over Σ. The length of a word w ∈ Σ * , usually written as |w |, is the number of occurrences of symbols in w , with ε denoting the empty word whose length is 0. ∅ is the empty set, a ∈ Σ is an input symbol, and r and s are regular expressions. A regular expression describes a set of strings witho ut enumerating them exp licit ly. A regular expression over Σ , which can be recursively de fined, is defined as follo ws: (1)∅ and ε are regular expressions, denoting ∅ and {ε}, respectively. (2)If a is a symbol in Σ , then a is a regular expression that denotes {a}. (3)Suppose r and s are regular expressions denoting the languages L (r) and L (s). Then, (r) + (s), (r) · (s), r * , and (r) are also regular exp ressions denoting L (r) ∪ L (s), L(r)L (s), (L (r)) * , and L (r), respectively. (4)All regular exp ressions can be obtained by applying rules (1), (2), and (3) a fin ite number of t imes.
Developments in Language Theory, 2016
A nondeterministic finite automaton is unambiguous if it has at most one accepting computation on every input string. We investigate the complexity of basic regular operations on languages represented by unambiguous finite automata. We get tight upper bounds for intersection (mn), left and right quotients (2 n − 1), positive closure (3/4 • 2 n − 1), star (3/4 • 2 n), shuffle (2 mn − 1), and concatenation (3/4 • 2 m+n − 1). To prove tightness, we use a binary alphabet for intersection and left and right quotients, a ternary alphabet for star and positive closure, a fiveletter alphabet for shuffle, and a seven-letter alphabet for concatenation. We also get some partial results for union and complementation.
Lecture Notes in Computer Science, 2016
We survey recent results on the descriptional complexity of self-verifying finite automata. In particular, we discuss the cost of simulation of self-verifying finite automata by deterministic finite automata, and the complexity of basic regular operations on languages represented by self-verifying finite automata.
2013
We define a new measure of complexity for finite strings using nondeterministic finite automata, called nondeterministic automatic complexity and denoted A N (x). In this paper we prove some basic results for A N (x), give upper and lower bounds, estimate it for some specific strings, begin to classify types of strings with small complexities, and provide A N (x) for |x| ≤ 8. iii TABLE OF CONTENTS
International Journal on Software Tools for Technology Transfer (STTT), 1999
We consider the problem of storing a set S k as a deterministic nite automaton (DFA). We s h o w that inserting a new string 2 k or deleting a string from the set S represented as a minimized DFA can be done in expected time O(kj j), while preserving the minimality of the DFA. We then discuss an application of this work to reduce the memory requirements of a model checker based on explicit state enumeration.
Computer ScienceTheory and …, 2008
Lecture Notes in Computer Science, 2013
We study the computational power of P automata, variants of symport/antiport P systems which characterize string languages by applying a mapping to the sequence of multisets entering the system during computations. We consider the case when the input mapping is defined in such a way that it maps a multiset to the set of strings consisting of all permutations of its elements. We show that the computational power of this type of P automata is strictly less than so called restricted logarithmic space Turing machines, and we also exhibit a strict infinite hierarchy within the accepted language class based on the number of membranes present in the system.
Information and Computation, 2001
The study of the computational power of randomized computations is one of the central tasks of complexity theory. The main goal of this paper is the comparison of the power of Las Vegas computation and deterministic respectively nondeterministic computation. We investigate the power of Las Vegas computation for the complexity measures of one-way communication, ordered binary decision diagrams, and finite automata. (i) For the one-way communication complexity of two-party protocols we show that Las Vegas communication can save at most one half of the deterministic one-way communication complexity. We also present a language for which this gap is tight. (ii) The result (i) is applied to show an at most polynomial gap between determinism and Las Vegas for ordered binary decision diagrams. (iii) For the size (i.e., the number of states) of finite automata we show that the size of Las Vegas finite automata recognizing a language L is at least the square root of the size of the minimal deterministic finite automaton recognizing L. Using a specific language we verify the optimality of this lower bound.
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.