Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
2016, Theoretical Computer Science
The state complexity of a Deterministic Finite-state automaton (DFA) is the number of states in its minimal equivalent DFA. We study the state complexity of random n-state DFAs over a k-symbol alphabet, drawn uniformly from the set [n] [n]×[k] × 2 [n] of all such automata. We show that, with high probability, the latter is α k n + O(√ n log n) for a certain explicit constant α k. 1 By symmetry, we may always take the state q = 1 to be the starting state.
Lecture Notes in Computer Science, 2013
proved that every language L that is (m, n)-recognizable by a deterministic frequency automaton such that m > n/2 can be recognized by a deterministic finite automaton as well. First, the size of deterministic frequency automata and of deterministic finite automata recognizing the same language is compared. Then approximations of a language are considered, where a language L is called an approximation of a language L if L differs from L in only a finite number of strings. We prove that if a deterministic frequency automaton has k states and (m, n)-recognizes a language L, where m > n/2, then there is a language L approximating L such that L can be recognized by a deterministic finite automaton with no more than k states. Austinat et al. [2] also proved that every language L over a singleletter alphabet that is (1, n)-recognizable by a deterministic frequency automaton can be recognized by a deterministic finite automaton. For languages over a single-letter alphabet we show that if a deterministic frequency automaton has k states and (1, n)-recognizes a language L then there is a language L approximating L such that L can be recognized by a deterministic finite automaton with no more that k states. However, there are approximations such that our bound is much higher, i.e., k!.
Information and Computation, 2011
Finite automata are probably best known for being equivalent to right-linear context-free grammars and, thus, for capturing the lowest level of the Chomsky-hierarchy, the family of regular languages. Over the last half century, a vast literature documenting the importance of deterministic, nondeterministic, and alternating finite automata as an enormously valuable concept has been developed. In the present paper, we tour a fragment of this literature. Mostly, we discuss developments relevant to finite automata related problems like, for example, (i) simulation of and by several types of finite automata, (ii) standard automata problems such as fixed and general membership, emptiness, universality, equivalence, and related problems, and (iii) minimization and approximation. We thus come across descriptional and computational complexity issues of finite automata. We do not prove these results but we merely draw attention to the big picture and some of the main ideas involved.
Developments in Language Theory
The paper investigates the effect of basic language-theoretic operations on the number of states in two-way deterministic finite automata (2DFAs). If m and n are the number of states in the 2DFAs recognizing the arguments of the following operations, then their result requires the following number of states: at least m + n − o(m + n) and at most 4m + n + const for union; at least m + n − o(m + n) and at most m + n + 1 for intersection; at least (m n) + 2 (n) log m and at most 2m m+1 • 2 n n+1 for concatenation; at least 1 n 2 n 2 −1 and at most 2 O (n n+1) for Kleene star, square and projections; between n + 1 and n + 2 for reversal; exactly 2n for inverse homomorphisms. All results are obtained by first establishing high lower bounds on the number of states in any 1DFAs recognizing these languages, and then using these bounds to reason about the size of any equivalent 2DFAs.
Theoretical Computer Science, 2015
Recently, Dassow et al. connected partial words and regular languages. Partial words are sequences in which some positions may be undefined, represented with a "hole" symbol . If we restrict what the symbol can represent, we can use partial words to compress the representation of regular languages. Doing so allows the creation of so-called -DFAs, smaller than the DFAs recognizing the original language L, which recognize the compressed language. However, the -DFAs may be larger than the NFAs recognizing L. In this paper, we investigate a question of Dassow et al. as to how these sizes are related.
Lecture Notes in Computer Science, 1988
A counting finite-state automaton is a nondeterministic finite-state au- tomaton which, on an input over its input alphabet, (magically) writes in binary the number of accepting computations on the input. We examine the complexity of comput- ing the counting function of an NFA, and the complexity of recognizing its range as a set of binary strings. We also consider the
Theoretical Computer Science, 2005
This document gives a generalization on the alphabet size of the method that is described in Nicaud's thesis for randomly generating complete DFAs. First, we recall some properties of m-ary trees and we give a bijection between the set of m-ary trees and the set K (m,n) of generalized tuples. We show that this bijection can be built on any total prefix order on . Then we give the relations that exist between the elements of K (m,n) and complete DFAs built on an alphabet of size greater than 2. We give algorithms that allow us to randomly generate accessible complete DFAs. Finally, we provide experimental results that show that most of the accessible complete DFAs built on an alphabet of size greater than 2 are minimal.
2013
We define a new measure of complexity for finite strings using nondeterministic finite automata, called nondeterministic automatic complexity and denoted A N (x). In this paper we prove some basic results for A N (x), give upper and lower bounds, estimate it for some specific strings, begin to classify types of strings with small complexities, and provide A N (x) for |x| ≤ 8. iii TABLE OF CONTENTS
Corr, 2009
We give an unique string representation, up to isomorphism, for initially connected deterministic finite automata (ICDFA's) with n states over an alphabet of k symbols. We show how to generate all these strings for each n and k, and how its enumeration provides an alternative way to obtain the exact number of ICDFA's. * Work partially funded by Fundação para a Ciência e Tecnologia (FCT) and Program POSI. † This paper was presented at the 7th Workshop on Descriptional Complexity of Formal Systems, DCFS'05
1991
We show how to turn a regular expression into an O(s) space representation of McNaughton and Yamada's NFA, where s is the number of NFA states. The standard adjacency list representation of McNaughton and Yamada's NFA takes up s+s 2 space in the worst case. The adjacency list representation of the NFA produced by Thompson takes up between 2r and 5r space, where r s in general, and can be arbitrarily larger than s. Given any set T of NFA states, our representation can be used to compute the set N of states one transition away from the states in T in optimal time O(jTj + jNj). McNaughton and Yamada's NFA requires (jTj jNj) in the worst case. Using Thompson's NFA, the equivalent calculation requires (r) time in the worst case. An implementation of our NFA representation con rms that it takes up an order of magnitude less space than McNaughton and Yamada's machine. An implementation to produce a DFA from our NFA representation by subset construction shows linear and quadratic speedups over subset construction starting from both Thompson's and McNaughton and Yamada's NFA's. It also shows that the DFA produced from our NFA is as much as one order of magnitude smaller than DFA's constructed from the two other NFA's.
Lecture Notes in Computer Science, 2016
We investigate the shuffle operation on regular languages represented by complete deterministic finite automata. We prove that f (m, n) = 2 mn−1 +2 (m−1)(n−1) (2 m−1 −1)(2 n−1 −1) is an upper bound on the state complexity of the shuffle of two regular languages having state complexities m and n, respectively. We also state partial results about the tightness of this bound. We show that there exist witness languages meeting the bound if 2 m 5 and n 2, and also if m = n = 6. Moreover, we prove that in the subset automaton of the NFA accepting the shuffle, all 2 mn states can be distinguishable, and an alphabet of size three suffices for that. It follows that the bound can be met if all f (m, n) states are reachable. We know that an alphabet of size at least mn is required provided that m, n 2. The question of reachability, and hence also of the tightness of the bound f (m, n) in general, remains open.
New Generation Computing, 1994
It is known that the class of deterministic nite automata is polynomial time learnable by using membership and equivalence queries. We investigate the query complexity of learning deterministic nite automata, i.e., the number of membership and equivalence queries made during the process of learning. We extend a known lower bound on membership queries to the case of randomized learning algorithms, and prove lower bounds on the number of alternations between membership and equivalence queries. We also show that a trade-o exists, allowing us to reduce the number of equivalence queries at the price of increasing the number of membership queries.
We try to compare the complexity of deterministic, nonde- terministic, probabilistic and ultrametric finite automata for the same language. We do not claim to have final upper and lower bounds. Rather these results can be considered as experiments to find advantages of one type of automata versus another type.
Lecture Notes in Computer Science, 2008
We prove that, for the uniform distribution over all sets X of m (that is a fixed integer) non-empty words whose sum of lengths is n, DX , one of the usual deterministic automata recognizing X * , has on average O(n) states and that the average state complexity of X * is Θ(n). We also show that the average time complexity of the computation of the automaton DX is O(n log n), when the alphabet is of size at least three.
In this paper we are implementing the regular expression matching is done using finite state Aleshin type automata, including non-deterministic finite state aleshin type automata (NAAs) and deterministic finite automata (DAAs). Storage space of automata is jointly determined by the number of states and transitions between states. A key issue is that the size of the automaton obtained from a regular expression is large, where size is defined as the number of states and transition arcs between states. The size of an automaton is crucial for the efficiency of the algorithms using three pattern matching based on regular expressions, size directly affects both time and space efficiency. NAAs and DAAs have their own advantages and disadvantages in regular expression matching. Keywords: deterministic finite state Aleshin type automata, non-deterministic finite state aleshin type automata (NAAs) and partial derivative automata I.INTRODUCTION NFAs can provide an exponentially more succinct description than DFAs but equivalence, inclusion, and universality are computationally hard for NFAs, while many of these problems can be solved in polynomial time for DFAs. The processing complexity for each character in the input is O (1) in a DFA, but O (n 2) for an NFA if all n states are active at the same time. The key feature of a DFA is that there is only one active state at any time; but converting an NFA into a DFA may generate O (Σ n) states. The size of a DFA, obtained fro m a regular expression, can increase exponentially; the DFA of a regular expression with thousands of patterns yields tens of thousands of states, which means memory consumption of thousands of megabytes. Another problem is that a minimal NFA is hard to co mpute [6]. How to use the matching e fficiency of a DFA and the storage efficiency of an NFA to realize matching is always a pursued goal in the field of regular exp ression matching. The regular expression is an important notation for specific patterns. Owing to its expressive power and flexib ility in describing useful patterns [1], regular expression matching technology based on finite automat a is widely used in networks and information processing, including applications for network real-t ime processing, protocol analysis, intrusion detection systems, intrusion prevention systems, deep packet inspection systems, and virus detection systems like Snort [2], Linu x L7-filter [3], and Bro [4]. Regular expressions are replacing explicit string patterns as the method of choice for describing patterns. However, with the increasing scale and number of regular exp ressions in a practical system, it is challenging to achieve good performance fo r pattern matching based on regular expressions. For example, the number of signatures in Snort has grown from 3166 in 2003 to 15,047 in 2009 and the pattern matching routines in Snort account for up to 70% of the total execution time with 80% of the instructions executed on real traces [5]. II. REGULAR EXPRESSION The term alphabet denotes any finite set of symbols. A string over an alphabet is a finite sequence of symbols drawn fro m that alphabet with the term word often used as a synonym for the term string. Let Σ be an alphabet and Σ * be the set of all words over Σ , i.e., Σ * denotes the set of all finite strings of symbols in Σ. If Σ is an alphabet, then any subset of Σ * is a language over Σ. The length of a word w ∈ Σ * , usually written as |w |, is the number of occurrences of symbols in w , with ε denoting the empty word whose length is 0. ∅ is the empty set, a ∈ Σ is an input symbol, and r and s are regular expressions. A regular expression describes a set of strings witho ut enumerating them exp licit ly. A regular expression over Σ , which can be recursively de fined, is defined as follo ws: (1)∅ and ε are regular expressions, denoting ∅ and {ε}, respectively. (2)If a is a symbol in Σ , then a is a regular expression that denotes {a}. (3)Suppose r and s are regular expressions denoting the languages L (r) and L (s). Then, (r) + (s), (r) · (s), r * , and (r) are also regular exp ressions denoting L (r) ∪ L (s), L(r)L (s), (L (r)) * , and L (r), respectively. (4)All regular exp ressions can be obtained by applying rules (1), (2), and (3) a fin ite number of t imes.
Journal of Computer and System Sciences, 1981
It is known that for every restricted regular expression of length n there exists a nondeterministic finite automaton with n + 1 states giving rise to the upper bound of 2" + 1 on the number of states of the corresponding reduced automaton. In this note we show that this bound can be attained for all n ) 2, i.e., the upper bound 2" + 1 is optimal. An observation is then made about the synthesis problem for nondeterministic finite automata.
2020
We revisit the complexity of procedures on SFAs (such as intersection, emptiness, etc.) and analyze them according to the measures we find suitable for symbolic automata: the number of states (n), the maximal number of transitions exiting a state (m) and the size of the most complex transition predicate (l). We pay attention to the special forms of SFAs: normalized SFAs and neat SFAs, as well as to SFAs over a monotonic effective Boolean algebra.
Bulletin of The European Association for Theoretical Computer Science, 2015
Because of their succinctness and clear syntax, regular expressions are the common choice to represent regular languages. Deterministic finite automata are an excellent representation for testing equivalence, containment or membership, as these problems are easily solved for this model. However, minimal deterministic finite automata can be exponentially larger than the associated regular expression, while the corresponding nondeterministic finite automata can be linearly larger. The worst case of both the complexity of the conversion algorithms, and of the size of the resulting automata, are well studied. However, for practical purposes, estimates for the average case can provide much more useful information. In this paper we review recent results on the average size of automata resulting from several constructions and suggest several directions of research. Most results were obtained within the framework of analytic combinatorics.
We determine the asymptotic proportion of minimal automata, within n-state accessible deterministic complete automata over a k-letter alphabet, with the uniform distribution over the possible transition structures, and a binomial distribution over terminal states, with arbitrary parameter b. It turns out that a fraction ~ 1-C(k,b) n^{-k+2} of automata is minimal, with C(k,b) a function, explicitly determined, involving the solution of a transcendental equation.
Theoretical Computer Science, 2005
We investigate the state complexity of some operations on binary regular languages. In particular, we consider the concatenation of languages represented by deterministic finite automata, and the reversal and complementation of languages represented by nondeterministic finite automata. We prove that the upper bounds on the state complexity of these operations, which were known to be tight for larger alphabets, are tight also for binary alphabets.
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.