———
Instuctors|Manvaley
FORMAL UANGUAGES
enn) AUTOMATA
Uhh 2
PEMERIEINZ
Uatresiyy of Calitnatinet DartsWorld Headquarters
Jones & Bartlett Jones & Bartlett Jones & Bartlett
Learning Learning Canada Learning International
40 Tall Pine Drive 6339 Ormindale Way ‘Barb House, Barb Mews
Sudbury, MA 01776 Mississauga, Ontario London W6 7PA,
978-443-5000 LSV 192 United Kingdom
info@ jblearning.com Canada
‘www jblearning.com
Jones & Bartlett Learning books and products are available through most
bookstores and online booksellers. To contact Jones & Bartlett Learning directly,
call 800-832-0034, fax 978-443-8000, or visit our website, www.jbleaming.com.
Substantial discounts on bulk quantities of Jones & Bartlett Learning
publications are available to corporations, professional associations, and other
qualified organizations. For details and specific discount information, contact
the special sales department at Jones & Bartlett Learning via the above contact
information or send an email to specialsales @jbleamning.com.
Copyright © 2012 by Jones & Ba
jet Learning, LLC
All rights reserved, No part of the material protected by this copyright may be
reproduced or utilized in any form, electronic or mechanical, including
photocopying, recording, or by any information storage and retrieval system,
‘without written permission from the copyright owner.
Production Credits
Publisher: Cathleen Sether
Senior Acquisitions Editor: Timothy Anderson
Senior Editorial Assistant: Stephanie Sguigna
Production Director: Amy Rose
Senior Marketing Manager: Andrea DeFronzo
‘Composition: Northeast Compositors, Inc.
tle Page Design: Kristin E. Parker
6048
181413 1211 10987654321“4070 PREF LinaIM” — 2011/1/28 — 13:30 — page iii — #1
Preface
my book An Introduction to Formal Languages and Automata, Fifth
Edition. Since this text was organized on the principle of learning
by problem solving, much of my advice relates to the exercises at
the end of each section
It is my contention that this abstract and often difficult subject matter
can be made interesting and enjoyable to the average undergraduate stu-
dent, if mathematical formalism is downplayed and problem solving is made
the focus. This means that students learn the material and strengthen their
mathematical skills primarily by doing problems. Now this may seem rather
cobvioss; all textbooks contain exercises that are routinely assigned to vest
and improve the students’ understanding, but what I have in mind goes a
little deeper. I consider exercises not just a supplement to the lectures, but
that to a large extent, the lectures should be a preparation for the exercises
‘This implics that one needs to emphasize those issues that will help the stu-
Gent to solve challenging problems, with the basic idess presented as simply
an ho aim of this manual is to provide assistance to instructors using“46070 PREF LinzIM” — 2011/1/28 — 13:30 — page iv — #2
iv Prseace
fas possible with many illustrative examples. Lengthy proofs, unnecessary
detail, or excessive mathematical rigor have no place in this approach. This,
is not to say that comect arguments ave ivelevant, but rather that they
should be made in connection with specific, concrete examples. Therefore,
homework has to be tightly integrated into the lectures and each exercise
should have a specific pedagogical purpose, Assignments need to be com-
posed as carefully and thoughtfully as the lectures. This is a dificult task,
but in my experience, the success of a course depends critically on how well,
this is done,
‘There are several types of exercises, each with a particular purpose and
flavor, Some of them are straightforward drill exercises. Any student with
1 basic understanding should be able to handle them. They are not always
very interesting, but they test the student’s grasp of the material, uncover
possible misunderstanding, and give everyone the satisfaction of being able
to do something
A second type of exercise in the manual, I call “fll-in-the-details.” These
are usually omitted parts of proois or examples whose broad outlines are
sketched in the text, Most of them are not overly difficult since all the
non-obvious points have beon spelled out. For mathematically well-tyained
students these exercises tend to be simple, but for those not in this cat
egory (0.g., many computer science undergraduates) they may be a little
more difficult and are likely to be unpopular. They are useful primarily in
sharpening mathernatical reasoning and formalizing skills.
‘The prevalent and most satisfying type of exercise involves both an
understanding of the material and an ability to cary it a step further
"These exercises aze a Little like puzzles whose solution involves inventiveness,
ranging from the fairly easy to the very challenging. Some of the more
difficult ones require tricks that are not easy to discover, so an occasional
hhint may be im order. I have identified some of the harder problems with a
star, but this classification is highly subjective and may not be shared by
others. The best way to judge the difficulty of any problem is to look at
the discussion of the solution
Finally, there are some exercises that take the student beyond the scope
of this course, to do some additional reading or implement a method on the
computer, These are normally quite time consuraing and are suitable only
for extra-credit assignments. These exercises are identified by a double star.
For the actual solutions, Ihave done what I think is most helpful, When,
a problem has a simple and concise answer, I give it. But there are many
ceases where the solution is lengthy and uninformative. I often omit the
details on these, because I think it is easier to make up one’s own answor
than to check someone else's. In difficult problems I outline a possible
approach, giving varving degrees of detail that I sce necessary for following,“46070 PREF LinzIM” — 2011/1/28 — 13:30 — page v —
Prerace v
the argument. There are also some quite general and open-ended problems
where no particular answer can be given. In these instances, I simply tell
you why I think that such an exercise is useful
tor Lin,46070 PREF LinaIM” — 2011/1/28 — 13:30 — page vi — #446070 TOCK Linz" — 2011/1/28 — 18:31 — page vii — #1
eK
Introduction to the Theory of Computation
1.1 Mathematical Preliminaries and Notation
12 Three Basic Concepts
13 Some Applications
Finite Automata
2.1 Deterministic Finite Accepters
2.2 Nondeterministie Finite Accepters
23 Equivalence of Deterministic and Nondeterministic Finite
‘Aceepters .
24 Reduction of the Number of States in Finite Automata. .
Regular Languages and Grammars
3.1 Regular Expressions... .
32 Connection Between Regular Expressions and Regular
Languages .
3.3. Regular Grammars
Properties of Regular Languages
4.1 Closure Properties of Regular Languages
4.2, Blementary Questions about Regular Languages
43 Identifying Nonregular Languages
un
ul
uM
16
a7
7
a10
sy
“46070°TOCK'LinzIM" — 2011/1/28 — 13:31 — page viii — #2
Convenrs
Context-Free Languages
5.1 Context-Free Grammars
5.2 Parsing and Ambiguity
5.3 Context-Free Grammars and Programming Languages
Simplification of Context-Free Grammars and
Normal Forms
6.1 Methods for Transforming Grammars
6.2 Two Important Normal Forms
63 A Membership Algorithm for Context-Free Grammars
Pushdown Automata
7.1 Nondeterministic Pushdown Automata,
7.2 Pushdown Automata and Context-Froo Languages
7.3 Deterministic Pushdown Automata and Deterministic
Context-Free Languages
7A Gratamars for Deterministic Context-Free Languages
Properties of Context-Free Languages
81 Two Pumping Lemmas
8.2. Closure Properties and Decision Algorithms for Context-
Free Languages
Turing Machines
9.1 The Standard Turing Machine
9.2 Combining Turing Machines for Complicated Tasks
93 Taring’s Thesis
Other Models of Turing Machines
10.1 Minor Variations on the Turiug Machine Theme
10.2 Turing Machines with More Complex Storage
103 Nondeterministic Turing Machines
104 A Universal Turing Machine
10.5 Linear Bounded Antomata
‘A Hierarchy of Formal Languages and Automata
111 Recursive and Recursively Enumerable Languages
112 Unrestricted Grammaars
113 Context-Sensitive Grammars and Languages
114 The Chorasky Hierarchy
25,
25
28
29
29
30
32
33
33,
33
- 36
38
39
40
40
43
45
45
ar
4a
a7
47
48
50
» 50
50
51
51
- 53
54
55“46070°TOCX'LinaIM” — 2011/1/28 — 18:31 — page ix — #3
Covvenrs
12 Limits of Algorithmic Computation
Ww
wa
1s
124
125
‘Some Problems That Cannot Be Solved by Turing.
‘Machines
Undecidable Problems for Recussively Enumerable
Languages
‘The Post Correspondence Principle
Undecidable Problems for Context-Free Languages
A Question of Biliciency
13 Other Models of Computation
131
132
a3
Recursive Functions
Post Systems
Rewsiting Systems
14 An Overview of Computational Complexity
Ma
ua
us
Ma
145
M6
jency of Computation
‘Turing Machine Models and Complexity
Language Families and Complexity Classes
Some NP Problems
Polynomial-Time Reduction
NP-Completeness and an Open Question
55,
87
58
59
59
59
59
61
62
63
63
63
64
64'46070°TOCX LinaIM” — 2011/1/28 — 13:31 — page x — #4“46070. XXXX LineIM” — 2011/1/14 — 15:44 — page 1 — #1
Chapter 1
Introduction to the Theory of Computation
1.1 Mathematical Preliminaries and Notation
‘The material in this section is a prerequisite for the course. The exercises are
all of the type done in a discrete mathematics course. If students are com-
ortable with this material, one or two of these can be assigned as refresher
or as warm-up exercises. If students are struggling with this material, extra
time will be needed to remedy the situation. Working out some of these
‘ecercises in class and reading the solved exercises should help,
1 to 14: These exercises ate all fairly simple, involving arguments with
sets. Most of them ean be solved by direct deduction or simple
induction, Exercise 8 establishes a result that is needed later
15 to 21: Material on order of magnitude is needed in later chapters
22 to 24: Some routine exercises to remind students of the terminology
of graphs.
25 to 28: Hxercises in induction. Most students will have seen something
close to this in their discrete math course and should know
that induction is the way to go. Exercise 28 combines order of
‘magnitude notation with induction, but the exercise may be
hard for some students.
29 to 31: Simple examples of using proof by contradiction.
132: (a) and (¢) are true and ean be proved by contradiction. (b) is
false, with the expression in Exercise 30 a counterexample.
83 and 34: Classic examples that should be known to most students.
‘86: This is oasicr than it looks. If nis not a snultiplo of threo, then
it must be that either n = 3m +1 or n = 3m-+2 In the first
case, n +2 = 3m 4 3, in the second n+ 4— 3m +6.“46070. XXXX LineIM” — 2011/1/14 — 15:44 — page 2 — #2
2 Carrer 2
1.2 Three Basic Concepts
In this section we introduce the basic concepts on which the rest of the
ook is based. One could leave the section out, since everything recurs
later. But I think it is important to put this up front, to provide a context
for more specific development later. Also, it gives students an immediate
introduction to the kinds of problems they will face later. There are some
reasonably dificult and challenging exercisos here.
1 to &: Like Exaraple 1.8, these are all obvious properties of strings and.
‘ve simply ask to make the obvious rigorous. All can be done with
induction and are useful for practicing such arguments in a simple
setting. The results are needed again and again, so it is useful to
do some of these exercises.
4: AA simple drill exercise that introduces the idea of parsing (without
specifically using the term) and shows that breaking a structure
into its constituent parts requires some thought
5: A good exercise for working with language complements and sot
notation.
6 LUT= 3"
7: Am exercise in understanding aotation, There ate of course no such
languages, since L* and (Z)* both contain A
8 to 10: These aro not difficult, but require careful reasoning, sometimes
involving several steps. The exercises are good tests of the under
standing of concatenation, reversal, and star-closare of languages.
11: To get the grammars should be easy, but giving convincing argu
monts may prove to be a little harder. In fact, expect students to
ask, “What do you mean by convincing argument?” and you will
need to set standards appropriate to your class at this point. It is
important that the istue of how much rigor and detail you expect
is settled early.
12: It is easy to sve that the answer is {(ab)” :n > 0)
18: Points out that grammar does not have to derive anything, that
is it derives 2“46070. XXXX LineIM” — 2011/1/14 — 15:44 — page 3 — #3
Cuarrer 1 3
14: A mixed bag of exercises. (a), (bj, and (c) are easy; so is (d) but it does
give trouble to students who don’t sev that the language is actually
{a#36 :m > 0}. Parts (c} to (h) let the students discover how to
combine grammars, ©, if S; derives Ly and S, derives L2, then S >
S18) combined with the grammars for Ly and La derives LiL. This
‘anticipates important results for context-free grammars, Part (i) cannot
be done this way, but note that Ly —Za= Li 0 La = 2.
15: (a) is simple, the others get progressively harder. The answers are not
too dificult if students are comfortable working with the mod fame-
tion. For example, the solution to (c) can be seen if we notice that
Jw|mod 3 |w| mod 2 means that Ju] # Gn or Jul # On-+1. A gram
mar then is
S$ acaaaas|A
A aalana|acaalaacae
16: ‘This simple exercise introduces a language that we encounter in many
subsequent examples.
17: In spite of the similarity of this grammar to that of Example 1.13,
its verbal deseription is not easy. Obviously, fw © L, then nw)
rns(w)+1. But strings in the language must have the form awyb or bua,
with wi €L
18: A set of fairly hard problems, which can be solved by the tick of eount-
ing described in Example 1.12. c) is pethaps the mest difficult.
(b) $+ aS |S]a5,
where $i, derives the language in Example 1.13.
(6) $+ aSbSa aaSb| bSea|SS| \
(4) Split into cases ng (w) = ns (w) 4 1 and ng (w)
ne (w) 1
19; While conceptually not much mote difficult than Exercise 18, it goos
‘a step further as now we need to be able to generate any number of
c's anywhere in the string. One way to do this is to introduce a new
variable that can generate c’s anywhere, say
Co e0\d
and then replace terminals a by C'aC and b by CBC in the productions
of Exercise 18.
20: A fillin-the-details exercise for those who stress making arguments com-
plete and precise“46070. XXXX Line” — 2011/1/14 — 15:44 — page 4 — #4
4 Carrer 2
21 to 28: These examples illustrate the briefly introduced idea of the equiv:
alence of grammars. It is important that the students realize
early that any given language can have many grammars, The
two grammars in Exercise 21 are not equivalent, although some
vill claim that both generate {a"6"}, forgetting about the empty
string, The exercise points out that the empty string is not “noth-
ing” In Exerciso 22, note that $ > SSS can be replaced by
S + SS + SSS, so the grammars are equivalent. A counterex-
ample for 23 is aa.
1.8 Some Applications
‘This section is optional; its puxpose is to hint at applications and relate the
material to something in the student’s previous experience. It also intro-
duces finite automata in an informal way. Exercises 1 to 6 are suitable for
those with a background and interest in programming languages. Exercises
8 to 14 deal with some fundamental hardware issues familiar to most com-
puter science students from a course in computer organization. Generally
those exercises aro not hard,
7: A more prosaic way of stating the problem is: no M can follow the first
D, no D can follow the first C, etc, The resulting automaton is a little
large, but easy in principle.
8: This introduces an important idea, namely how an automaton can re
member things. For example, fan a, is read, it will have to be reproduced
later, so the automaton has to remember. ‘This can be done by labeling
the state with the appropriate information. Part of the automaton will
‘then look like
9: A simple extension of the idea in Exercise 8.“46070. XXXX LineIM” — 2011/1/14 — 15:44 — page 8 — #5
Cuaprer 25:
10; The automaton must complement each bit, add one to the lower
torder bit, and propagate a carry. Students will probably need to
try a fow examples by hand before discovering an answer such as
the one below,
wo ow
uw
™%
LA: A fairly simple solved exercise,
12 to 14: These are similar in difficulty to Exercise 10. They all require
that the automaton remember some of the previously encoun-
tered bits.
15: This is simple as long as the higher order bits are given first.
Think about how the problem could be solved if the lower order
bits are seen first
Chapter 2
Finite Automata
2.1 Deterministic Fi
ite Accepters
1: A drill exercise to see if students can follow the workings of a dfs,
2: Some of these languages are the same as Exercise 11, Section
1.2, so the student can see the parallels between grammars and
automata solutions, Since this is a very fundamental isste, this is
28 good introductory problem. All the exercises are relatively easy
if mnemonic labeling is used,
Sand 4: These two exercises let the student discover closure of regular
languages under complementation. This is discussed later in the
treatment of closure, so this gives a preview of things to come. It
also shows that the dfa for T can be constructed by comapleracnting
the state set of the dfa for L. The only difficulty in the exercises
is that they require formal arguments, so it is a good exercise for
practicing mathematical reasoning,
5 and 6; Easy exercises'46070°XXXX'LinzIM” — 2011/1/14 — 15:44 — page 6 — #6
6 Cuaprer 2
7: Similar to Exercise 15 in Section 1.2. The answers all involve simple
modular counting by an automaton, Once students grasp this principle,
all parts are easy,
8: May be quite hard until the student gets into the habit of using mnemonic
labels. If each state is labeled with the appropriate number of a’s, the
solution for part (a) below follows dixectly, Solutions to the other two
parts are comparable in difficulty
9: Continues the theme of Exercise 8, but is on the whole a little eas
After this exercise, the students should be convineed of the usefulness
fof mnemonic labeling. Note that (a) and (b) are not the same problem.
10: This is a difficult problem for most students. Many will try to find some
kind of repeated pattern. The trick is to label the st
(mod 5) of the partial bit string and find the rule for taking care of the
next bit by
tes with the value
(2n +1) mod 5 = (2n mod § 1 1)mod 5
leading to the solution shown below.
Don't expect everyone to discover this, so a hint may be appropriate
(or let them try a bit first before giving dizections)“46070. XXXX LineIM” — 2011/1/14 — 15:44 — page 7 — #7
Caaprer 27
11 to 15: All fairly easy exercises, reinforcing the idea that a language is
regular if we ean find a dfa for it
16: A simple problem, pointing out an application in programming
Tanguages.
17 and 18: These exercises look easier than they are. They introruce the
important technique of a general construction, Given any dia
for L, how do we construct from it a dla for L— (A)? The
temptation is to say that if Ae Z, then q) must be in the
final state set F, 90 just construct a new automaton with final
state sot F — {go}. But this is not comect, since there may be
‘a nonempty string w € L, such that &* (go,w) = go. To get
around this difficulty, we ceate a new iitial state pp and now
transitions
8.2) = 45
for all original
$ (0.2) = 4
‘This is intuitively reasonable, but it has to be spelled out in
detail, 50 a fortsal argument wil still be hard, However, as it
is one of the sizuplest cases of justifying a construction, asking
for a proof that the construction works is « good introductiem
to this sort of thing
Note that these exercises become much easier after nfa's have
been introduced
19: A good exercise in inductive reasoning as well as in handling
concise, butt not very transparent, mathematical notation.
20 and 21; These involve gencralization of the idea introduced in Example
2.6 and should not prove too difficult.
22: Generalizes the above idea in Bxercises 20 and 21 a little more
and points to closure properties to come. Gets the student to
‘think ahead about issues that will be treated in more detail
later, Not too hard for students with the ability to generalize,
although the formal proof may be challenging,
23: An exercise for reasoning with transition graphs. The answer
is intuitively easy to seo, but may be troublesome if you are
asking for a formal proof.“46070. XXXX Line” — 2011/1/14 — 15:44 — page 8 — #8
8 Caarrer 2
24: Constructions of this type are very fundamental in subsequent diseus
sions, but at this point the student has no experience with them, so
this will be hard. But if the student can discover the idea behind the
solution, later material will make much more sense. Perhaps a hint such
as “if va € L, then 6° (va) € F, But v € truncate(L), 30 that 6° (v)
must be a final state for the new automaton” is worthwhile, This will
probably give the construction away, but any kind of formal proof will
still be hard for most.
25: A simple exercise that has many different solutions,
2.2 Nondeterministic Finite Accepters
1: A “éll-in-the-dotails" exercise, of which there are a number through-
cout the text, Such exercises tend to be unexciting to many sti
dents and you may not want to assign 2 lot of them, An occasional
cone, though, is appropriate. Having to reason about fine points
gives the student a better understanding of the result. It will also
test tine students’ ability to go from an intuitive understanding of
‘8 prool to a precise sequence of logical steps.
2: An exercise foreshadowing the
such as
-O-O-0-0-OLD
is not entirely trivial for students at this point.
dfa/nfa equivalence. An answer
3: This exercise brings out the connection between complementing
the final state set and the complement of a language.
4 to 6: Routine drill exercises in understanding and following transition
graphs. Also reinforces the point that 5° is a set.
7 and 8: These require solutions with a bound on the number of states.
Without such bounds the exercises would be trivial, but even with
them they are not too hard. ‘The main virtue of this set is that it
gets the student to play around with various options. A solution
to 7 is easy. Exercise § is solved.
9: The answer is pretty obvious, but how can one defend such a
conclusion? A question without a very tidy answer, but it gets
the student to think.“46070. XXXX LineIM” — 2011/1/14 — 15:44 — page 9 — #9
Cuapren 29
10: The answer to part (b) is yes, since L= {b"a! mn > 0,k 2 0},
11: An easy exercise.
12: A routine, but worthwhile exercise, since some students get confused
tracing through an nfa,
18: Clearly, the language accepted is {a : n> 1). Assuming 2 = {a}, the
complement consists of the empty string only.
14: An easy exercise that requires a simple modification of Figure 28
A}
16: Can be a little hard, mainly because of the unusual nature of the ques
tion. It is easy to come up with an incorrect result. For the solution,
see the solved exercises.
1s: L
17: Again the obvious answer is no, but this is not so easy to defend. One
way to argue is that for a dfa to accept {a}, its intial state must be
4 final state. Removing any edge will not change this, so the resulting
automaton still accepts 2.
18: A worthwhile exercise about a generalization of an nfa. Students some
times ask why in the definition of a finite automaton we have only one
initial state, but may have a number of final states. This kind of exer
cise shows the somewhat arbitrary nature of the definition and points
cout that the restriction is inconsoquential
19: No, if go € F, introduce py as in the exercise above.
20: Exercise in reasoning with tzansition graphs. Makes sense intuitively,
Dut don't expect everyone to produce an airtight argument.
21; This introduces a useful concept, an incomplete dfa (which some authors
ise asthe actual definition of a fa). Using incomplete dia's can simplify
‘many problems, so the exercise is worthwhile.
2.3 Equivalence of Deterministic and Nondeterministic
Finite Accepters
1: Straightforward, drill exercise, For a simple answer, note that the
accepted language is {a” : n 2 1}
and 8: These are easy drill exercises,
4: A “fil-in-the-details” exercise to supply some material omitted in
the text.10
10:
un
aa
1s:
“46070 XXXX LinaIM” — 2011/1/14 — 15:44 — page 10 — #10
CHAPTER 2
‘Yos, it is true. A formal argument is not hard. By delinition, w © Lif
and only if 8 (go,w) VF # &. Consequently, if 5" (go,e) MF ~ @, then
wel
: Not true, although some students will find this counterintuitive. This
exercise makes a good contrast with Exercise 5 above and Exercise 4,
Section 2.1. If the students understand this, they are on their way to
understanding the difficult idea of nondeterminism.
+ Create a new final state and connect it to the old ones by \-transitions,
‘This does not work with dfa’s, as explained in the solution,
: Does not follow from Exexcise 7 since A-transitions ate forbidden. The
answer requires some thinking. A solution is provided. This is a specific
case of the general construction in the next exercise,
‘This is a troublesome construction, Start with dia, add a single final
state with Mtransitions, then remove the A-transitions as sketched be-
Introduce a single initial state, and connect it to previous ones via
transitions. Then convert back to a dfa and note that the construction,
of Theorera 2.2 retains the single initial state.
[An instructive and easy exercise, establishing a result needed on occa
sion. Without this exercise some students may not realize that all finite
languages are regular.
Another exercise foreshadowing closure results. The construction is
‘easy: reverse final and initial states and all arrows. Then use the com
clusion of Exercise 18, Section 2.2.
Once you see that the language is {0":n > 1) U{0"1:n > 0}, the
problem is trivial“46070 XXXX LinaIM” — 2011/1/14 — 15:44 — page 11 — #11
Cuarrer $11
14: A difficult problem with a solution in the solved exercises
15: Not obvious, but should be manageable after one or two similar prob-
Jems. Find the set of states Q for which there is a path of length two
fom the initial state. Introduce 2 new initial state @) and A-transitions
ftom qj to the states in Qo.
2.4 Reduction of the Number of States in Fin
1: This is an easy drill exercise.
2: Easy to construct dfa's; the rest is routine.
3: ‘This is an important point that may have escaped notice: in minimize
tion, the original and final automata must be dfa's. So things will work
as stated only if reduce produces a dfa. This is the case because each +
can occur in only one label of 1,
Another easy drill,
Every walk from an initial to a final state must have length at least n.
‘Thus all simple paths have length at least n, and consequently there
must be at least m +1 vertices.
6: Not obvious, but the solution is given.
T: A test of students’ ability to work with equivalence relations,
8: This exercise asks the student to fill in some of the details that were
omitted in the text
Again, a test of understanding equivalences involving a short proof by
contradiction,
10: While the uniqueness question is not addressed in the text, itis worth
‘mentioning. But discovering a proof is probably beyond the capability
fof most students, so the best you can do is to point to the Literature
(cg, Denning, Dennis and Qualitz 1978) for help.
Chapter 3
Regular Languages and Grammars
3.1 Regular Expressions
‘The material in this section lends itself very well for problem-solving exer
cises. The ones given here range from simple to moderately difficult. Tt is
easy to make up similar problems, but one needs to be careful. Some very
innocent looking questions can be quite difficult. It appears to be harder for
most students to work with regular expressions than with finite automata,2
10:
as
rt
As:
1a:
“46070 XXXX LinaIM” — 2011/1/14 — 15:44 — page 12 — #12
CHAPTER 3
A routine drill exercise
Yes:
Since 1" ineludes A, this is also true.
aaa’ (b6)*
y, if you split this into (n and m hoth even) or (n and m both edd),
“An answer is
aa)" (65)" = aaa)” 666)"
(2) and {b) are easy, since the solution can be written down immediately,
eg.
r= (A+ataatana)(A + b+ bb-+ bbb)
for (b). To solve (¢) and (d), split the problem into several eases, such
asn<4,m <3, n> 4,m > 3, ete, We must also include strings in
which @ can follow b.
+: Uncovers some notational subtleties that may have escaped attention.
Particularly, the concatenation of a language with the empty set is
often misunderstood (of course, itis not really a very important point)
Answers: (2°)" = {A}, az = 2
All strings of the form wibw2, where w1 and w2 are composed of an
even number of a's, or w; and w3 consist of an odd number of a's.
: Exercise to soe how to get L* by reversing the regular expression. Leads
to the more general question in Exercise 21
Split into three cases: n > 1, m>3;n>2,m>2;andn>3,m>1.
Easy problem, with answer
Abb (a+ 8)"
A little hard to see, The part of getting an odd number of a's or an even
amumber of 6's is casy. But we also must include strings where an @ cam
follow a &, An answer is (a +b)" ba(a-+8)" + aaa)" b° + a” (88) = 2,
‘but many quite different expressions are possible
Enumerate all strings of length two to get something like aa (a +5)" act
ab(a +8)" ab +
A problem in unraveling notation. The simple answer is (a + 6)*
xamples 3.5 and 8.6 give a hint. A short answer is (1 + 01)" 00(1 + 10)*,
Dut students are likely to come up with many dilferent solutions“46070 XXXX LinaIM” — 2011/1/14 — 15:44 — page 13 — #13.
Cuarren $13
16: A sequence of simple to harder problems. (a) is easy, (b) a little harder
since it should be broken into parts—no a's, exactly one a, ete. (2) is
quite hard since it involves putting several ideas together, for example
runs of a's of length 3n, separated by substrings with no runs of a's. To
get started, look at
r =r; (baaabr;baaat)" r,
where ry generates all strings that have no runs of a's. This gives a
subset of what we want, but several other parts have to be considered,
17: These again increase in difficulty, but on the whole are a little harder
than Exercise 14, (f) can be rather tedious and frustrating, Fiast look
at
= (r100r3)*
where ry is a regular expression for strings not containing 10. Unfortu.
nately, this does not cover the many special cases, such as 1", and it
takes some effort to sort this all out.
18: (a) is not hard, (b) perhaps alittle harder with answer b* +(8°ab" al" abt)"
{6) can be solved along these ines, but is longer since it must be split
Into separate cases mq (w) mod 5 = 1,2,3,4
: Once 18 is solved, this involves putting c’s into arbitrary positions.
20: Not too important, since these identities will not be used much. They
are easy to see but quite hard to justify precisely, so if you assign any,
say what level of detail you expect, I accept some arguments like
G7) = tn trim te)
= any number of r; concatenated = rf
Not very precise, but good enough to see that the student understands
the principle
21: A general question on how to get the reverse of a language through
the reverse of its regular expression. It is not hard to sce that the only.
thing we need is to replace ab---ed by de--ba recursively, but the
justification requires a somewhat lengthy argument.
22: Formal arguments to provide some omitted details.
ets the student to think about the meaning of closure. For a more
challenging exercise, omit the disclaimers about 2 and 2:“46070 XXXX Lit
= 2011/1/14 — 15:44 — page 4 — #14
14 Carrer 3
24 and 25: These are simple exercises, but interesting as they point to some
advanced applications in pattern recognition, Students may find
it interesting to realize that formal languages cau be used for
thing other than describing strings. There is lots in the
Iiterature on chain-codes.
26: Basy, but introduces the concepts of the next section,
at
An exercise with a somewhat different flavor, with a solution
provided
28: Similar to Exercise 27, with answer
r= 1410411 41004101 + 1104111
1000 + 1001 + 1010 + 11110 (0 + 1)
+ 104104 DO+ HOF) +1041)"
3.2 Connection Between Regular Expressions and Regular
Languages
1: Routine application of a given construction,
2: Hard if not approached correctly (what is the complement of L?), but
easy if you look at it the right way. Here the answer can be found by
complementing an appropriate dfa. This works, since the most natural
construction is a dia (or at least an incomplete one). A partial answer
‘with undefined transit
ns to an accepting trap state.“46070 XXXX LinaIM” — 2011/1/14 — 15:44 — page 15 — #15,
10:
11 and 12:
As:
1a
15:
16:
Cuarrer $15
2 Routine drill exercise
: Here the student can take the tedious route of regular expres:
sion to nfa to dia, but of course they can be done from first
principles more easily. ‘Traps those who blindly follow the algo-
rithmic constructions
: The nfa’s are trivial and the nfa-to-dfa constructions are rou-
tine. Solutions from first principles may be a little cleaner but
require more thought.
2 Good contrast to Exercise 17(f), Section 3.1, Most students will
find it easier to get the finite automaton first, then the regular
expression from it
+: Find an nfa for the language, convert to a dfa, then minimize.
A little tedious, but straightforward,
Routine application of the construction leading up to Theorem
32,
‘The regular expression can be obtained by inspection,
Part (a) can be done by inspection. Perts (b) and (c) can be
done hy the given construction, but we must first create an nfa
‘with a single final state, distinet from the initial state, This is a
reutinder that the construction in Theorem 3.2 requires an nfa
of a special form.
Drill exercises to make sure the students can follow the cou-
struction,
Easy if an nfa is constructed first, very hard without this step,
‘This exercise will give trouble to those who try to find regular
expressions directly. It also demonstrates that finite autoraata
are easier to work with than regular expressions.
It will be hard for students to make the argument precise, but
it serves to justify an important step in the construction
Easy, mainly to koop connection with applications in mind.
Quite hard, but the given solution should help students in the
ext exercise“46070 XXXX LinaIM” — 2011/1/14 — 15:44 — page 16 — #16
16 Caapren 3
17: Create a duplicate dfa for L, then follow the pattern suggested by the
diagram below.
18: This shows why the simaple automata in Figure 3.1 were chosen in their
particular way. ‘The construction works even for the unusual situations
in this exercise
3.3 Regular Grammars
‘The exercises in this section are mostly routine, since the constructions
relating finite automata to regular grammars are quite automatic.
Land 2: Routine dill exercises
8: ‘The language is Z (abba (oba)” 5). From this the leftclimear gram
mar can be found without much trouble
4: A solved exercise
: Ifyou follow the suggestion in Theorern 3.5, you will get the gram-
mar qo + 010|A; q1 —* gol; 42 —» golqiO|gal. This suggests a direct
approach: reverse the arrows in the graph and write the come
sponding rules in lot-linear forza, Notico that the grammar has
several useless productions, Of this, more later
6: It is trivial to get a grammar for L (aab*ab). Then we need the
closure of this language, which requires only a minor modification.
‘An answer is $+ aaAlA, A bA|C,C + ab\ abs.
T: Split into parts (no a's, one a, ete.), then get a grammar for vach,
‘This is an exercise in combining grammars to get the union of
languages.
8: Theoretical, but the solution is in the solved problems section
9: Seo the answer to Exercise 5 above.“46070 XXXX LinaIM” — 2011/1/14 — 15:44 — page 17 — #17
Cuarren 417
10; Basy to do from frst principles, using the same kind of argue
iment as in Exercise 5
11 and 12: These exercises look harder than they are, Split into two parts:
tg (w) and ng (w) are both even, and ng (w) and ny (we) are
both odd,
18: Constructing a dfa for these languages should by now be easy
for most students. After this apply Theorem 3.4
14: Introduces the important idea of a normal form, about which
‘we say more later. The technique here is also important : we
introduce new variables, for exemple
An aaB
becomes
As ac
Cab
and so on. It is satisfying forthe student to discover this.
15: If there is no such rule, no sentence can be derived,
16: A straightforward, applications-oriented exercise
17: Thisis a worthwhile exercise whose solution is outlined. Perhaps
students can be asked to make the given outline more precise,
Chapter 4
Properties of Regular Languages
4.1 Closure Properties of Regular Languages
Most of the closure results come either from some construction (such as
the one in Theorem 4.1) oF from some set identity (as in Example 4.)
‘The exercises in this section expand on these observations. Some of the
constructions are difficult,
1: Prove the result by induction on the number of moves of the nfa.
2: This is a routine exercise that tests the understanding of the algorithia
involved in a constructive proof. It is worthwhile because it gets students
used to constructions involving Cartesian products of states.
‘&: Follow the construction for the intersection. The final states of the new
automaton are (qi, 9), with q € F and 9; ¢ F.“46070 XXXX LinaIM” — 2011/1/14 — 15:44 — page 18 — #18
18 Caapren 4
4: Straightforward, fillin-the-details,
5: Good example of simple induction on the number of languages.
6: We use
$18 Sp = (SUS) = (81.8)
and known closure properties. A constructive proof (along the lines of
intersection m Theorem 4.1) can be made, and it may be worthwhile
to extend the exercise by asking for such a proof
1: Follows from
not (I, La) = Tro Ta
cor(Uayta) = TUTE
8: Gets the student thinking about komomorphista, (a) and (c) are true,
bbut (b) is false, as shown by the example Ly = L (a*) , La = 1 (6*) h(a) =
a,h(b) =a.
10: Basy problera, with answer Ly/La = L(a")
11: A counterexample may not be obvious. Here is one: Ly = {a"6"
n> 0}, Ly = (8 :m > 0}. Then Ly La/Lz = {a"b" :n > 0,m > 0}.
12: A somewhat unusual and challenging problem. ‘The answer is given,
18: The construction can be done cither via nfa's o regular expressions.
Neither approach should be hard to discover. For example, if r is a
regular expression for and s is a regular expression for ©, then rss is,
a regular expression for Ly
14: An easy problem, Since L is regular, so is L®. Then by closure under
concatenation, so is the language under consideration
15: While for someone who thoroughly understands the construction of
Theorem 4.4 this problem is in principle not too hard, don’t assume
that this will be easy for your students. The reasoning requires some
ability to adapt a dificult argument for a new purpose. We can change
the right-quotient argument as follows. We first find all g, that cam
hoe reached from go with sorae string from La (og, by making cach
4,8 final state and intersecting the L). All such reachable states axe
thon made tho initial states of an nfa for L,L2. This gives an nfa with
multiple initial states, but according to Exercise 13 in Section 2.2 this
is ok.“46070 XXXX LinaIM” — 2011/1/14 — 15:44 — page 19 — #19
Cuarren 419)
16: Many students havs dilliculty with Uhis type of problem, so the given
solution could be instructive.
Exercises 17 to 25 all involve constructions of some difficulty. Several of these
have been encountered peripherally in previous sections, but if this is the
first time students see this type of problem, expect some confusion. However,
once a student has mastered one or two, the rest are nat so bad. In addition
to seeing the construction, there is also the difficulty of proving that the
construction works as claimed, This is mostly a matter of using induction
on the length of the strings involved, but there are lots of details that will
give many students trouble. Although these are not easy problems, I suggest
you assign at least one or two of them. Examining the given solution of 18
should help,
17: Find the set of all states 4* (q,.) € F for some w. Create a nevr initial
slate and add a Mtransition from it to all elements of Q.
18: You may want to assign this problem just go that students will read the
siven answer
19; Extend the construction in Exercise 14, Section 2.3, The idea is the
same here, but there are a few snore details.
20: Suppose the graph for L looks like
5
O—®
We replace this with
OO O-—O
If qy has several outgoing edges, we create a sub-automaton for each,
siving us an nfa with multiple initial states20
2
22:
“46070 XXXX LinaIM” — 2011/1/14 — 15:44 — page 20 — #20
CHAPTER 4
‘The idea here is similar to Exercise 20. We replace each part of the
graph of the forma
-©+O- —O4
by.
but for this to work, the final states of the original automaton cannot
ave any outgoing edges. This requirement can be met by introducing,
new final states and suitable \transitions, Also, the construction must
involve separate parts for each pair (0,4) for which 5{go,@) = 1
and 8(Qn, 0m) = 2s
A vory difieult construction. We ean transfer from the automaton for
‘Ly to that of La any time by the addition of suitable \-transitions, but
the resulting automaton has to remember where to zeturn to. This can
be done by creating a sub-automaton of one for each state of the other;
specifically, ifthe automaton for Z; has state set Q and that of La has
state set P, the automaton for shuffte(Za, La) will have a state set of
size 2|Q||P| arranged somewhat as in the picture below.
Subscript i meins
that amen te
Bae,“46070 XXXX LinaIM” — 2011/1/14 — 15:44 — page 21 — #21
Cuarrer 421
23: Find all states that can be reached frou the initial state in five steps.
‘Then define a new initial state and add A-tyansitions to them,
24: For each q € Q of the automaton for L, construct two nfa’s. The first
fone has q, as a final state, the second one has q, as its initial state. If
the languages represented by the first automaton and the reverse of the
language corresponding to the second have any common element, then
4 becomes a final state for the automaton for leftside (L)
25: ‘Take the transition graph of a dia for J: and delete all edges going out
of any final vertex. Note that this works only if we start with a dfa!
simple view of closure via grammars. The ideas are intuitively easy 10
see, but the arguments should be done carefully.
4.2 Elementary Questions about Regular Languages
‘Most of the exercises in this section can he done constructively by explicitly
exhibiting an automaton answering the given questions. But as demon-
strated with several cxamples in previous sections, sct operations together
with known algorithms are often a quicker way to the solution. If the sta
ents realize this, they will find the exercises much easier.
1: Simple, since Ly — Ls is regular and we have a membership algorithm
for regular languages.
2: I Ly © Lp, thon Ly U Ly = Ly. Since Ly U Ly is regular, and we have an
algorithm for set equality, we also have an algorithm for set inclusion,
8: Construct a dia. Then \ € Lif and only if g © F.
4: Sinco Ly [Lz is regular and we have an algorithm for set equality, we have
an algorithm for this problem.
5: Use the dfa for L, interchange initial and Sina states, and reverse edges
to get an nfa M for L¥. Then check if L(M) = L (i)
6; Similar to Exercise 5, except that we need to check if L (M)L (7) = 2
7: Construct the regular Language Li Lz (e.g, by concatenating their regular
expressions), then use the equality algorithm for regular languages.
8: Very simple, since L* is regular,“46070 XXXX LinaIM” — 2011/1/14 — 15:44 — page 22 — #22
22 Carre 4
9: This is a litle harder than some of the above exercises. Take a dia for L
‘Then for each pair (4,43) such that 8° (go, ) ~ ge and 8° (43,0) — a,
construct an automaton Mj, with q, a8 initial and q, a8 final state
Thea determine if @ © L(A)
10: By Bxercise 22, Section 4.1, there exists a construction that gives M,
such that L(M,) = shuffte(L, L). Using the algorithm for equality of
regular languages, we can then determine if L(M,) = E.
11: Similar to Bxereise 10. Construct M, such that tail (Z) = L (Me)
12: A good exercise that involves a simple, but not obvious trick. If there
are no even length strings, then
1 ((aa+ab+ba+0)") EL = 2.
Some students will come to grief trying to argue from transition grapbs.
13: Look at the transition graph for the dfa. If there is a eycle, then || > 5:
If not, check the lengths of all possible paths.
14: Similar to Exercise 12. Check if L ((aa + ab + ba-+ BA") OL is infinite
15: This isa very simple problem since we have an algorithm to test equality
of two regular languages and is regular
4.3 Identifying Nonregular Languages
‘The correct application of the pumping lemma is difficult for many students
In spite of repeated admonitions, there aze always some who will base their
arguments on a specific value of m or a decomposition of their own choosing,
I dou't know how to overcome this except to give lots of exercises.
1 and 2: The proofs of these are nearly identical to the proof of Theorem
4.8; wo just have to look at the states traversed during the reading
of any part of the string. These are theoretical, but worthwhile
exercises; they make the student think about what is involved in
the proof of the pumping lemma. The results are useful in Exercise
20 below.
3: A simple problem: pick {a5} as starting string, Also, here L =
i“46070 XXXX LinaIM” — 2011/1/14 — 15:44 — page 23 — #23
Cuarren 423
4: ‘This set is gencrally easy and involves little more than a routine appli-
cation of the pumping lemma, similar to Examples 4.7 and 4.8. (2) is
hard if approached directly, A simpler solution is by contradiction; if L
is assumed regular, so is Z, But we already know that, in this particular
case, T is not regular. I think that all parts of this exercise should be
one before proceeding to harder problezas.
this sot is harder than Bxercise 4, since some algebra is roquired. For
example, in (a), pick string a™!, where MM > m is a prime. Assume that
the string to be pumped is of length k, so that by pumping é times we
set astring of length M++(i — 1) k. By picking i — A/+1, we get a string
‘whose length is not a prime number. Part (b) follows from (a), since
the language here is the complement of that in (a), Parts (c) and (a)
roquire similar algchraic manipalations. (c) is similar to (a), but (f) and
(g) are traps for the unwary, since both languages are essentially {a"}.
There are always a few students who manage to apply the pumping
Jemma to “prove” that these languages are not regular.
6: The language in (a) is regular, but (b) is not. Applying the pumping
lemma to (h) requires some cate.
T: An easy application of the pumping lemma,
A hard problem, One way to go about itis to note that the complement
Of Lis closely related to {2% Ua"“15" Ua™20"}, then using closure
under complementation
9: ‘The problem is easy if you notice that nq(w) = na(w) + ne(w
10: Port (a) is very hard. We have to pick a very special string a+ to
start with, Suppose the middle string has length &; then the pumped
string is @M1+0-DK. If we now pick M = ml, then é can be chosen
so that M2614 (¢—1)k = (M4 1)2. The argument through closure
is easy since we have Example 4.11
Li; Start with w = a!) Then wy = a’"'-# and ma! — b> (m= 1)!
12: Very easy.
15: Here it is hard to use the pumping lemma divectly; instead note that ZA
L (arb) = {ad mn =k 1). We can easily pump out this language
1d: False, Take Ly = {a%":n >on} and Ly = {a"W™ sn
O}.
A theoretical exercise, filling @ gap left in @ previous argument. The
proof can be done by induction on the number of vertices.
A very difficult exercise. The answer is no. As counterexample, take the
Tanguages
L
{oan a} U (ow: sl i, then z €
{vjuu = |o| = ¢} ond therefore in L,.Ifn 1},
25: Rectangles are described by w"r"d"t". Apply the pumping lemma to
show that the language is not regular
26: A problem for studexts who misunderstand what the pumping lemma
is all about.
Chapter 5
Context-Free Languages
5.1 Context-Free Grammars
"The material in this section is important, but not hard to understand. Most
of the exercises explore the concepts in a fairly divect way. Nevertheless, an
occasional problem can be difficult. Many of the exercises are reminiscent
of (or extensions to) the grammar problems in Section 1.2
1: Straightforward reasoning: first, § = abil. By induction it follows
easily that B 3 (bhaa)” B (ba). Finally, B > bbAa > bba, s0
that > ab (Dban)” boa ba)”
Zand &: Simple drill exercise.
4: The solution is given.
5: No, by an easy application of the pumping lemma for regular
languages.
6; Fill-in-the-details via an induction on the number of steps in the
derivation,26
10:
un
2
“46070 XXXX LinaIM” — 2011/1/14 — 15:44 — page 26 — #26
CHAPTER 5:
‘Most of the exercises in this set should not prove too dilficult for stu
dents who have previously encountered such questions in Section 12.
‘The problems increase somewhat in difficulty from (a) to (g). Several
of the exercises simplify when split into subsets (eg., n # m can be
divided into n m). Answers
(a) $+ aSb|A]B,A— Alalaa|aca, B+ bB|2,
(0) S + aS8b|A]B,A + aA, B+ BC,E 910)
(6) SaaSb|4| B.A aAla, B= bB)p.
(a) $ > asbb |asbbo| a
(c) First split into two parts (i) na(se) > nol), and (i) nq(w) <
rn(w). Hor pact (j), generate an equal number of a's and b's, then
(8) Modify Example 6.4 to get $ + aSb|SS'S;, where Sy cam derive
(x) Quite difficult, but easier if you understand the solution of 18(c),
Section 1.2. Modify that grammar so that the count ends up at
+1
A set similar in difficulty to Exercise 7. Again, they are much easier if
split intelligently, For example, in (a) we let Ly — {a6 cF sm = m}
{arch mm < k}. Then we use $ — $}|S2, where $} derives
drives Ln. The other parts are similar in nature.
: Conceptually not hard, but the answer is long: for each a, generate two
noma symbols, eg., 5» aSbblaSbel... ete
Easy, once you see that head (L) = L (a"b*),
An easy exercise, involving two familiar ideas. An answer:
S + aSb|S1, 8, aSya|bSy)
Introduce new vatiables and rewrite rules so that all terminals are
fon the left side of the productions. For example, S — aSb becomes
‘S$ aSB, Bb. Then add productions A) for all variables that
can derive a terminal string, This anticipates some of the grammar
‘manipulations of Chapter 6 and is easier after that material has been
covered. At this stage, it can be quite difficult,“46070 XXXX LinaIM” — 2011/1/14 — 15:44 — page 27 — #27
1a
La
16:
ar
1s
19)
Cuarren 527
‘This exercise anticipates the closure results of Chapter 8. The problems
ae not too difficult, for example, the solution to part (b) can be con-
structed by S— S)...:, where Sy derives L. The hardest part of the
exercise is to find a grammar for Z because this first requites a char~
acterization of Z. We can do so by splitting the problem into various
cases, such as {a"6") with n > m, ete
Another simple exercise, anticipating closure under union,
For an answer, uso $ > S
So aS2a [6526] Sy
| with S) deriving (a+ 5)(a-46) and
Difficult, but the solution is in the book.
‘A little easier than solving Exercise 16,
‘A difficult, but instructive exercise, Two alternatives have to be consid~
ered: (a) fv:) — wa, im Which ease the grammar must gencrate at least
‘one different symbol in the same relative position, and (b) [wi # seo
A solution can be found starting with
Sr aSalpSd|M
M — abblpralp|R
where E, L, and R derive strings with Juz] — lua), fw] > heal, and
Jal < fea), vespectively
Routine drill to give the derivation tree, but an intuitive characteriza.
tion of the underlying language is quite hard to make. One way is to
describe the language as
L= (w: 2nq(w) = ns (w),w = aubyb, with we Ly € L)
An easily solved problem,
Basy, but there are many answers.
2a:
25:
2 This exer
2S [S}US)| ssa.
se is not hard, but it is worthwhile since it introduces an
important idea. You can view this as an example of a metalanguage,
that is, a language about languages
Same idea as in Exercise 23.
For a leftmost derivation, traverse tree in preorder, that is, process
root, process left subtree, process right subtree, recursively, expanding,
each variable in turn. The fact that we can do this shows that a leftmost
derivation is always possible. For a rightmost derivation, usv a traversal
defined by: process root, process right subtree, process left subtree,“46070 XXXX LinaIM” — 2011/1/14 — 15:44 — page 28 — #28
28 CHarrer 5
26: Expands on discussion in Example 5.3. For the case m > m, we ean use
the linear productions $+ aS|A,A— aAb|, with a similar set for
the case n < m.
27: A fairly simple exercise involving some counting of leaves in a tree. The
major purpose is to get students to think about derivation trees, To
got the answer, ostablish a relation between the height of the derivation,
tree and the maximum mumber of leaves. For a tree of height A, the
maximum number is h*. The other extreme is when there is only one
node (Le., there is only one vatiable in v). In this case, each level except
the last has & — 1 leaves, $0 fu — 1) — h(k 1)
5.2 Parsing and Ambiguity
1 to 8: While finding a grammar for those languages is trivial, the require.
ment that they be s-grammars makes the exercises a little more
challenging
4: Clearly, there is no choice in a leftinost derivation.
5: For each variable A on the let, there are at most [| possible right
sides so that [P| <|V| 7
6: A simple, solved problem,
7: The language is L (aa"b), so the answer is trivial
8: A xoutine drill exercise
9: Construct a dia and from it a regular grammar. Since the dfa never
hhas a choice, there will never be a choice in the productions. In fact,
the resulting grammar is ax s-grammar.
10; Straightforward modification of Example 5.12. Follow the approach,
used for arithmetic expressions
1: Yes, for example $ — aS;\ab, S; — 5. Contrast this with Exer-
cise 9 to show the difference between ambiguity in ® grammar and
ambiguity of a language.
12: The grammar S > aSa/bSbj.\ is not an s-grammar, so we cam
not immediately claim that it is unambiguous, But, by comparing
the sentential form with the string to be parsed, we see that there
Js never a choice in what production to apply, so the grammar is,
unambiguous,“46070 XXXX LinaIM” — 2011/1/14 — 15:44 — page 29 — #29
CHarren 6 29
13: Consider w = abab, which has two leftmost derivations
S aSbS = abS = abab
and
S aSbS > aSb > abat,
Mand 15: Simple variations on the current theme.
16: ‘The grammar has all s grammar productions, except for B > A
and B — A, but we can always tell which to use
17: The first part is a somewhat tedious drill. For the second part,
note that A —+ bBb produces two terminals, and B— A none
So every two productions produce two terminals, and we have
‘at most [w| rounds.
18: The derivation of any nonempty string must start with 6 >
Ab, and continue with A — aAb until enough terminals are
generated.
19; Consider leftmost productions. Since the variable to be ex.
panded occurs on the left side of only one production, there
is never a choice.
20: AA solved exercise
5.3 Context-Free Grammars and Programming Languages
‘A sot of rather simple and not very exciting exercises. ‘The main purpose of
this type of exercise ig to remind students of potential applications. Assign
fone or two of these if you fecl that such a reminder is necessary.
Chapter 6
Simplification of Context-Free Grammars and
Normal Forms
‘Many of the exercises in this chapter are conceptually easy, but tend to be
lengthy. This reflects the material of this chapter, whose main difficulties are
technical. The exercises are useful to reinforce the constructions. Without
them, the students may be left with some misunderstandings.“46070 XXXX LinaIM” — 2011/1/14 — 15:44 — page 30 — #30
30 CHAPTER 6
6.1 Methods for Transforming Grammars
6
Tto 9:
10 and 11
a2:
as:
1a:
415:
16:
ar
1s:
19:
‘This involves filling in some missing steps, using the given rea
soning in reverse
A reminder of derivation trees,
2 Simple, just substitute for B.
: It is not elear how to interpret the algorithm if A= B.
A routine application of the algorithm in Theorem 6.2. The
grammar generates the empty set.
Straightforward, but shows that ordor matters
Routing applications ofthe given theorems
Involve elementary arguments to complete some missing detail
Since $ is a nullable variable, we get
Ss ab asd] SS
which derives the original language without A. This leads into
a subsequent exercise, where the particular observation is made
general
Another example of what happens in A-removal when A € L(G).
‘This generalizes the previous two exercises. To prove it, show
that every nonempty string in L(G) can still be derived.
A solved exercise
We add A— y: ye|---. But none of the y: are empty, so that
no Aeproductions can be added.
Here we add no productions at all,
Addresses an issue that is pretty obvious to students, but which
some find difficult to justify rigorously. An argument, which is
at least plausible, can be made from the derivation tree. Since
the tree does not embody any order, the order of replacement
of variables cannot matter.
‘This rather trivial substitution can be justified by a straight.
forward argument. Show that every string derivable by the first
grammar is also derivable by the second, and vice versa,“46070 XXXX LinaIM” — 2011/1/14 — 15:44 — page 31 — #31
Cuarrer 6 31
20: An important point: the modified process does not work correctly
Counterexample:
Ss abc|AB
Asac,
B will be recognized as useless, but not A.
Exercises 21 and 22 are intimidating in appearance, but really quite simple,
‘once the notation is understood. I elude them, because students sometimes
ask: “What is the simplest grammar for this language?” so it is useful 10
point out that “simplest” kas to be defined before such a question becomes
meaningful.
21: Looks harder than it is. Obviously, in removing useless variables we
introduce no new productions; therefore, the complexity decreases. For
the others, though, one can find simple examples where the complexity
is increased
22: Also simple, for example $ + ad, A + a does not have minimal
complexity. Serves to point out that just removing useless variables
does not simplify the grammar as far as possible
23: A difficult problem because the result is hard to see intuitively. The
proof can be done by showing that crucial sentential forms generated
by one grammar ean also he generated by the other one. The important
step in this is to show that if
AF Aap tjzi > Yetinjts
then with the modified grammar we can make the derivation
AS WZ 3 yen 25%
‘This is actually an important result, dealing with the removal of certain
left-recursive productions from the grammar and is needed if one were
to discuss the general algorithm for converting a grammar into Greibach,
normal form
24: A routine application illustrating the result of Exercise 23,
25: The arguments are similar to those in Exercise 23,“46070 XXXX LinaIM” — 2011/1/14 — 15:44 — page 82 — #32
CHAPTER 6
6.2 Two Important Normal Forms
1: Straightforward even though a bit theoretical. By now the stu-
dents should have no trouble with this soxt of thing.
2 and S: Routine drills, involving an easy application of Theorem 6.7.
4 and 5: To apply the method deseribed in Theorem 6.7, we need to re-
move A-productions first. The vest is easy, but a little lengthy.
6: Am exercise that looks harder than it is. It can be solved with,
clementary arguments, Enumerate the productions generated in
each step. In the first, we introduce V, for all a © T and |T| new
productions V, — a. In the second, right sides of length & are
split into & — 1 separate rules. The stated result follows easily.
7: A trivial exercise, just to get students to work with the widely
used concept of a dependency graph.
8: A simple solved exercise leading to a normal forma for linear gram-
9 Another normal form, which can be obtained from Chomsky nor
mal form. Productions A» BC’ are permitted. For A> a, cre-
ate new variables Vi, Va and productions A aVsVa, Vi > 2,
Vy A.
10 to 13: These are relatively simple problems whose answers can be found
from first principles. They serve to make it plausible that conver-
sion to Gxeibach normal form is always possible.
4: No, since the result is a yegular grammar.
15: This solved problem is uot too hard, but its generalization in the
next exercise can be quite difficult.
16: This exercise shows an important extension of the material in
the text. It is for most a very difficult exercise, since if involves
2 good grasp of the use of substitutions, Start from Greibach
normal form, then make substitutions to reduce the number of
variables in the productions. We illustrate this step with
A> aBCD.
First introduce a new variable Vj and productions
Asay“46070 XXXX LinaIM” — 2011/1/14 — 15:44 — page 33 — #33)
Cuarren 7 33
and
Vi > BoD.
‘The second production is not in correct form so we continue, introducing,
Vz and,
Vio BVs
and
Vz > CD.
After a while, all productions will be either in correct form or of the form
Vi > Boy
But the rules with B om the left ate either
Bob
Bow,
‘The final substitution then gives the right form. Complete arguments
can be found in some advanced treatments, for example, in Harrison
1078, This exercise may he suitable for an extraccredit assignment for
the better students
6.3 A Membership Algorithm for Context-Free Grammars
‘The material in this section is optional. ‘The CYK algorithm is difficult
to grasp and not particularly important for understanding the concepts in
this text. Stil, itis a good example of dynamic programming, and students
can benefit from studying it. The exercises in this section can be used to
complement what might be a very short discussion in class. Exercises 1 and.
2 are drill, xequiring no more than an understanding of the notation by
which the algorithm is described. Exercises 3 and 4 involve an extension
and actual implementation, For this, the student will have to have a good
understanding of the construetion.
Chapter 7
Pushdown Automata
7.1 Nondeterministic Pushdown Automata
ere we have a collection of problems of varying difficulty. Particularly in-
structive are the ones that force the student to think nondeterministically.“46070 XXXX LinaIM” — 2011/1/14 — 15:44 — page 44 — #34
34 Carrer 7
‘This is really the first place where nondeterminism becomes an important is.
sue. For finite automata, a deterministic solution can always be constructed,
so nondeterminism sometimes appears as a technical trick. Here the situa-
tion is different. In problems such as Exercise 4(f), students often start by
‘trying to find a deterministic solution and will arrive at the nondeterminis-
tic approach only after a few failures, But once they discover the solution,
they begin to understand something about nondeterminism. An even more
convincing problem is Exercise 10 in Section 8.1
1: No need for state gy; substitute qo wherever q, occurs. The major diff-
culty is to argue convincingly that this works.
2: This problem is not too hard, but requires a little bit of analysis of the
rnondeterministic nature of the pda in Example 7.5. For this reason, it
is a helpful exercise. An argument goes somewhat like this: the switch
to state q; is done nondeterministically, and can be made any time. But
if it is not made in the middle of the string, the emergence of the stack
start symbol will not coincide with the end of the input. The only way
‘the stack can be cleared is to make the transition in the middle of the
Input string.
8: Because the languages involved are all regular, the npda’s can be com
structed as nfa’s with an inactive stack. One of these problems may be
useful in illustrating this point,
4: A sot of exercises in programming a pda. Most students will have some
‘experience in programming with stacks, so this is not too far removed
from their experience and consequently tends to be easy. Those problems,
such as (f) and (), that illustrate nondeterminism may be a little harder.
Part (a) is easy: put two tokens on stack for each a, remove one for each
8, Parts (b) and (c) are also easy. Part (@) is a little harder. Put a
token on stack for each a. Fach 6 will consume one until the stack start
symbol appears. At that time, switch state, so now b puts on tokens to
be consumed by e. In (e), use internal states to count three a’s. In ()
nondetermainise is the key; an @ can put one, two, or three tokens on the
stack. Parts (g), (h), and (i) are easy, following the approach suggested
in Example 7.3, Part (j) is similar, but uses nondeterminism,
2 Students who blindly follow the lead suggested by Example 7.2 will make
‘a number of subtle errors, but the approach is still useful. The best way
to proceed is to split the mpda initially into cases n> m and n < m.
‘The first can be dealt with by putting the pda into a final state with
8 (g2,4,1) = {(ay.A)}. In the second case, the stack start symabol will
appear before all the b's are read, and one must be careful to check the
rest of the string (otherwise, something like abba might be accepted).“46070 XXXX LinaIM” — 2011/1/14 — 15:44 — page 35 — #35
Cuarrer 735
6: At Gist glance this looks like a simple problem: put win th
stack, then go into a inal trap state whenever a mismatch with
up is detected, But there may be trouble ifthe two substrings are
of mequal length, for exazeple, if w2 = wf, Taking eare of this
takes a good bit of thought. A dificult, but highly recommended
problem. A solution is below
7: This adds another twist to Exercise 6, since the part belonging
to L(a*) must not be put on stack. We can use nondeterminism
to docide where this part ends.
8: A lengthy, but not difficult problem, The reason for the lengthi-
ness is that internal states must he used to recognize substrings
such as ab as a unit. This substring puts a single symbol on the
stack, which is then consumed by the substring ba.
9 and 10: These are simple exercises that require an analysis of a given
pda,
LA: A simple tracing exercise
12: Any string will be accepted since there are no dead configura
tions,
18: Adds ) to the orginal language
14: Not an easy problem, but the solution is provided,
15: No change, Since the original machine accepts w, we have
(0,272) F (a2, .0) > (90,29)“46070 XXXX LinaIM” — 2011/1/14 — 15:44 — page 36 — #36
360 CHAPTER 7
But in the modified mackine
(40,22) F (42,90) = (40, 9,0) + (49,2)
16: Not a very dificult construction, but worthwhile. Tho tric is to 16
rember the extra symbols using internal states, for example
# (45.058) = {(a5,0e)}
is replaced by
5 (qi,4,b) = {(gje,de)}
8 edd) = {aye}
‘This result is newded in subsequent discussions
17: In many books this is treated as a major issue, We left it out of our
main discussion, but the result is useful. The construction is not hard
to discover. Whenever the pda goos into a final state, it can be put into
a stack-clearing mode, Conversely, if M sees an empty stack, it can be
ut into a final state. It’s a good exercise to have studonts make the
argument precise. This result is also needed later.
7.2 Pushdown Automata and Context-Free Languages
‘There are some difficult constructions in this section, particularly in Tho
orem 7.2. A few exercises will make the results plausible, which is all one
really needs.
1: Straightforward drill exercise
2: A relatively easy proof that just formalizes some obvious observa
tions,
8: The long way is to follow the construction in Theorem 7.1. The
short way is to notice that
1G) = {a"! 281 sm 2 0}
4 to 6: ‘The grammars should be converted to Greibach normal form, after
which the exercises are direct applications of the construction of
Theorem 7.1
7: This exercise requires that the student be able to put together two
observations made in the text. In Theorem 7.1 we constructed a
three-state pda for any context-free grammar. From Theorem 7.2
wwe get a context-free grammar for every npda, Putting the two
together proves the claim