Lecture Notes in AI
Lecture Notes in AI
Artificial Intelligence
Dr V N Krishnachandran
Dr V N Krishnachandran
Author’s name:
DR. V. N. KRISHNACHANDRN
Published by author
Pblisher’s address: Dr. V. N.
Krishnachandran Vidya Academy of Science
& Technology Thalakkottukara, Thrissur -
680501, Kerala
First Edition
The book was typeset by the author using the LATEX document preparation
Copyright ⃝c Author
Licensed under the Creative Commons Attribution 4.0 International (CC BY 4.0)
License. You may not use this file except in compliance with the License. You may
obtain a copy of the License at
https://creativecommons.org/licenses/by/4.0/.
Price: Rs 0.00.
ISBN: · · · · · · · · ·
Preface
The book is written primarily for the students pursuing the MCA programme of the APJ
Abdul Kalam Technological University. The Curriculum for the programme offers a
course on artificial intelligence as an elective course in the Second Semester with code
and name “20MCA188 Artificial Intelligence”. The selection of topics in the book was
guided by the contents of the syllabus for the course and also by the contents in the books
cited in the syllabus. The book will also be useful to faculty members who teach the
course. The users of these notes are earnestly requested to bring to the notice of the
author any errors they find so that a corrected and revised edition can be published at the
earliest.
i
This page is intentionally left blank.
Contents
Preface i
Syllabus ix
iii
CONTENTS iv
5 Game playing 84
5.1Games considered in AI . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
5.2Game tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
5.3Solving a game . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
5.4Formulation of a game as a search problem . . . . . . . . . . . . . . . . . 88
5.5The minimax value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
5.6The minimax algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
5.6.1 The algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
5.6.2 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
5.7Minimax with alpha-beta pruning . . . . . . . . . . . . . . . . . . . . . . 93
5.7.1 Basic idea . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
5.7.2 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
5.7.3 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
5.7.4 Additional example . . . . . . . . . . . . . . . . . . . . . . . . . . 97
5.8Sample questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
6.8.1 Definition..............................................................................................112
6.8.2 Frame structure.....................................................................................113
6.8.3 Examples..............................................................................................113
6.8.4 Frame languages...................................................................................115
6.9 Conceptual dependency.....................................................................................116
6.9.1 Simple examples of representations of sentences in CD theory...........116
6.9.2 The building blocks of conceptual dependency...................................118
6.9.3 Rules.....................................................................................................121
6.9.4 Advantages and disadvantages of the CD theory.................................123
6.10 Sample questions...............................................................................................123
9 Learning 172
9.1 What is learning?...............................................................................................172
9.2 Forms of learning based on feedback................................................................172
9.3 Forms of learning based on different ways of acquiring knowledge................173
9.3.1 Rote learning.........................................................................................173
9.3.2 Learning by taking advice....................................................................173
9.3.3 Learning in problem solving................................................................174
9.3.4 Inductive learning.................................................................................175
9.3.5 Explanation based learning...................................................................176
9.3.6 Discovery..............................................................................................176
9.3.7 Analogy................................................................................................177
9.4 Neural net learning............................................................................................178
9.4.1 Neurons and perceptrons......................................................................178
9.4.2 Neural networks....................................................................................179
9.4.3 Learning in neural networks.................................................................179
9.4.4 Applications..........................................................................................179
9.5 Genetic algorithms............................................................................................180
9.5.1 Outline..................................................................................................180
9.6 Sample questions...............................................................................................181
Index 210
Syllabus
Category L T P Credit
20MCA 188 Artificial Intelligence
Elective 3 1 0 4
1. Preamble
This course introduces the techniques of Artificial Intelligence and analyses various meth-
ods of solving problems using it. The concept of expert system architecture and fuzzy
operations are introduced. This course serves as a prerequisite for many advanced
courses in Data Science areas.
2. Prerequisite
Mathematical foundations for computing, advanced data structures
3. Course outcomes
After the completion of the course the student will be able to:
ix
SYLLABUS x
5. Assessment pattern
6. Marks distribution
Attendance : 8 marks
Continuous Assessment Test (2 numbers) : 20
marks Assignment/Quiz/Course project : 12 marks
2. List the problem formulations & production characteristics. (K1 & K2)
3. Solve the various problems such as 8 puzzle, Crypt arithmetic, etc. (K3)
2. List and explain the knowledge representation methods in AI. (K1 & K2)
3. Describe about resolution graph in predicate and propositional logic. (K1 & K2)
1. Differentiate between Goal stack and Hierarchical planning in AI. (K1 & K2)
3. List out and explain various tools and languages in AI. (K1 & K2)
SEND + MORE
MONEY
(10 × 3 = 30 marks)
Part B
11. Consider a water jug problem. You are given two jugs, a 4 gallon and 3 gallons.
Neither has any measuring markers on it. There is a pump that can be used to fill
the jugs with water. How can you get exactly 2 gallons of water into 4-gallon jug.
State the production rule for waterjug problem. (6)
OR
OR
S A B C D
6 2 3 0 2
15. List and explain the knowledge representation methods in AI. (6)
OR
16. Explain how alpha-beta algorithm works in pruning of branches with an example.(6)
SYLLABUS xiii
17. Explain the algorithm to convert WFF to clause with an example. (6)
OR
18. Explain Neural net and Genetic learning methods in AI. (6)
19. Illustrate architecture of an expert system and mention its features. (6)
OR
20. Solve the following using various fuzzy set operations: (6)
(5×6=30 Marks)
9. Syllabus
Module 1
In this introductory chapter, we shall see several definitions of artificial intelligence (the
terminology is commonly abbreviated as AI), shall have a look at the various stages in
the historical development of the discipline of artificial intelligence and shall have a
glimpse of the many fields in which this discipline has been successfully applied.
1
1.3. APPLICATIONS OF ARTIFICIAL INTELLIGENCE 2
6. The period when artificial intelligence began to adopt the scientific method (1987
- present)
This was the period when artificial intelligence came firmly under the scientific
method. To be accepted, hypotheses must be subjected to rigorous empirical
experiments, and the results must be analysed statistically for their importance. It is
now possible to replicate experiments by using shared repositories of test data and
code.
7. The period when very large data sets began to be available (2001 - present)
Throughout the 60-year history of computer science, the emphasis has been on the
algorithm as the main subject of study. For many problems, it makes more sense to
worry about the data and be less concerned about what algorithm to apply. This is
true because of the increasing availability of very large data sources: for example,
trillions of words of English and billions of images from the Web; or billions of
base pairs of genomic sequences.
1. Robotic vehicles
A self-driving car, also known as an autonomous vehicle (AV), driverless car, or
robo- car is a vehicle that is capable of sensing its environment and moving safely
with little or no human input. Self-driving cars combine a variety of sensors to
perceive their surroundings. Advanced control systems interpret sensory
information to identify appropriate navigation paths, as well as obstacles and
relevant signage.
2. Speech recognition
Speech recognition is concerned with the recognition and translation of spoken lan-
guage into text by computers. It is also known as automatic speech recognition,
computer speech recognition or speech to text. Some speech recognition systems
require “training” where an individual speaker reads text or isolated vocabulary
into the system. The system analyses the person’s specific voice and uses it to fine-
tune the recognition of that person’s speech, resulting in increased accuracy.
3. Autonomous planning and scheduling
NASA’s Remote Agent program was the first on-board autonomous planning
program to control the scheduling of operations for a spacecraft. The Remote Agent
generated plans from high-level goals specified from the ground and monitored the
execution of those plans - detecting, diagnosing, and recovering from problems as
they occurred. Successor program MAPGEN planned the daily operations for
NASA’s Mars Explo- ration Rovers, and MEXAR2 did mission planning for the
European Space Agency’s Mars Express mission in 2008.
4. Game playing
IBM’s DEEP BLUE became the first computer program to defeat the world
champion in a chess match when it defeated Garry Kasparov in an exhibition match
in 1997.
5. Spam fighting
Each day, learning algorithms classify over a billion messages as spam, saving the
recipient from having to waste time deleting them.
6. Logistics planning
During the Persian Gulf crisis of 1991, U.S. forces deployed a Dynamic Analysis
and Replanning Tool (DART) to do automated logistics planning and scheduling
for
1.3. APPLICATIONS OF ARTIFICIAL INTELLIGENCE 4
7. Robotics
The iRobot Corporation has sold over two million Roomba robotic vacuum clean-
ers for home use. The company also deploys the more rugged PackBot to Iraq and
Afghanistan, where it is used to handle hazardous materials, clear explosives, and
identify the location of snipers.
8. Machine translation
There are computer programs that can automatically translate from, say, Arabic to
English. The program uses a statistical model built from examples of Arabic-to-
English translations and from examples of English text totalling two trillion words.
1. Agriculture
In agriculture new artificial intelligence advancements show improvements in
gaining yield and to increase the research and development of growing crops. New
artificial intelligence methods now predict the time it takes for a crop like a tomato
to be ripe and ready for picking thus increasing efficiency of farming.
2. Cybersecurity
The more advanced of these solutions use artificial intelligence and natural
language processing (NLP) methods to automatically sort the data in networks into
high risk and low-risk information. This enables security teams to focus on the
attacks that have the potential to do real harm to the organization, and not become
victims of attacks such as denial-of-service (DoS), malware and others.
3. Finance
Banks use artificial intelligence systems today to organize operations, maintain
book- keeping, invest in stocks, and manage properties. Artificial intelligence can
react to changes overnight or when business is not taking place.
4. Healthcare
The primary aim of health-related artificial intelligence applications is to analyse
rela- tionships between prevention or treatment techniques and patient outcomes.
Artificial intelligence programs are applied to practices such as diagnosis
processes, treatment protocol development, drug development, personalized
medicine, and patient moni- toring and care. Artificial intelligence algorithms can
also be used to analyse large amounts of data through electronic health records for
disease prevention and diagno- sis.
1.4. SAMPLE QUESTIONS 5
2. Describe some of the well known applications of artificial intelligence (at least five
applications).
Chapter 2
In this chapter we consider the formal definition of the concept of a “problem” and illustrate
the concept with several well known examples. We also discuss different characteristics
of a problem. The concept of a “production system” is also introduced in this chapter.
We begin the chapter with a very brief discussion of a “rational agent” because the
chapter is concerned with the behaviour of a special kind of rational agent known as
“problem solving agent”.
An agent is anything that can be viewed as perceiving (becoming aware of) its envi-
ronment through sensors and acting upon that environment through actuators (or
effectors) (see Fig 2.1).
Examples
• A human agent has eyes, ears, and other organs for sensors and hands, legs, vocal
tract, and so on for actuators.
6
2.2. DEFINITION OF THE CONCEPT OF A PROBLEM IN AI 7
An agent’s behaviour is described by the agent function that maps any given input
sequence to an action. The agent function for an artificial agent is implemented by an
agent program.
“A rational agent is an agent which, for each possible input sequence sensed
by the sensors, selects an action that maximises its expected performance
mea- sure.”
Throughout the rest of these notes, by an “agent” we shall always mean a “rational
agent”. Artificial intelligence is sometimes defined as a study of rational agents. A
rational agent could be anything which makes decisions such as a person, firm, machine,
or software.
This chapter describes the behaviour of a special kind of rational agent called a
problem- solving agent. Our discussion of problem solving begins with precise
definitions of prob- lems and their solutions and give several examples to illustrate these
definitions.
Problem statement
“The problem consists of a 2 × 2 board with three numbered tiles and a blank space
(see Figure 2.3). A tile adjacent to the blank space can slide into the space. The objective
is to reach the goal position shown in Figure 2.4 by sliding the tiles one at a time from
start position as in Figure 2.4.”
1 2 3 1
3 2
Start Goal
1 2 1 2 1 3 1 3 1 1
3 3 2 2 2 3 3 2
s1 s2 s3 s4 s5 s6
2 1 2 1 2 3 2 3 2 2
3 3 1 1 1 3 3 1
s7 s8 s9 s10 s11 s12
3 1 3 1 3 2 3 2 3 3
2 2 1 1 1 2 2 1
s13 s14 s15 s16 s17 s18
1 1 2 2 3 3
2 3 3 2 1 3 3 1 1 2 2 1
s19 s20 s21 s22 s23 s24
Actions
To move from one state to another state we apply some action. In this problem an action
can be thought of as moving the blank tile. Hence there are four possible actions that can
be applied to any state: “move blank tile up”, “move blank tile down”, “move blank tile
left” and “move blank tile right”. We may abbreviate these actions as “Up”, “Down”,
“Left” and “Right”. Note that not all actions are applicable to all states.
The set of actions applicable to a state s is denoted by ACTIONS(s). Thus, from
Figure 2.5, we can see that
Results of actions
The result of applying an action a to a state s is denoted by RESULT(s, a). For example,
we have
RESULT(s1, Up) = s6
RESULT(s19, Down) = s8
State space
The initial state, the actions applicable to the initial states, the resulting states, the actions
applicable to the resulting states and the resulting successors, and so on, all together con-
stitute the state space for the problem. The state space can be represented in the form of a
tree1. Figure 2.6 shows part of the state space assuming that the start state is as shown in
Figure 2.4.
1 2
3
Left Up
1 2 1
3 3 2
Right Up Left Down
1 2 2 1 1 2
3 1 3 3 2 3
. . . .
1
For a precise definition of a tree, see Section 3.1.
2.2. DEFINITION OF THE CONCEPT OF A PROBLEM IN AI 10
Solution
A solution is an action sequence that leads from the initial state to a goal state. It can be
easily verified that the following sequence of actions is a solution to the problem that we
are talking about:
Up → Left → Down → Right
The resulting states are shown in Figure 2.7.
Optimal solutions
We may assign a cost for executing each action in a problem. The costs depend on the
problem. In the present problem, let us assign a cost of 1 unit for each action. Then, in the
solution specified in Figure 2.7, the the total cost of arriving at the goal state is 4 units. In
other words, the cost of the solution is 4 units. There may be solutions with lower costs.
The solution with the least cost is referred to as the optimal solution.
The reader may verify that the solution given in Figure 2.7 is indeed an optimal
solution for the problem under discussion.
2.2.2 Definition
A problem is defined formally by the following five components:
1. Initial state
The initial state that the agent starts in.
2. Actions
A description of the possible actions available to the agent. Given a particular state
s, ACTIONS(s) returns the set of actions that can be executed in s. We say that
each of these actions are applicable in s.
3. Transition model
A description of what each action does. The formal name for this is the transition
model, specified by a function RESULT(s, a) that returns the state that results from
doing action a in state s. We also use the term successor to refer to any state
reachable from a given state by a single action.
The initial state, the actions, and the transition model all taken together implicitly
define the state space of the problem – the set of all states reachable from the
initial state by any sequence of actions. The state space forms a directed network or
graph in which the nodes are states and the links between nodes are actions. A
path in the state space is a sequence of states connected by a sequence of actions.
2.3. EXAMPLE PROBLEMS 11
4. Goal test
The goal test, which determines whether a given state is a goal state. Sometimes
there is an explicit set of possible goal states, and the test simply checks whether
the given state is one of them.
5. Path cost
A path cost function that assigns a numeric cost to each path. The problem-solving
agent chooses a cost function that reflects its own performance measure. We
assume that the cost of a path can be described as the sum of the costs of the
individual actions along the path.
A solution to a problem is an action sequence that leads from the initial state to a goal
state. The optimal solution is the solution having the lowest path cost among all solutions.
Standard formulation
• States: The state is determined by both the agent location and the dirt locations.
The agent is in one of two locations, each of which might or might not contain dirt.
Thus, there are 2 × 22 = 8 possible world states.
• Actions: In this simple environment, each state has just three actions: Left, Right,
and Suck.
• Transition model: The actions have their expected effects, except that moving
Left in the leftmost square, moving Right in the rightmost square, and Sucking in a
clean square have no effect. The complete state space is shown in Figure 2.9.
• Goal test: This checks whether all the squares are clean.
2.3. EXAMPLE PROBLEMS 12
• Path cost: Each step costs 1, so the path cost is the number of steps in the path.
Figure 2.9: The state space for the vacuum world. Links denote actions: L = Left, R =
Right, S = Suck.
Description
The 8-puzzle consists of a 3 × 3 board with eight numbered tiles and a blank space. A tile
adjacent to the blank space can slide into the space. The objective is to reach a specified
goal state such as the one shown in Figure 2.10 by sliding the tiles one at a time, given
some initial or start state as in Figure 2.10
Standard formulation
• States: A state description specifies the location of each of the eight tiles and the
blank in one of the nine squares. (The total number of states is 9! = 362880.)
• Actions: Actions may be defined as movements of the blank space Left, Right, Up,
or Down. Different subsets of these are possible depending on where the blank is.
2.3. EXAMPLE PROBLEMS 13
• Transition model: Given a state and action, this returns the resulting state; for ex-
ample, if we apply Left to the start state in Figure 2.10, the resulting state has the 5
and the blank cell switched.
• Goal test: This checks whether the state matches the goal configuration shown in
Figure 2.10.
• Path cost: Each step costs 1, so the path cost is the number of steps in the path.
Description
Three missionaries and three cannibals2 are on one side of a river, along with a boat that can
hold one or two people. Find a way to get everyone to the other side without ever leaving
a group of missionaries in one place outnumbered by the cannibals in that place. (See
Figure 2.11.)
Standard formulation
Ignore all irrelevant parts of the problem (e.g., weather conditions, possible presence of
crocodiles in the river, etc.).
legal states. For example, the triple (2, 3, 1) is not allowed because it represents a
state where the number of cannibals outnumber the missionaries.
Figure 2.12: The state (0, 1, 0) of the missionaries and cannibals problem
• Actions: Take two missionaries, two cannibals, or one of each across in the boat.
We have to take care to avoid illegal states.
• Transition model: Given a state and action, this returns the resulting state; for
exam- ple, if we apply the action ”Take 1 cannibal” to the state (0, 2, 0) the
resulting state is (0, 3, 1).
Solution
Figure 2.13 shows the possible legal states reachable from the initial state applying only
legal actions. In the figure, the notation “−2c ” indicates the operation of taking 2 cannibals
from the bank where the boat is currently→ located to the other bank and the notation “ −2m
” indicates taking two missionaries. From the figure we can easily get the solution to the→
problem as follows:
2c 1c
2c 1c 1m 1m, 1c
(3, 3, 1) −→ (3, 1, 0) −→ (3, 2, 1) −→ (3, 0, 0) −→ (3, 1, 1) −→ (1, 1, 0) −−−→
2m 1c 2c 1c 2c
(2, 2, 1) −→ (0, 2, 0) −→ (0, 3, 1) −→ (0, 1, 0) −→ (0, 2, 1) −→ (0, 0, 0)
Figure 2.13: The state space of the missionaries and cannibals problem (with triangles
representing missionaries and circles representing cannibals
2.3. EXAMPLE PROBLEMS 15
Description
A cryptarithmetic puzzle is a mathematical exercise where the digits of some numbers are
represented by letters. Each letter represents a unique digit. It will be assumed that the
leading digits of numbers are not zero. The problem is to find the digits such that a given
mathematical equation is satisfied.
Illustrative example
Find the digits to be assigned to the letters such that the following equation is true (the
digits assigned to T and F should not be 0):
T W O +
T W O
F O U R
Solution
With some trial and error we can get the following
solution: F = 1, O = 4, R = 8, T = 7, U = 6, W = 3.
7 3 4 +
7 3 4
1 6 8
4
8 2 6 +
8 2 6
1 5 2
6
Problem formulation
Using the idea of states the problem can be formulated as follows:
• States: Let there be n different letters where n ≤ 10. A problem state is an ordered
n-tuple of digits (d1, d2, . . . , dn) representing the digits to be assigned to the integers.
• Initial state: The initial state can be considered as the ordered n-tuple all of whose
elements are 0’s.
• Transition model: Given a state and action, this returns the resulting state.
• Goal test: We have reached state (d1, d2, . . . , dn) which satisfies the constraints
of the problem.
Solution
There are algorithms for solving the cryptarithmetic puzzle. Implementations of these al-
gorithms by hand computations is difficult. The method of trial and error keeping in mind
the constraints to be satisfied may be applied to obtain solutions. We illustrate the
method by solving some simple cryptarithmetic problems.
Example problem 1
Solve the following cryptarithmetic puzzle:
T O
+ G O
O U T
• Since O is a non-zero carry digit when the two digits T an G are added, we must
have O = 1.
T = 2, O = 1, G = 8, U = 0.
2 1
+ 8 1
1 0 2
Example problem 2
Solve the following cryptarithmetic puzzle:
S E N D
+ M O R E
M O N E Y
Solution
1. We number the columns as follows:
Col:5 4 3 21
S E ND
+ M O RE
M O N EY
2. From column 5, we must have M = 1 since it is the only carry possible from the
sum of two single digit numbers.
Col:5 4 3 21
S E ND
+ 1 O RE
1 O N EY
Col:5 4 3 21
S E ND
+ 1 0 RE
1 0 N EY
Col:5 4 3 21
9 E ND
+ 1 0 RE
1 0 N EY
N + R = 10 + E
E + 1 + R = 10 + E.
Col:5 4 3 21
9 E ND
+ 1 0 8E
1 0 N EY
D + E = 10 + Y.
Col:5 4 3 21
9 5 6D
+ 1 0 85
1 0 6 5Y
Col:5 4 3 21
9 5 67
+ 1 0 85
1 0 6 52
D = 7, E = 5, M = 1, N = 6, O = 0, R = 8, S = 9, Y = 2.
Description
There is a table on which some uniform blocks (cubes) are placed. Some blocks may or may
not be stacked on other blocks. We have a robot arm to pick up or put down the blocks.
The robot arm can move only one block at a time, and no other block should be stacked
on top of the block which is to be moved by the robot arm. The problem is to change the
configuration of the blocks from a given initial state to a given goal state (see, for
example, Figure 2.14).
Problem formulation
The problem can be formulated as follows:
2.3. EXAMPLE PROBLEMS 20
• Actions:
UNSTACK(A,B) : pick up clear block A from block B
and hold it in the arm
STACK(A,B) : place block A held in the arm onto clear
block B
PICKUP(A) : lift clear block A with the empty arm
PUTDOWN(A) : place block A held in the arm onto a
free space on the table.
• Goal state: A certain configuration of the blocks.
Solution
The blocks world problem has always a trivial solution: All blocks not already correctly
positioned for the goal state be set off onto the table (one at a time with the mechanical arm),
and then reassembled in the proper order on top of any blocks already correctly positioned.
To illustrate the idea consider the blocks world problem specified in Figure 2.14. A
solution consists of the following 12 moves in that order:
2.3. EXAMPLE PROBLEMS 21
1. UNSTACK(2, 3)
2. PUTDOWN(2)
3. UNSTACK(4, 5)
4. PUTDOWN(4)
5. UNSTACK(5, 6)
6. STACK(5, 4)
7. PICKUP(2)
8. STACK(2, 5)
9. PICKUP(3)
10. STACK(3, 1)
11. PICKUP(6)
12. STACK(6, 3)
Description
A monkey is in a room. A bunch of bananas is hanging from the ceiling. The monkey
cannot reach the bananas directly. There is box in the room. The ceiling is just the right
height so that a monkey standing on the box can get hold of the bananas. The monkey
knows how to move around and carry things around. How can the monkey get the
bananas? What is the best sequence of actions for the monkey? (See Figure 2.15.)
Remark
The purpose of the problem is to raise the question: Are monkeys intelligent? Both
humans and monkeys have the ability to use mental maps to remember things like where
to go to find shelter, or how to avoid danger. They can also remember where to go to
gather food and water, as well as how to communicate with each other. Monkeys have the
ability not only to remember how to hunt and gather but to learn new things, as is the
case with the monkey and the bananas: despite the fact that the monkey may never
have been in an identical
2.3. EXAMPLE PROBLEMS 22
situation, with the same artifacts at hand, a monkey is capable of concluding that it needs
to make a ladder, position it below the bananas, and climb up to reach for them.
Problem formulation
Let the initial location of the monkey be A, that of the bananas be B and that of the box
be C. Initially, the monley and box have height “low”; but when the monkey climbs on
the box, it will have height “high”.
• States: Each state is represented by the location and height of the monkey. bananas
and the box.
• Initial state:
Location(monkey, A), Location(bananas, B), Location(box, C),
Position(monkey, low), Position(bananas, high), Position(box, low)
• Transition model: Given a state and action, this returns the resulting state.
Solution
Step 1. GO(A,C)
Step 2. PUSH(box, C, B)
Step 3. CLIMBUP(box, B)
Step 5. CLIMBDOWN(box, B)
Step 6. GO(B, A)
2.3. EXAMPLE PROBLEMS 23
Problem formulation
Using the idea of states the problem can be formulated as follows:
• States: Each city in the map represents a state.
• Initial state: The initial state is the city of Arad.
• Actions: Travel to a connected nearby city.
• Transition model: Given a state and action, this returns the resulting state.
• Goal test: The city of Bucharest is reached.
• Path cost: The total distance traveled in reaching the goal state.
Problem formulation
• States: Let x denote the number of liters of water in the 4-liter jug and y the
number of liters of water in the 3-liter jug. Now x = 0, 1, 2, 3, or 4 and y = 0, 1,
2, or 3. The ordered pair (x, y) represents a state.
2.3. EXAMPLE PROBLEMS 24
• Actions: Each action is represented in the form “(x, y) → (u, v)” where (x, y)
represents the state before the application of the action and (u, v) represents the
state after the application of the action. The possible actions are given in Table 2.1.
The table also specifies the conditions under which the various actions can be
applied.
• Transition model: The production system given in Table 2.1 also specifies the
tran- sition model.
Solution
One solution to the water jug problem is given below:
Rule 2
Rule Rule Rule Rule Rule 7
7 2 5 3
(0, 0) −−−→ (0, 3) −−−→ (3, 0) −−−→ (3, 3) −−−→ (4, 2) −−−→ (0, 2) −−−→ (2, 0).
4. There are obvious good solutions without comparisons to all other possible solutions.
We show that this problem is decomposable to smaller subproblems (see Figure 2.18).
Since ∫ 2 ∫
(x + sin2 x) dx = ∫x2 dx + sin2 xdx,
the problem can be decomposed into two simpler subproblems.
∫
• Problem (i): Evaluate x2 dx.
∫
• Problem (ii): Evaluate sin2 xdx.
2.5. EXAMPLES ILLUSTRATING PROBLEM CHARACTERISTICS 26
∫
(x2 + sin2 x)
dx
∫
sin2 xdx
∫
x2 dx
∫1
2 (1 − cos 2x) dx
1 3
3x
∫1 ∫ 1
2 dx —2 cos 2xdx
1x
2
—41 sin 2x
Since
∫ ∫
sin2 xdx = 12 (1 − cos 2x)
∫ ∫
dx = 21 dx − cos
2xdx,
Problem (ii) can be divided into two subproblems:
∫
• Problem (iii): Evaluate 12 dx.
∫
• Problem (iv): Evaluate − 12 cos 2xdx.
These subproblems can be easily solved as
∫ 1
2
dx =2 1
∫ 1
1
2 cosx2xdx = − 2 sin 2x
Combining the solutions of the subproblems we get a solution of the given problem.
Example 2
Consider a simple blocks world problem specified by Figure 2.19.
PUTDOWN(C)
PICKUP(B)
STACK(B,A)
STACK(C,B)
Since, the steps are interdependent the problem cannot be decomposed into two
independent problems.
Example 2
Consider the 8-puzzle. The solution involves a sequence of moves. In the process of
finding a solution, after some moves, we realize that a certain previous move has be
reversed. The previous move can be undone by backtracking the moves. In this example,
the solution steps can be ignored.
Example 3
Consider the problem of playing chess. If a chess playing programme makes a wrong
move, the rules of the game do not permit undoing the move. In this example, the
solution steps cannot be undone or ignored.
Example 2
In a game like bridge, this is not possible because a player does not know where the
various cards are and what the other players will do on their turns. In this problem, the
universe is unpredictable.
2.5.4 There are obvious good solutions without comparison to all other pos-
sible solutions.
Example 1
Consider a mathematical problem. In general, there may be many methods for solving the
problem. Any method is a good method without comparison to other methods provided it
solves the problem. In general, any “any-path problem” is an example of a problem
having obvious good solutions without comparison to other possible solutions.
Example 2
A “best-path problem” is a problem having no obvious good solutions. The travelling
sales- man problem is an example for a best-path problem. The travelling salesman
problem can be formulated as follows: “Given a list of cities and the distances between
each pair of cities, what is the shortest possible route that visits each city exactly once
and returns to the origin city?”
Example 2
In the cryptarithmetic puzzle, the solution to the problem is a state of the problem,
namely, the state representing the assignment of digits to the letters in the puzzle.
Example 2
Consider the problem of scanning daily newspapers to decide which are supporting and
which are opposing a political party in an upcoming election. It is obvious that a great
deal of knowledge is required to solve this problem.
2.6. PRODUCTION SYSTEMS 29
2.6.1 Definition
A production system (also called production rule system) consists of the following:
1. Set of rules
This is the most important component of a production system. Briefly, a production
rule is a rule of the form “C → A” which may be interpreted as “given condition C,
take action A”. Thus, each rule in the set of rules consists of a left side (a pattern)
that determines the applicability of the rule and a right side that describes the
operation to be performed if the rule is applied.
2. Knowledge databases
The databases contain whatever information is appropriate for the particular task.
Some parts of the database may be permanent.
The database includes a working memory whose elements may pertain only to the
solution of the current problem. Working memory contains a description of the
cur- rent state of the world in the problem-solving process. The description is
matched against the conditions of the production rules. When a condition matches,
its action is performed. Actions are designed to alter the contents of working
memory.
3. Control strategy
The control strategy includes a specification of the order in which the rules are to
be applied and a conflict resolution strategy.
• Backward chaining
In backward chaining, the execution of rules starts from the goals and
con- clusions and works backwards to see if there is any data from which
the goals and conclusions can be derived (see Figure 2.21).
(b) Conflict resolution strategy
A system with a large number of rules and facts may result in many rules
being true for the same fact; these rules are said to be in conflict. The conflict
res- olution strategy is a way of resolving the conflicts that manages the
execution order of these conflicting rules. Some of the commonly used
conflict resolution strategies are listed below:
• Choose arbitrarily.
• Choose the lowest numbered rule.
• Choose the rule containing the largest number of conditions.
• Choose the least recently used rule.
• Choose a rule where the condition is a new fact.
• Choose the rule with the highest priority (weight).
4. Interpreter
The interpreter repeats the following operations: all rules whose conditions are sat-
isfied are found (rule matching), one of them is selected (conflict resolution), and
its action is called (rule firing).
The interpreter (also referred to as rule applier) can be viewed as a “select-execute
loop”, in which one rule applicable to the current state of the data base is chosen,
and then executed. Its action results in a modified data base, and the select phase
begins again. This cycle is also referred to as a “recognize-act cycle”.
2.6. PRODUCTION SYSTEMS 31
Problem statement
Using the production system model, determine whether a path exists from square 1 to
square 2 in the 3 × 3 knight’s tour problem (see Figure 2.22).
1 2 3
4 5 6
7 8 9
Solution
Step 1. Problem formulation as a production system
(i) Knowledge database
Since there is no information other than that contained in the statement of the
problem, the database contains only the working memory. The working memory
contains the following two facts:
Table 2.2: The set of production rules for the 3 × 3 knight’s tour problem
.
(iv) Interpreter
Interpreter is the select-execute loop.
2.6. PRODUCTION SYSTEMS 33
Table 2.3: Sequence of the applications of the production rules in the 3 × 3 knight’s tour
problem
Step 3. Conclusion
There exists a path from the initial state “knight on square 1” to the goal state “knight on
square 2”.
Problem statement
We have two jugs of capacity 4 liters and 3 liters, and a tap with an endless supply of
water. The objective is to obtain 2 liters of water exactly in the 4-liter jug with the
minimum steps possible.
Solution
Step 1. Problem formulation as a production system
As in section 2.3.8, we represent a state of the problem by an ordered pair (x, y) of
integers where x denotes the number of liters of water in the 4-liter jug and y the number
of liters of water in the 3-liter jug.
1. Knowledge database
Since there is no information other than that contained in the statement of the prob-
lem, the database contains only the working memory. The working memory
contains
2.6. PRODUCTION SYSTEMS 34
2. Production rules
The set of production rules is given in Table 2.4. This is identical with Table 2.1
except for the differences in the column titles.
4. Interpreter
Interpreter is the select-execute loop.
Table 2.5: Sequence of the applications of the production rules to reach the goal state in
the water jug problem
Let us examine the working of the conflict resolution strategy in this example.
Consider Step No. 6 in Table 2.5. The current state is (3, 3). If we examine the rules, it
can be seen that this state satisfies the conditions of the Rules 1, 3, 4, 5, and 6 and hence
the conflict set is {Rule 1, Rule 3, Rule 4, Rule 5, Rule 6}. The conflict resolution
strategy we have adopted is “select and fire the first rule that does not lead to a repeated
state”. If we select Rule 1 and fire it, the resulting state is (4, 3) and we have obtained this
state in Step No. 3. Similarly if we select Rule 3 and fire it the resulting state is (0, 3)
which we have already obtained in Step No. 4. Proceeding like this, we see that that the
first rule in the conflict set that does not lead to a repeated state is Rule 5 and we select it
and fire it to obtain the non-repeated state (4, 2).
3. Conclusion
The operations specified in Table 2.5 specify a method of getting 2 liters of water in the
4-liter jug. However, this need not necessarily be the optimum solution in the sense that
there may be a solution with less number of operations.
Remarks
1. It can be shown that, depending on how the operators are chosen, the 8-puzzle and
the blocks world problem are partially commutative systems.
2. Production systems that are not partially commutative are useful in which
irreversible changes occur. For example, the process to produce a desired chemical
compound may involve may irreversible steps.
3. Commutative production systems are useful for solving ignorable problems like the
problem of proving a theorem in mathematics.
Table 2.6 summarises the characteristics of some of the well known production systems.
Monotonic Nonmonotonic
Partially commutative Theorem proving Robot navigation
Not partially commutative Chemical synthesis Bridge
2. The system is highly modular because individual rules can be added, removed or
modified independently.
4. The system uses pattern directed control which is more flexible than algorithmic
con- trol.
Disadvantages
The following are some of the disadvantages.
1. There is a lot of inefficiency in production systems. For example, there may be
situations where multiple rules get activated during execution and each of these
rules may trigger exhaustive searches.
2. It is very difficult to analyze the flow of control within a production system.
3. There is an absence of learning due to a rule-based production system that does not
store the result of the problem for future use.
4. The rules in the production system should not have any type of conflict operations.
When a new rule is added to the database it should ensure that it does not have any
conflict with any existing rule.
Which of the above rules will be put into a conflict set by the system if the working
memory contains two facts: green, blinking? Apply some conflict resolution
strategy to fire a rule.
A B C
(b) E A T
+ T H A T
A P P L E
(c) F O R T Y
T E N
+ T E N
S I X T Y
(d) C R O S S
+ R O A D S
D A N G E R
Search algorithms are one of the most important areas of artificial intelligence. Search is an
important component of problem solving in artificial intelligence and, more generally, in
computer science, engineering and operations research. Combinatorial optimization, deci-
sion analysis, game playing, learning, planning, pattern recognition, robotics and theorem
proving are some of the areas in which search algorithms play a key role.
In this chapter we consider a certain class of search algorithms called blind search
algorithms.
3.1 Preliminaries
In describing the various strategies discussed in this and the next chapter, we make extensive
use the mathematical notions of graphs and trees. In this short section we quickly review
the necessary concepts.
3.1.1 Graphs
A graph G is an ordered pair (V, E) where the elements of V are called vertices or nodes
and the elements of E are called edges or sides. Each element of E can be thought of an
arc joining two vertices in V . Sometimes the arcs may have specific directions in which
case the graph is called a directed graph. There may be multiple edges joining two given
edges. If there are more than one edge joining two vertices, the graph is called a
multigraph. There may also be loops in graphs, a loop being an edge joining a vertex
with itself. In the sequel, unless otherwise explicitly stated, by “graph” we shall always
mean an undirected graph without loops and without multiple edges.
Graphs can be pictorially represented as in Figure 3.1. The set of vertices of the graph
in the figure is V = {v1, v2, v3, v4, v5} and the set of edges is E = {e1, e2, e3, e4, e5, e6,
e7}.
3.1.2 Trees
A graph is said to be connected if there is a path along the edges joining any two vertices in
the graph. A path which returns to the starting vertex is called a cycle. A tree is a
connected graph without cycles.
39
3.2. SEARCH IN GENERAL 40
e1
v1 v2
e2
e7 e6 e4 v3
e3
v5 v4
e5
v2 v5 e d h k
v1 v2 c g j
v6 v3
v3 a b f
v5 v4 v1 v4 i
Examples
Rooted trees
In search strategies, we generally consider rooted trees where one vertex is designated as
the root and all edges are directed away from the root. In pictorial representations of
rooted trees, the root vertex is positioned near the top of a page and the remaining
vertices are positions below the root vertex at different levels (see, for example, Figure
3.3).
slide into the space. The objective is to reach the goal shown in Figure 2.4 by sliding the
tiles one at a time from start as in Figure 2.4.”
We have also seen that the solution to the problem is the following sequence of actions:
This, that the solution is a sequence of actions, is true in general. Assuming that the envi-
ronment has certain properties1, the solution to any problem (of the type considered in
AI) is a fixed sequence of actions.
Example
Consider a person intending to travel from Arad to Bucharest in Romania (see Figure
3.4). Partial search trees for finding a route from Arad to Bucharest are shown in Figure
3.5.
1
The environment must be fully observable, discrete, known and deterministic.
3.3. BLIND SEARCH 42
Figure 3.5: Partial search trees for finding a route from Arad to Bucharest
3.3.1 Example
Figure 3.4 shows a map of Romania.
• Let us assume that we are currently at Arad and we want to get to Bucharest.
• To construct a search tree we start by designating the initial state as the root node.
In the present problem, the initial state is the city of Arad and so we designate Arad
as the root node (Figure 3.5 (a)).
• There are three possible actions that we can take while we are at Arad, namely, go
to Sibiu, go to Timisoara and go to Zerind. These actions define three branches of
the search tree emanating from the root node and ending at the nodes named as
Sibiu, Timisoara and Zerind. These three branches form level 1 of the search tree
(Figure
3.5 (b)). A blind search will have no preference as to which node it should explore
first.
• Suppose we arbitrarily move to Sibiu. Then the possible actions are: go to Arad,
go to Rimnicu Vicea, go to Fagaras and go to Oradea. These actions produce four
branches from the node Sibiu (Figure 3.5 (c)).
Remarks
The search tree shown in Figure 3.5(c) includes the path from Arad to Sibiu and back to
Arad again! We say that Arad is a repeated state in the search tree, generated in this case
by a loopy path. Considering such loopy paths means that the complete search tree for
Romania is infinite because there is no limit to how often one can traverse a loop. On the
other hand, the map has only 20 states. However, there are mechanisms which ensure that
such loopy paths are generated while creating a search tree.
Remark
Figure 3.6 shows the order in which the nodes in a tree are explored while applying the
breadth-first search algorithm.
3.4.1 Algorithm
The algorithm returns the goal state or failure.
Figure 3.6: The order in which the nodes are explored while applying the BFS algorithm:
0→1→2→3→4→5→6
(a) Remove the first element from NODE-LIST and call it E. If NODE-LIST was
empty, quit and return failure.
(b) For each way that each rule can match the state described in E do:
i. Apply the rule to generate a new state.
ii. If the new state is a goal state, quit and return this state.
iii. Otherwise, add the new state to the end of NODE-LIST.
3.4.2 Examples
Example 1
Let the search tree be as in Figure 3.7 the initial and goal states being 1 and 8 respectively.
Apply the breadth-first search algorithm to find the goal state.
Solution
The details of the various steps in finding a solution to the problem are depicted in Table
3.1. It can be seen that the nodes are visited in the following order:
1 → 2 → 3 → 4 → 5 → 6 → 7 → 8.
3.4. BREADTH-FIRST SEARCH 45
Node gener-
Step no. NODE-LIST Node E Comment
ated from E
1 1 - - Initial state
2 - 1 - -
3 - 1 2 -
4 2 1 - -
5 2 1 3 -
6 2, 3 1 - -
7 2, 3 1 4 -
8 2, 3, 4 1 - -
9 3, 4, 2 - -
10 3, 4 2 5 -
11 3, 4, 5 2 - -
12 3, 4, 5 2 6 -
13 3, 4, 5, 6 2 - -
13 4, 5, 6 3 - No node generated from E
14 5, 6 4 - -
15 5, 6 4 7 -
16 5, 6, 7 4 - -
17 5, 6, 7 4 8 Goal state. Return 8 and quit.
Table 3.1: Table showing the details of the various steps in the solution to the problem
specified in Figure 3.7
Example 2
Consider the water jug problem discussed in Section 2.6.3 and the associated production
rules given Table 2.1. We now construct the search tree using the breadth-first search
algo- rithm.
Stage 1. Construct a tree with the initial state (0, 0) as its root.
(0, 0)
Figure 3.8: Stage 1 in the construction of the breadth-first search tree of the water
jug problem
Stage 2. Construct all the children of the root by applying each of the applicable rules to
the initial state. Then we get a tree as in Figure 3.9.
3.5. DEPTH-FIRST SEARCH 46
(0, 0)
(4, 0) (0, 3)
Figure 3.9: Stage 2 in the construction of the breadth-first search tree of the water
jug problem
Stage 3. Now for each leaf node in Figure 3.9, generate all its successors by applying all
the rules that are appropriate. The tree at this point is as shown in Figure 3.10.
(0, 0)
(4, 0) (0, 3)
Figure 3.10: Stage 3 in the construction of the breadth-first the search tree of the
water jug problem
Stage 4. The process is continued until we get a node representing the goal state, namely,
(2, 0).
(a) Generate a successor E of the initial state. If there are no more successors,
signal failure.
(b) Call depth-first search with E as the initial state.
(c) If success is returned, signal success. Otherwise continue in this loop.
Remark
Figure 3.11 shows the order in which the nodes in a tree are explored while applying the
depth-first search algorithm.
3.5.2 Examples
Example 1
Let the search tree be as in Figure 3.12 the initial and goal states being 1 and 7
respectively. Apply the depth-first search algorithm to find the goal state.
3.5. DEPTH-FIRST SEARCH 47
Figure 3.11: The order in which the nodes are explored while applying the DFS algorithm:
0→1→3→4→2→5→6
2 5
3 4 6
7 8
Solution
We denote by DFS(x) an invocation of the depth-first search algorithm with x as the
initial vertex. The various steps in the implementation of the algorithm are shown in
Figure 3.13. Note that the various nodes are visited in the following order:
1→2→3→4→5→6→7
Call DFS(1)
1. Node 1 is not a goal state.
2. Do until successor or failure is signaled.
(a) Choose 2 as successor of node 1: E = 2.
(b) Call DFS(2).
1. Node 2 is not a goal state.
2. Do until success or failure is signaled.
(a) Choose 3 as successor of 2: E = 3
(b) Call DFS(3).
1. Node 3 is not a goal state.
2. Do until successor or failure is signaled.
(a) Node 3 has no successor. Signal failure.
DFS(3) ends.
(a) Choose 4 as successor of 2: E = 4
(b) Call DFS(4).
1. Node 4 is not a goal state.
3.5. DEPTH-FIRST SEARCH 48
Figure 3.13: Finding a path from node 1 to node 7 in Figure 3.12 using the DFS
Example 2
Consider the water jug problem discussed in Section 2.6.3 and the associated production
rules given Table 2.1. We now construct the search tree using the depth-first search algo-
rithm.
Stage 1. Construct a tree with the initial state (0, 0) as its root. The initial state is not a
good state.
(0, 0)
Figure 3.14: Stage 1 in the construction of the depth-first search tree of the water
jug problem
Stage 2. We generate a successor E for (0, 0). We choose the successor obtained by the
application of Rule 1 in the production rules. Then we get a tree as in Figure
(0, 0)
(4, 0)
3.15.
Figure 3.15: Stage 2 in the construction of the depth-first search tree of the water
jug problem
Stage 3. Now for the leaf node in Figure 3.15, generate a successor by applying an appli-
cable rule. The tree at this point is as shown in Figure 3.16.
3.6. BREADTH-FIRST SEARCH VS. DEPTH-FIRST SEARCH 49
(0, 0)
(4, 0)
(4, 3)
Figure 3.16: Stage 3 in the construction of the depth-first search tree of the water
jug problem
Stage 4. The process is continued until we get a node representing the goal state, namely,
(2, 0).
Sl No Breadth-first Depth-first
1 Requires more memory because Requires less memory because
all of the tree that has so far only the nodes in the current
been generated must be stored. path are stored.
2 All parts of the search tree must May find a solution without ex-
be examined at level n before amining much of the search
any node on level n + 1 can be space. It stops when one solu-
examined. tion is found.
3 Will not get trapped exploring a May follow an unfruitful path
blind alley. for a long time, perhaps for
ever.
4 If there is a solution, May find a long path to a so-
Guaranteed to find a solution if lution in one part of the tree
there exists one. If there are when a shorter path exists in
multiple so- lutions then a some other unexplored path of
minimal solution (that is, a the tree.
solution that takes a minimum
number of steps) will be found.
5 Breadth-first search uses queue Depth-first search uses stack
data structure. data structure.
6 More suitable for searching ver- More suitable when there are
tices which are closer to the so- lutions away from source.
given source.
3.7. DEPTH-FIRST ITERATIVE DEEPENING (DFID) SEARCH 50
3.7.1 Algorithm
1. Set SEARCH-DEPTH = 1.
2. Conduct a depth-first search to a depth of SEARCH-DEPTH. If a solution is found,
then return it.
3. Otherwise, increment SEARCH-DEPTH by 1 and go to step 2.
Figure 3.18: The order in which the nodes are explored in iterative deepening
3.8. BLIND SEARCH IN GRAPHS 51
4. The DFID is slower than the depth-first search only by a constant factor.
5. DFID is the optimal algorithm in terms of space and time for uninformed search.
3.8.3 Example
Table 3.3: Frontier set using queue data structure, and explored set
3.9. SAMPLE QUESTIONS 53
Table 3.4: Frontier set using queue data structure, and explored set
If we use the stack data structure (LIFO) to store the frontier set, then one of the
orders in which the nodes are explored is shown in Table 3.5.
Initial A ∅
E
Iteration 1 {A}
B
D
Iteration 2 { A, E }
B
C
Iteration 3 { A, E, D }
B
Iteration 4 B { A, E, D, C }
Iteration 5 { A, E, D, C, B}
Table 3.5: Frontier set using stack data structure, and explored set
6. Will the breadth-first search method always find the optimal solution? Why? Illus-
trate with an example.
9. Let the search tree be as in Figure 3.21. If the BFS algorithm is used, what is the
order in which the various nodes are explored starting from node 0.
12. If the graph in Figure 3.23 is explored using the breadth first algorithm starting
from node Q, what is the order in which the nodes will be explored? what will be
the order if it is explored using the depth first algorithm?
M N O
R Q P
13. If the graph in Figure 3.24 is explored using the breadth first algorithm starting
from node 0, what is the order in which the nodes will be explored? what will be
the order if it is explored using the depth first algorithm?
2. Briefly describe the breadth-first and the depth-first search for trees.
58 A
47 B 34 C 41 D
For example, let us assume that we are at node A in the search tree of some search
problem (see Figure 4.1). Let us also assume that there are three different branches, say
1
The word heuristics comes from Greek word heuriskein which means “to find, to discover”.
56
4.1. HEURISTIC FUNCTION 57
AB, AC and AD, along which we can continue the search for the goal node. We have to
decide which one of these nodes is to be selected. From the circumstances of the problem
let it be possible to obtain an estimate of the cheapest path from the nodes to the goal
node. Let these estimates be as shown alongside each of the nodes in the figure. For
example, the estimate of the cheapest cost from node C to the goal node is 34.
One way of choosing the next node is to chose that node for which the estimated cost
of the cheapest path to goal is least. As per this criterion, we choose to proceed along the
branch AC. (Of, course, there are other ways in which the estimates can be used to
choose the next branch.) These estimated costs define a heuristic function for the
problem. The methods of constructing such functions from the circumstances of a
problem are illustrated in Section 4.1.4.
4.1.1 Definition
The heuristic function for a search problem is a function h(n) defined as follows:
Remark
Unless otherwise specified, it will be assumed that a heuristic function h(n) has the
follow- ing properties:
4.1.3 Remarks
1. There are no limitations on the function h(n). Any function of our choice is accept-
able. As an extreme case, the function h(n) defined by
where c is some constant can also be taken as a heuristic function. The heuristic
function defined by
h(n) = 0 for all n
is called the trivial heuristic function.
4.1. HEURISTIC FUNCTION 58
2. The heuristic function for a search problem depends on the problem in the sense
that the definition of the function would make use of information that are specific
to the circumstances of the problem. Because of this, there is no one method for
constructing heuristic functions that are applicable to all problems.
4.1.4 Examples
In the following subsections we present several examples of heuristic functions. In each
of these examples, the heuristic function is defined using knowledge about the
circumstances of the particular problem. Also, these functions are solutions of problems
obtained by re- laxing some of the constraints in the problem. Moreover, it can be seen
that all the presented heuristic functions are admissible.
Table 4.1 shows the values of h(n). Note that h(n) ≥ 0 for all n and h(Bucharest) = 0.
Now h∗(n) is the the length of the shortest road-path from n to Bucharest. Since, the
air distance never exceeds the road distance we have
Thus, h(n) as defined above is an admissible heuristic function for the problem
crossings, if k = 4 also we need only 3 river crossings to take all people to the other side
of the river. If k = 5 or k = 6 we need 5 river crossings to take all people to the other
side of the river. See below for details of the crossings in the case when k = 6.
0 if x + y =
h(x, y, z) = x0 + y if x + y is odd
x + y − 1 if x + y > 0 and x + y is even
Note that x + y is the number of people on the initial side of the river.
Note that h(n) ≥ 0 for all n and h(Goal state) = 0. Also, it is obvious that this is an
admissible heuristic function for the problem.
Another admissible heuristic function for the problem is given below:
1
h(n) = 2 (x + y).
We can get different admissible heuristic functions for the 8-puzzle problem by solv-
ing simpler problems by relaxing or removing some of these restrictions. We define two
different heuristic functions h1(n) and h2(n) for the 8-puzzle problem. h1(n) is obtained
by solving the problem without all the three restrictions given above. h2(n) is obtained by
solving the problem obtained by relaxing first restriction to the restriction that “a tile may
be moved to any nearby slot and deleting the third restriction”.
Example
5 8 1 2 3
4 2 1 4 5 6
7 3 6 7 8
(C) 0 1 2 (D) 3
(B) 7
6
5 (E) 0
(A) 0 1 2 3 4 1
2
(F) 3
Figure 4.3: In the 8 × 8 grid, the Manhattan distance between the cells (A) and (B)
is 7, between (C) and (D) is 3 and between (E) and (F) is 3.
h2(n) = Sum of the Manhattan distances of every numbered tile to its goal
position.
Example
h(n) = Add one point for every block that is resting on the thing it is
supposed to be resting on. Subtract one point for every block
that is sitting on the wrong thing.
This is an example of a heuristic function not satisfying the conditions h(n) ≥ 0 and
h(Goal node) = 0.
Example
To illustrate the idea, consider a blocks world problem with the initial and goal states as
in Figure 4.4. In the initial state, only block numbered “1” is resting on the thing it is
supposed to be resting on. All other blocks are sitting on the wrong things. Hence we
have:
problem
Since in the goal state every block is sitting on the right thing, we have
h(Goal state) = 1 + 1 + 1 + 1 + 1 + 1 = 6 /= 0.
h(x, y) = |x − 2| + |y − 2|.
This function satisfies the condition h(n) ≥ 0 for all n. It does not satisfy the condition
h(Goal state) = 0, because as a goal state is specified by a pair of the form (2, y), we have
Algorithm
1. Create two empty lists: OPEN and CLOSED.
2. Start from the initial node (say N) and put it in the ordered OPEN list.
(a) If OPEN list is empty, then exit the loop returning FALSE.
(b) Select the first/top node (say N) in the OPEN list and move it to the CLOSED
list. Also record the information of the parent node of N.
(c) If N is a goal node, then move the node to the CLOSED list and exit the loop
returning TRUE. The solution can be found by backtracking the path.
(d) If N is not the goal node, generate the immediate successors of N, that is, the
immediate next nodes linked to N and add all those to the OPEN list.
(e) Reorder the nodes in the OPEN list in ascending order according to the values
of the evaluation function f (n) at the nodes.
4.3.2 Examples
Example 1
Use the greedy best first search algorithm to find a path from A to C in the graph shown in
Figure 4.5.
A B
D C
Node n S A B C D
h(n) 6 2 3 0 2
4.3. GREEDY BEST FIRST SEARCH 64
Solution
Table 4.2 shows the various stages in the execution of the algorithm with details of the
current node, children of the current node, the contents of the OPEN list and the contents
of the CLOSED list.
Table 4.2: Various stages in the execution of the greedy best first search algorithm for the
graph in Figure 4.5
From the list of nodes in the CLOSED list at the stage when the goal state has been
reached we see that the path from S to C is
“C ← D ← A ← S”,
or equivalently,
“S → A → D → C”.
Example 2
Using the greedy best first search algorithm, find an optimal path from A to F in the
search graph shown in Figure 4.6. In the figure, the numbers written alongside the nodes
are the values of the heuristic function and the numbers written alongside the edges are
the costs associated with the edges.
9 2 0
B 5 2
D F
6 11
8
4 C 2 3
3 8
A E G
7 3 5
10 1
Solution
Table 4.3 shows the various stages in the execution of the algorithm with details of the
current node, children of the current node, the contents of the OPEN list and the contents
of the CLOSED list.
Table 4.3: Various stages in the execution of the greedy best first search algorithm for the
search tree in Figure 4.6
From the list of nodes in the CLOSED list at the stage when the goal state has been
reached we see that the path from A to F is “F ← G ← E ← A”, or equivalently, “A → E →
G → F ”.
Example 3
Using the greedy best first search algorithm find the optimal path from Arid to Bucharest
in Figure 4.7 using the heuristic function defined by Table 4.1.
4.3. GREEDY BEST FIRST SEARCH 67
Solution
The greedy best first search algorithm yields the following path from Arad to Bucharest:
Arad → Sibiu → Fagaras → Bucharest
Remarks
It may be noted that the above solution is not the optimal path from Arad to Bucharest. The
cost of the above path is
140 + 99 + 211 = 450
where as there is path with lower cost, namely,
Arad → Sibiu → Rmnicu Vilcea → Pitest → Bucharest
whose total cost is only
140 + 80 + 97 + 101 = 418.
4.4. A∗ BEST FIRST SEARCH 69
4.4.1 Examples
The following examples illustrate the basic idea of the A∗ algorithm.
Example 1
Use the A∗ best first search algorithm to find a path from A to C in the graph shown in
Figure 4.8.
S
1 4
2
A B
11 5 2
D C
3
Node n S A B C D
h(n) 6 2 3 0 2
Solution
Step 1. We start with S.
Nodes A and B are immediate successors of S. We have
Example 2
Consider the graph shown in Figure 4.9. The numbers written on edges represent the dis-
tance between the nodes. The numbers written on nodes represent the heuristic value.
Find the most cost-effective path to reach from start state A to final state J using A*
Algorithm.
Solution
Step 1. We start with node A.
Node B and Node F can be reached from node A.
A* Algorithm calculates f (B) and f (F ).
f (B) = 6 + 8 = 14
f (F ) = 3 + 6 = 9
f (G) = (3 + 1) + 5 = 9
f (H) = (3 + 7) + 3 = 13
f (I) = (3 + 1 + 3) + 1 = 8
It decides to go to node I.
Path: A → F → G → I
Step 4. Node E, Node H and Node J can be reached from node I.
A* Algorithm calculates f (E), f (H) and f (J).
f (E) = (3 + 1 + 3 + 5) + 3 = 15
f (H) = (3 + 1 + 3 + 2) + 3 = 12
f (J) = (3 + 1 + 3 + 3) + 0 = 10
this method may not be the global optimum; it will only be a local optimum. In this
sense, it is a local search method. The difference between a local and a global maxima is
illustrated in Figure 4.10. In the figure, the x-axis denotes the nodes in the state space and
the y-axis denotes the values of the objective function corresponding to particular states.
1. Evaluate the initial state. If it is also a goal state, then return it and quit. Otherwise
continue with the initial state as the current state.
2. Loop until a solution is found or until there are no new operators left to be applied
in the current state:
(a) Select an operator that has not yet been applied to the current state and apply
it to produce a new state.
(b) Evaluate the new state.
i. If it is a goal state then return it and quit.
ii. If it is not a goal state but if it is better than the current state then make it
the current state.
iii. If it is not better than the current state then continue in the loop.
4.6.2 Examples
Example 1
Apply the hill climbing algorithm to solve the blocks world problem shown in Figure 4.12:
Solution
To use the hill climbing algorithm we need an evaluation function or a heuristic function.
We consider the following evaluation function:
h(n) = Add one point for every block that is resting on the thing it is
supposed to be resting on. Subtract one point for every block
that is sitting on the wrong thing.
4.6. SIMPLE HILL CLIMBING 75
We call “initial state” “State 0” and “goal state” ”Goal”. Then considering the blocks A, B,
C, D, E, F, G, H in that order we have
h(State 0) = −1 − 1 + 1 + 1 + 1 + 1 + 1 + 1
=4
h(Goal) = +1 + 1 + 1 + 1 + 1 + 1 + 1 + 1
=8
There is only one move from State 0, namely, to move block A to the table. This
produces the State 1 with a score of 6:
h(State 1) = 1 − 1 + 1 + 1 + 1 + 1 + 1 + 1 = 6.
There are three possible moves form State 1 as shown in Figure 4.13. We denote these three
states State 2(a), State 2(b) and State 2 (c). We also have
h(State 2(a)) = −1 − 1 + 1 + 1 + 1 + 1 + 1 + 1 = 4
h(State 2(b)) = +1 − 1 + 1 + 1 + 1 + 1 + 1 − 1 = 4
h(State 2(c)) = +1 − 1 + 1 + 1 + 1 + 1 + 1 − 1 = 4
Hill climbing will halt because all these states have lower scores than the current state.
The process has reached a local maximum that is not a global maximum. We have
reached such a situation because of the particular choice of the heuristic function. A
different choice of heuristic function may not produce such a situation.
Example 2
Given the 8-puzzle shown in Figure 4.14, use the hill-climbing algorithm with the
Manhat- tan distance heuristic to find a path to the goal state.
Solution
By definition, the Manhattan distance heuristic is the sum of the Manhattan distances of
tiles from their goal positions. In Figure 4.14, only the tiles 5, 6 and 8 are misplaced and
their distances from the goal positions are respectively 2, 2 and 1. Therefore,
h(Initial state) = 2 + 2 + 1 = 5.
4.7. STEEPEST-ASCENT HILL CLIMBING 76
1 2 3 1 2 3
4 8 4 5 6
7 6 5 7 8
puzzle
Figure 4.15 shows the various states reached by applying the various operations specified
in the hill-climbing algorithm. The numbers written alongside the grids are the values of
the heuristic function.
Note that the problem is a minimisation problem. The value of the heuristic function
at the goal state is 0 which is the minimum value of the function. Note further that once
we find an operation which produces a better value than the value in current state we do
proceed to that state without considering whether there is any other move which produces
a state having a still better value.
Figure 4.15: The search path obtained by applying the hill-climbing algorithm to an 8-
puzzle
4.7.1 Algorithm
1. Evaluate the initial state. If it is also a goal state, then return it and quit. Otherwise
continue with the initial state as the current state.
2. Loop until a solution is found or until a complete iteration produces no change to
current state:
(a) Let SUCC be a state such that any possible successor of the current state
will be better than SUCC.
(b) For each operator that applies to the current state do:
i. Apply the operator and generate a new state.
ii. Evaluate the new state. If it is a goal state, then return it and quit. If
not, compare it to SUCC. If it is better then set SUCC to this state. If it
is not better, leave SUCC alone.
(c) If SUCC is better than current state, then set current state to SUCC.
Disadvantages
Both the basic hill climbing and the steepest-ascent hill climbing may fail to produce a
solution. Either the algorithm terminates without finding a goal state or getting into a
state from which no better state can be generated. This will happen if the programme has
reached either a local maximum, a ridge or a plateau.
• Local optima (maxima or minima): A local maximum is a peak that is higher
than each of its neighboring states, but lower than the global maximum (see Figure
4.10). A solution to this problem is to backtrack to an earlier solution and try going
in a different direction.
• Ridges: A ridge is a sequence of local maxima. Ridges are very difficult to
navigate for a hill-climbing algorithm (see Figure 4.17). This problem may be
overcome by moving in several directions at once.
4.9. SIMULATED ANNEALING 78
• Plateaus: A plateau is an area of the state space where the evaluation function is
flat. It can be a flat local maximum, from which no uphill path exists, or a shoulder,
from which progress is possible (see Figure 4.16). One solution to handle this
situation is to make a big jump in some direction to get to a new part of the search
space.
Figure 4.17: A ridge: The ridge is the curved line at the top joining the local maxima
4.9.1 Annealing
The name of the algorithm comes from annealing in metallurgy. Annealing is a heat
treat- ment that alters the physical and sometimes chemical properties of a material to
increase its ductility and reduce its hardness, making it more workable. It involves
heating a material above a certain temperature, maintaining a suitable temperature for an
appropriate amount of time and then cooling.
4.9.2 Algorithm
Notations and conventions
The following notations and conventions are used in the algorithm.
• A variable BEST-SO-FAR which takes a problem state as a value.
• A certain function of time denoted by T (t) or simply T . This function is called the
annealing schedule.
• AS per standard usage, in the context of the simulated annealing algorithm, the
heuristic function will be called an objective function.
• It will be assumed that the problem is a minimisation problem
Algorithm
1. Evaluate the initial state. If it is also a goal state, then return it and quit. Otherwise,
continue with the initial state as the current state.
2. Initialise BEST-SO-FAR to the current state.
(a) Select an operator that has not yet been applied to the current state and apply
it to produce a new state.
(b) Evaluate the new state. Compute
2. Give a heuristic function for the problem of path search in maps. Is it admissible?
Why?
3. Give an admissible and nontrivial heuristic for the missionaries and cannibals prob-
lem.
5. Define the Hamming distance and Manhattan distance heuristic functions for the 8-
puzzle.
6. Given the 8-puzzle specified in Figure 4.19, what are the values of the Hamming
dis- tance and Manhattan distance heuristic functions at the initial state?
8 7 1 2 3
1 3 2 4 5 6
6 5 4 7 8
6. What are the advantages and disadvantages of the hill climbing algorithm?
7. Consider a blocks world with 4 blocks A, B, C and D with the start and goal states
given in Figure 4.21. Solve the problem using the hill climbing algorithm and a
suitable heuristic function. Show the intermediate decisions and states.
A D
D C
C B
B A
Start Goal
8. Using the greedy best first search algorithm, find an optimal path from S to G in
the search graph shown in Figure 4.22. Show in a tabular form the various stages in
the execution of the algorithm with details of the current node, children of the
current node, the contents of the OPEN list and the contents of the CLOSED list.
4.10. SAMPLE QUESTIONS 82
Node (n) A B C D E F H I S G
h(n) 12 4 7 3 8 2 4 9 13 0
9. Solve the previous problem using the A∗ best first search algorithm.
10. Consider the graph shown in Figure 4.23. In the graph, “S” is the start state and
“G1, G2, G3” are three goal states. When traversing the graph, one can move only
in the direction indicated by the arrows. The numbers on the edges are the step-cost
of traversing that edge. the numbers in the nodes are estimated cost to the nearest
goal state. Apply each of the algorithms below to find a solution path from the start
state to a goal state and to calculate the total cost to the goal.
search also.
Game playing
• Turn-taking: A turn-taking game is one where the players play one at a time in
alternating turns.
• Two-player
• Zero-sum: A zero-sum game1 is defined as one where the total payoff to all
players is the same for every instance of the game.
Example 1: Chess
The best known example of a game considered in AI is chess. The reader is assumed to
be familiar with how the game of chess is played. The initial position is shown in Figure
5.1.
1
The terminology of zero-sum game comes from mathematical game theory a branch of applied mathematics
extensively used in economics.
84
5.2. GAME TREE 85
In chess, the outcome is a win, loss, or draw, with values +1, 0, or 1/2. It is a zero-
sum game because every game has the total payoff of either 0 + 1, 1 + 0 or 1/2 +
1/2.
Example 2: Tic-tac-toe
Tic-tac-toe or noughts and crosses is a paper-and-pencil game for two players, X and
O, who take turns marking the spaces in a 3 × 3 grid. The player who succeeds in placing
three of his/her marks in a diagonal, horizontal, or vertical row is the winner (see Figure
5.2).
Example
Consider the game of tic-tac-toe and let the players be named “MAX” and “MIN”. MAX
plays by placing an X and MIN plays by placing an O in a cell in the grid.
In the figure the initial state is represented by a blank grid at the top. From the initial
state, MAX has nine possible moves. These are indicated in the figure by the nine
children of the initial state. Play alternates between MAX’s placing an X and MIN’s
placing an
O. This is continued until we reach leaf nodes corresponding to terminal states such that
one player has three in a row or all the squares are filled. The number on each leaf node
indicates the utility value of the terminal state from the point of view of MAX; high
values are assumed to be good for MAX and bad for MIN (which is how the players get
their names).
5.3. SOLVING A GAME 86
Example 1
Solve a zero-sum two-player game represented by the game tree shown in Figure 5.4.
Solution
The steps in arriving at a solution are described below.
2. There are three possible moves for MAX now: a1, a2, a3.
(a) Then the next move is of MIN. There are are three possible moves for MIN
from B, namely, b1, b2, b3. The respective losses are 3, 12, 8.
(b) Being rational, MIN tries to minimize his loss and co chooses b1 with a
possible loss of 3. This results in a gain of 3 for MAX.
(a) Then the next move is of MIN. There are are three possible moves for MIN
from B, namely, c1, c2, c3. The respective losses are 2, 4, 6.
(b) Being rational, MIN tries to minimize his loss and co chooses c1 with a
possible loss of 2. This results in a gain of 2 for MAX.
(a) Then the next move is of MIN. There are are three possible moves for MIN
from B, namely, d1, d2, d3. The respective losses are 14, 5, 2.
(b) Being rational, MIN tries to minimize his loss and co chooses d3 with a
possible loss of 2. This results in a gain of 2 for MAX.
6. Now, if MAX chooses a1 the minimum gain is 3, if MAX chooses a2 the minimum
gain is 2 and a3 is chosen the minimum gain is 2.
“Initially, MAX chooses a1 and then MIN chooses b1. This guarantees a mini- mum
gain of 3 for MAX and a maximum loss of 3 for MIN.”
Example 2
Solve the game represented by the game tree in Figure 5.5. Assume that the maximising
player plays first.
5.4. FORMULATION OF A GAME AS A SEARCH PROBLEM 88
Solution
We have:
“Initially, MAX chooses the move AC and then MIN chooses CI. This guaran-
tees a minimum gain of -2 for MAX and a maximum loss of -2 for MIN.”
• S0: The initial state, which specifies how the game is set up at the start.
• RESULT(s, a): The transition model, which defines the result of a move a in state
s
A zero-sum game is defined as one where the total payoff to all players is the same for every
instance of the game.
The initial state, ACTIONS function, and RESULT function define the game tree for
the game-a tree where the nodes are game states and the edges are moves.
Example
S0 A
PLAYER(B) MIN
ACTIONS(B) {b1, b2, b3}
RESULT(C, c2) I
TERMINAL-TEST(B) False
TERMINAL-TEST(K) True
UTILITY(E, MIN ) 3
MINIMAX(s)
UTILITY(s) if TERMINAL-TEST(s)
max MINIMAX(RESULT(s, a)) if PLAYER(s) =
= a ACTIONS(s)
∈ MAX
a∈ min (s) MINIMAX(RESULT(s, a)) if PLAYER(s) = MIN
ACTIONS
Example
Consider a two-player game with the game tree as shown in Figure 5.6. The ∆-nodes are
“MAX nodes” in which it is MAX’s turn to play and the ∇-nodes are the “MIN nodes”.
The terminal nodes sho the utility values for MAX. The minimax values of the other
nodes
5.6. THE MINIMAX ALGORITHM 90
MINIMAX(C) = min
a∈{c1,c2,c3}
MINIMAX(RESULT(C, a))
= min{MINIMAX(H), MINIMAX(I), MINIMAX(J)}
= min{2, 4, 6}
= 2.
MINIMAX(D) = min
a∈{d1,d2,d3}
MINIMAX(RESULT(D, a))
= min{MINIMAX(K), MINIMAX(L), MINIMAX(M)}
= min{14, 5, 2}
= 2.
MINIMAX(A) = max
a∈{a1,a2,a3}
MINIMAX(RESULT(A, a))
= min{MINIMAX(B), MINIMAX(C), MINIMAX(D)}
= min{3, 2, 2}
= 3.
Remarks
1. The algorithm is presented using the pseudocode employed by Russel and Norvig
in their book “Articial Intelligence: A Modern Approach”.
2. The algorithm determines the action to be taken when the game is at state. To
determine the action to be taken at the initial state the function value MINIMAX-
DECISION(initial state) has to be computed.
3. In the algorithm the notation “←” denotes the assignment operator: “x ← y” means
assign y to x.
4. The notation arg maxa∈Sf (a) computes the element a of set S that has the maximum
value of f (a). For example, let
S = {1, 2, 3, 4, 5}
f (1) = 28, f (2) = 39, f (3) = 38, f (4) = 32, f (5) =
30
then
Solution
We are required to compute MINIMAX-DECISION(A). According to the minimax algo-
rithm, this is given by
MINIMAX-DECISION(A) = arg max MIN-VALUE(RESULT(A,a)).
a∈ACTIONS(A)
1. Next we compute MIN-VALUE(B). Since B is not a terminal state and since ACTIONS(B)=
{BD, BE}, MIN-VALUE(B) is computed as follows:
v←∞
v ← MIN(v, MAX-VALUE(RESULT(B, BD))
v ← MIN(v, MAX-VALUE(RESULT(B, BE))
MIN-VALUE(B) ← v.
(a) v ← ∞
(b) We have to compute
MAX-VALUE(RESUL(B,BD))=MAX-VALUE(D).
Since ACTIONS(D)= { DH, DI} and since D is not a terminal state MAX-
VALUE(D) is computed as follows:
w ← −∞.
w ← MAX(w, MIN-VALUE(RESULT(D,
DH)) w ← MAX(w, MIN-
VALUE(RESULT(D, DI)) MAX-VALUE(D)
← w.
(i) w ← −∞.
(ii) We compute
MIN-VALUE(RESUL(D, DH))=MAX-VALUE(H).
Since H is a terminal state, we have
MAX-VALUE(H) = UTILITY(H) = -
1.
Thus we have
w ← MAX(−∞, -1)
w ← −1
(iii) We compute
MIN-VALUE(RESUL(D, DI))=MAX-VALUE(I).
Since I is a terminal state, we have
MAX-VALUE(I) = UTILITY(I) =
4.
Thus we have
w ← MAX( -1, 4)
w←4
(iv) Finally we get
MAX-VALE(D)← 4.
v ← MIN(v, MAX-VALUE(RESULT(B, BD))).
that is, v ← MIN(∞, 4), that is v ← 4.
(c) We compute
MAX-VALUE(RESUL(B,BE))=MAX-VALUE(E).
5.7. MINIMAX WITH ALPHA-BETA PRUNING 93
Since E is not a terminal state, and since ACTIONS(E)= {EJ, EK}, MAX-
VALUE(E) is computed as follows:
w ← −∞
w ← MAX(w, MIN-VALUE(RESULT(E, EJ))
w ← MAX(w, MIN-VALUE(RESULT(E,
EK)) MAX-VALUE(E) ← w.
(i) w ← −∞
(ii) Let us compute the
assignment We compute
MIN-VALUE(RESUL(E, EJ))=MAX-VALUE(J).
Since J is a terminal state, we have
MAX-VALUE(J) = UTILITY(J) =
2.
Thus we have
w ← MAX(−∞, 2)
w←2
(iii) We compute
MIN-VALUE(RESUL(E, EK))=MAX-VALUE(K).
Since K is a terminal state, we have
MAX-VALUE(K) = UTILITY(K) =
6.
Thus we have
w ← MAX(2, 6)
w←6
v ← MIN(v, MAX-VALUE(RESULT(B, BE)))
that is, v ← MIN(4, 6), that is v ← 4.
(d) We now get
MIN-VALUE(B) ← 4.
5.7.2 Example
Example 1
Figure 5.8 represents a game tree showing which nodes can be pruned. Suppose the values
of the leaf nodes are given or are computed given the definition of the game. The
numbers at the bottom show some of these values. The other values are irrelevant, as we
show here.
2. Just by considering the leftmost child of i with a value of 6, we know that the value
of i is less than or equal to 6.
3. Therefore, at node d, the maximizing agent will go left. We do not have to evaluate
the other child of i.
4. Similarly, the value of j is 11, so the value of e is at least 11, and so the minimizing
agent at node b will choose to go left.
5. The value of l is less than or equal to 5, and the value of m is less than or equal
to 4; thus, the value of f is less than or equal to 5, so the value of c will be less
than or equal to 5.
Notice that this argument did not depend on the values of the unnumbered leaves.
Moreover, it did not depend on the size of the subtrees that were not explored.
5.7. MINIMAX WITH ALPHA-BETA PRUNING 95
Example 2
Consider the game tree shown in Figure 5.6. We show below why the leave c2 and c3 can
be pruned without affecting the final decision.
Figure 5.9 shows the various stages in the calculation of the optimal decision for the
game tree in Figure 5.6. At each point, we also show the range of possible values for each
node.
Figure 5.9: Stages in the calculation of the optimal decision for the game tree in Figure 5.6
(a) The first leaf below B has the value 3. Hence, B, which is a MIN node, has a
value of at most 3.
(b) The second leaf below B has a value of 12; MIN would avoid this move, so the
value of B is still at most 3.
(c) The third leaf below B has a value of 8; we have seen all B’s successor states,
so the value of B is exactly 3. Now, we can infer that the value of the root is at
least 3, because MAX has a choice worth 3 at the root.
(d) The first leaf below C has the value 2. Hence, C, which is a MIN node, has a
value of at most 2. But we know that B is worth 3, so MAX would never
choose
C. Therefore, there is no point in looking at the other successor states of C. This
is an example of alpha-beta pruning.
(e) The first leaf below D has the value 14, so D is worth at most 14. This is still
higher than MAX’s best alternative (that is, 3), so we need to keep exploring
D’s successor states. Notice also that we now have bounds on all of the
successors of the root, so the root’s value is also at most 14.
5.7. MINIMAX WITH ALPHA-BETA PRUNING 96
(f) The second successor of D is worth 5, so again we need to keep exploring. The
third successor is worth 2, so now D is worth exactly 2. MAX’s decision at the
root is to move to B, giving a value of 3.
Example 3
Figure 5.10 illustrates alpha-beta pruning. The grayed-out subtrees are not explored (when
moves are evaluated from left to right). The max and min levels represent the turn of the
player and the adversary, respectively.
5.7.3 Algorithm
The minimax with alpha-beta pruning algorithm is given below.
Algorithm
01. function alphabeta(node, α, β, Player)
02. if node is a terminal node then
03. return the value of node
04. if Player == MAX then
05. value := −∞
06. for each child of node do
07. value := max(value, alphabeta(child, α, β, MIN))
08. α := max(α, value)
09. if α ≥ β then
10. break (* β cutoff *)
11. return value
12. else
13. value := +∞
14. for each child of node do
15. value := min(value, alphabeta(child, α, β, MAX))
16. α := min(β, value)
17. if α ≥ β then
18. break (* α cutoff *)
19. return value
Solution
5. Why the alpha-beta pruning method is better than the minimax search method in
5.8. SAMPLE QUESTIONS 99
MAX S
MIN A B
Terminal C D E F G
5 3 8 4 2
Figure 5.18: A game tree
solving a game?
5. Determine which of the branches in the game tree shown in Figure 5.20 will be
pruned if we apply alpha-beta pruning to solve the game.
MAX A
MIN B C
MAX D E F G
Terminal H I J K L M N O
3 5 6 9 1 2 0 -1
Knowledge representation
6.1 Knowledge
Knowledge is awareness or familiarity gained by experiences of facts, data, and
situations. Knowledge is a familiarity, awareness, or understanding of someone or
something, such as facts (propositional knowledge), skills (procedural knowledge), or
objects (acquaintance knowledge). By most accounts, knowledge can be acquired in
many different ways and from many sources, including but not limited to perception,
reason, memory, testimony, scientific inquiry, education, and practice. The philosophical
study of knowledge is called epistemology.
6.1.1 Knowledge in AI
The following are the kinds of knowledge which need to be represented in AI systems:
• Object:
All the facts about objects in our world domain. E.g., Guitars contains strings,
trum- pets are brass instruments.
• Events:
Events are the actions which occur in our world.
• Performance:
It describe behavior which involves knowledge about how to do things.
• Meta-knowledge:
It is knowledge about what we know.
• Facts:
Facts are the truths about the real world and what we represent.
• Knowledge base:
It is the totality of information, structured or unstructured, stored in a system.
100
6.2. REPRESENTATION 101
2. Procedural knowledge
Procedural knowledge, also known as imperative knowledge, is a type of
knowledge which is responsible for knowing how to do something. It can be
directly applied to any task and it includes rules, strategies, procedures, agendas,
etc.
3. Meta-knowledge
Knowledge about the other types of knowledge is called meta-knowledge.
4. Heuristic knowledge
Heuristic knowledge is knowledge of experts in a field or subject. Heuristic knowl-
edge is a “rule of thumb” knowledge based on previous experiences, awareness of
approaches, and which are good to work but not guaranteed.
5. Structural knowledge
It describes relationships between various concepts such as kind of, part of, and
grouping of something. It describes the relationship that exists between concepts
or objects.
6.2 Representation
By “representation” we mean a relationship between two domains where the first is
meant to “stand for” or take place of the second. Usually the first domain is more
concrete, or accessible in some way than the second. For example, a drawing of a
milkshake and a hamburger on a sign might stand for a less visible fast food restaurant.
all the desired properties. Because of this, there are different knowledge representations
systems having varying capabilities.
1. Representational adequacy
This the ability to represent all kinds of knowledge that are needed in a domain.
2. Inferential adequacy
This is the ability to derive new knowledge from old.
3. Inferential efficiency
This is the ability to incorporate into the knowledge base additional information.
4. Acquisitional efficiency
This is the ability to acquire new knowledge easily.
• Semantic networks
• Frames
• Conceptual dependencies
6.6.1 Definition
A semantic network is a graph constructed from a set of vertices (or nodes) and a set of
directed and labelled edges. The vertices or nodes represent concepts or objects, and the
edges represent relations between the nodes.
6.6.2 Examples
Example 1
Consider the knowledge contained in the following sentence:
S: “Sparrow is a bird.”
6.6. SEMANTIC NETWORKS 103
There are two concepts in the sentence, namely, “Sparrow” and “Bird”. The relation be-
tween these concepts is indicated by “is a”. We represent the two concepts by two nodes
in a graph and the relation between them by a directed edge with the label “is-a”. (We
have represented the “is a” relation by “is-a” as it is the convention in the theory of
semantic networks.) The knowledge contained in S can now be represented by the graph
shown in Figure 6.1. This is the semantic network representation of the knowledge
contained in S.
Sparrow is-a
Bird
Example 2
Consider the knowledge contained in the following sentences:
S: “Tweety is a bird.”, “Birds are animals.”
The sentences has three objects “Tweety”, “Birds” and ”Animals”. Tweety is the name of
a particular bird. So the sentence “Tweety is a bird” means that the bird Tweety is an
instance of the class of things indicated by the word “Birds”. The sentence “Birds are
animals” means that all members of the class of things indicated by ”Birds” are also
members of the class of things indicated by “Animals”.
To represent the knowledge contained in S by a semantic network, we need three ver-
tices to denote the three objects. These vertices are to be connected by directed edges
with appropriate labels. The knowledge contained in S is represented by the graph
exhibited in Figure 6.2.
subset-of
Tweety instance-of Birds Animals
Figure 6.2: Semantic network representation of the sentences “Tweety is a bird.” and “Birds
are animals.”
We may represent the nodes as rectangles as in Figure 6.3.
instance-of subset-of
Tweety Birds Animals
Figure 6.3: Semantic network representation of the sentences “Tweety is a bird.” and “Birds
are animals.”
Example 3
Consider the sentence:
S: “A bird has feathers.”
This sentence means that the object “Feather” is part of the the object “Bird”. In a
semantic network representation of the sentence, the edge joining the nodes is labelled
“HAS-A” (or “HASA”, or “PART-OF”) as in Figure 6.4. Such a relation is referred to as
a part-whole relation. The link or edge is from whole to part.
6.6. SEMANTIC NETWORKS 104
HAS-A
Bird Feather
Example 4
Consider the sentence:
is-taller-than
Bill John
Figure 6.5: Inappropriate semantic network representation of “Bill is taller than John”
This representation of the sentence does not represent the knowledge expressed by the
sen- tence because the relation “is-taller-than” cannot be taken as a basic relation. What
the sentence actually means is that the height of Bill is greater than than the height of
John. Thus the proper semantic representation of the sentence “Bill is taller than John” is
the one given Figure 6.6. In this figure we have used the basic relation “is-greater-than”
and also the relation “height”.
is-greater-than
h1 h2
height height
Bill John
Example 5
Consider the sentence:
This sentence has three concepts “Sara”, “Brown” and “Eyes”. “Brown” is a property of
“Eyes” and, in a semantic representation of the sentence, the property can be denoted by
the label “Property” or more specifically by the label “Colour”. The semantic network is
shown in Figure 6.7.
Figure 6.7: Semantic network representation of the sentence “Sara has brown eyes.”
6.6. SEMANTIC NETWORKS 105
Example 6
Consider the knowledge represented by the following sentences:
• A crow is a bird.
The semantic network representation of the sentences is given in Figure 6.8. Note the use
of the labels “Property”, “Child-of” and “Color” in the network.
Example 7
Consider the following sentences:
IS-A IS-A
Scooter Two-wheeler Motor-bike
IS-A
HAS-A HAS-A
Breaks Moving vehicle Engine
HAS-A HAS-A
Example 8
Figure 6.10 shows another simple semantic network. Note the use of the label “is” instead
of “IS-A”. This is perfectly permitted as there are no standardisation of the notations for
labels. Also note the labels “lives in” and “works in”.
Thrissur
is lives in
City is
John Person
is works in
Cochin
Example 9
Figure 6.11 shows a semantic network with several objects and categories.
Example 10
Figure 6.12 shows a semantic network with two objects (John and Mary) and four
categories (Mammals, Persons, FemalePersons and MalePersons).
Example 11
Figure 6.13 shows a semantic network with “IS-A” and “AKO” (A-KIND-OF) links. The
link AKO is used to relate one class (or category) to another. AKO relates generic nodes
to generic nodes while the IS-A relates an instance or individual to a generic class.
Example 12
Figure 6.14 illustrates a semantic network where the principal concept is “financial prod-
uct,” which can be a loan or an investment. An example of a long-term loan is a
mortgage, and an example of a low-risk investment is a government bond. Note the
variety of labels used to represent the various relations. Note also the presence of two-
way links connecting nodes.
6.6. SEMANTIC NETWORKS 108
6.6.3 Remarks
It appears that there are no universally agreed conventions on the use of geometrical
shapes to represent nodes. Ovals, circles and rectangles are all used to represent nodes.
Also, there is no complete set of standard terminology to label the edges in semantic
graphs. However, the following are some of the most common semantic relations used in
semantic networks.
• IS-A (ISA)
• IS-PART-OF
• IS-SUBSET-OF
• A-KIND-OF (AKO)
1. Definitional networks
These emphasize the subtype relation between a concept type and a newly defined
subtype.
6.6. SEMANTIC NETWORKS 109
2. Assertional networks
Assertional semantic networks, also known as propositional semantic networks
are designed to represent assertions or propositions. Figure 6.16 shows a semantic
net- work representation of the proposition “All men are mortal”. In the figure,
node M1 corresponds to the proposition that “All men are mortal.” Node V1 is the
variable cor- responding to “all men.” Node M2 is the proposition that corresponds
to the arbitrary man being a member of the class of men.
3. Implicational networks
These use implication as the primary relation for connecting nodes.
4. Executable networks
These include some mechanism which can perform inferences, pass messages, or
search for patterns and associations.
5. Learning networks
These build or extend their representations by acquiring knowledge from examples.
6. Hybrid networks
These combine two or more of the previous techniques.
Example 1
Consider the proposition: “John gave the book to Mary.” This is a proposition involving
three elements.
This proposition can be considered as an instance of the category of events called
”Give”. Let us call this instance as EV7. There are three components to this event,
namely, an agent (the giver), a beneficiary and an object. The object “the book” is a
particular in- stance of the the category “Book”. Let us denote this particular instance of
Book by BK23. The proposition can b now be represented by the semantic net shown in
Figure 6.21.
Figure 6.21: A semantic network representing the sentence: “John gave the book to Mary.”
Example 2
Consider the proposition: “The score in Cubs vs. Dodgers game was 5 - 3.” This is also a
proposition involving three elements. This may be represented as a semantic network as
in Figure 6.22. In the figure, “G23” is the particular instance of the category “Game”
played by Cubs and Dodgers.
Figure 6.22: A semantic network representing the sentence: “The score in Cubs vs. Dodgers
game was 5 - 3.”
6.7.2 Disadvantages
1. There is no standard definition of link names.
5. Undistinguished nodes that represent classes and that represent individual objects.
6.8 Frames
6.8.1 Definition
A frame is a collection of attributes and possible values that describe some entity in the
world. A frame system is a collection of frames that are connected to each other by the
fact that the value of an attribute in one frame may be another frame.
Remarks
1. The basic characteristic of a frame is that it represents related knowledge about a
narrow subject.
2. A frame system is a good choice for describing a mechanical device, for example a
car.
6.8. FRAMES 113
3. Just as with semantic nets, there are no standards for defining frame-based systems.
4. A frame is analogous to a record structure, corresponding to the fields and values
of a record are the slots and slot fillers of a frame.
5. A frame is basically a group of slots and fillers that defines a stereotypical object.
6. A frame is also known as slot-filter knowledge representation in artificial intelli-
gence.
6.8.3 Examples
Example 1
Table 6.1 is a frame for a book.
Slot Value
Title Artificial Intelligence
Genre Computer Science
Author Peter Norvig
Edition Third Edition
Year 1996
Page 1152
Example 2
Table 6.2 is the frame representation of the statement: “Peter is an engineer as a profession,
and his age is 25. He lives in city London, and the country is England.”
Slot Value
Name Peter
Profession Doctor
Age 25
Marital status Single
Weight 78
Example 3
Table 6.3 displays another frame. In this frame, he car is the object, the slot name is the
attribute, and the filler is the value.
Slots Fillers
Manufacturer General Motors
Model Chevrolet Caprice
Year 1979
Transmission Automatic
Tyres 4
Colour Blue
engine Petrol
Example 4
Table 6.4 defines a frame with name ALEX. The various slots in the frame, their names,
their values and their types are all specified in the table.
6.8. FRAMES 115
The parent frame BOY of the frame named ALEX is shown in Table 6.5.
Example 5
Figure 6.23 shows a simplified frame system. There are two types of attributes that can
be associated with a class or set: those that are about the set itself and those inherited by
each element of the set. In the figure, the latter type of attributes are prefixed by an
asterisk (*).
the most influential early Frame languages was KL-ONE. KL-ONE spawned everal
subse- quent Frame languages. One of the most widely used successors to KL-ONE was
the Loom language developed by Robert MacGregor at the Information Sciences
Institute.
“John ran”.
PP ⇐⇒ ACT
Example 2
Consider the English
sentence:
“John gave Mary a book.”
P O
John ⇐⇒ gave ←− book
Now, consider the action “gave”. It has two meanings: simply a physical transfer, or an
ownership transfer. In the former case it is an example of PTRANS. In the latter case it is
an abstract transfer which in CD theory is indicated by ATRANS. Assuming that it is the
latter, we may represent “John gave a book.” as
P O
John ⇐⇒ ATRANS ←− book
Next, let us consider the dependency of “Mary” on the other concepts in the sentence.
The dependency of “Mary” on “John” is one of a “recipient-donor” relationship and the
“book” is dependent on this relation. So the dependency can be represented as follows:
6.9. CONCEPTUAL DEPENDENCY 118
to
R Mary
book
John
from
In the diagram, R indicates a recipient-donor relationship. Putting all the components to-
gether, we obtain the CD representation of the sentence “John gave Mary a book.” as
shown below:
to
p O R Mary
John book
ATRANS John
from
Example 3
Consider the English sentence:
The verb “ate” which is the past tense of “eat” is conceptualised as “ingesting” or “taking
in” and is indicated by the primitive action INGEST. In the sentence, the spoon is the
instrument of the action. In CD theory, an instrument of action is indicated by the letter
“I”. The sentence can be represented in the following form.
John
p O I
John INGEST ice cream
do
O
spoon
Remark
In any domain in which we build a conceptual representation system, we will have to
choose an appropriate level for primitive actions.
Primitive Meaning
ACT Real world actions. There are only eleven of these ACTs
(see Table 6.7).
PP Real world objects (picture producers). Only physical
objects are PPs.
AA Modifiers of actions (action aiders) (Attributes of actions)
PA Attributes of an object (picture aiders). PAs take the form:
STATE(VALUE). That is, a PA is an attribute characteristic
(like color or size) plus a value for that characteristic (red or
10 cm).
T Times
LOC Location
3. Conceptual tenses
There are definite symbols in CD theory to indicate the tenses of verbs like, for example,
“p” for past tense. The available symbols are given in Table 6.8.
4. Indicators of dependencies
Meaning
−→, ←− Direction of dependency
Two-way links between the actor (PP) and the action (ACT)
Dependency between PP and PA
Symbol Meaning
o Objective case
R Recipient case
I Instrumental case (e.g., eat with a
spoon) D Directive case (e.g., going home)
6. States
States of objects are described by scales which have numerical values. For example, the
state of health of a person is represented by a numerical value which goes from −10 to
6.9. CONCEPTUAL DEPENDENCY 121
+10, “−10” indicating “dead” and “+10” indicating “perfect health”. The intermediate
values may represent various states of health as indicated below:
Various other states like the states of “fear”, “anger”, “mental state” (sad, happy, etc.),
hunger, etc. have also been defined the CD theory with appropriate numerical values.
There are also states that have absolute measures like, “length”, “mass”, “weight”,
“speed”, “size”, etc. Further, there are also states which indicate relationships between
objects like “possession” indicated by “POSS” in CD theory.
6.9.3 Rules
The conceptual categories combine in certain specified ways. The rules for combination
of conceptual categories are called the conceptual syntax rules. There are used to
represent natural language statements. The various rules are summarised in Table 6.11.
p o John
John ATRANS
o PP Mary
ACT o
7 PP book John took the book
from Mary.
John
I
John INGEST
o DO
I ice cream o
ACT spoon John ate ice cream
8
with a spoon.
P D field
John PTRANS bag
PP o
D
ACT fertilizer John fertilized the
9 PP field (“D” denotes
the directive case.)
PP size > x
PP plants
10 PA size = x The plants grew.
2. CD can act as Interlingua and can facilitate translation from one language to
another based on the idea of the statement and not just translation of the words used
in the statement.
3. We can fill missing pieces in the fixed structures by fetching them from the context.
Disadvantages
1. It is only a theory of representation of events.
6. The set of 11 primitive acts is incomplete. All knowledge should be expressed into
these acts which is sometimes inefficient and even impossible.
8. In the generalized representation, the finer meaning is lost. For example there is a
difference between giving and gifting something which is not captured by CD.
15. Give a CD representation of the sentence: “Mohan took the book from Laila.”
16. Express the statement “He gave the pen” through conceptual dependency.
13. What do you understand by conceptual dependency? Give the conceptual depen-
dency representations of the following sentences:
In the last chapter we discussed knowledge representation using semantic networks and
conceptual dependency diagrams. In the present chapter we consider a third method of
knowledge representation, namely, representation using logic, especially first order predi-
cate logic.
In the first section we give a quick review of the basic ideas of propositional logic as
a preparation for studying first order predicate logic. The chapter concludes with a
summary of the various knowledge representation methods.
Examples
1. The statement “New Delhi is the capital of India.” is a proposition and its truth
value is True,
2. The statement “India became a republic in 1947.” is a proposition and its truth
value is False.
3. The sentence “It may rain tomorrow.” is not a proposition because it has no
definite truth value.
127
7.1. PROPOSITIONAL LOGIC 128
1. Negation
The negation of a proposition P is denoted by ¬P . By definition, the truth value of ¬P is
False if the truth value of P is True and the truth value of ¬P is True if the truth value of
P is false. The truth table of ¬P is exhibited in Table 7.1.
P ¬P
T F
F T
2. Disjunction
The disjunction of two propositions P and Q is denoted by P ∨ Q. The truth value of
P ∨ Q is True if P is True, or Q is True or both P and Q are True. The truth value of P
∨ Q is False in all other cases. The truth values are given Table 7.2.
P Q P ∨Q
T T T
T F T
F T T
F F F
3. Conjunction
The conjunction of two propositions P and Q is denoted by P ∧ Q. The truth value
of P ∧ Q is True only if both P and Q are True. The truth value of P ∨ Q is False in all
other cases. The truth values are given Table 7.3.
P Q P ∧Q
T T T
T F F
F T F
F F F
4. Implication
The implication of proposition Q from P is denoted by P ⇒ Q. The truth value of P ⇒
Q is False only if P is True and Q is False. The truth value of P ⇒ Q is True in all
other cases. The truth values are given Table 7.4.
P Q P ⇒Q
T T T
T F F
F T T
F F T
P Q P ⇔Q
T T T
T F F
F T F
F F T
Examples
For example it can be seen that the expression (¬P ) ∨ (P ∨ Q) is a tautology. We
prove this by constructing a truth table of the expression.
7.1. PROPOSITIONAL LOGIC 130
P Q ¬P P ∨ Q (¬P ) ∨ (P ∨ Q)
T T F T T
T F F T T
F T T T T
F F T F T
P Q ¬P P ∧ Q ¬P ∧ (P ∧ Q)
T T F T F
T F F F F
F T T F F
F F T F F
7.1.4 Identities
Two logical expressions X and Y are said to be logically equivalent if the expression
X ⇔ Y is a tautology. If X and Y are logically equivalent we write X ≡ Y and say that
X ≡ Y is a logical identity.
Example
Prove the identity:
(P ⇒ Q) ≡ ((¬P ) ∨ Q)
We have to show that the expression
(P ⇒ Q) ⇔ ((¬P ) ∨ Q)
P Q P ⇒Q ¬P (¬P ∨ Q) (P ⇒ Q) ⇔ ((¬P ) ∨ Q)
T T T F T T
T F F F F T
F T T T T T
F F T T T T
1. P ∨ (¬P ) ≡ T
2. P ∧ (¬P ) ≡ F
3. P ∨ P ≡ P
4. P ∧ P ≡ P
5. P ∨ Q ≡ Q ∨ P
6. P ∧ Q ≡ Q ∧ P
7. P ∨ (Q ∨ R) ≡ (P ∨ Q) ∨ R
8. P ∧ (Q ∧ R) ≡ (P ∧ Q) ∧ R
9. (P ⇒ Q) ≡ ((¬P ) ∨ Q)
7.1.5 Inferences
Arguments
An argument in propositional logic is a sequence of propositions. All but the final
propo- sition are called premises and the final proposition is called the conclusion. An
argument is valid if the truth of all its premises implies that the conclusion is true.
An example of an argument is given in Figure 7.1. It means that if the statements P
∨Q, P ⇒ R and Q ⇒ R are true then R is also true.
P ∨Q
Premises
P⇒R
Q⇒R
Conclusion ∴ R
P
P ⇒Q
∴ Q
This is a well known rule of inference is known as “modus-ponens”. Table 7.9 lists the
various rules of inference and their names.
¬Q
2 P ⇒Q Modus tollens
∴ ¬P
P ⇒Q
3 Q⇒ Hypothetical syllogism
∴ RP⇒
R
P ∨Q
4 ¬P Disjunctive syllogism
∴ Q
P
5 Addition
∴ P ∨Q
P ∧Q
6 Simplification
∴ P
P
7 Q Conjunction
∴ P ∧Q
P ∨Q
8 ¬P ∨ R Resolution
∴ Q∨R
Examples
Sentences Examples
Atomic sentence P, Q, R, . .
.
Literals P, ¬P, Q, ¬Q, R, ¬R, . . .
Clauses P ∧ Q ∧ ¬R, ¬Q ∧ R, . . .
CNF (P ∧ Q ∧ ¬R) ∨ (¬P ∧ Q)
7.1. PROPOSITIONAL LOGIC 134
Conversion to CNF
Every sentence of propositional logic is logically equivalent to a conjunction of clauses. We
now describe a procedure for converting to CNF.
¬(¬α) ≡ α
¬(α ∧ β) ≡ (¬α ∨ ¬β)
¬(α ∨ β) ≡ (¬α ∧ ¬β)
4. Apply the distributivity laws distributing ∨ over ∧ wherever possible, that is, we
apply the equivalence:
α ∨ (β ∧ γ) ≡ (α ∨ β) ∧ (α ∨ γ).
Example
Transform the following sentence into CNF:
P ⇔ (Q ∨ R)
Solution
Given sentence:
P ⇔ (Q ∨ R)
Eliminating “⇔” we get the equivalent sentence:
(P ⇒ (Q ∨ R)) ∧ ((Q ∨ R) ⇒ P )
By associativity we have
1. Select two clauses that contain conflicting terms (that, one term in one of the
clauses is the negation of a term in the other clause).
What makes the resolution proof technique so important is that this one simple technique
is capable of performing the same kinds of reasoning tasks as modus ponens, modus
tollens, syllogisms and many other logical operations.
A resolution refutation proof is proof by contradiction using resolution. Like every
proof by contradiction, we begin by assuming the opposite of what we wish to prove, and
then show that this “fact” would lead to a contradiction.
The algorithm of the resolution-refutation proof technique is given below.
Step 2. Negate P and convert the result to clause form. Add it to the set of clauses
obtained in Step 1.
If a contradiction has been found, we conclude that P is a valid conclusion from the
premises F . Otherwise, that is if the algorithm terminates in such a status that no progress
can be made, we conclude that P is not a valid conclusion from F .
Example
Using resolution-refutation determine whether the argument shown in Figure 7.1 is valid.
7.2. FIRST ORDER PREDICATE LOGIC (FOPL) 136
Solution
The premises are
P ∨ Q, P ⇒ R, Q ⇒ R,
Converting these to the clause form we get
P ∨ Q, ¬P ∨ R, ¬Q ∨ R.
P ∨ Q, ¬P ∨ R, ¬Q ∨ R, ¬R.
To study the first of the three sentences given above, we consider the sentence “x
wears boots”. Here we assume that x is the name of an unspecified arbitrary child in
the photo- graph. As such it has no truth value because we have not specified what x is. If
we replace x by the name of a child in the photograph, say “John”, we get the
sentence “John wears boots.” which has a truth value namely True. The statement
“All children wear boots” is true if all sentences obtained by replacing x in “x wears
boots” by the names of each and every child in the photograph is true. This is indeed
the case and so we conclude that the statement “All children wear boots” is true.
To discuss the the third of the three statements, we consider the sentence “x is a girl”
For the statement “There is a girl among the children” to be true it is enough that there is
at least one name say “Sara” among the set of names of children which makes the
sentence “Sara is a girl” true. This is indeed the case and so the truth value of the
sentence “Thers is a girl among the children” is True.
Sentences like “x wears boots” and “x is a girl” are called predicates. The predicate
“x wears boots” may be symbolically represented as “wearsboots(x)”, or
“WearsBoots(x)” or simply “W (x)”. Similarly, the sentence “x is a girl” may be
represented as “girl(x)” or “Girl(x)” or simply “G(x)”.
Figure 7.3: A model containing five objects, two binary relations, three unary relations
(indicated by labels on the objects), and one unary function, left-leg.
7.2.2 Predicates
A predicate is a statement (specifying some relation) that contains variables (predicate
variables), and they may be true or false depending on the values of these variables. The
domain is the set of values that variables can take.
Definition
A predicate is a sentence that contains a finite number of variables and becomes a
statement when specific values are substituted for the variables.
Examples
The following are predicates in appropriate domains:
1. “x is a boy.”
2. “x wears boots.”
3. “x is a king.”
• The symbols are of three kinds: constant symbols, which stand for objects;
predicate symbols, which stand for relations; and function symbols, which stand
for functions.
• Each predicate and function symbol comes with an arity that fixes the number of
arguments.
A predicate with arity 1 is called a unary predicate. The predicate “x is a king” is unary
predicate. A predicate with arity 2 is called a binary predicate. The predicate “x is a
brother of y” is a binary predicate. In general, a predicate with arity n is called a n-ary
predicate.
Notation
We adopt the convention that the symbols begin with uppercase letters. For example, for the
world specified in Figure 7.3, we might use the constant symbols Richard and John; the
predicate symbols Brother, OnHead, Person, King, and Crown; and the function symbol
LeftLeg. The choice of names is entirely up to the user.
7.3 Quantifiers
7.3.1 Universal quantification
Consider the statement:
In the notations of first-order logic we write the sentence “All kings are persons.” as
follows:
∀ x King(x) → Person(x)
The symbol ∀ is usually pronounced “For all . . .” and is called the universal quantifier.
Thus, the above sentence says:
Crown(x) : “x is a crown”
OnHead(x, John) : “x is on the head of
John.” To say that “King John has a crown on his head”, we write
∃ x Crown(x) ∧ OnHead(x, John)
∃x is pronounced “There exists an x such that . . .” or “For some x . . .”. The symbol “∃”
is called the existential quantifier. Notice that, by our definition, the sentence would
also be true in a model in which King John was wearing two crowns. This is entirely
consistent with the original sentence “King John has a crown on his head.”
Intuitively, the sentence ∃ x P says that P is true for at least one object x.
• “Everybody loves somebody” means that for every person, there is someone that
person loves:
∀x∃y Loves(x, y)
3. “If two persons speak the same language, they understand each other.”
∀ x, y, l SpeaksLanguage(x, l) ∧ SpeaksLanguage(y, l)
⇒ Understands(x, y) ∧ Understands(y, x)
4. “There is a barber who shaves all men in town who do not shave themselves.”
∃x Barber(x) ∧ (∀ y Man(y) ∧ ¬ Shaves(y, y) ⇒ Shaves(x, y))
7. “The best score in French is always higher than the best score in German in all
semesters.”
∀ s ∃ x ∀ y Score(x, French, s) > Score(y, German, s)
(Score(x, g, s) denotes the score obtained by student x in subject g in semester s.)
∀x Person(x) ∧ Born(x)∧
4. If A and B are wffs, then so are ¬A, (A ∨ B), (A ∧ B), (A ⇒ B), and A ⇔ B.
2. Eliminate implications
Replace a ⇒ b by ¬a ∨ b.
3. Move ¬ inwards
Make replacements as follows:
¬(a ∨ b) by ¬a ∧ ¬b
¬(a ∧ b) by ¬a ∨ ¬b
¬ ∀ x P by ∃ x ¬ P
¬ ∃ x P by ∀ x ¬ P
4. Standardize variables
Change variables so that each quantifier binds a unique variable.
For example, for sentences like (∃ x P (x))∨(∃ x Q(x)) which use the same
variable name twice, change the name of one of the variables.
Skolemize2
5.
Skolemization is the process of removing existential quantifiers by elimination.
2
The terminology is in honour of Thoralf Albert Skolem (1887 - 1963), a Norwegian mathematician who
worked in mathematical logic and set theory.
7.4. WELL-FORMED FORMULAS IN FOPL 143
(a) Existentially quantified variables which are not inside the scope of a universal
quantifier are replaced by creating new constants. For example, ∃ x P (x) is
changed to P (c), where c is a new constant.
(b) Replace every existentially quantified variable y with a term f (x1, , xn)
whose function symbol f is new. The variables of this term are the variables
that are universally quantified and whose quantifiers precede that of y.
• For example, in the formula ∀ x ∃ y ∀ z P (x, y, z), Skolemization replaces
y with f (x), where f is a new function symbol, and removes the
quantifi- cation over y. The resulting formula is ∀x ∀z P (x, f (x), z).
The Skolem term f (x) contains x but not z, because the quantifier to be
removed ∃ y is in the scope of ∀ x, but not in that of ∀z
• As a second example, when we Skolemize
we get
∀ x ∀ y ∃ z ∀ u ∃ v P (x, y, z, u, v)
7. Distribute ∨ over ∧
Convert the resulting expression into a conjunction of disjunctions by using
distribu- tivity.
Replace a ∨ (b ∧ b) by (a ∨ b) ∧ (a ∨ b).
8. CNF of the given wff
The sentence is now in CNF.
7.4.4 Examples
Example 1
Convert the following wff to the clausal form:
Solution
Step 1. Eliminate implications
∀ x ¬[∀ y ¬A(y) ∨ L(x, y)] ∨ [∃ y L(y, x)]
that is,
∀ x [¬∀ y ¬A(y) ∨ L(x, y)] ∨ [∃ y L(y, x)]
7.4. WELL-FORMED FORMULAS IN FOPL 144
Example 2
Convert the following wff to the clausal form:
∀ h i
x [E(x)] ⇔ [∀y E(T (x, y))]
Solution
Step 1. Remove biconditional
∀ x ([E(x)] ⇒ [∀y E(T (x, y))]) ∧ ([∀y E(T (x, y))] ⇒ [E(x)])
Step 2. Remove implications
∀ x (¬[E(x)] ∨ [∀y E(T (x, y))]) ∧ (¬[∀y E(T (x, y))] ∨ [E(x)])
Step 3. Move ¬ inwards
∀ x (¬[E(x)] ∨ [∀y E(T (x, y))]) ∧ ([∃y ¬ E(T (x, y))] ∨ [E(x)])
Step 4. Standardize variables
∀ x (¬[E(x)] ∨ [∀y E(T (x, y))]) ∧ ([∃z ¬ E(T (x, z))] ∨ [E(x)])
Step 5. Skolemize
∀ x (¬[E(x)] ∨ [∀y E(T (x, y))]) ∧ ([¬ E(T (x, F (x)))] ∨ [E(x)])
where F (x) is a Skolem function.
Step 6. Drop universal quantifiers
(¬[E(x)] ∨ [E(T (x, y))]) ∧ ([¬ E(T (x, F (x)))] ∨ [E(x)])
Step 7. This is the CNF of the given sentence.
7.5. INFERENCE RULES IN FOPL 145
Example 3
Transform to the CNF the following wff:
This is a symbolic representation of the sentence: “All Romans who know Marcus
either hate Caesar or think that anyone who hates anyone is crazy.” (see RKN, p.109) To
simplify the writing of the various steps we shall write R for “Romans”, K for “know”, H
for “hate”, T for “thinkcrazy”, M for “Marcus” and C for “Caesar”.
In the simplified notations, the given sentence can be written as
Solution
Step 1. Remove implications
∀ x ¬[R(x) ∧ K(x, M )] ∨ [H(x, C) ∨ (∀ y ¬ (∃ z H(y, z)) ∨ T (x, y))]
Step 4. Skolemize
No change.
Example
Consider the following predicate formula:
If this is true then we can infer that the following are all true.
Example
For example, from the sentence
(Q1 ∧ Q2 ∧ · · · ∧ QN ) ⇒ R,
Remarks
1. Recall the modus ponens inference rule of propositional logic:
P
P ⇒Q
∴ Q
This means that we are required to replace all occurrences of variable symbol vi by
term ti. Substitutions are made in left-to-right order in the list.
For example, let the substitutions be
θ = {x/IceCream, y/Ziggy}
Unification
Unification is the process of finding substitutions that make different logical expressions
look identical.
For example, consider the logical expressions P (x, y) and P (a, f (z)). If we
substitute x by a and y by f (z) in the first expression, the first expression will be identical
to the second expression. We say that the unifier of the expressions is the set θ = {a/x, f
(z)/y}.
As another example, consider the sentences:
Since both sentences contain the same variable, we apply standardisation and change the x
in Knows(x, Elizabeth) to y and get the following sentences:
Now, it is easy to see that the two sentences are unified by the substitutions
θ = {x/Elizabeth, y/John}.
For unification of two sentences, the predicate symbol must be same and number of
arguments must also be the same.
7.6. INFERENCE METHODS IN FOPL 148
Example
Consider the clauses:
Let us denote
P1 = P (x, f (a)), Q1 = ¬ P (z, f (a)).
We have
¬, Q1 = ¬(¬ P (z, f (a))) = P (z.f (a)).
Clearly, P1 and ¬ Q1 unify under the substitution
θ = {x/z}.
Hence using the resolution inference rule we get the resolvent clause
Step 2. Negate P and convert the result to clause form. Add it to the set of clauses
obtained in Step 1.
Step 3. Repeat until either a contradiction is found, no progress can be made, or a pre-
determined effort has been expended.
If a contradiction has been found, we conclude that P is a valid conclusion from the
premises F . Otherwise, that is if the algorithm terminates in such a status that no progress
can be made, we conclude that P is not a valid conclusion from F .
Remark
If the clauses to be resolved are chosen in certain systematic ways, the resolution
procedure will find a contradiction if one exists. However it may take a long time.
7.6.3 Examples
The various steps in applying the above algorithm to a given simple problem can be sum-
marised as follows:
Example 1
Prove the validity of the following argument:
3. Josephine is a cat.
Solution
First, we express the given sentences and the negated goal G in first-order logic in the CNF.
1. ¬ cat(x) ∨ likes(fish)
3. cat(jo)
The resolution proof that Josephine eats fish is given in Figure 7.4.
¬ cat(jo) cat(jo)
contradiction
Example 2
Prove that the conclusion
hate(Marcus, Caesar)
can be derived from the following premises:
1. man(Marcus)
2. Pompeian(Marcus)
3. ¬ Pompeian(x1) ∨ Roman(x1)
4. ruler(Caesar)
6. loyalto(x3, f (x3))
8. tryassassinate(Marcus, Caesar)
Solution
The various stages in the derivation are shown in Figure 7.5.
contradiction
Example 3
Use resolution-refutation method to check the validity of the following argument:
Everyone who loves all animals is loved by someone.
Anyone who kills an animal is loved by no one. Jack
loves all animals.
Either Jack or Curiosity killed the cat, who is named Tuna.
Therefore, Curiosity killed the cat.
Solution
First, we express the given sentences, some background knowledge, and the negated goal
G in first-order logic:
C. ∀ x Animal(x) ⇒ Loves(Jack, x)
E. Cat(Tuna)
F. ∀ x Cat(x) ⇒ Animal(x)
4. C: ¬Animal(x) ∨ Loves(Jack, x)
6. E: Cat(Tuna)
8. G: ¬Kills(Curiosity, Tuna)
The resolution proof that Curiosity killed the cat is given in Figure 7.6.
Loves(G(Jack), Jack)
Contradiction
Example 4
Given the premises:
“The law says that it is a crime for an American to sell weapons to hostile
nations. The country Nono, an enemy of America, has some missiles, and all
of its missiles were sold to it by Colonel West, who is American.”
Solution
First, we represent the facts as first-order clauses.
8. “West is a criminal.”
Criminal(West)
4. ¬Missile(x) ∨ Weapon(x)
5. Missile(M 1)
6. Owns(Nono, M 1)
7.7. SUMMARY OF KNOWLEDGE REPRESENTATION METHODS 154
7. American(West)
8. Enemy(Nono, America)
9. Criminal(West)
The various satges of the resolution proof are shown in Figure 7.7
Am(West) ¬Am(West)∨¬We(y)∨¬Se(West,y,z)∨¬Ho(
z)
¬Mi(x)∨We(x) ¬We(y)∨¬Se(West,y,z)∨¬Ho(z
)
Mi(M1) ¬Mi(y)∨¬Se(West,y,z)∨¬Ho(z)
¬Mi(x)∨¬Ow(Nono,x)∨Se(West,x,Nono) ¬Se(West,M1,z)∨¬Ho(z)
Mi(M1) ¬Mi(M1)∨¬Ow(Nono,M1)∨¬Ho(Nono)
Ow(Nono,M1) ¬Ow(Nono,M1)∨¬Ho(Nono)
Contradiction
database, but it also enables an intelligent machine to learn from that knowledge and
expe- riences so that it can behave intelligently like a human.
A knowledge representation structure (or, knowledge representation system) is a par-
ticular set of definitions, rules and procedures for setting up a representation that captures
information and knowledge about the world.
There are several knowledge representation systems such as the following:
(b) Frames
2. Semantic networks
Semantic networks are graphs in which vertices or nodes represent concepts or objects,
and the edges represent relations between the nodes. It is a knowledge representation
format that can be used to store the “meanings” of words so that human like use of these
meanings is possible.
Figure 7.8 shows a semantic network with four objects (John, Mary, 1 and 2) and four
categories (Mammals, Persons, FemalePersons and MalePersons).
3. Frames
The frame contains information on how to use the frame, what to expect next, and what
to do when these expectations are not met. Some information in the frame may remain
unchanged while others may change. The changing information are stored in “terminals”.
Terminals can be considered as variables. Different frames may share the same terminals.
Table 7.12 defines a frame with name ALEX. The various slots in the frame, their
names, their values and their types are all specified in the table.
4. Conceptual dependency
Conceptual dependency (CD) is a theory of knowledge representation to represent
natural language sentences in such way that the representation is independent of the
language in which the sentences are stated and it facilitates drawing inferences from
sentences.
7.7. SUMMARY OF KNOWLEDGE REPRESENTATION METHODS 156
The principal conceptual categories in CD theory are “real world actions” (denoted
by ACT), “real world objects” (PP), “modifiers of actions” (AA), “attributes of objects”
(PA), “times” (T) and “location” (LOC). The CD theory recognises 11 primitive ACTs
like “MOVE” (movement of a body part by its owner), “PROPEl” (application of
physical force to an object), etc.
The English sentence:
“John ate the ice cream with a spoon.”
is represented in CD theory as in Figure 7.9.
John
p O I
John INGEST ice cream
do
O
spoon
Figure 7.9: Representation of the sentence ”John ate the ice cream with a spoon.” in CD
theory
Logic makes statements about the world which are true (or false) if the state of affairs
it represents is the case (or not the case). Compared to natural languages and
programming languages logic combines the advantages of natural languages and formal
languages. Logic is concise, unambiguous, context insensitive, expressive and effective
for inferences.
A logic is defined by the syntax that describes the possible configurations that
constitute sentences, semantics that determines what facts in the world the sentences refer
to and a proof theory which is a set of rules for generating new sentences that are
necessarily true given that the old sentences are true.
There are two kinds of logic: propositional logic and first-order logic. For example,
knowledge contained in the following sentences can be represented by propositional
logic:
We write
E: the program is efficient
Q: the program executes quickly
B: the program has a bug
The given argument can be represented as follows:
(E ⇒ Q) ∧ (¬E ⇒ B) ∧ ¬Q ⇒ B
The knowledge contained in the following statements can be represented using predi-
cate logic.
3. Josephine is a cat.
This argument can be represented using the notations of predicate logic as follows:
1. ∀ x cat(x) ⇒ likes(fish)
3. cat(jo)
2. From the sentence “Heads I win, tails you lose,” prove using resolution refutation
that “I win.”
8. Tony, Shi-Kuo and Ellen belong to the Hoofers Club. Every member of the
Hoofers Club is either a skier or a mountain climber or both. No mountain climber
likes rain, and all skiers like snow. Ellen dislikes whatever Tony likes and likes
whatever Tony dislikes. Tony likes rain and snow. Is there a member of the
Hoofers Club who is a mountain climber but not a skier?
9. Anyone passing his history exams and winning the lottery is happy. But anyone
who studies or is lucky can pass all his exams. John did not study but John is
lucky. Anyone who is lucky wins the lottery. Is John happy?
Use resolution refutation to show that if John is a light sleeper, then John does not
have any mice.
Chapter 8
Planning
Planning is the task of finding a procedural course of action for a system to reach its
goals while optimizing overall performance measures. It is the first and foremost activity
to achieve desired results. Planning is a fundamental property of intelligent behavior.
This makes planning an important sub-area of artificial intelligence.
Planning techniques have been applied in a variety of tasks including robotics, pro-
cess planning, web-based information gathering, autonomous agents and spacecraft
mission control.
8.1 Overview
8.1.1 Decomposition of problems
When trying to solve a complicated problem, it may be necessary to work on small pieces
of the problem separately and then combine the partial solutions at the end into a
complete problem solution.
The ability to decompose a problem into small pieces to find a solution to the
problem is important because of the following reasons.
1. The decomposition may help us to avoid having to recompute the entire problem
state when we move from one state to next. This in turn help us to consider only
that part of the state that may have changed.
• For example, if we move from one room to another, it does not affect the
loca- tions of the doors and the windows in the two rooms.
• Consider the problem of guiding a robot in an ordinary house. The description
of a single state is very large since it must describe the locations of each and
every object in the house as well as that of the robot. A single action by the
robot will only change a small part of the total state. Instead of writing rules
that describe transformations one entire state into another, we would like to
write rules that describe only the affected parts of the state description.
2. The decomposition may divide a single difficult problem into several easier
subprob- lems.
160
8.2. COMPONENTS OF A PLANNING SYSTEM 161
8.1.3 Planning
Planning is the process of decomposing a problem into appropriate subparts and of
record- ing and handling interactions among the subparts as they are detected during the
problem solving process.
The word planning also refers to the process of computing several steps of a
problem- solving procedure before executing them.
2. Applying rules
3. Detecting a solution
1. Isolate a set of differences between the desired goal state and the current state.
8.2. COMPONENTS OF A PLANNING SYSTEM 162
1. Describe each of the changes to the state description the application of the rule
makes. Add a new rule that everything else is remaining unchanged. To do this, we
may require the explicit statements of a large number of premises or axioms.
2. Describe each operator by a list of new predicates (called ADD) that the operator
causes to become true and a list of old predicates (called DELETE) that the
operator causes to become false. There is also a third list (called
PRECONDITION) which specifies the predicates that must be true for the operator
to be applied. Any predicate not included in either ADD or DELETE is is assumed
to be unaffected by the rule.
original problem. Since the problem is only nearly decomposable, the combined solution
may not be the correct solution; it will only be an almost correct solution. The following
are some of the ways in which an almost correct solution may be repaired.
1. Throw out the almost correct solution and look for another one and hope that it is
better!
2. Consider the situation arising from the execution of the sequence of operations
speci- fied in the proposed almost correct solution. The difference between this and
the goal will much less than the difference between the initial state and the goal.
The problem solving system may be applied again to eliminate the difference.
3. Find exactly what went wrong and try a direct patch.
4. Apply the least commitment strategy. This means, leave the almost correct
solutions incompletely specified until the last possible moment and when as much
information as possible is available complete the specifications in such a way that
no conflicts arise.
A real-world stack allows operations at one end only. For example, we can place or
remove a book or a plate from the top of the stack only. Likewise, the Stack ADT allows
all data operations at one end only. At any given time, we can only access the top
element of a stack.
8.3.5 Algorithm
1. Push the compound predicate describing the goal state in to the Stack.
2. Push the individual predicates of the goal state in to the Stack.
3. Loop till the Stack is empty
(a) Pop an element E from the Stack.
(b) If E is a predicate
i. If E is True
A. Do nothing.
ii. Else
A. Push the relevant action into the Stack.
B. Push the individual predicates of the precondition of the action into
the Stack.
(c) Else if E is an action “a”
i. Apply the action “a” to the current state.
ii. Add the action “a” to the plan.
world PredicateMeaning
3. Actions
The following are the available actions in the blocks world.
Action Meaning
The PRECONDITION list (the list of predicates that should be true to apply the action),
ADD list (the list of predicates that become true after the action) and DELETE list (the
list of predicates that become false after the action) associated with the various actions
are given below.
8.3. GOAL STACK PLANNING 166
4. Problem
Given the initial and the final states of a blocks world with four blocks as shown in Figure
8.2, use goal stack planning algorithm to obtain a plan for achieving the goal state.
Figure 8.2: A simple blocks world problem: Initial state on the left and the goal state on the
right
5. Solution
Step 1. The initial state can be described by the statement:
ON (B, A)∧ONTABLE(A)∧ONTABLE(C)∧ONTABLE(D)∧ARMEMPTY
(empty)
Step 4. The goal state can be divided into four components, namely,
Of these the last two are true in the initial state. We push the first two
components to the stack GS.
ON (C, A)
ON (B, D)
ON (C, A) ∧ ON (B, D) ∧ ONTABLE(A) ∧ ONTABLE(D)
Note: The choice of the order in which the components are stacked is indeed im-
portant. For example, the two new components may be stacked in the following
order also. We have chosen the order as above in a purely arbitrary manner.
How- ever, to make an informed choice, we may need additional knowledge
about the world in which we are operating.
ON (B, D)
ON (C, A)
ON (C, A) ∧ ON (B, D) ∧ ONTABLE(A) ∧ ONTABLE(D)
Step 5. Consider the top item in GS which is ON (C, A). Check whether it is true in cur-
rent state. It is not. Find an action which makes it true. the action is STACK(C, A).
We replace ON (C, A) by STACK(C, A) to get the following GS.
STACK(C, A)
ON (B, D)
ON (C, A) ∧ ON (B, D) ∧ ONTABLE(A) ∧ ONTABLE(D)
CLEAR(A)
HOLDING(C)
CLEAR(A) ∧ HOLDING(C)
ON (C, A)
ON (B, D)
ON (C, A) ∧ ON (B, D) ∧ ONTABLE(A) ∧ ONTABLE(D)
Note: As in Step 4, here also the order in which the components CLEAR(A)
and HOLDING(C) are stacked has been chosen arbitrarily. The command
HOLDING(C) may be stacked above CLEAR(A).
Step 7. We now check whether CLEAR(A) is true. It is not. the only operator that make
it true is UNSTACK(B, A). So we replace CLEAR(A) by UNSTACK(B, A).
UNSTACK(B, A)
HOLDING(C)
CLEAR(A) ∧ HOLDING(C)
STACK(C, A)
ON (B, D)
ON (C, A) ∧ ON (B, D) ∧ ONTABLE(A) ∧ ONTABLE(D)
ON (B, A)
CLEAR(B)
ARMEMPTY
ON (B, A) ∧ CLEAR(B) ∧ ARMEMPTY
UNSTACK(B, A)
HOLDING(C)
CLEAR(A) ∧ HOLDING(C)
STACK(C, A)
ON (B, D)
ON (C, A) ∧ ON (B, D) ∧ ONTABLE(A) ∧ ONTABLE(D)
Step 9. The top element ON (B, A) is true and so we pop it off from the stack. The
next elemt CLEAR(B) is also true and so we pop it off also. At the current
state, the arm is not holding anything and so ARMEMPTY is also true and
we pop it off also. Sine each of the components of the next statement is true, the
compounded statement is also true. Thu it is also popped off from the stack
GS. At this stage the stack is as follows.
UNSTACK(B, A)
HOLDING(C)
CLEAR(A) ∧ HOLDING(C)
STACK(C, A)
ON (B, D)
ON (C, A) ∧ ON (B, D) ∧ ONTABLE(A) ∧ ONTABLE(D)
Step 10. The top element of the stack is UNSTACK(B, A). The preconditions for apply-
ing this action are satisfied.
HOLDING(C)
CLEAR(A) ∧ HOLDING(C)
STACK(C, A)
ON (B, D)
ON (C, A) ∧ ON (B, D) ∧ ONTABLE(A) ∧ ONTABLE(D)
Example
• For example, let it be required to travel at short notice to Mumbai with minimum
expenses for an interview. The task of the highest priority is to check the
availability of an airline at affordable cost. After finding a plan to solve this
problem, the person may find plans for solving the next level problems like
preparing baggage and hiring a taxi. Attempts to solve these problems lead to
problems at the third level: detailed plans for preparing the baggage and hiring a
taxi. These will in turn lead to problems at the next lower level.
Learning
1. Unsupervised learning
In unsupervised learning the agent learns patterns in the input even though no explicit feed-
back is supplied. The most common unsupervised learning task is clustering: detecting
potentially useful clusters of input examples. For example, a taxi agent might gradually
develop a concept of “good traffic days” and ”bad traffic days” without ever being given
labeled examples of each by a teacher.
2. Reinforcement learning
In reinforcement learning the agent learns from a series of reinforcements - rewards or
punishments. For example, the lack of a tip at the end of the journey gives the taxi agent
an indication that it did something wrong. The two points for a win at the end of a chess
game tells the agent it did something right.
3. Supervised learning
In supervised learning the agent observes some example input-output pairs and learns a
function that maps from input to output.
1
H A Simon, ”Why Should Machines Learn?”, Chapter 2 in Machine Learning, An Artificial Intelligence
Approach, Tioga Publishing Company, 1983.
172
9.3. FORMS OF LEARNING BASED ON DIFFERENT WAYS OF ACQUIRING
KNOWLEDGE173
4. Semi-supervised learning
In semi-supervised learning we are given a few labeled examples and must make what
we can of a large collection of unlabeled examples. Imagine that you are trying to build a
system to guess a person’s age from a photo. We gather some labeled examples by
snapping pictures of people and asking their age. That is supervised learning. But in
reality some of the people lied about their age and to uncover them is an unsupervised
learning problem.
• Take high level, abstract advice and convert it into rules that can guide
performance elements of the system. Automate all aspects of advice taking.
• Develop sophisticated tools such as knowledge base editors and debugging. These
are used to aid an expert to translate his expertise into detailed rules. Here the
expert is an integral part of the learning system.
artifi- cial intelligence. He popularized the term “machine learning” in 1959. The Samuel Checkers-playing
Program was among the world’s first successful self-learning programs, and as such a very early
demonstration of the fundamental concept of artificial intelligence (AI).
John McCarthy (1927 âA˘ S¸ 2011) was an American computer scientist and cognitive scientist. McCarthy
3
was one of the founders of the discipline of artificial intelligence. He co-authored the document that coined
the term “artificial intelligence” (AI), and was very influential in the early development of AI.
9.3. FORMS OF LEARNING BASED ON DIFFERENT WAYS OF ACQUIRING
KNOWLEDGE174
General advice can be given such as “Avoid taking points”. A human must convert
into a FOO representation of the form
FOO operationalises the advice by translating it into expressions it can use in the game.
• Features that appear to be good predictors will have their weights increased and
bad ones will be decreased.
where the t terms are values of the 16 features and the c terms are weights. As learning
progresses, the c values change.
3. Learning by chunking
Chunking involves ideas similar to macro-operators and originates from psychological ideas
on memory and problem solving. Its computational basis is in production systems.
• Winston’s learning programme: This programme operated in the blocks world do-
main. Its goal was to construct representations of concepts in the blocks domain.
• A goal concept: A high level description of what the program is supposed to learn.
• A domain theory: A set of rules that describe relationships between objects and
actions in a domain.
From this, EBL computes a generalisation of the training example that is sufficient to de-
scribe the goal concept and also satisfies the operationality criterion.
9.3.6 Discovery
Discovery is a restricted form of learning in which one entity acquires knowledge without the
help of a teacher. Discovery can be of two types:
• Prime Numbers: Factorisation of numbers and numbers with only one factor were
discovered.
• Golbach’s Conjecture: Even numbers can be written as the sum of 2 primes. E.g. 28
= 17 + 11.
9.3.7 Analogy
Analogy involves a mapping between what might appear to be two dissimilar concepts.
For example, consider the following statement: “Finding a good man is like finding
a needle in a haystack.” This involves a mapping between two worlds: One, a world of
men, good men and searching for good men; two, a world of small objects, a haystack
and searching for a needle.
There are two methods of analogical problem solving in AI: transformational analogy
and derivational analogy.
Transformational analogy
Carbonell (1983) describes a T-space method to transform old solutions into new ones.
• T-operators prescribe methods of transforming existing solution states into new ones.
Derivational analogy
Derivational analogy is a method of solving problems based upon the transfer of past experi-
ence to new problem situations. The experience transfer process consists of recreating
lines of reasoning, including decision sequences and accompanying justifications, that
proved effective in solving particular problems requiring similar initial analysis. The
derivational analogy approach is advocated as a means of implementing reasoning from
individual cases in expert systems.
4
Named after the philosopher of science Sir Francis Bacon (1561 - 1626).
9.4. NEURAL NET LEARNING 178
In an ANN, each neuron receives one or more inputs and produces one or more
outputs. In its simplest form known as a perceptron a neuron takes a weighted sum of its
inputs and produces an output 1 if the sum is greater than a certain pre-defined threshold
value and 0 otherwise. The relation between the inputs and the output of a perceptron can
be represented as in Figure 9.3. In the figure the output is defined as follows:
(
1 if w1x1 + w2x2 +·····+ wnxn > w
y = 0 otherwise
1. Input layer: All the inputs are fed in the model through this layer.
2. Hidden layers: There can be more than one hidden layers which are used for
process- ing the inputs received from the input layers.
3. Output layer: The data after processing is made available at the output layer.
9.4.4 Applications
Artificial neural networks have found applications in many disciplines. Application areas
include:
5. medical diagnosis
7. data mining
8. visualization
9. machine translation
9.5.1 Outline
In a genetic algorithm, a population of candidate solutions (called individuals, creatures, or
phenotypes) to an optimization problem is evolved toward better solutions. Each
candidate solution has a set of properties (called its chromosomes or genotype) which can
be mutated and altered; traditionally, solutions are represented in binary as strings of 0s
and 1s.
The evolution usually starts from a population of randomly generated individuals, and
is an iterative process. The population in each iteration called a generation. In each gener-
ation, the fitness of every individual in the population is evaluated; the fitness is usually
the value of the objective function in the optimization problem being solved. The more
fit in- dividuals are randomly selected from the current population, and each individual’s
genome (the set of chromosomes) is modified (recombined and possibly randomly
mutated) to form a new generation. The new generation of candidate solutions is then
used in the next it- eration of the algorithm. Commonly, the algorithm terminates when
either a maximum
9.6. SAMPLE QUESTIONS 181
number of generations has been produced, or a satisfactory fitness level has been reached
for the population.
The chromosome representation, selection, crossover (or, recombination), mutation,
and fitness function computation are the key elements of the genetic algorithm.
A genetic algorithm flowchart is shown in Figure 9.5.
4. With the help of a flowchart, explain the basic concepts of the genetic algorithm.
Chapter 10
Expert systems
10.1 Introduction
An expert system is a computer program that uses artificial intelligence methods to solve
problems within a specialized domain that ordinarily requires human expertise. The first
expert system was developed in 1965 by Edward Feigenbaum and Joshua Lederberg of
Stanford University. Their system known as Dendral was designed to analyze chemical
compounds. Expert systems now have commercial applications in fields as diverse as
med- ical diagnosis, petroleum engineering, and financial investing.
In order to accomplish feats of apparent intelligence, an expert system relies on two
components: a knowledge base and an inference engine. A knowledge base is an
organized collection of facts about the system’s domain. An inference engine interprets
and evaluates the facts in the knowledge base in order to provide an answer. Typical
tasks for expert systems involve classification, diagnosis, monitoring, design, scheduling,
and planning for specialized endeavours.
Facts for a knowledge base must be acquired from human experts through interviews
and observations. This knowledge is then usually represented in the form of “if-then”
rules. The knowledge base of a major expert system includes thousands of rules. A
probability factor is often attached to the conclusion of each rule and to the ultimate
recommendation, because the conclusion is not a certainty.
182
10.2. ARCHITECTURE OF AN EXPERT SYSTEM 183
• System controlled, where the system drives the dialogue through questioning the user.
• User controlled, where the user drives the consultation by providing information to
the system.
• Mixed control, where both user and system can direct the consultation.
accessing the appropriate rules, executing the rules and determining when an acceptable
solution has been found.
The reasoning process can be simple or complicated, depending on the structure of
knowledge base. If the knowledge base consists of simple rules and facts, forward
chaining suffices. However, for a knowledge which consists of rules and unstructured
logic (facts, data and variables), both sophisticated forward and backward chaining with
well thought- out search strategies may be required. Other methods such as problem
reduction, pattern matching, unification etc. are also used to implement the reasoning
process.
Mostly, the knowledge is not inter twined with the control structure. This has a value-
added advantage, the same inference engine which works well in one expert system may
work just as well with a different knowledge base to make another expert system, thus
reducing expert system developing time.
For example, inference engine of MYCIN is available separately as EMYCIN
(essential MYCIN). EMYCIN has been used with different knowledge base to create many
new expert system, eliminating the need to develop a new inference engine. This will be
taken up after having studied expert system shells.
conflict occurs, the editor helps the user resolve the conflict by explaining what caused it
and describing ways to remove it.
Another potentially useful but generally unavailable facility for knowledge base edi-
tors is knowledge extraction where the editor helps the user enter new knowledge into the
system. It combines syntax and consistency checking with sophisticated prompting and
explanation to let even naive users add or modify rules.
• Knowledge engineer
The knowledge engineer is the person who encodes the expert’s knowledge in a
declarative form that can be used by the expert system.
• User
The user is the person who will be consulting with the system to get advice which
would have been provided by the expert.
• System engineer
The system engineer builds the user interface, designs the declarative format of the
knowledge base, and implements the inference engine.
languages such as LISP and PROLOG. Symbol-manipulation languages are designed for
ar- tificial intelligence applications; for example, LISP has mechanism for manipulating
sym- bols in the form of list structures. A list is simply a collection of items enclosed by
paren- thesis, where each item can be either a symbol or another list. List structures are
useful building blocks for representing complex concepts.
The most popular and widely used programming language for expert system applica-
tions are LISP and PROLOG. Symbol-manipulation languages like these are more suit-
able for work in artificial intelligence, although a few expert systems have been written in
problem-oriented languages like FORTRAN and PASCAL.
Expert systems are also written in CLIPS (C Language Integrated Production System
developed in mid 1980s). The major advantages of these languages as compared to
conven- tional programming languages, is the simplicity of the addition, elimination or
substitution of new rules and memory management capabilities.
A skeletal knowledge engineering language is a stripped down expert system, that is,
an expert system with its domain-specific knowledge removed leaving only the inference
en- gine and support facilities. The general-purpose knowledge engineering language can
han- dle many different problem areas such as knowledge extraction, giving inference or
making user interface though its use is rather tedious. These languages have a range of
generality and flexibility. Table 10.1 shows some well known knowledge engineering
languages.
Dendral was a project in artificial intelligence of the 1960s. Its primary aim was to
study hypothesis formation and discovery in science. For that, a specific task in
sci- ence was chosen: help organic chemists in identifying unknown organic
molecules, by analyzing their mass spectra and using knowledge of chemistry. It
began at Stan- ford University in 1965 and spans approximately half the history of
AI research. Den- dral is considered the first expert system because it automated
the decision-making process and problem-solving behavior of organic chemists.
2. MYCIN
MYCIN was an early backward chaining expert system that used artificial intelli-
gence to identify bacteria causing severe infections, such as bacteremia and
meningi- tis, and to recommend antibiotics, with the dosage adjusted for patient’s
body weight. The name was derived from the antibiotics themselves, as many
antibiotics have the suffix "-mycin". The Mycin system was also used for the
diagnosis of blood clotting diseases. MYCIN was developed over five or six years
in the early 1970s at Stanford University. It was written in Lisp.
MYCIN operated using a fairly simple inference engine and a knowledge base of
approximately 600 rules. It would query the physician running the program via a
long series of simple yes/no or textual questions. At the end, it provided a list of
possible culprit bacteria ranked from high to low based on the probability of each
diagnosis, its confidence in each diagnosis’ probability, the reasoning behind each
diagnosis, and its recommended course of drug treatment.
technician in finding the faults in a computer system. The primary goal of the
DART Project was to develop programs that capture the special design knowledge
and di- agnostic abilities of the experts and to make them available to field
engineers. The practical goal is the construction of an automated diagnostician
capable of pinpoint- ing the functional units responsible for observed malfunctions
in arbitrary system configurations.
4. XCON
XCON (eXpert CONfigure) is a system with the ability to select specific software
to generate a computer system as per user preference, written in 1978.
5. PXDES
This system could easily determine the type and the degree of lung cancer in
patients based on limited data.
6. CaDet
This is a clinical support system that identifies cancer in early stages.
7. PIP
PIP (Process Invention Procedure) is an hierarchical expert system for the synthesis
of chemical process flowsheets. It uses a combination of qualitative knowledge,
that is, heuristics and quantitative knowledge, that is, design and cost calculations,
arranged in a hierarchic manner. A hybrid, expert system control architecture was
developed for PIP that allows these two types of knowledge-bases to interact with
each other.
8. INTERNIST
INTERNIST-I was designed between 1972 and 1973 to provide computer assisted
diagnosis in general internal medicine by attempting to model the reasoning of
clini- cians.
analysis or simple classification. They typically perform very limited tasks which can be
performed by professional, in a few minutes and hours.
4. Give the names of some of the programming languages used to develop expert sys-
tems.
2. With the help of a neat diagram, describe the architecture of an expert system.
Fuzzy logic
• As multivalued logic
In one sense the term “fuzzy logic” means the extension of ordinary logic, in which
the set of truth values is the two-element set {0, 1}, to the case where the set of
truth values is any finite or infinite subset of the closed interval [0, 1] (the set of all
numbers between 0 and 1 including both). In this sense, fuzzy logic is sometimes
referred to “many-valued logic” or “multivalued logic”. There are many ways in
which this extension can be carried out and consequently there are many different
multivalued logics.
• As fuzzy sets
In the second sense, the term “fuzzy logic” refers to the use of fuzzy sets in the repre-
sentation and manipulation of vague information for the purpose of making
decisions or taking actions. It is the theory of fuzzy sets. A fuzzy set is a
generalisation of the concept of a set.
In Section 11.2 of this chapter we discuss fuzzy logic in the first sense and in the re-
maining sections we discuss fuzzy logic in the second sense.
191
11.2. FUZZY LOGIC AS MULTIVALUED LOGIC 192
∨ 0 u 1 ∧ 0 u 1 ¬
0 0 u 1 0 0 0 0 0 1
u u u 1 u 0 u u u u
1 1 1 1 1 0 u 1 1 u
⇒ 0 u 1 ⇔ 0 u 1
0 1 1 1 0 1 u 0
u u 1 1 u u 1 u
1 0 u 1 1 0 u 1
Kleene logic
The Kleene logic has the same tables for AND, OR, and NOT as the Lukasiewicz logic
given above, but differs in its definition of implication.
⇔ 0 u 1
0 1 1 1
u u u 1
1 0 u 1
Example
Using the definitions of logical connectives in Lukasiewicz logic, construct the truth table
of the compound proposition (P ∧ Q) ⇒ P .
Solution
P Q P ∧ Q (P ∧ Q) ⇒ P
0 0 0 1
0 u 0 1
0 1 0 1
u 0 0 1
u u u 1
u 1 u 1
1 0 0 1
1 u u 1
1 1 1 1
done in it. In these natural languages the meaning of words is very often vague. Examples
are words such as “birds” (how about penguins, bats, etc.?), “red roses”, but also terms
such as “tall men”, “beautiful women”, etc. The term “tall men” is fuzzy because the
meaning of “tall” is fuzzy and dependent on the context (height of observer, culture, etc.).
11.3.2 Example
Imagine the set of all young men residing in a small town. Since the number of people
residing in the town is not very large it is indeed possible to prepare a list of all persons in
the town. But can we prepare a list of all young men in the town? It is not possible
because the attribute of being “young” is not well defined, is vague and is imprecise. We
say that he attribute is a “fuzzy” attribute. However, all of us will agree that a person
aged 80 years cannot be a member of the “set of young men”! Similarly a three-year old
boy cannot also be a member of the set. Also, all of us would agree that a 25-year old
man residing in the city should be a member of the set.
To describe the set of young men in the town, we may arbitrarily assign a number to
each person which may indicate the “youngness” of the person. We assign the smallest
number to a person having the least “youngness” property and the highest value to those
who have the highest “youngness” property. The assignment of numbers is purely arbi-
trary. However it should reflect a human being’s understanding the notion of
“youngness”: a higher value must indicate a higher level of “youngness”. A set to whose
elements are assigned numbers like this is called a fuzzy set. We shall consider a formal
definition later. In practice, however, we usually assign values in the range 0 to 1,
including both.
Figure 11.1: The graph of the measure “youngness”, “middle-agedness” and “oldness”
vs. “age”
Types of membership
Let x ∈ U . Then x is said to be
The α-cut (also called the α-level cut of A), denoted by A≥α or Aα, is defined as the
crisp set
A≥α = {x ∈ U : µA(x) ≤ α}.
Notation
Let A = (U, m) be a fuzzy set and let U be a finite set. Let the support of A be Supp (U ) =
{x1, x2, . . . , xn}. Then the fuzzy set A is sometimes written as a set of ordered pairs as
follows:
A = {(x1, m(x1)), (x2, m(x2)), · · · , (xn, m(xn))}
11.5. DEFINITION AND EXAMPLES 196
are also used to denote the fuzzy set A. Still others write
Note that in these representations, the symbols “/” and “+” are just symbols and they do
not represent the corresponding arithmetical operations.
The membership function of a crisp set is called the characteristic function of the set.
11.5.2 Examples
Example 1
Consider the set
U = {5, 10, 15, 20, 30, 35, 40, 45, 60, 70}
and the function m : U → [0, 1] defined by the following table:
x 5 10 15 20 30 35 40 45 60 70
m(x) 1 1 1 1 0.6 0.5 0.4 0.2 0 0
Supp (A) = {5, 10, 15, 20, 30, 35, 40, 45}
or as
Example 2
A realtor wants to classify the houses he offers to his clients. One indicator of comfort of
these houses is the number of bedrooms in it. Let
Example 3
The fuzzy set defined as “integers close to 10” may be specified by
Example 4
Let the universe U = R be the set of all real numbers. Then the fuzzy set A described
as “the set of real numbers close to 10” can be specified by the membership function
1
µA(x) = .
1 + (x − 10)2
0 x≤a
x−a
b − a<x≤b
µA(x) =
a1 b< x ≤ c
x−c
d−b c<x≤d
0 d<x
1 x − m!
k
s
µA(x) = exp −
2
1. Equality
A and B are said to be equal, denoted by A = B, if µA(x) = µB(x) for all x in U .
2. Union
The union of A and B, denoted by A ∪ B, is the fuzzy set C whose membership
function is defined by
3. Intersection
The intersection of A and B, denoted by A∩B, is the fuzzy set D whose membership
function is defined by
4. Complement
The complement of A, denoted by A, is the fuzzy set E whose membership function
is defined by
µE(x) = 1 − µA(x), x ∈ U.
Remarks
1. The definitions of the membership functions of the union, intersection and comple-
ment of fuzzy sets may be given in the following forms also:
A ∪ A /= U, A ∩ A /= ∅.
11.6.2 Examples
Example 1
Given the fuzzy sets
find A ∪ B and A ∩ B.
Solution
We have:
Let us compute A ∪ B.
Hence
Therefore
A ∪ B = {0.3/2, 0.4/3, 0.7/4, 0.8/5, 1.0/6, 0.02/7, 0.75/8}.
Now, let us calculate A ∩ B.
Supp (A ∩ B) = Supp (A) ∩ Supp (B)
= {4, 5, 6}
We
µA∩B(x) = min{µA(x), µB(x)}.
have
µA∩B(4) = min{µA(4), µB(4)}
Hence = min{0.1, 0.7}
= 0.1
µA∩B(5) = min{µA(5), µB(5)}
= max{0.8, 0.5}
= 0.5
µA∩B(6) = min{µA(6), µB(6)}
= max{1, 0, 1, 0}
= 1.0
Therefore
Example 2
Let the universe be U = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10} and let
A = {0.2/1, 0.4/2, 0.6/3, 0.8/4, 1.0/5}.
Compute A¯ .
11.7. SOME MORE DEFINITIONS 202
Solution
By definition
µA(x) = 1 − µA(x), for all x ∈ U.
Therefore, we
have:
2. Fuzzy subset
A fuzzy set A is said to be a subset of a fuzzy set B, denoted by A ⊆ B, if µA(x) ≤
µB(x) for all x ∈ U .
Examples
1. Let
A = {0.12/2, 0.23/4}, B = {0.15/2, 0.37/4, 0.15/6},
Then A ⊆ B.
Then
1. Commutativity
A∪B=B∪B
A ∩ B = B ∩
A.
11.9. FUZZY VARIABLES 203
2. Associativity
A ∪ (B ∪ C) = (A ∪ B) ∪ C)
A ∩ (B ∩ C) = (A ∩ B) ∩ C
3. Distributivity
A ∪ (B ∩ C) = (A ∪ B) ∩ (A ∪ C),
A ∩ (B ∪ C) = (A ∩ B) ∪ (A ∩ C)
4. Identity
A ∪ ∅ = A, A ∪ X = A
A ∩ ∅ = ∅, A ∩ X = A
5. Involution
(A) = A.
6. De Morgan’s laws
A∪B=A∩B
A ∩ B = A ∪ B.
Example
Given the universe U = {a, b, c, d, e, f, g, h} and the fuzzy sets
A = {0.1/a, 0.2/b, 0.3/c, 1.0/d, 0.1/e, 0.5/h}
B = {0.75/b, 0.2/c, 0.35/d, 0.4/e, 0.5/f }
C = {1.0/e, 0.9/f, 0.8/g, 0.7/h}
verify the distributivity properties.
Solution
11.9.1 Definition
A fuzzy variable x is a variable whose possible values are fuzzy sets in some universe of
discourse.
11.9.2 Examples
Example 1
We define a fuzzy variable x as follows. Let the universe of discourse be
U = {1, 2, 3, 4, 5}.
or
B = {1.0/1, 0.8/2, 0.6/3, 0.4/4, 0.2/5}.
Example 2
Consider a variable which denotes the age of a person in years. We may denote this
variable by age. As a crisp variable, age can assume nonnegative integers as a values.
We may take age as a fuzzy variable. To do this, we have to define a universe of
discourse U which we may take as the set of all nonnegative integers. As a fuzzy variable
the possible values can that can be assigned to age are fuzzy sets in U . These fuzzy sets
may denoted by linguistic terms indicative of the age of a person, such as infant, young,
middle-aged, old, etc. Thus the fuzzy variable age can be thought of as a variable whose
possible values are such linguistic terms. However, it is to be emphasised that the value is
not the linguistic term but the fuzzy set represented by the term.
The device
The cooler is implemented using a fan encased in a box with wool or hay that is contin-
uously moistened with a trickle of water. A motorised pump controls the rate of flow of
water. Two sensors measures the fan motor speed and the temperature.
Objective
The basic aim is to achieve a smooth control and to save water. We have to control the
rate of flow of water based on fan motor speed (measured in rpm’s) and temperature
(measured in ◦C). To make the problem precise, let it be required to find the rate of flow
of water if the temperature is 42◦C and fan motor speed is 31 rpm.
11.10. EXAMPLE FOR APPLICATION OF FUZZY SETS 205
Variables
The input variables of the system are “temperature” and “fan-motor-speed”. We take
them as fuzzy variables. We assign the fuzzy sets denoted by “Cold, Cool, Moderate,
Warm, Hot” as the values of the variable temperature and the fuzzy sets denoted by
”Slack, Low, medium, Brisk, Fast” as the values of fan-motor-speed. The output variable
is “water-flow- rate” and it may take the fuzzy sets “Strong-Negative (SN), Negative (N),
Low-Negative (LN), Medium (M), Low-Positive (LP), Positive (P), High-Positive” as its
values.
Membership functions
Based on observations and experimentation we have to construct the membership
functions to the various fuzzy sets introduced above. We assume that the membership
functions are as shown in Figure 11.8.
Fuzzification
We have to express the observed values of 42◦C and 31 rpm as values of the fuzzy variables
temperature and fan-motor-speed.
From the graphs of the membership functions of the values of the fuzzy variable tem-
perature, we see that a value of 42◦C corresponds to the fuzzy set Warm with a
membership grade of 0.142 and also to the fuzzy set Hot with a membership grade of 0.2.
Similarly from From the graphs of the membership functions of the values of the
fuzzy variable fan-motor-speed, we see that a value of 31 rpm corresponds to the fuzzy
set Medium with a membership grade of 0.25 and also to the fuzzy set Brisk with a mem-
bership grade of 0.286.
This process of expressing crisp values as values of fuzzy variables is known as
fuzzifi- cation.
11.10. EXAMPLE FOR APPLICATION OF FUZZY SETS 207
Application of rules
Since we have two possible fuzzy values for temperature and two possible fuzzy values
for fan-motor-speed we have the combinations of values as shown Table 11.3. The table
also shows the applicable rules, the value of the output variable obtained by applying the
rule and further the membership grade of the output value. Note that the membership
grade of the output variable water-flow-rate is taken as the minimum of the membership
grades of the two input variables.
Defuzzification
From the above set of possible fuzzy values for the output variable, we have to derive a
crisp value for the output variable. This process is known as defuzzification. One method
for doing this known as the “centre of gravity method”.
To obtain the crisp value, we consider value P with membership grade 0.2 and
consider the graph of the membership function of the fuzzy variable P . In this graph, we
consider the shaded region shown in Figure 11.9. In a similar way, by considering the the
other possible output values and their member ship grades, we construct shaded regions
in the graphs of the respective membership functions. In this way, we get four shaded
regions. We next construct in one graph a composite region consisting of all the four
shaded regions as in Figure 11.10.
The x-coordinate of the centre of gravity, or the centroid, of this composite region is
the defuzzified crisp value of the output variable water-flow-rate.
5. Define the support and kernel of a fuzzy set and illustrate with examples.
7. Define the union and intersection of fuzzy sets. Illustrate with examples.
12. What is the purpose of defuzzycation? Name at least one method used for defuzzy-
cation.
find A ∪ B and A ∩ B.
11.11. SAMPLE QUESTIONS 209
2. Let
A = 0.5/1 + 0.9/2 + 1/5, B = 0.7/2 + 0.9/3 + 0.1/4.
Compute A ∪ B and A ∩ B.
3. Suppose U = {1, 2, 3, 4, 5} and A = 0.5/1 + 0.3/3 + 1/5. Compute:
(a) A
(b) A ∩ A
(c) A ∪ A
4. Let U = {1, 2, 3, 4, 5, 6, 7, 8} and let
A = {0.4/1, 0.7/3, 0.5/5, 0.8/7}, B = {0.5/1, 0.2/2, 1.0/6, 0.1/8}.
Verify De Morgan’s laws.
5. Consider two fuzzy subsets of the set X = {a, b, c, d, e} defined by
A = {1/a, 0.3/b, 0.2/c0.8/d}, B = {0.6/a, 0.9/b, 0.1/c, 0.3/d, 0.2/e}.
Compute the following: Supp (A), Supp(B), A ∪ B, A ∩ B, A and B.
6. Let
A = 0.2/1 + 0.5/2 + 0.7/3 + 1.0/4 + 0.8/5 + 0.4/6 + 0.2/7.
Compute the α-level set of A for α = 0.5.
7. Given the fuzzy sets
0.12 02.5 03.8 41.0 50.7 60.3 0.32 04.4 05.6 06 .8 81.0 91.0101.0
C= , , , , , ,D = , , , , , ,
compute C ∩ D and C ∪ D.
8. Let
A = 0.1/2 + 0.5/3 + 0.4/4 + 0.5/5, B = 0.1/2 + 0.6/3 + 1/4 + 0.6/5 + 0.1/6.
Is it true that B /⊂ A?
9. Consider these (very subjective) membership functions for the height of a person:
10. Construct a fuzzy set (universal set and membership function) for the concept
“early in the morning”. (Hint: Choose the universal set as {1, 2, . . . , 23, 24}
(times of a day) and define a membership function to indicate “early in the
morning”.)
Index
Walter Pitts, 1
Warren McCulloch, 1
water jug problem, 23
well-formed formula,
142
wff, 142
working memory, 29
world, 138
XCON, 189
zero-sum game, 84
This page is intentionally left blank.
Lecture Notes in Artificial Intelligence
Dr V N Krishnachandran