IAI Unit2
IAI Unit2
Initial call:
Minimax(node, 3, true)
Step-1: In the first step, the algorithm generates the entire game-tree and apply the
utility function to get the utility values for the terminal states. In the below tree
diagram, let's take A is the initial state of the tree. Suppose maximizer takes first turn
which has worst-case initial value =- infinity, and minimizer will take next turn which
has worst-case initial value = +infinity.
Step 2: Now, first we find the utilities value for the Maximizer, its initial value is -∞, so
we will compare each value in terminal state with initial value of Maximizer and
determines the higher nodes values. It will find the maximum among the all.
Note: To better understand this topic, kindly study the minimax algorithm.
1. α>=β
Step 1: At the first step the, Max player will start first move from node A where α= -∞
and β= +∞, these value of alpha and beta passed down to node B where again α= -∞
and β= +∞, and Node B passes the same value to its child D.
Step 2: At Node D, the value of α will be calculated as its turn for Max. The value of α
is compared with firstly 2 and then 3, and the max (2, 3) = 3 will be the value of α at
node D and node value will also 3.
Step 3: Now algorithm backtrack to node B, where the value of β will change as this is
a turn of Min, Now β= +∞, will compare with the available subsequent nodes value,
i.e. min (∞, 3) = 3, hence at node B now α= -∞, and β= 3.
In the next step, algorithm traverse the next successor of Node B which is node E, and
the values of α= -∞, and β= 3 will also be passed.
Step 4: At node E, Max will take its turn, and the value of alpha will change. The current
value of alpha will be compared with 5, so max (-∞, 5) = 5, hence at node E α= 5 and
β= 3, where α>=β, so the right successor of E will be pruned, and algorithm will not
traverse it, and the value at node E will be 5.
Step 5: At next step, algorithm again backtrack the tree, from node B to node A. At
node A, the value of alpha will be changed the maximum available value is 3 as max (-
∞, 3)= 3, and β= +∞, these two values now passes to right successor of A which is
Node C.
At node C, α=3 and β= +∞, and the same values will be passed on to node F.
Step 6: At node F, again the value of α will be compared with left child which is 0, and
max(3,0)= 3, and then compared with right child which is 1, and max(3,1)= 3 still α
remains 3, but the node value of F will become 1.
Step 7: Node F returns the node value 1 to node C, at C α= 3 and β= +∞, here the
value of beta will be changed, it will compare with 1 so min (∞, 1) = 1. Now at C, α=3
and β= 1, and again it satisfies the condition α>=β, so the next child of C which is G
will be pruned, and the algorithm will not compute the entire sub-tree G.
Step 8: C now returns the value of 1 to A here the best value for A is max (3, 1) = 3.
Following is the final game tree which is the showing the nodes which are computed
and nodes which has never computed. Hence the optimal value for the maximizer is 3
for this example.
Move Ordering in Alpha-Beta pruning:
The effectiveness of alpha-beta pruning is highly dependent on the order in which
each node is examined. Move order is an important aspect of alpha-beta pruning.
o Worst ordering: In some cases, alpha-beta pruning algorithm does not prune
any of the leaves of the tree, and works exactly as minimax algorithm. In this
case, it also consumes more time because of alpha-beta factors, such a move
of pruning is called worst ordering. In this case, the best move occurs on the
right side of the tree. The time complexity for such an order is O(bm).
o Ideal ordering: The ideal ordering for alpha-beta pruning occurs when lots of
pruning happens in the tree, and best moves occur at the left side of the tree.
We apply DFS hence it first search left of the tree and go deep twice as minimax
algorithm in the same amount of time. Complexity in ideal ordering is O(bm/2).
Rules to find good ordering:
Following are some rules to find good ordering in alpha-beta pruning:
These section examines the constraint optimization methodology, another form or real
concern method. By its name, constraints fulfilment implies that such an issue must be
solved while adhering to a set of restrictions or guidelines.
In constraint satisfaction, domains are the areas wherein parameters were located after
the restrictions that are particular to the task. Those three components make up a
constraint satisfaction technique in its entirety. The pair "scope, rel" makes up the
number of something like the requirement. The scope is a tuple of variables that
contribute to the restriction, as well as rel is indeed a relationship that contains a list
of possible solutions for the parameters should assume in order to meet the
restrictions of something like the issue.
For a constraint satisfaction problem (CSP), the following conditions must be met:
o States area
o fundamental idea while behind remedy.
The definition of a state in phase space involves giving values to any or all of the
parameters, like as
o Discrete Domain: This limitless area allows for the existence of a single state
with numerous variables. For instance, every parameter may receive a endless
number of beginning states.
o It is a finite domain with continous phases that really can describe just one area
for just one particular variable. Another name for it is constant area.
o Unary restrictions are the easiest kind of restrictions because they only limit the
value of one variable.
o Binary resource limits: These restrictions connect two parameters. A value
between x1 and x3 can be found in a variable named x2.
o Global Resource limits: This kind of restriction includes a unrestricted amount
of variables.
The main kinds of restrictions are resolved using certain kinds of resolution
methodologies:
o In linear programming, when every parameter carrying an integer value only
occurs in linear equation, linear constraints are frequently utilised.
o Non-linear Constraints: With non-linear programming, when each variable (an
integer value) exists in a non-linear form, several types of restrictions were
utilised.
Note: The preferences restriction is a unique restriction that operates in the actual world.
Think of a Sudoku puzzle where some of the squares have initial fills of certain integers.
You must complete the empty squares with numbers between 1 and 9, making sure
that no rows, columns, or blocks contains a recurring integer of any kind. This solving
multi - objective issue is pretty elementary. A problem must be solved while taking
certain limitations into consideration.
The integer range (1-9) that really can occupy the other spaces is referred to as a
domain, while the empty spaces themselves were referred as variables. The values of
the variables are drawn first from realm. Constraints are the rules that determine how
a variable will select the scope.
» Constraint Propagation
Constraint Propagation
We have already seen constraint propagation in action, but let’s define it bit more precisely.
The basic step in constraint propagation is called filtering a constraint. As input, filtering
takes a constraint and the current domains for its variables. It then tries to remove all the
values in the domains that do not appear in any solution to that constraint. That is, it tries to
reduce the domains so that all the solutions are preserved. Thus “constraint filtering” in CSPs
corresponds to a “unit clause propagation” step in CNF SAT solvers, but it is usually more
complex because the domains are larger and the constraints are more complex. Filtering on
the global constraint level can also be more effective than unit clause propagation in pruning
the search space.
Constraint propagation then means the process of applying filtering to the constraints in the
CSP instance at hand. As filtering one constraint can reduce the domains of its variables, it
can trigger further reduction when filtering another constraint that also involves the reduced-
domain variables. Thus a constraint can be filtered multiple times during the constraint
propagation process. Usually the process is run to the end, meaning until no more reduction
happens in filtering any of the constraints. Or if this takes too long, then until some resource
usage limit is exceeded.
It then produces new domains D after (yi ) ⊆ D before (yi ) for all i ∈ [1..k] such that
Example
When D before (x) = {1, 2, 4} and D before (y) = {1, 2, 5}, filtering the constraint x = y
optimally gives to domains D after (x) = {1, 2} and D after (y) = {1, 2}. This is because
the solutions of the constraint are x = 1, y = 1 and x = 2, y = 2.
Whenever feasible, our goal is to reduce the domains so that the constraint in question
becomes arc-consistent.
We usually omit “generalized” and just use the term arc consistent (the term “arc consistency”
is sometimes used for “generalized arc consistency” when the constraint is binary (i.e., over
two variables)).
Example
Assume two variables, x and y, and the domains D(x) = {1, 2, 4} and
D(y) = {1, 2, 5} . In this case:
he constraint x = y , corresponding to the set {(1, 1), (2, 2)}, is not arc consistent
T
as for 4 ∈ D(x) there is no tuple of the form (4, vy ) in the set because 4 ∉ D(y).
Similarly, 5 ∈ D(y) but 5 ∉ D(x) .
The constraint x ≠ y , corresponding to the set
{(1, 2), (1, 5), (2, 1), (2, 5), (4, 1), (4, 2), (4, 5)}, is arc consistent as
Note that if a CSP is arc consistent and the domains of the variables are non-empty, then the
individual constraints in the CSP have solutions. But this does not mean that the CSP as a
whole has solutions, as illustrated by the next example.
Example
constraints in it are arc consistent, but the CSP as a whole does not have any solutions.
Usually, global constraints filter better than a (possibly large) corresponding set of primitive
binary constraints, as illustrated in the next example.
Example
Of course, filtering complex global constraints is algoritmically more difficult than filtering
simple constraints such as binary equality and disequality.
C ∩ (D(y1 ) × … × D(yk )) ≠ ∅?
or equality, check whether the domains contain at least one common element
F
For disequality, check whether the domains are not empty and not equal to a common
singleton set.
Testing consistency of more complex global constraints is algoritmically more involved. In the
following, we introduce some basic algorithms for consistency checking and filtering some of
the global constraints we saw earlier.
c1 x 1 + … + cn x n = z
Consistency checking
Consistency checking and filtering of sum constraints are NP-hard in general. Thisis because
the NP-complete subset sum problem reduces to it in a rather straighforward way. However,
one obtains pseudo-polynomial time algorithms by using dynamic programming [Trick2003]
as follows. We define the predicate p(i, v), where i , that evaluates to true if and
∈ [0..n]
i
only if the sum ∑j=1 cj xj can evaluate to v, recursively as follows:
p(0, v) = T for v = 0 and F otherwise. That is, summing nothing always gives zero.
i
p(i, v) = ⋁
d∈D(x i )
p(i − 1, v − ci d) for 1 . That is, ∑j=1 cj xj can evaluate to
∈ [1..n]
i−1
v if and only if ∑j=1 cj xj can evaluate to v − ci d for some value d in the domain of xi .
Example
2x 1 + 2x 2 + 2x 3 + x 4 + x 5 = z
under the domains D(x1 ) = {0, 2, 3}, D(x2 ) = {0, 3}, D(x3 ) = {0, 3} ,
D(x 4 ) = {2, 3}, D(x 5 ) = {2, 3}, and D(z) = {4, 7, 9} .
We see that p(5, 4) and p(5, 9) are true, therefore the constraint is consistent.
Filtering
The graph representation shown above allows us to find all the solutions as well: they
correspond to the paths from the node (0, 0) to the nodes (n, v) with v ∈ D(z). Namely, if
(0, 0), (1, v1 ), … , (n, vn ) is such a path, then the solution is
1. v ∈ D ′ (z) iff v ∈ D(z) and there is a path from the node (0, 0) to (n, v), and
2. v ∈ D ′ (xi ) iff there is a path (0, 0), (1, v1 ), … , (n, vn ) such that vn ∈ D(z) and
x i = (vi − vi−1 )/ci .
Finding all the edges that take part in some of the paths above is easy: just follow the edges
backwards from the nodes (n, v) with v ∈ D(z).
Example
2x 1 + 2x 2 + 2x 3 + x 4 + x 5 = z
under the domains D(x1 ) = {0, 2, 3}, D(x2 ) = {0, 3}, D(x3 ) = {0, 3},
D(x 4 ) = {2, 3}, D(x 5 ) = {2, 3}, and D(z) = {4, 7, 9} . From the highlighted edges
x1 = 0, x 2 = 0, x 3 = 0, x 4 = 2, x 5 = 2
x1 = 2, x 2 = 0, x 3 = 0, x 4 = 2, x 5 = 2 , and
x1 = 2, x 2 = 0, x 3 = 0, x 4 = 3, x 5 = 3.
alldifferent(y1 , … , yn )
requires that the values assigned to the variables y1 , … , yn are pairwise distinct. We now
show a graph theoretical way for consistency checking and filtering of all-different
constraints.
Consistency checking
We can reduce the consistency problem of alldifferent constraints to the matching problem
in unweighted bipartite graphs [vanHoeveKatriel2006]. As there are efficient algorithms for
solving the matching problem, the consistency problem can be solved efficiently as well.
U and W are two disjoint, finite and non-empty sets of vertices, and
E ⊆ U × W is a set of edges.
A matching in such a graph is a subset M ⊆ E of the edges such that for all disjoint edges
(u 1 , w1 ), (u 2 , w2 ) ∈ M it holds that u 1 ≠ u 2 and w1 ≠ w2 . In other words, a matching
associates each vertex with at most one edge. A matching M covers a vertex u ∈ U if
(u, w) ∈ M for some w ∈ W .
Example
U = {x 1 , x 2 , x 3 , x 4 }
W = {1, 2, 3, 4, 5}
is shown below. The matching {(x1 , 4), (x2 , 3), (x4 , 1)} is shown with the highlighted
edges. It covers the vertices x1 , x2 and x4 but not the vertex x3 .
The reduction from alldifferent constraints to undirected bipartite graphs is rather simple:
k
The vertices in {y1 , … , yk } are called variable vertices and the ones in ⋃i=1 D(yi )
value vertices.
Example
shown below.
By construction, each solution to the constraint corresponds to a matching that covers all the
variable vertices of the value graph. Conversely, each matching that covers all the variable
vertices of the value graph corresponds to a solution to the constraint. Therefore, the
constraint is consistent if the value graph has a matching that covers all the variable vertices.
Example
Whether a value graph has a matching covering all variable vertices can be found by
computing a maximum matching, a matching having largest possible number of edges, for the
graph and checking whether it covers all the variable vertices. A maximum matching can be
found, e.g., with the Hopcroft-Karp algorithm. A simplified version of this algorithm is
presented next.
starts and ends in an M -free vertex and alternates between edges not in M and in M . The
following two facts on augmenting paths allow us to find a maximum matching:
The value graph for the constraint alldifferent(x1 , x2 , x3 ) with the domains
D(x 1 ) = {1, 2, 3, 4} , D(x 2 ) = {1, 3}, and D(x 3 ) = {1, 3} is shown below.
We start with the empty matching M = ∅. We may first find the augmenting path [x1 , 1].
After this, the matching is M = {(x1 , 1)}, shown below with highlighted edges.
The next the augmenting path could be [x2 , 3]. After this, the matching is
M = {(x 1 , 1), (x 2 , 3)}:
The next the augmenting path could be [x3 , 1, x1 , 2], giving the matching
M = {(x 1 , 2), (x 2 , 3), (x 3 , 1)} :
The are no more augmenting paths. All variable vertices are covered and a solution is
{x 1 ↦ 2, x 2 ↦ 3, x 3 ↦ 1}.
Filtering
ach solution to the constraint corresponds to a value vertex covering matching of the
e
value graph, and
each value vertex covering matching of the value graph corresponds to a solution to the
constraint.
Thus if an edge (x, v) in the value graph does not appear in any value vertex covering
matching, the element v can be removed from the domain D(x).
Once one maximum matching is found, we can find all edges that appear in some maximum
matching [Tassa2012] [vanHoeveKatriel2006]. Assume a bipartite graph G = (U , W , E)
and a maximum matching M ⊆ E. Build the directed graph
′ ′ ′ ′ ′ ′ ′ ′
G M = (U ∪ W , {(u , w ) ∣ (u , w ) ∈ M } ∪ {(w , u ) ∣ (u , w ) ∈ E ∖ M }) . Now an
1. (u, w) ∈ M ,
2. (u, w) is an edge inside a strongly connected component of GM , or
3. (u, w) appears in a path of GM that starts from an M -free vertex
Strongly connected components can be computed efficiently with Tarjan’s algorithm or with
Kosaraju’s algorithm.
Example
Consider again our running example having the value graph G and a maximum matching
M shown below.
The directed graph GM is the following.
Now:
he edges (x1 , 2), (x2 , 3) and (x3 , 1) in M belong to some maximum matching.
T
The directed graph has only one non-singleton strongly connected component,
{x 2 , 3, x 3 , 1}. Thus the edges (x 2 , 1) and (x 3 , 3) belong to some maximum
matching.
As [4, x1 ] is a path of the form required by the third case, (x1 , 4) belongs to some
maximum matching.
The edge (x1 , 1) cannot be included with any one three cases above. Therefore, the value
1 can be removed from the domain of x 1 . The same applies to edge (x 1 , 3) and thus the
{x 1 ↦ 4, x 2 ↦ 3, x 3 ↦ 1}.
» Backtracking Search
Backtracking Search
How can we solve CSP instances? That is, how can we deduce whether a set of constraints is
satisfiable or not, or to find an optimal solution for them? When the domains are finite, CSP
instances could be solved by
Indeed, some CSP solvers do this. However, performing a backtracking search and constraint
propagation on the global constraints level can result in better performance as
1. the expansion of non-binary domains and global constraints to the Boolean level can be
quite large, and
2. one can use efficient custom algorithms for propagating global constraints.
def solve(C, D) :
′
D ← propagate(C, D)
for each v ∈ {v , . . . , v }:
1 k
′
r ← solve(C, D )
x↦{v}
return unsat
Example
Consider a simple ARM architecture assembly program with symbolic register names
below.
LDR a, a_addr // a_addr is the address of the input array
MOV i, #0 // i = 0
MOV s, #0 // s = 0
loop:
CMP i, #10 // if i >= 10, goto end
BGE end
LDR t, [a, i, LSL#2] // t = a[i]
MUL v, t, t // v = t * t
ADD s, s, v // s = s + v
ADD i, i, #1 // i = i + 1
B loop
end:
LDR b, b_addr // save s to the memory location b_addr
STR s, [b]
If we perform a liveness analysis for the registers, we get a register-inference graph (see
e.g. Aho, Sethi, and Ullman: “Compilers: Principles, Techniques, and Tools”) shown below.
It has an edge between register nodes if and only if there is a line in which one register is
written to and the other can be read later without being written to in between. The graph
is k-colorable iff the program can be implemented by using only k hardware registers and
no spills to the memory.
The graph k-coloring problem can be easily expressed as a CSP problem by taking a
variable for each vertex. The domain of all the variables is a set of k colors. The
constraints are x ≠ y for each edge between x and y in the graph. In our simple example,
the constraints are
a ≠ i a ≠ s a ≠ t a ≠ v i ≠ s
i ≠ t i ≠ v s ≠ t s ≠ v s ≠ b
A search tree for our instance when three colors are allowed is shown below. Each node
describes the domains either when entering the solve call or after constraint
propagation.
The instance has no solutions as all the branches end in a situation where the domain of
some variable is empty.
Our example graph is colorable with 4 colors and thus the program can be implemented
with 4 registers:
The above description of the backtracking search scheme is simplified and omits many
details. For instance, the variable selection heuristic, i.e. how one chooses the variable x
whose domain {v1 , … , vk } is split into {v1 },…,{vk }, is very important but not discussed
further in this course. And one could also split the domain {v1 , … , vk } of the branching
variable x differently, for instance into {v1 , … , vk/2 } and {vk/2+1 , … , vk } if k is large.
Some solvers allow the user to control these aspects and thus tune the solver for specific
problem types.
Knowledge-Based Agent in Artificial
intelligence
o An intelligent agent needs knowledge about the real world for taking decisions
and reasoning to act efficiently.
o Knowledge-based agents are those agents who have the capability
of maintaining an internal state of knowledge, reason over that
knowledge, update their knowledge after observations and take actions.
These agents can represent the world with some formal representation
and act intelligently.
o Knowledge-based agents are composed of two main parts:
o Knowledge-base and
o Inference system.
Inference system
Inference means deriving new sentences from old. Inference system allows us to add
a new sentence to the knowledge base. A sentence is a proposition about the world.
Inference system applies logical rules to the KB to deduce new information.
Inference system generates new facts so that an agent can update the KB. An inference
system works mainly in two rules which are given as:
o Forward chaining
o Backward chaining
1. TELL: This operation tells the knowledge base what it perceives from the
environment.
2. ASK: This operation asks the knowledge base what action it should perform.
3. Perform: It performs the selected action.
1. function KB-AGENT(percept):
2. persistent: KB, a knowledge base
3. t, a counter, initially 0, indicating time
4. TELL(KB, MAKE-PERCEPT-SENTENCE(percept, t))
5. Action = ASK(KB, MAKE-ACTION-QUERY(t))
6. TELL(KB, MAKE-ACTION-SENTENCE(action, t))
7. t=t+1
8. return action
The knowledge-based agent takes percept as input and returns an action as output.
The agent maintains the knowledge base, KB, and it initially has some background
knowledge of the real world. It also has a counter to indicate the time for the whole
process, and this counter is initialized with zero.
Each time when the function is called, it performs its three operations:
1. Knowledge level
Knowledge level is the first level of knowledge-based agent, and in this level, we need
to specify what the agent knows, and what the agent goals are. With these
specifications, we can fix its behavior. For example, suppose an automated taxi agent
needs to go from a station A to station B, and he knows the way from A to B, so this
comes at the knowledge level.
2. Logical level:
At this level, we understand that how the knowledge representation of knowledge is
stored. At this level, sentences are encoded into different logics. At the logical level, an
encoding of knowledge into logical sentences occurs. At the logical level we can expect
to the automated taxi agent to reach to the destination B.
3. Implementation level:
This is the physical representation of logic and knowledge. At the implementation level
agent perform actions as per logical and knowledge level. At this level, an automated
taxi agent actually implement his knowledge and logic so that he can reach to the
destination.
However, in the real world, a successful agent can be built by combining both
declarative and procedural approaches, and declarative knowledge can often be
compiled into more efficient procedural code.
Propositional logic in Artificial intelligence
Propositional logic (PL) is the simplest form of logic where all the statements are made
by propositions. A proposition is a declarative statement which is either true or false.
It is a technique of knowledge representation in logical and mathematical form.
Example:
1. a) It is Sunday.
2. b) The Sun rises from West (False proposition)
3. c) 3+3= 7(False proposition)
4. d) 5 is a prime number.
ADVERTISEMENT
a. Atomic Propositions
b. Compound propositions
Example:
Example:
Logical Connectives:
Logical connectives are used to connect two simpler propositions or representing a
sentence logically. We can create compound propositions with the help of logical
connectives. There are mainly five connectives, which are given as follows:
Truth Table:
In propositional logic, we need to know the truth values of propositions in all possible
scenarios. We can combine all the possible combination with logical connectives, and
the representation of these combinations in a tabular format is called Truth table.
Following are the truth table for all logical connectives:
Truth table with three propositions:
We can build a proposition composing three propositions P, Q, and R. This truth table
is made-up of 8n Tuples as we have taken three proposition symbols.
Precedence of connectives:
Just like arithmetic operators, there is a precedence order for propositional connectors
or logical operators. This order should be followed while evaluating a propositional
problem. Following is the list of the precedence order for operators:
Precedence Operators
Note: For better understanding use parenthesis to make sure of the correct interpretations.
Such as ¬R∨ Q, It can be interpreted as (¬R) ∨ Q.
Logical equivalence:
Logical equivalence is one of the features of propositional logic. Two propositions are
said to be logically equivalent if and only if the columns in the truth table are identical
to each other.
Let's take two propositions A and B, so for logical equivalence, we can write it as A⇔B.
In below truth table we can see that column for ¬A∨ B and A→B, are identical hence A
is Equivalent to B
Properties of Operators:
o Commutativity:
o P∧ Q= Q ∧ P, or
o P ∨ Q = Q ∨ P.
o Associativity:
o (P ∧ Q) ∧ R= P ∧ (Q ∧ R),
o (P ∨ Q) ∨ R= P ∨ (Q ∨ R)
o Identity element:
o P ∧ True = P,
o P ∨ True= True.
o Distributive:
o P∧ (Q ∨ R) = (P ∧ Q) ∨ (P ∧ R).
o P ∨ (Q ∧ R) = (P ∨ Q) ∧ (P ∨ R).
o DE Morgan's Law:
o ¬ (P ∧ Q) = (¬P) ∨ (¬Q)
o ¬ (P ∨ Q) = (¬ P) ∧ (¬Q).
o Double-negation elimination:
o ¬ (¬P) = P.
o We cannot represent relations like ALL, some, or none with propositional logic.
Example:
a. All the girls are intelligent.
b. Some apples are sweet.
o Propositional logic has limited expressive power.
o In propositional logic, we cannot describe statements in terms of their
properties or logical relationships.
Rules of Inference in Artificial intelligence
Inference:
In artificial intelligence, we need intelligent computers which can create new logic from
old logic or by evidence, so generating the conclusions from evidence and facts is
termed as Inference.
Inference rules:
Inference rules are the templates for generating valid arguments. Inference rules are
applied to derive proofs in artificial intelligence, and the proof is a sequence of the
conclusion that leads to the desired goal.
In inference rules, the implication among all the connectives plays an important role.
Following are some terminologies related to inference rules:
From the above term some of the compound statements are equivalent to each other,
which we can prove using truth table:
Hence from the above truth table, we can prove that P → Q is equivalent to ¬ Q → ¬
P, and Q→ P is equivalent to ¬ P → ¬ Q.
Types of Inference rules:
1. Modus Ponens:
The Modus Ponens rule is one of the most important rules of inference, and it states
that if P and P → Q is true, then we can infer that Q will be true. It can be represented
as:
Example:
2. Modus Tollens:
The Modus Tollens rule state that if P→ Q is true and ¬ Q is true, then ¬ P will also
true. It can be represented as:
Example:
Statement-1: If you have my home key then you can unlock my home. P→Q
Statement-2: If you can unlock my home then you can take my money. Q→R
Conclusion: If you have my home key then you can take my money. P→R
4. Disjunctive Syllogism:
The Disjunctive syllogism rule state that if P∨Q is true, and ¬P is true, then Q will be
true. It can be represented as:
Example:
5. Addition:
The Addition rule is one the common inference rule, and it states that If P is true, then
P∨Q will be true.
Example:
Proof by Truth-Table:
6. Simplification:
The simplification rule state that if P∧ Q is true, then Q or P will also be true. It can be
represented as:
Proof by Truth-Table:
7. Resolution:
The Resolution rule state that if P∨Q and ¬ P∧R is true, then Q∨R will also be true. It
can be represented as
Proof by Truth-Table:
First-Order Logic in Artificial intelligence
In the topic of Propositional logic, we have seen that how to represent statements
using propositional logic. But unfortunately, in propositional logic, we can only
represent the facts, which are either true or false. PL is not sufficient to represent the
complex sentences or natural language statements. The propositional logic has very
limited expressive power. Consider the following sentence, which we cannot represent
using PL logic.
To represent the above statements, PL logic is not sufficient, so we required some more
powerful logic, such as first-order logic.
First-Order logic:
o First-order logic is another way of knowledge representation in artificial
intelligence. It is an extension to propositional logic.
o FOL is sufficiently expressive to represent the natural language statements in a
concise way.
o First-order logic is also known as Predicate logic or First-order predicate
logic. First-order logic is a powerful language that develops information about
the objects in a more easy way and can also express the relationship between
those objects.
o First-order logic (like natural language) does not only assume that the world
contains facts like propositional logic but also assumes the following things in
the world:
o Objects: A, B, people, numbers, colors, wars, theories, squares, pits,
wumpus, ......
o Relations: It can be unary relation such as: red, round, is adjacent, or
n-any relation such as: the sister of, brother of, has color, comes
between
o Function: Father of, best friend, third inning of, end of, ......
o As a natural language, first-order logic also has two main parts:
a. Syntax
b. Semantics
Syntax of First-Order logic:
The syntax of FOL determines which collection of symbols is a logical expression in
first-order logic. The basic syntactic elements of first-order logic are symbols. We write
statements in short-hand notation in FOL.
Variables x, y, z, a, b,....
Connectives ∧, ∨, ¬, ⇒, ⇔
Equality ==
Quantifier ∀, ∃
Atomic sentences:
o Atomic sentences are the most basic sentences of first-order logic. These
sentences are formed from a predicate symbol followed by a parenthesis with
a sequence of terms.
o We can represent atomic sentences as Predicate (term1, term2, ......, term n).
Complex Sentences:
Consider the statement: "x is an integer.", it consists of two parts, the first part x is
the subject of the statement and second part "is an integer," is known as a predicate.
Universal Quantifier:
Universal quantifier is a symbol of logical representation, which specifies that the
statement within its range is true for everything or every instance of a particular thing.
o For all x
o For each x
o For every x.
Example:
All man drink coffee.
Let a variable x which refers to a cat so all x can be represented in UOD as below:
∀x man(x) → drink (x, coffee).
It will be read as: There are all x where x is a man who drink coffee.
Existential Quantifier:
Existential quantifiers are the type of quantifiers, which express that the statement
within its scope is true for at least one instance of something.
If x is a variable, then existential quantifier will be ∃x or ∃(x). And it will be read as:
Example:
Some boys are intelligent.
∃x: boys(x) ∧ intelligent(x)
It will be read as: There are some x where x is a boy who is intelligent.
Points to remember:
o The main connective for universal quantifier ∀ is implication →.
o The main connective for existential quantifier ∃ is and ∧.
Properties of Quantifiers:
o In universal quantifier, ∀x∀y is similar to ∀y∀x.
o In Existential quantifier, ∃x∃y is similar to ∃y∃x.
o ∃x∀y is not similar to ∀y∃x.
At the first level or highest level, we will examine the functionality of the circuit:
3. Decide on vocabulary:
The next step of the process is to select functions, predicate, and constants to
represent the circuits, terminals, signals, and gates. Firstly we will distinguish the gates
from each other and from other objects. Each gate is represented as an object which
is named by a constant, such as, Gate(X1). The functionality of each gate is determined
by its type, which is taken as constants such as AND, OR, XOR, or NOT. Circuits will
be identified by a predicate: Circuit (C1).
For gate input, we will use the function In(1, X1) for denoting the first input terminal
of the gate, and for output terminal we will use Out (1, X1).
The function Arity(c, i, j) is used to denote that circuit c has i input, j output.
We use a unary predicate On (t), which is true if the signal at a terminal is on.
o If two terminals are connected then they have the same input signal, it can be
represented as:
1. ∀ t1, t2 Terminal (t1) ∧ Terminal (t2) ∧ Connect (t1, t2) → Signal (t1) = Signal (2).
o Signal at every terminal will have either value 0 or 1, it will be represented as:
o Output of AND gate will be zero if and only if any of its input is zero.
1. ∀ g Gate(g) ∧ Type(g) = XOR → Signal (Out(1, g)) = 1 ⇔ Signal (In(1, g)) ≠ Signal (In
(2, g)).
o All the gates in the above circuit have two inputs and one output (except NOT
gate).
For the given circuit C1, we can encode the problem instance in atomic sentences as
below:
Since in the circuit there are two XOR, two AND, and one OR gate so atomic sentences
for these gates will be:
What should be the combination of input which would generate the first output of
circuit C1, as 0 and a second output to be 1?
1. ∃ i1, i2, i3 Signal (In(1, C1))=i1 ∧ Signal (In(2, C1))=i2 ∧ Signal (In(3, C1))= i3
2. ∧ Signal (Out(1, C1)) =0 ∧ Signal (Out(2, C1))=1
Substitution:
Note: First-order logic is capable of expressing facts about some or all objects in the
universe.
Equality:
First-Order logic does not only use predicate and terms for making atomic sentences
but also uses another way, which is equality in FOL. For this, we can use equality
symbols which specify that the two terms refer to the same object.
As in the above example, the object referred by the Brother (John) is similar to the
object referred by Smith. The equality symbol can also be used with negation to
represent that two terms are not the same objects.
o Universal Generalization
o Universal Instantiation
o Existential Instantiation
o Existential introduction
1. Universal Generalization:
o Universal generalization is a valid inference rule which states that if premise P(c)
is true for any arbitrary element c in the universe of discourse, then we can have
a conclusion as ∀ x P(x).
Example: Let's represent, P(c): "A byte contains 8 bits", so for ∀ x P(x) "All bytes
contain 8 bits.", it will also be true.
2. Universal Instantiation:
Example:1.
Example: 2.
"All kings who are greedy are Evil." So let our knowledge base contains this detail as
in the form of FOL:
So from this information, we can infer any of the following statements using Universal
Instantiation:
o King(John) ∧ Greedy (John) → Evil (John),
o King(Richard) ∧ Greedy (Richard) → Evil (Richard),
o King(Father(John)) ∧ Greedy (Father(John)) → Evil (Father(John)),
3. Existential Instantiation:
Example:
So we can infer: Crown(K) ∧ OnHead( K, John), as long as K does not appear in the
knowledge base.
4. Existential introduction
Generalized Modus Ponens can be summarized as, " P implies Q and P is asserted to
be true, therefore Q must be True."
According to Modus Ponens, for atomic sentences pi, pi', q. Where there is a
substitution θ such that SUBST (θ, pi',) = SUBST(θ, pi), it can be represented as:
Example:
We will use this rule for Kings are evil, so we will find some x such that x is king,
and x is greedy so we can infer that x is evil.
Substitution θ = {John/x} is a unifier for these atoms and applying this substitution,
and both expressions will be identical.
o The UNIFY algorithm is used for unification, which takes two atomic sentences
and returns a unifier for those sentences (If any exist).
o Unification is a key component of all first-order inference algorithms.
o It returns fail if the expressions do not match with each other.
o The substitution variables are called Most General Unifier or MGU.
E.g. Let's say there are two different expressions, P(x, y), and P(a, f(z)).
In this example, we need to make both above statements identical to each other. For
this, we will perform the substitution.
o Substitute x with a, and y with f(z) in the first expression, and it will be
represented as a/x and f(z)/y.
o With both the substitutions, the first expression will be identical to the second
expression and the substitution set will be: [a/x, f(z)/y].
Unification Algorithm:
Algorithm: Unify(Ψ1, Ψ2)
For each pair of the following atomic sentences find the most general unifier (If
exist).
5. Find the MGU of Q(a, g(x, a), f(y)), Q(a, g(f(b), a), x)}
SUBST θ= {b/y}
S1 => {Q(a, g(f(b), a), f(b)); Q(a, g(f(b), a), f(b))}, Successfully Unified.
Resolution is used, if there are various statements are given, and we need to prove a
conclusion of those statements. Unification is a key concept in proofs by resolutions.
Resolution is a single inference rule which can efficiently operate on the conjunctive
normal form or clausal form.
Clause: Disjunction of literals (an atomic sentence) is called a clause. It is also known
as a unit clause.
Note: To better understand this topic, firstly learns the FOL in AI.
This rule is also called the binary resolution rule because it only resolves exactly two
literals.
Example:
We can resolve two clauses which are given below:
Where two complimentary literals are: Loves (f(x), x) and ¬ Loves (a, b)
These literals can be unified with unifier θ= [a/f(x), and b/x] , and it will generate a
resolvent clause:
To better understand all the above steps, we will take an example in which we will
apply resolution.
Example:
In the first step we will convert all the given statements into its first order logic.
Step-2: Conversion of FOL into CNF
In First order logic resolution, it is required to convert the FOL into CNF as CNF form
makes easier for resolution proofs.
In this statement, we will apply negation to the conclusion statements, which will be
written as ¬likes(John, Peanuts)
Now in this step, we will solve the problem by resolution tree using substitution. For
the above problem, it will be given as follows:
Hence the negation of the conclusion has been proved as a complete contradiction
with the given set of statements.
Inference engine:
The inference engine is the component of the intelligent system in artificial
intelligence, which applies logical rules to the knowledge base to infer new information
from known facts. The first inference engine was part of the expert system. Inference
engine commonly proceeds in two modes, which are:
a. Forward chaining
b. Backward chaining
Horn clause and definite clause are the forms of sentences, which enables knowledge
base to use a more restricted and efficient inference algorithm. Logical inference
algorithms use forward and backward chaining approaches, which require KB in the
form of the first-order definite clause.
Definite clause: A clause which is a disjunction of literals with exactly one positive
literal is known as a definite clause or strict horn clause.
Horn clause: A clause which is a disjunction of literals with at most one positive
literal is known as horn clause. Hence all the definite clauses are horn clauses.
It is equivalent to p ∧ q → k.
A. Forward Chaining
Forward chaining is also known as a forward deduction or forward reasoning method
when using an inference engine. Forward chaining is a form of reasoning which start
with atomic sentences in the knowledge base and applies inference rules (Modus
Ponens) in the forward direction to extract more data until a goal is reached.
The Forward-chaining algorithm starts from known facts, triggers all rules whose
premises are satisfied, and add their conclusion to the known facts. This process
repeats until the problem is solved.
Properties of Forward-Chaining:
Consider the following famous example which we will use in both approaches:
Example:
"As per the law, it is a crime for an American to sell weapons to hostile nations.
Country A, an enemy of America, has some missiles, and all the missiles were sold
to it by Robert, who is an American citizen."
To solve the above problem, first, we will convert all the above facts into first-order
definite clauses, and then we will use a forward-chaining algorithm to reach the goal.
In the first step we will start with the known facts and will choose the sentences which
do not have implications, such as: American(Robert), Enemy(A, America), Owns(A,
T1), and Missile(T1). All these facts will be represented as below.
Step-2:
At the second step, we will see those facts which infer from available facts and with
satisfied premises.
Rule-(1) does not satisfy premises, so it will not be added in the first iteration.
Rule-(4) satisfy with the substitution {p/T1}, so Sells (Robert, T1, A) is added, which
infers from the conjunction of Rule (2) and (3).
Rule-(6) is satisfied with the substitution(p/A), so Hostile(A) is added and which infers
from Rule-(7).
Step-3:
At step-3, as we can check Rule-(1) is satisfied with the substitution {p/Robert, q/T1,
r/A}, so we can add Criminal(Robert) which infers all the available facts. And hence
we reached our goal statement.
B. Backward Chaining:
Backward-chaining is also known as a backward deduction or backward reasoning
method when using an inference engine. A backward chaining algorithm is a form of
reasoning, which starts with the goal and works backward, chaining through rules to
find known facts that support the goal.
Backward-Chaining proof:
In Backward chaining, we will start with our goal predicate, which is Criminal(Robert),
and then infer further rules.
Step-1:
At the first step, we will take the goal fact. And from the goal fact, we will infer other
facts, and at last, we will prove those facts true. So our goal fact is "Robert is Criminal,"
so following is the predicate of it.
Step-2:
At the second step, we will infer other facts form goal fact which satisfies the rules. So
as we can see in Rule-1, the goal predicate Criminal (Robert) is present with
substitution {Robert/P}. So we will add all the conjunctive facts below the first level and
will replace p with Robert.
Step-4:
At step-4, we can infer facts Missile(T1) and Owns(A, T1) form Sells(Robert, T1, r) which
satisfies the Rule- 4, with the substitution of A in place of r. So these two statements
are proved here.
Step-5:
At step-5, we can infer the fact Enemy(A, America) from Hostile(A) which satisfies
Rule- 6. And hence all the statements are proved true using backward chaining.
Difference between backward chaining and
forward chaining
Following is the difference between the forward chaining and backward
chaining:
o Forward chaining as the name suggests, start from the known facts and move
forward by applying inference rules to extract more data, and it continues until
it reaches to the goal, whereas backward chaining starts from the goal, move
backward by using inference rules to determine the facts that satisfy the goal.
o Forward chaining is called a data-driven inference technique, whereas
backward chaining is called a goal-driven inference technique.
o Forward chaining is known as the down-up approach, whereas backward
chaining is known as a top-down approach.
o Forward chaining uses breadth-first search strategy, whereas backward
chaining uses depth-first search strategy.
o Forward and backward chaining both applies Modus ponens inference rule.
o Forward chaining can be used for tasks such as planning, design process
monitoring, diagnosis, and classification, whereas backward chaining can be
used for classification and diagnosis tasks.
o Forward chaining can be like an exhaustive search, whereas backward chaining
tries to avoid the unnecessary path of reasoning.
o In forward-chaining there can be various ASK questions from the knowledge
base, whereas in backward chaining there can be fewer ASK questions.
o Forward chaining is slow as it checks for all the rules, whereas backward
chaining is fast as it checks few required rules only.
1. Forward chaining starts from known Backward chaining starts from the goal
facts and applies inference rule to and works backward through inference
extract more data unit it reaches to rules to find the required facts that
the goal. support the goal.
5. Forward chaining tests for all the Backward chaining only tests for few
available rules required rules.
9. Forward chaining is aimed for any Backward chaining is only aimed for the
conclusion. required data.