0% found this document useful (0 votes)
19 views32 pages

ADSA Lecture Notes

Uploaded by

eedupugantil
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views32 pages

ADSA Lecture Notes

Uploaded by

eedupugantil
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 32

[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

STRASSEN’S MATRIX MULTIPLICATION

Let A and B are two n X n matrices. The product matrix C = A B is also an n X n


matrix whose i, jth element is formed by taking the elements in the ith row of A and jth column
of B and multiplying them to get

C (i, j) = ∑ A (i, k) B (k, j) ; for all i and j between 1 and n.


1≤k≤n

Time complexity for the above conventional method is O(n3) time.

The divide and conquer strategy suggest another way to compute the product of two
n X n matrices. Here, assume n is a power of 2. Now, the two matrices A and B are
partitioned into 4 sub matrices, each sub matrix having dimensions n/2 X n/2. Then the
product of AB can be computed by using the formula as:

A1 1 A1 2 B1 1 B1 2 C1 1 C1 2
=
A2 1 A2 2 B2 1 B2 2 C2 1 C2 2

Here,
C11 = A11 B11 + A12 B21
C12 = A11 B12 + A12 B22
C21 = A21 B11 + A22 B21
C22 = A21 B12 + A22 B22

Here, to compute the product AB, it performs 8 multiplication operations and 4


addition operations. Then, the overall computing time T(n) is given by the recurrent relation:

T(n) = b ; n≤2
8 T(n/2) + c n2 ; n>2

T(n) = O ( n3 )

Volker Strassen has discovered a way to compute Cij `s using only 7 multiplications
and 18 additions or subtractions. His method involves first computing the seven n/2 X n/2
matrices P, Q, R, S, T, U and V as:

P = ( A11 + A22 ) ( B11 + B22 )


Q = ( A21 + A22 ) B11
R = A11 ( B12 - B22 )
S = A22 ( B21 - B11 )
T = ( A11 + A12 ) B22
U = ( A21 - A11 ) ( B11 + B12 )
V = ( A12 - A22 ) ( B21 + B22 )

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 48


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

Then, Cij `s can be obtained as:

C11 = P+S–T+V
C12 = R+T
C21 = Q+S
C22 = P+R–Q+U

The resulting recurrence relation for T(n) is:

T(n) = b ; n≤2
7 T(n/2) + a n2 ; n>2

T(n) = O ( n2.81 )

Example: Calculate product of the given two matrices using Strassen’s matrix
multiplication where

A = 1 2 4 1 B = 1 2 3 4
2 3 2 4 4 3 2 1
1 5 1 2 1 3 1 2
3 1 4 2 4 1 2 3

CONVEX HULL

The convex hull of a set S of points in the plane is defined to be the smallest convex
polygon containing all points of S.

Note: A polygon is defined to be convex if for any two points p1 and p2 inside the polygon,
the directed line segment from p1 to p2 is fully contained in the polygon.

Example: Input set of points Convex Hull

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 49


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

UNIT – III

Greedy Method: General Method, Job Sequencing with deadlines, Knapsack Problem,
Minimum cost spanning trees, Single Source Shortest Paths.
Dynamic Programming: General Method, All pairs shortest paths, Single Source Shortest
Paths – General Weights (Bellman Ford Algorithm), Optimal Binary Search Trees, 0/1
Knapsack, String Editing, Travelling Salesperson problem.
***

GREEDY METHOD
Greedy method is a straightforward design technique applied on those methods in
which solution is in terms of a subset that satisfies some constraints.

 Any subset that satisfies the constraints of the problem statement is called a “feasible
solution”.
 A feasible solution that either maximizes or minimizes the given objective function is
called an “optimal solution”.

To obtain optimal solution, Greedy method suggests that to devise an algorithm that
works in stages, considering one input at a time. At each stage, a decision is made regarding
whether a particular input is I an optimal solution.
If the inclusion of next input into the partially constructed optimal solution will result
in an infeasible solution, then this input is not added to the partial solution. Otherwise, it is
added.
This procedure results the algorithm to generate suboptimal solutions. Hence, this
version of the algorithm is called the “Subset Paradigm”.

 Control abstraction algorithm of greedy strategy is as follows:

Algorithm Greedy ( a , n )
{
// a[1 : n] contains n inputs
solution : = Φ;
for i : = 1 to n do
{
x : = Select ( a );
if Feasible ( solution , x ) then
solution : = Union ( solution , x );
}
return solution;
}

Here,
 The function Select( ) selects an input from a[ ] and is assigned to x.
 Feasible( ) is a Boolean valued function that determines whether x can be included
into the solution vector.
 Union( ) function combines x with the solution and updates the objective function.

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 51


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

APPLICATIONS

Greedy strategy can be applicable to solve different problems such as:

 Knapsack problem
 Job sequencing with dead lines
 Minimum cost spanning trees
 Single source shortest path etc,

KNAPSACK PROBLEM

Consider there are n objects and a Knapsack or bag. Every object i has a weight w i
and the knapsack has a capacity m. If a fraction xi, 0 ≤ xi ≤ 1, of object i is placed into the
knapsack, then a profit of pixi is earned. The objective is to obtain a filling of the knapsack
that maximizes the total profit earned.

Formally, the problem statement can be stated as:

maximize ∑ pi xi → 1
1≤i≤n

subject to ∑ wi xi ≤ m → 2
1≤i≤n

and 0 ≤ xi ≤ 1 , 1≤i≤n → 3

 A feasible solution is any set ( x1, x2, - - - - -, xn ) satisfying equation 2 and 3.


 An optimal solution is a feasible solution for which equation 1 is maximized.

Example: Find an optimal solution to the knapsack instance n = 3, m = 20, (p 1,p2,p3) =


(25, 24, 15) and (w1,w2,w3) = (18, 15, 10).

Algorithm GreedyKnapsack ( m , n )
{
// p[1 : n] and w[1 : n] contain the profits and weights respectively of n objects
such that p[i] / w[i] ≥ p[i+1] / w[i+1] ≥ - - - - - - - -.

for i : = 1 to n do
x[i] : = 0.0;
U : = m;
for i : = 1 to n do
{
if ( w[i] > U ) then
break;

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 52


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

else
{
x[i] : = 1.0;
U : = U – w[i];
}
}
if ( i ≤ n ) then
x[i] : = U / w[i];
}

Here,
 If sum of all weights is ≤ m, then xi = 1 ; 1 ≤ i ≤ n is an optimal solution.
 If p1 / w1 ≥ p2 / w2 ≥ - - - - ≥ pn / wn , then GreedyKnapsack generates an optimal
solution to the given instance of the Knapsack problem
 Time complexity of Knapsack problem is O(n) time.

JOB SEQUENCING WITH DEADLINES


Consider a set of n jobs. Every job i has a deadline di ≥ 0 and a profit pi > 0. For any
job i the profit pi is earned iff the job is completed by it deadline. To process these jobs only
one machine is available. To complete the job, it has to process on machine for one unit of
time.

 A feasible solution for this problem is a subset of jobs such that each job in this sub
set can be completed by its deadline. The value of the feasible solution J is the sum of
profits of the jobs as ∑ pi.
i€J
 An optimal solution is a feasible solution with maximum profit value.

Example: Obtain optimal solution of Job sequencing with deadlines for the instance n=5,
( p1, p2, p3, p4, p5 ) = (20, 15, 10, 5, 1) and ( d1, d2, d3, d4, d5 ) = (2, 2, 1, 3, 3).

Algorithm GreedyJob ( d, J, n )
{
J : = { 1 };
for i : = 2 to n do
{
if ( all jobs in J U { i } can be completed by their deadlines ) then
J : = J U { i };
}
}

 The computing time of Job sequencing with deadlines is O(n 2) time.

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 53


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

MINIMUM COST SPANNING TREES

Let G = (V, E) be an undirected connected graph. A subgraph T = (V, E 1) of G is a


spanning tree iff T is a tree.

Example: Consider the graph

For this, some of the possible spanning trees are:

etc,

If the graph consists ‘n’ vertices, spanning tree contains exactly ‘n-1’ edges to connect
the ‘n’ vertices.

Consider a weighted graph in which every edge is labeled with a cost / weight value.
Cost of the spanning tree is obtained by adding the cost of the selected edges.

“A minimum cost spanning tree of G is a spanning tree T having minimum cost”.

Most important methods used to find the minimum cost spanning trees are:

 Prim’s Algorithm
 Kruskal’s Algorithm

PRIM’s ALGORITHM

In this method, minimum cost spanning tree builds the tree edge by edge. The next
edge is included based on some optimization criterion. The simplest way is to choose an
edge that results in a minimum increase in the sum of the cost of the edges so far included.

Procedure:

 Procedure starts with a tree T that contains the starting vertex.


 Add a least cost edge (u, v) to T such that T U { (u, v) } is also a tree.
 To select the edge (u, v) such that exactly one of u or v is in T.
 Selection of the edge (u, v) forms a cycle, discard it; otherwise, include in the tree
structure.
 Repeat the above procedure until T contains n-1 edges.

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 54


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

Example: Obtain minimum-cost spanning tree for the graph

Algorithm Prim( )
{
T:=Φ;
TV : = Φ ;
while ( T contains less than n-1 edges ) do
{
Let (u, v) be a least cost edge such that u € TV and v € TV;
if ( there is no edges ) then
break;
else
{
Add v to TV;
Add (u, v) to T;
}
}
}

 Time complexity of Prim’s algorithm is O(n2) time.

KRUSKAL’s ALGORITHM

Kruskal’s algorithm builds a minimum-cost spanning tree T by adding the edges one
at a time. It selects the edges in non-decreasing order of their costs. An edge is added to T if
it does not create a cycle.

Procedure:

 Procedure starts with all vertices in terms of a forest representation.


 Select an edge (u, v) with minimum cost and add to T if it does not form a cycle.
 Repeat the above step until T contains n-1 edges.

Example: Obtain minimum-cost spanning tree for the graph

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 55


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

Algorithm Kruskal( )
{
T:=Φ;
while (( T has less than n-1 edges) and (E ≠ Φ )) do
{
Choose an edge (u,v) from E of Lowest cost;
Delete (u, v) from E;
if (u, v) does not create a cycle in T then
add (u,v) to T;
else
discard (u, v);
}
}

 The time complexity of Kruskal’s algorithm is O( |E| log n ) time.

Note: The main difference between Prim’s and Kruskal’s algorithm is:
 Prim’s algorithm maintains tree structure at every stage and Kruskal’s algorithm need
not be maintained.
 Finally, both methods produces same spanning tree with minimum-cost.

SINGLE-SOURCE SHORTEST PATHS


Graphs can be used to represent highway structure with vertices representing cities
and edges representing communication link in between the cities. These edges are labeled
with weights representing distance between the cities.
In general, everybody is interested to move from one city to another city with
minimum distance. It leads to shortest path problem.

 Consider G = (V, E) be a weighted directed graph and source vertex as v0. The
problem is to determine the shortest path from v0 to all the remaining vertices of
G.
 The greedy method generates shortest paths from vertex v0 to the remaining
vertices in non-decreasing order of path length.
 First, a shortest path to the nearest vertex is generated. Then, a shortest path to the
second nearest vertex is generated and so on.

Example: Identify shortest paths of the following graph by starting vertex as ‘5’.

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 56


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

Algorithm ShortestPath ( V, Cost, Dist, n )


{
// Graph G is represented by Cost Adjacency Matrix C[1:n , 1:n]

for i : = 1 to n do
{
S[i] : = False;
Dist[i] : = Cost[v, i];
}
S[v] : = True;
Dist[v] : = 0.0;
for num : = 1 to n-1 do
{
Choose u from among those vertices not in S such that Dist[u] is minimum;
S[u] : = True;
for ( each w adjacent to u with S[w] = False ) do
{
if ( Dist[w] > Dist[u] + Cost[u, w] ) then
Dist[w] : = Dist[u] + Cost[u, w];
}
}
}

 The time complexity of single-source shortest path problem is O(n2) time.

***

DYNAMIC PROGRAMMING

Dynamic programming is an algorithm design method that can be used when the
solution to a problem statement can be viewed as the result of a sequence of decisions.

To obtain optical solution, first generate all possible sequences, and then apply
principle of optimality.

“The principle of optimality states that an optimal sequence of decisions has the
property that whatever the initial state and decisions are, the remaining decisions must
constitute an optimal decision sequence with respect to the state resulting from the firs
decision”.

Note: The essential difference between the Greedy method and the Dynamic programming
is, in the greedy method only one decision sequence is ever generated. In dynamic
programming, many decision sequences may be generated.

Applications: Some of the problem statements that can be solved using dynamic
programming are:
 All pairs shortest paths
 Single-source shortest paths
 Optimal binary search trees
 0/1 Knapsack problem etc,

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 57


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

ALL-PAIRS SHORTEST PATHS

Let G = (V, E) be a directed graph with n vertices. Let cost be a cost adjacency
matrix such that cost <i, i> = 0, 1 ≤ i ≤ n. Then cost <i, j> is the length of edge <i, j> if <i, j>
€ E(G) and cost <i, j> = α if <i, j> € E(G).

All-pairs shortest path problem is to determine of matrix A such that A(i, j) is the
length of a shortest path from i to j. The matrix A can be obtained by solving n-single source
shortest path problems.

 By applying dynamic programming it can be solved as:

Step 1: To examine shortest path from i to j, if k is an intermediate vertex between i and j


then the shortest path from i to j becomes the shortest path from i to k and shortest
path from k to j.
Step 2: Ak (i, j) be length of the shortest path from node i to j such that every intermediate
vertex will be ≤ k.
Then compute Ak for k = 1, 2, 3, - - - - - - n.

When intermediate vertex k raised, then two possible cases are:


 Path going from i to j via k
 Path not going via k
Thus, principle of optimality holds.
Step 3: Shortest path can be computed using the formula

Ak (i, j) = min { Ak-1 (i, j) , Ak-1 (i, k) + Ak-1 (k, j) } ,k≥1


Here,
If k is selected, then Ak (i, j) = Ak-1 (i, k) + Ak-1 (k, j)

Otherwise, Ak (i, j) = Ak-1 (i, j).

Initially, A0 (i, j) = cost(i, j) , 1 ≤ i ≤ n.

Example: Compute All-pairs shortest paths for the graph

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 58


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

Algorithm AllPaths ( cost, A, n )


{
// cost[1:n , 1:n] is the Cost Adjacency Matrix
// A[i, j] is the cost of shortest path from vertex i to j.
// cost[i, i] = 0 , V 1 ≤ i ≤ n.

for i : = 1 to n do
for j : = 1 to n do
A[i, j] : = cost[i, j];
for k : = 1 to n do
for i : = 1 to n do
for j : = 1 to n do
A[i, j] : = min [ A[i, j] , A[i, k] + A[k, j] ] ;
}

Analysis:

Here, the first nested loop takes O(n2) time and second nested loop takes O(n3) time.
Hence, overall time complexity is O(n3) time.

SINGLE-SOURCE SHORTEST PATHS [ General Weights ]


Consider a directed graph G that may have negative lengths. When negative edge
lengths are permitted, we require that the graph have no cycles of negative length. When
there no cycles of negative length, there is a shortest path between every two vertices of n-
vertex graph that has at most n-1 edges on it.

Let distl[u] be the length of a shortest path from the source vertex v to vertex u that
contains at most l edges. Then,

distl [u] = cost [v, u] , 1 ≤ u ≤ n

The main objective is to compute distn-1[u] for all u. This can be done by using
dynamic programming methodology with Bellman and Ford algorithm as:

 If the shortest path from v to u with at most k, k > 1, edges has no more than
k-1, then
distk [u] = distk-1 [u]

 If the shortest path from v to u with at most k, k > 1, edges has exactly k, then
it is made up of a shortest path from v to some vertex j followed by the edge
<j,u>, then
distk [u] = min { distk-1 [i] + cost [ i, u ] }
i
From this, dist can be shown as:

distk [u] = min { distk-1 [u] , min { distk-1 [i] + cost [ i, u ] } }


i
This recurrence can be used to compute dist k , distk-1 , - - - - - for k = 1, 2, - - -- n-1.

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 59


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

Example: Obtain single-source shortest paths for the graph

Algorithm BellmanFord ( v, cost, dist, n )


{
for i : = 1 to n do
dist[ i ] := cost [ v, i ];
for i : = 2 to n-1 do
for each u such that u ≠ v and u has atleast one incoming edge do
for each <i, u> in the graph do
if dist[ u ] > dist[ i ] + cost[ i, u ] then
dist[ u ] : = dist[ i ] + cost[ i, u ];
}

 The time complexity of above representation is O(n 3) time.

OPTIMAL BINARY SERCH TREES


Consider a fixed set of identifiers and need to create a binary search tree. Here, it
leads to different binary search trees with different performance characteristics.

Example: Assume the set of identifiers are: for, do, while, int, if.

For this, possible binary search trees are

for for

do while etc,
do int

int
if while
if

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 60


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

 The first tree takes 1, 2, 2, 3, 4 comparisons to find the identifiers. Thus, the average
number of comparisons is (1 + 2 + 2 + 3 + 4) / 5 = 12/5.
 The second tree takes 1, 2, 2, 3, 3 comparisons to find the identifiers. Thus, the
average number of comparisons is (1 + 2 + 2 + 3 + 3) / 5 = 11/5.
 The number of comparisons at different levels to search a particular node is
considered as a cost.
 For a given set of identifiers, design a binary search tree with minimum cost is known
as “Optimal Binary Search Tree”.

For this, Let us assume that the given set of identifiers is { a 1 , a2 , - - - - - - , an }


with a1 < a2 < - - - < an. Let p(i) be the probability to search an element a i as successful
search. Let q(i) be the probability to search an element x as a i < x < ai+1 refers to
unsuccessful search. Then,

∑ p(i) + ∑ q(i) = 1
1≤i≤n 0 ≤i≤n

To obtain cost function of binary search trees, it is useful to add an external node at
every empty subtree indicated with square nodes.

for

do while

int

if

If a binary search tree represents ‘n’ identifiers, then there will be exactly n internal
nodes and n+1 external nodes. Every internal node represents a point where a successful
search may terminate. Every external node represents a point where an unsuccessful search
may terminate.
Now, if successful search terminates at level l, then expected cost contribution for the
internal node is p(i) * level(ai). If unsuccessful search terminates at a specified level then
expected cost contribution is q(i) * ( level (E i ) – 1).
Finally, cost of the binary search tree is

∑ p(i) * level(ai) + ∑ q(i) * ( level (Ei ) – 1)


1≤i≤n 0 ≤i≤n

The aim of the problem statement is to construct the binary search tree with the above
cost is minimum.

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 61


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

 To apply dynamic programming, it makes a decision as to which of the a i’s should be


assigned as root of the tree. It splits the identifiers into left and right formats. Hence,
it becomes

ak

l r
Here,

cost (l) = ∑ p(i) * level(ai) + ∑ q(i) * ( level (Ei ) – 1)


1≤i<k 0 ≤i<k

cost (r) = ∑ p(i) * level(ai) + ∑ q(i) * ( level (Ei ) – 1)


k<i≤n k < i≤n

Finally,

cost(i, j) = min { c(i, k-1) + c(k, j) } + w(i, j)


i< k ≤ j

The problem is solved for c(o, n) by computing j – i = 1, 2, - - - - - - .

Initially, c(i, i) = 0 ;
r(i, i) = 0 ;
w(i, i) = q( i ) , 0 ≤ i ≤ n and
w(i, j) = p( j ) + q( j ) + w(i, j – 1).

During this computation, record the root r(i, j) of each tree.

Example: Construct an optimal binary search tree for n = 4 and


(a1 , a2 , a3 , a4) = (do, if, int, while).
The values of p’s and q’s are p(1:4) = (3, 3, 1, 1) and q(0:4) = (2, 3, 1, 1, 1).

Algorithm OBST ( p, q, n )
{
for i : = 0 to n-1 do
{
w[i, i] : = q[i];
c[i, i] : = 0.0;
r[i, i] : = 0;
w[i, i+1] : = q[i] + q[i+1] + p[i+1] ;
c[i, i+1] : = q[i] + q[i+1] + p[i+1] ;
r[i, i+1] : = i+1;
}
w[n, n] : = q[n];
c[n, n] : = 0.0;
r[n, n] : = 0;

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 62


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

for m : = 2 to n do
{
for i : = 0 to n - m do
{
j : = i + m;
w[i, j] : = w[i, j-1] + p[ j ] + q[ j ];
k : = Find(c, r, i, j);
c[i, j] = w[i, j] + c[i, k – 1] + c[k, j];
r[i, j] : = k;
}
}
write ( c[0, n], w[0, n], r[0, n] );
}

 The time complexity of optimal binary search tree is O(n 3) time.

0 / 1 KNAPSACK PROBLEM
Consider there are n objects and a Knapsack or bag. Every object i has a weight w i
and the knapsack has a capacity m. If the object i is placed into the knapsack, then a profit of
pixi is earned where xi = 0 or 1. The objective is to fill the knapsack that maximizes the total
profit earned.
Formally, the problem statement can be stated as:

maximize ∑ pi xi → 1
1≤i≤n

subject to ∑ wi xi ≤ m → 2
1≤i≤n

and xi = 0 or 1 , 1≤i≤n → 3

 Solution to the Knapsack problem can be obtained by making a sequence of decision


on the variables x1, x2, - - - - -, xn .

Let us assume that decision of xi is made in the order xn , xn-1 , - - - - , x1. For a
decision on xn possible states are:

 xn = 0 ; The knapsack capacity is m and no profit earned.


 xn = 1 ; The knapsack capacity is m – wn and a profit of pn is earned.

Now, the remaining decisions xn-1 , - - - - , x1 must be optimal with respect to the
problem statement resulting from the decision on x n.

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 63


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

 To solve this problem using dynamic programming, follow the procedure as:

Step 1: Si is a pair (P, W)


Initially S0 = { (0, 0) };

Then, S1i-1 is calculated as:

S1i-1 = { Select next pair (P, W) and add with the pairs S i-1 } i = 1, 2, - - -.

Now, Si is obtained by merging Si-1 and S1i-1 and apply purging rule.

Purging Rule: If Si contains (Pj , Wj) and (Pk , Wk) with the condition that Wj > Wk
and Pj ≤ Pk , then the pair (Pj , Wj) can be discarded.

Step 2: Generate all possible Si’s.

Step 3: Select a tuple (P1 , W1) with a condition that W1 ≤ m from the last set Si then

Set xn = 0; if (P1 , W1) є Sn-1 otherwise, set xn = 1.

Now, select new pair as (P1 – pn , W1 – wn) and repeat Step 3 until all
decisions are completed.

Example: Obtain 0/1 Knapsack solution for the instance n=3, m=6, (p 1 , p2 , p3) = (1, 2, 5)
and (w1 , w2 , w3) = (2, 3, 4).

Algorithm DKP ( p, w, n, m)
{
S0 : = { (0, 0) };
for i : = 1 to n – 1 do
{
S1i-1 = { (P, W) / (P – pi , W - wi ) є Si-1 and W ≤ m } ;
Si : = MergePurge( Si-1 , S1i-1 );
}
(PX, WX) : = last pair in Sn-1;
(PY, WY) : = (P1 + pn , W1 + wn);

if (PX > PY) then


xn : = 0;
else
xn : = 1;
TraceBackFor( xn-1 , - - - - , x1 );
}

 The time complexity of 0/1 Knapsack using dynamic programming is O(2 n) time.

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 64


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

STRING EDITING
Consider two strings X = x1, x2, - - -, xn and Y = y1 , y2 , - - - - ym. Where xi , 1 ≤ i ≤ n
and yj, 1 ≤ j ≤ m, are members of a finite set of symbols known as the alphabet. We want to
transform X into Y using a sequence of edit operations on X. The permissible edit operations
are insert, delete, and change, each operation is associated with a cost. The cost of sequence
of operations is the sum of the costs of individual operations in the sequence.

 “ The problem of string editing is to identify a minimum cost sequence of edit


operations that will transform X into Y”.

For this,

Let D(xi) be the cost of deleting the symbol xi from X,


I(yj) be the cost of inserting the symbol yj into X,
C(xi , yj) be the cost of changing the symbol xi of X into yj.

A dynamic programming solution for this problem can be stated as:

 Define cost(i, j) be the minimum cost of any edit sequence for transforming X into Y.
cost(i, j) can arrive using the recurrence equation:

cost (i, j) = 0 ; i=j=0


cost (i-1, 0) + D(xi) ; j=0,i>0
cost (0, j-1) + I(yj) ; j>0,i=0
cost1 (i, j) ; i>0,j>0

Where,

cost1 (i, j) = min{ cost(i-1, j) + D(xi) , cost(i, j-1) + I(yj) , cost(i-1, j-1)+c(xi , yj) }

 Compute cost(i, j) for all possible values of i and j. There are (n+1)(m+1) values.
These values can be computed in the form of a table M, where each row corresponds
to the value i and column corresponds to j.

 The final cost(n, m) shows the cost of an optimal edit sequence.

 Time complexity of string editing problem is O(m n) time.

Example: Solve string edit problem for X = a, a, b, a, a, b and Y = b, a, b, b.

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 65


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

TRAVELING SALESPERSON PROBLEM


Consider G be a directed graph denoted by < V, E >, where V denotes a set of
vertices and E denotes a set of Edges. The edges are assigned with weights values as c ij. The
cost cij > 0 for all i and j; if there is an edge between i and j, otherwise, c ij = α.

 A tour for the graph should be visit all vertices exactly once and the cost of the tour is
sum of the cost of the edges. Traveling salesperson problem is to find such a tour
with minimum cost.

 To solve this problem, dynamic programming strategy can be stated as:

Step 1: Let the tour starts and ends at the vertex 1. Then the function g( 1, V-{1} ) be
the total length of the tour.
 Let g(i, S) be the length of the shortest path starting at vertex i, going
through all vertices in S, and terminating at vertex 1.

Step 2: According to principle of optimality, we need to take a decision on the vertices


v1 , v2 , - - -- , vn . Hence, the functional value can be shown as:

g (i, s) = min { cij + g ( j, S – { j } ) } iєS


jєS

Initially, g ( i, 0 ) = ci 1 , 1 ≤ i ≤ n.

Obtain g (i, s) for S with |S| = 1, |S| = 2, - - - - - - - , |S| = n-1.

 The complexity of Traveling salesperson problem using dynamic programming


approach is O(n 2n) time.

END

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 66


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

UNIT – IV

Backtracking: General Method, 8-Queens Problem, Sum of Subsets problem, Graph


Coloring, 0/1 Knapsack Problem.
Branch and Bound: The General Method, 0/1 Knapsack Problem, Travelling Salesperson
problem.
***

BACKTRACKING

Backtracking is one of the general technique used to obtain solution of the problem
statement which satisfies some constraints. For this, enumerate all possible solutions and
select which produces optimum result.

 In backtracking concept, the desired solution is expressible as an n-tuple ( x1, - - , xn ),


where xi is chosen from a finite set Si.
 Backtracking strategy determines the solution by systematic searching the solution
space that consists all possible solutions.
 Most of the problem statements used to solve backtracking strategy must satisfy a
complex set of constraints known as Explicit and Implicit constraints.

 Explicit constraints are rules that restrict each xi can take values from a given set Si.

Example: Si = { 0 , 1 }

=> xi = 0 or 1.

 Implicit constraints are rules that determines which of the tuples in the solution
space satisfies the criterion function.

Basic Terminology:

Backtracking strategy determines the problem solution by systematic searching for the
solution using Tree structure.

 Each node in the tree is called as “Problem State”.


 A node which has been generated and all of its children have not yet been generated is
called a “Live Node”.
 A live node whose children are currently being expanded is called “E-Node”.
 A node which is not to be expanded further or all of its children have been generated
is called a “Dead Node”.
 “Bounding Function” is used to kill live nodes without generating its children.
 All paths from root to other nodes define “State Space” of the problem statement.
 State space defines all possible solutions called “Solution Space”.
 Finally, from the solution space, identify the “Answer State” that satisfies implicit
constraints.
 Depth first node generation with bounding function is called “Backtracking”.

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 67


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

The control abstraction algorithm can be shown as:

Algorithm BackTracking ( n )
{
k : = 1;
while ( k ≠ 0 ) do
{
if (each x[k] є T ( x[1], x[2], - - -- , x[k-1]) and
Bk(x[1], - - , x[k] is True ) then
{
if ( x[1], - - - , x[k] ) is a path to answer node then
write x[1 : k];
k : = k + 1;
}
else
k : = k – 1;
}
}

APPLICATIONS

Some of the problems can be solved using backtracking are:

 8-Queens problem
 Sum of subsets problem
 Graph coloring
 0/1 Knapsack problem etc,

N-QUEENS PROBLEM

Let us consider N=4, then the problem is termed as “4 - Queens Problem”.

4 – QUEENS PROBLEM

The problem statement is to place 4 Queens on a 4 X 4 chessboard such that no two


queens can attack i.e., no two of them are on the same row, column or diagonal.

 Solution of a 4 – Queens problem can be represented as a 4-tuple ( x1 , x2 , x3 , x4 )


where xi is a column number on which queen i is placed.
 Here, solution space generates 4! tuples.
 It satisfies the specific constraints as:

 Explicit constraints: Si = { 1, 2, 3, 4 }
Xi should select from Si
 Implicit constraints: No two queens can be placed in the same row,
column or diagonal.

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 68


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

Suppose two queens are placed at positions ( i, j ) and ( k, l ). Then they are one same
diagonal only if

i–j=k–l or i+j=k+l

j–l=i–k or j–l=k–i

|j–l| = | i – k |;

8 – QUEENS PROBLEM

The problem statement is to place 8 Queens on a 8 X 8 chessboard such that no two


queens can attack i.e., no two of them are on the same row, column or diagonal.

 Solution of a 8 – Queens problem can be represented as a 8-tuple ( x1 , x2 , x3 , x4 , x5 ,


x6 , x7 , x8 ) where xi is a column number on which queen i is placed.
 Here, solution space generates 8! tuples.
 It satisfies the specific constraints as:

 Explicit constraints: Si = { 1, 2, 3, 4, 5, 6, 7, 8 }
Xi should select from Si
 Implicit constraints: No two queens can be placed in the same row,
column or diagonal.

N – QUEENS PROBLEM

The problem statement is to place N Queens on a N X N chessboard such that no two


queens can attack i.e., no two of them are on the same row, column or diagonal.

Solution of a N – Queens problem can be represented as a N-tuple ( x1 , x2 - - - xn )


where xi is a column number on which queen i is placed. Here, solution space generates N!
tuples.

Algorithm NQueens ( k, i )
{
for i : = 1 to n do
{
if Place ( k, i ) then
{
x[ k ] : = i;
if ( k = n ) then
write x [ 1 : n ];
else
NQueens( k+1 , n );
}
}
}

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 69


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

Algorithm Place ( k , i )
{
// Returns True if a queen can be placed in kth row and ith column.

for j : = 1 to k – 1 do
{
if ( ( x[ j ] = i ) or (Abs ( x[ j ] – i ) = Abs ( j – k ) ) ) then
return False;
}
return True;
}

Analysis:

 Place ( k, i ) returns a Boolean value True if queen can be placed on ith column. For
this, it takes O(k – 1) time.
 Overall time complexity of N-Queens problem is O(n2) time.

SUM OF SUBSETS
Consider ‘n’ distinct positive numbers given by the set w = ( w1 , w2 , - - - , wn ). The
aim of the problem is to find all combinations of these numbers whose sums are m.

 For this problem statement solution is represented by an n-tuple ( x1 , x2 - - - xn ) such


that xi = 0 or 1.
 xi = 1 means wi is included
 xi = 0 means wi is not included

 In solution space tree, left subtree of root defines all subsets containing wi i.e., xi = 1.
 Right subtree of the root defines all subsets not containing w i i.e., xi = 0.

Example: Solve sum of subsets problem when n = 4, w = (7, 11, 13, 24) and m = 31.

Algorithm SumOfSub ( s, k, r )
{
x [ k ] : = 1;
if ( s + w[k] = m ) then
write ( x [ 1 : k ] );
else if ( s + w [ k ] + w [ k + 1 ] ≤ m ) then
SumOfSub ( s + w[k], k+1, r-w[k] );
if ( ( s + r – w[k] ≥ m ) and ( s + w[k+1] ≤ m ) ) then
{
x [ k ] : = 0;
SumOfSub ( s , k+1, r-w[k] );
}
}

 Time complexity of sum of subsets is O(n2 logn) time.

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 70


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

GRAPH COLORING
Let G be a graph and m be a given positive integer. Graph coloring is a problem of
coloring each vertex in such a way that no two adjacent nodes have the same color and only
m colors are used.

 In general, the problem is treated as m-colorability decision problem and the integer
value is treated as chromatic number of the graph.

Example: Consider the following graph and m = 3 where colors are 1, 2, 3.

1
A A

B C 2 3
B C

D E 1 D E 2

F F
3

 Suppose we represent a graph by its adjacency matrix G[1:n , 1:n] where G[i, j] = 1 if
(i, j) is an edge of G; otherwise, G[i, j] = 0;

 The colors are represented by the integers 1, 2, - - -- , m and the solutions are given
by the n-tuple ( x1 , x2 - - - xn ) where xi is the color of node i.

Algorithm mColoring ( k )
{
// k is the index of the next vertex to color.

repeat
{
NextValue ( k ); // Assign to x[k] a legal color
if ( x[ k ] = 0 ) then
return;
if ( k = n ) then
write ( x [ 1 : n ] ) ;
else
mColoring ( k+1 ) ;
}
until(false);
}

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 71


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

Algorithm NextValue ( k )
{
repeat
{
x [ k ] : = ( x [k] + 1 ) mod (m + 1);
if ( x[ k ] = 0 ) then
return;
for j : =1 to n do
{
if ( ( G[k, j] ≠ 0 ) and ( x[k] = x[j] )) then
break;
}
if ( j = n + 1 ) then
return;
}
until (false);
}

Analysis:

 Computing time for NextValue to determine the legal color is O(mn) time.
 NextValue function is invoked in m-coloring function by n times.
 Hence, time complexity of Graph coloring problem is O(n mn) time.

0/1 KNAPSACK PROBLEM

Consider there are n objects and a Knapsack or bag. Every object i has a weight w i
and the knapsack has a capacity m. If the object i is placed into the knapsack, then a profit of
pixi is earned where xi = 0 or 1. The objective is to fill the knapsack that maximizes the total
profit earned.
Formally, the problem statement can be stated as:

maximize ∑ pi xi
1≤i≤n

subject to ∑ wi xi ≤ m
1≤i≤n

and xi = 0 or 1 , 1≤i≤n

 In backtracking, 0/1 knapsack problem solution is represented by an n-tuple (x1, x2, - -


- - -, xn ) such that xi = 0 or 1.
 Assume the given objects are placed in decreasing order of profit / weight ration
format. Now, generate a solution space tree by specifying left child value format as
xi = 0 and right child format as xi = 1.

Example: Obtain 0/1 Knapsack solution for the instance n=3, m=6, (p 1 , p2 , p3) = (5, 3, 6)
and (w1 , w2 , w3) = (3, 2, 4).

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 72


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

Algorithm Bound ( cp, cw, k )


{
// cp is the current profit total and cw is the current weight total
// k is the index of the last removed item

b : = cp;
c : = cw;
for i : = k + 1 to n do
{
c : = c + w[ i ];
if ( c < m ) then
b : = b + p[ i ];
else
return b + ( 1 – (c – m )/w[ i ] ) * p[ i ];
}
return b;
}

Algorithm BKnap( k, cp, cw )


{
if ( cw + w[k] ≤ m ) then
{
y[ k ] : = 1;
if ( k < n ) then
BKnap ( k+1, cp+p[k] , cw+w[k] );
if ( ( cp+p[w] > fp) and (k = n) ) then
{
fp : = cp + p[k];
fw : = cw + w[k];
for j : = 1 to k do
x[ j ] : = y [ j ];
}
}
if ( Bound( cp, w, k ) ≥ fp ) then
{
y[ k ] : = 0;
if ( k < n ) then
BKnap ( k+1, cp , cw );
if ( ( cp > fp) and (k = n) ) then
{
fp : = cp;
fw : = cw;
for j : = 1 to k do
x[ j ] : = y [ j ];
}
}
}

 Time complexity of above representation is O(2n) time.

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 73


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

BRANCH AND BOUND


Branch and bound is also used to find optimal solution of the given problem
statement. The term branch and bound refers to all state space search methods in which all
children of the E-Node are generated before any other live node can become the E-Node.

 Here, nodes explore it’s children node by using either BFS are D-Search.
 In branch and bound technology, a BFS state space search will be called FIFO (First
In First Out) search as the list of live nodes in First-In-First-Out list.
 A D-Search stat space search will be called LIFO (Last In First Out) search as the
list of live nodes in Last-In-First-Out list.
 As similar to backtracking process, Bounding functions are used to kill live nodes
those are not generating answer state.

LEAST COST (LC) SEARCH

Using FIFO and LIFO branch and bound selection of a E-Node is a time consuming
process. To improve its performance, they use another method that involves cost of the node
referred as least cost search.

In LC search cost function c( . ) is defined as:

 If x is an answer node, then c(x) is the cost of reaching x from the root of the state
space tree.
 If x is not an answer node, then c(x) = α.

Control abstraction algorithm for LC Search can be shown as:

Algorithm LCSearach ( t )
{
E : = t;
reperat
{
for each child x of E do
{
if x is an answer node then
write the path from x to t and return;
Add(x);
x → parent : = E;
}
if there ar no more live nodes then
{
write ‘ No Answer Node ‘
return;
}
E : = Least( );
}
until (False);
}

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 74


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

BOUNDING
Bounding functions are used to avoid generation of subtrees that do not contain the
answer node. In bounding function, lower and upper bound values are generated at each
node.

 Assume each node x has a cost c(x) associated with it and a minimum cost answer
node is to be found.
 A cost function Ĉ(.) such that Ĉ(x) ≤ c(x) is used to provide lower bounds on
solution obtained from any node x.
 An upper us an upper bound on the cost of minimum cost solution, then all live nodes
x with Ĉ(x) > upper can be killed.
 Initially upper is set to α. Each time a new answer node is found, the value of upper
can be updated.

Note:

1) If the list of live nodes is implemented as a Queue with Least( ) and Add( ) functions,
then LC search is treated as “FIFO Branch and Bound”.
2) If the list of live nodes is implemented as a Stack with Least( ) and Add( ) functions,
then LC search is treated as “LIFO Branch and Bound”.

0/1 KNAPSACK PROBLEM

Consider there are n objects and a Knapsack or bag. Every object i has a weight w i
and the knapsack has a capacity m. If the object i is placed into the knapsack, then a profit of
pixi is earned where xi = 0 or 1. The objective is to fill the knapsack that maximizes the total
profit earned.
In branch and bound strategy, the problem statement can be changed into
minimization problem as:

minimize - ∑ pi xi
1≤i≤n

subject to ∑ wi xi ≤ m
1≤i≤n

and xi = 0 or 1 , 1≤i≤n

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 75


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

 For minimum cost answer node, calculate

u( x ) = - ∑ pi xi for every node x


1≤i≤n

Actual profit
m−CurrentTotalWeight of remaining
Ĉ(x) = u( x ) - *
Actual Weight of Remaining Object object

Example: Obtain LCBB solution for the given knapsack instance n = 4, m = 15,
(p1 , p2, p3 , p4) = (10, 10, 12, 18) and ( w1 , w2, w3 , w4) = (2, 4, 6, 9).

TRAVELING SALESPERSON
Let G = (V, E) be a directed graph defining an instance of the traveling salesperson
problem. Let cij is the cost of edge <i, j> є E and cij = α if <i, j> є E. Assume every tour
starts and ends at vertex 1.

To use LCBB to search the traveling salesperson state space tree, define a cost
function c( . ) and other two functions Ĉ(.) and u( . ) such that Ĉ(r) ≤ c(r) ≤ u(r) for all
nodes r. The cost c( . ) is such that the solution node with least c( . ) corresponds to the
shortest tour in G.

To obtain traveling salesperson solution, apply branch and bound strategy as:

Step 1: Calculate reduced cost matrix corresponding to G. A row (column) is said to be


reduced iff it contains atleast one zero and all remaining entries are non-negative.
A matrix is said to be reduced iff every row and column is reduced.

For this, choose minimum entry in Row i (column j) and substract it from all
entries in Row i (column j).

Step 2: The total amount subtracted from the columns and rows is a lower bound on the
length of a minimum-cost tour and can be used as Ĉ value for the root of the state
space tree.

Step 3: Obtain a reduced cost matrix for every node in the traveling salesperson state
space tree.

Let A be the reduced cost matrix for node R. Let S be a child of R such that the tree
edge (R, S) corresponds to including edge <i, j> in the tour. If S is not a leaf node, then
reduced cost matrix for S may be obtained as follows:

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 76


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

 Change all entries in row i and column j of A to α.


 Set A(j, 1) to α.
 Reduce all rows and columns in the resulting matrix except for rows and
containing only α.
 Let the resulting matrix be B. If r is the total amount subtracted in sub-step (3)
then,
Ĉ(S) = Ĉ(R) + A[i, j] + r
 Initially assume upper = α. After reaching final leaf node, kill the live nodes
which have Ĉ(x) > upper.
 When the process reached to leaf node, it shows optimal tour of the traveling
salesperson problem.

Example: Obtain optimal tour for the cost matrix

α 20 30 10 11
15 α 16 4 2
3 5 α 2 4
19 6 18 α 3
16 4 7 16 α

END

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 77


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

UNIT – V

NP Hard and NP Complete Problems: Basic Concepts, Cook’s theorem.


NP Hard Graph Problems: Clique Decision Problem (CDP), Chromatic Number Decision
Problem (CNDP), Traveling Salesperson Decision Problem (TSP).
NP Hard Scheduling Problems: Scheduling Identical Processors, Job Shop Scheduling.
***

P & NP- Class Problem: Depending on computational complexity of various problem, it


can be classified into two classes as: P-Class and NP-Class problems.

A set of decision problems that can be solved in deterministic polynomial time are
called P-Class problems.
Example: Searching : O(log n)
Sorting techniques : O(n2), O(n log n)
Matrix multiplication : O(n3) etc,

A problem that can’t be solved in polynomial time, but is verified in non-deterministic


polynomial time is known as NP-Class problem.
Example: Knapsack problem
Traveling sales person problem etc,

NP-Class problems are further classified into two types as: NP-Hard and NP-
Complete problems.

NP-Hard Problem

A problem L is NP-Hard, iff SAT problem reduces to L.

 If there is a solution to one NP-Hard problem in polynomial time, there is solution to


all NP-Hard problems.
 If an NP-Hard problem can be solved in polynomial time, then all NP-Complete
problem can be solved in polynomial time.

NP-Complete Problem

A problem L is NP-Complete if
 It belongs to NP and NP-Hard
 It has property that it can be solved in polynomial time, iff all other NP-Complete
problems can also be solved in polynomial time.

 All NP-Complete problems are NP-Hard, but some NP-Hard problems can’t be NP-
Complete.

Properties:

P and NP: P is the set of all decision problems solvable by deterministic algorithms in
polynomial time. NP is the set of all decision problems solvable by nondeterministic
algorithms in polynomial time.

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 78


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

Since deterministic algorithms are just a special case of nondeterministic ones,


we conclude that P⊆NP.

 A problem L is NP-Hard, iff satifiability problem reduces to L. A problem L is NP-


Complete iff L is NP-Hard and LєNP.
 Let L1 and L2 be two problems. Problem L1 reduces to L2 iff there is a way to solve
L1 by any polynomial time.

Cook’s Theorem
Cook’s theorem states that the satisfiability is in P, iff P = NP. The scientist Stephen
Cook in 1972 state that Boolean satisfiability problem is NP-Complete problem.

SAT is in NP, because a non-deterministic algorithm can guess an assignment of truth


values for variables. An expression is satisfiable if it value results to be true on some
assignments on Boolean values.

Example: Consider the Boolean formula:


__ __ __ __ __
(x1 V x2 V x3) ^ (x1 V x2 V x3) ^ (x1 V x2)

Let x1 = 1 , x2 = 0, x3 = 0

Then, result of formula = ( 1 V 0 V 1 ) ^ ( 0 V 1 V 1 ) ^ (1 V 1)


= 1^1^1
= 1 (TRUE)

The non-deterministic algorithm can also be determine the value of expression for
corresponding assignment and accept if entire expression is true.
The proof shows that Boolean satisfiable problem can be solved in polynomial time.
Hence, all the problems in NP could be solved in polynomial time.

II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 79


[23A05302T] ADVANCED DATA STRUCTURES & ALGORITHM ANALYSIS

NP Hard Graph Problems


Clique Decision Problem: A clique is a subgraph of a graph such that all the vertices in
this subgraph are connected with each other that is the subgraph is a complete graph. The
Maximal Clique Problem is to find the maximum sized clique of a given graph G, that is a
complete graph which is a subgraph of G and contains the maximum number of vertices.
This is an optimization problem.

The Clique Decision Problem belongs to NP – If a problem belongs to the NP class, then
it should have polynomial-time verifiability, that is given a certificate, we should be able to
verify in polynomial time if it is a solution to the problem.

Proof:
1. Certificate – Let the certificate be a set S consisting of nodes in the clique and S is a
subgraph of G.

2. Verification – We have to check if there exists a clique of size k in the graph. Hence,
verifying if number of nodes in S equals k, takes O(1) time. Verifying whether each
vertex has an out-degree of (k-1) takes O(k2) time. (Since in a complete graph, each
vertex is connected to every other vertex through an edge. Hence the total number of
edges in a complete graph = kC2 = k*(k-1)/2 ). Therefore, to check if the graph formed
by the k nodes in S is complete or not, it takes O(k 2) = O(n2 ) time (since k<=n, where n
is number of vertices in G).

Therefore, the Clique Decision Problem has polynomial time verifiability and hence
belongs to the NP Class.

The Clique Decision Problem belongs to NP-Hard – A problem L belongs to NP-Hard if


every NP problem is reducible to L in polynomial time. Now, let the Clique Decision
Problem by C. To prove that C is NP-Hard, we take an already known NP-Hard problem,
say S, and reduce it to C for a particular instance. If this reduction can be done in
polynomial time, then C is also an NP-Hard problem. The Boolean Satisfiability Problem
(S) is an NP-Complete problem as proved by the Cook’s theorem. Therefore, every
problem in NP can be reduced to S in polynomial time. Thus, if S is reducible to C in
polynomial time, every NP problem can be reduced to C in polynomial time, thereby
proving C to be NP-Hard.

Conclusion
The Clique Decision Problem is NP and NP-Hard. Therefore, the Clique decision problem
is NP-Complete.

Traveling Salesperson problem

The traveling salesman problem consists of a salesman and a set of cities. The
salesman has to visit each one of the cities starting from a certain one and returning to the
same city. The challenge of the problem is that the traveling salesman wants to minimize the
total length of the trip.

THE END
II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 80

You might also like