100% found this document useful (1 vote)
2K views885 pages

Asymptotic Analysis of Algorithms

The document discusses asymptotic analysis and notations used to analyze algorithms. It introduces asymptotic analysis as a way to evaluate algorithm performance based on input size rather than actual running time. It discusses worst case, average case, and best case analysis and how asymptotic notations like Big-O, Theta, and Omega are used to represent upper and lower time complexity bounds. Common notations are defined precisely, and examples like linear search and insertion sort are used to illustrate different analysis cases and notation usage.

Uploaded by

ash23ish
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
2K views885 pages

Asymptotic Analysis of Algorithms

The document discusses asymptotic analysis and notations used to analyze algorithms. It introduces asymptotic analysis as a way to evaluate algorithm performance based on input size rather than actual running time. It discusses worst case, average case, and best case analysis and how asymptotic notations like Big-O, Theta, and Omega are used to represent upper and lower time complexity bounds. Common notations are defined precisely, and examples like linear search and insertion sort are used to illustrate different analysis cases and notation usage.

Uploaded by

ash23ish
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 885

Analysis of Algorithms | Set 1 (Asymptotic Analysis)

Why performance analysis?


There are many important things that should be taken care of, like user friendliness, modularity, security, maintainability, etc. Why to worry about
performance?
The answer to this is simple, we can have all the above things only if we have performance. So performance is like currency through which we can
buy all the above things. Another reason for studying performance is speed is fun!
Given two algorithms for a task, how do we find out which one is better?
One naive way of doing this is implement both the algorithms and run the two programs on your computer for different inputs and see which one
takes less time. There are many problems with this approach for analysis of algorithms.
1) It might be possible that for some inputs, first algorithm performs better than the second. And for some inputs second performs better.
2) It might also be possible that for some inputs, first algorithm perform better on one machine and the second works better on other machine for
some other inputs.
Asymptotic Analysis is the big idea that handles above issues in analyzing algorithms. In Asymptotic Analysis, we evaluate the performance of an
algorithm in terms of input size (we dont measure the actual running time). We calculate, how does the time (or space) taken by an algorithm
increases with the input size.
For example, let us consider the search problem (searching a given item) in a sorted array. One way to search is Linear Search (order of growth is
linear) and other way is Binary Search (order of growth is logarithmic). To understand how Asymptotic Analysis solves the above mentioned
problems in analyzing algorithms, let us say we run the Linear Search on a fast computer and Binary Search on a slow computer. For small values
of input array size n, the fast computer may take less time. But, after certain value of input array size, the Binary Search will definitely start taking
less time compared to the Linear Search even though the Binary Search is being run on a slow machine. The reason is the order of growth of
Binary Search with respect to input size logarithmic while the order of growth of Linear Search is linear. So the machine dependent constants can
always be ignored after certain values of input size.
Does Asymptotic Analysis always work?
Asymptotic Analysis is not perfect, but thats the best way available for analyzing algorithms. For example, say there are two sorting algorithms that
take 1000nLogn and 2nLogn time respectively on a machine. Both of these algorithms are asymptotically same (order of growth is nLogn). So,
With Asymptotic Analysis, we cant judge which one is better as we ignore constants in Asymptotic Analysis.
Also, in Asymptotic analysis, we always talk about input sizes larger than a constant value. It might be possible that those large inputs are never
given to your software and an algorithm which is asymptotically slower, always performs better for your particular situation. So, you may end up
choosing an algorithm that is Asymptotically slower but faster for your software.
We will covering more on analysis of algorithms in some more posts on this topic.
References:
MITs Video lecture 1 on Introduction to Algorithms.

Analysis of Algorithms | Set 2 (Worst, Average and Best Cases)


In the previous post, we discussed how Asymptotic analysis overcomes the problems of naive way of analyzing algorithms. In this post, we will
take an example of Linear Search and analyze it using Asymptotic analysis.
We can have three cases to analyze an algorithm:
1) Worst Case
2) Average Case
3) Best Case
Let us consider the following implementation of Linear Search.
#include <stdio.h>
// Linearly search x in arr[]. If x is present then return the index,
// otherwise return -1
int search(int arr[], int n, int x)
{
int i;
for (i=0; i<n; i++)
{
if (arr[i] == x)
return i;
}
return -1;
}
/* Driver program to test above functions*/
int main()
{
int arr[] = {1, 10, 30, 15};
int x = 30;
int n = sizeof(arr)/sizeof(arr[0]);
printf("%d is present at index %d", x, search(arr, n, x));
getchar();
return 0;
}

Worst Case Analysis (Usually Done)


In the worst case analysis, we calculate upper bound on running time of an algorithm. We must know the case that causes maximum number of
operations to be executed. For Linear Search, the worst case happens when the element to be searched (x in the above code) is not present in the
array. When x is not present, the search() functions compares it with all the elements of arr[] one by one. Therefore, the worst case time
complexity of linear search would be ?(n).
Average Case Analysis (Sometimes done)
In average case analysis, we take all possible inputs and calculate computing time for all of the inputs. Sum all the calculated values and divide the
sum by total number of inputs. We must know (or predict) distribution of cases. For the linear search problem, let us assume that all cases are
uniformly distributed (including the case of x not being present in array). So we sum all the cases and divide the sum by (n+1). Following is the
value of average case time complexity.
Average Case Time =
=
= ?(n)

Best Case Analysis (Bogus)


In the best case analysis, we calculate lower bound on running time of an algorithm. We must know the case that causes minimum number of
operations to be executed. In the linear search problem, the best case occurs when x is present at the first location. The number of operations in
the best case is constant (not dependent on n). So time complexity in the best case would be ?(1)

Most of the times, we do worst case analysis to analyze algorithms. In the worst analysis, we guarantee an upper bound on the running time of an
algorithm which is good information.
The average case analysis is not easy to do in most of the practical cases and it is rarely done. In the average case analysis, we must know (or
predict) the mathematical distribution of all possible inputs.
The Best Case analysis is bogus. Guaranteeing a lower bound on an algorithm doesnt provide any information as in the worst case, an algorithm
may take years to run.

For some algorithms, all the cases are asymptotically same, i.e., there are no worst and best cases. For example, Merge Sort. Merge Sort does ?
(nLogn) operations in all cases. Most of the other sorting algorithms have worst and best cases. For example, in the typical implementation of
Quick Sort (where pivot is chosen as a corner element), the worst occurs when the input array is already sorted and the best occur when the pivot
elements always divide array in two halves. For insertion sort, the worst case occurs when the array is reverse sorted and the best case occurs
when the array is sorted in the same order as output.
References:
MITs Video lecture 1 on Introduction to Algorithms.

Analysis of Algorithms | Set 3 (Asymptotic Notations)


We have discussed Asymptotic Analysis, and Worst, Average and Best Cases of Algorithms. The main idea of asymptotic analysis is to have a
measure of efficiency of algorithms that doesnt depend on machine specific constants, and doesnt require algorithms to be implemented and time
taken by programs to be compared. Asymptotic notations are mathematical tools to represent time complexity of algorithms for asymptotic
analysis. The following 3 asymptotic notations are mostly used to represent time complexity of algorithms.

1) ? Notation: The theta notation bounds a functions from above and below, so it defines exact asymptotic behavior.
A simple way to get Theta notation of an expression is to drop low order terms and ignore leading constants. For example, consider the following
expression.
3n3 + 6n2 + 6000 = ?(n3)
Dropping lower order terms is always fine because there will always be a n0 after which ?(n3) beats ?n2) irrespective of the constants involved.
For a given function g(n), we denote ?(g(n)) is following set of functions.
?((g(n)) = {f(n): there exist positive constants c1, c2 and n0 such that
0 <= c1*g(n) <= f(n) <= c2*g(n) for all n >= n0}

The above definition means, if f(n) is theta of g(n), then the value f(n) is always between c1*g(n) and c2*g(n) for large values of n (n >= n0). The
definition of theta also requires that f(n) must be non-negative for values of n greater than n0.

2) Big O Notation: The Big O notation defines an upper bound of an algorithm, it bounds a function only from
above. For example, consider the case of Insertion Sort. It takes linear time in best case and quadratic time in worst case. We can safely say that
the time complexity of Insertion sort is O(n^2). Note that O(n^2) also covers linear time.
If we use ? notation to represent time complexity of Insertion sort, we have to use two statements for best and worst cases:
1. The worst case time complexity of Insertion Sort is ?(n^2).
2. The best case time complexity of Insertion Sort is ?(n).
The Big O notation is useful when we only have upper bound on time complexity of an algorithm. Many times we easily find an upper bound by
simply looking at the algorithm.
O(g(n)) = { f(n): there exist positive constants c and n0 such that
0 <= f(n) <= cg(n) for all n >= n0}

3) ? Notation: Just as Big O notation provides an asymptotic upper bound on a function, ? notation provides an
asymptotic lower bound.
? Notation< can be useful when we have lower bound on time complexity of an algorithm. As discussed in the previous post, the best case
performance of an algorithm is generally not useful, the Omega notation is the least used notation among all three.
For a given function g(n), we denote by ?(g(n)) the set of functions.

? (g(n)) = {f(n): there exist positive constants c and n0 such that


0 <= cg(n) <= f(n) for all n >= n0}.

Let us consider the same Insertion sort example here. The time complexity of Insertion Sort can be written as ?(n), but it is not a very useful
information about insertion sort, as we are generally interested in worst case and sometimes in average case.
Exercise:
Which of the following statements is/are valid?
1. Time Complexity of QuickSort is ?(n^2)
2. Time Complexity of QuickSort is O(n^2)
3. For any two functions f(n) and g(n), we have f(n) = ?(g(n)) if and only if f(n) = O(g(n)) and f(n) = ?(g(n)).
4. Time complexity of all computer algorithms can be written as ?(1)
References:
Lec 1 | MIT (Introduction to Algorithms)
Introduction to Algorithms 3rd Edition by Clifford Stein, Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest

Analysis of Algorithms | Set 4 (Analysis of Loops)


We have discussed Asymptotic Analysis, Worst, Average and Best Cases and Asymptotic Notations in previous posts. In this post, analysis of
iterative programs with simple examples is discussed.
1) O(1): Time complexity of a function (or set of statements) is considered as O(1) if it doesnt contain loop, recursion and call to any other nonconstant time function.
// set of non-recursive and non-loop statements

For example swap() function has O(1) time complexity.


A loop or recursion that runs a constant number of times is also considered as O(1). For example the following loop is O(1).
// Here c is a constant
for (int i = 1; i <= c; i++) {
// some O(1) expressions
}

2) O(n): Time Complexity of a loop is considered as O(n) if the loop variables is incremented / decremented by a constant amount. For example
following functions have O(n) time complexity.
// Here c is a positive integer constant
for (int i = 1; i <= n; i += c) {
// some O(1) expressions
}
for (int i = n; i > 0; i -= c) {
// some O(1) expressions
}

3) O(nc): Time complexity of nested loops is equal to the number of times the innermost statement is executed. For example the following sample
loops have O(n2) time complexity
for (int i = 1; i <=n; i += c) {
for (int j = 1; j <=n; j += c) {
// some O(1) expressions
}
}
for (int i = n; i > 0; i += c) {
for (int j = i+1; j <=n; j += c) {
// some O(1) expressions
}

For example Selection sort and Insertion Sort have O(n2) time complexity.
4) O(Logn) Time Complexity of a loop is considered as O(Logn) if the loop variables is divided / multiplied by a constant amount.
for (int i = 1; i <=n; i *= c) {
// some O(1) expressions
}
for (int i = n; i > 0; i /= c) {
// some O(1) expressions
}

For example Binary Search(refer iterative implementation) has O(Logn) time complexity.
5) O(LogLogn) Time Complexity of a loop is considered as O(LogLogn) if the loop variables is reduced / increased exponentially by a constant
amount.
// Here c is a constant greater than 1
for (int i = 2; i <=n; i = pow(i, c)) {
// some O(1) expressions
}
//Here fun is sqrt or cuberoot or any other constant root
for (int i = n; i > 0; i = fun(i)) {
// some O(1) expressions
}

See this for more explanation.

How to combine time complexities of consecutive loops?


When there are consecutive loops, we calculate time complexity as sum of time complexities of individual loops.
for (int i = 1; i <=m; i += c) {
// some O(1) expressions
}
for (int i = 1; i <=n; i += c) {
// some O(1) expressions
}
Time complexity of above code is O(m) + O(n) which is O(m+n)
If m == n, the time complexity becomes O(2n) which is O(n).

How to calculate time complexity when there are many if, else statements inside loops?
As discussed here, worst case time complexity is the most useful among best, average and worst. Therefore we need to consider worst case. We
evaluate the situation when values in if-else conditions cause maximum number of statements to be executed.
For example consider the linear search function where we consider the case when element is present at the end or not present at all.
When the code is too complex to consider all if-else cases, we can get an upper bound by ignoring if else and other complex control statements.
How to calculate time complexity of recursive functions?
Time complexity of a recursive function can be written as a mathematical recurrence relation. To calculate time complexity, we must know how to
solve recurrences. We will soon be discussing recurrence solving techniques as a separate post.
Quiz on Analysis of Algorithms

Analysis of Algorithm | Set 4 (Solving Recurrences)


In the previous post, we discussed analysis of loops. Many algorithms are recursive in nature. When we analyze them, we geta recurrence relation
for time complexity. We get running time on an input of size n as a function of n and the running time on inputs of smaller sizes. For example in
Merge Sort, to sort a given array, we divide it in two halves and recursively repeat the process for the two halves. Finally we merge the results.
Time complexity of Merge Sort can be written as T(n) = 2T(n/2) + cn. There are many other algorithms like Binary Search, Tower of Hanoi, etc.
There are mainly three ways for solving recurrences.
1) Substitution Method: We make a guess for the solution and then we use mathematical induction to prove the the guess is correct or incorrect.
For example consider the recurrence T(n) = 2T(n/2) + n
We guess the solution as T(n) = O(nLogn). Now we use induction
to prove our guess.
We need to prove that T(n) <= cnLogn. We can assume that it is true
for values smaller than n.
T(n) =
<=
=
=
<=

2T(n/2) + n
cn/2Log(n/2) + n
cnLogn - cnLog2 + n
cnLogn - cn + n
cnLogn

2) Recurrence Tree Method: In this method, we draw a recurrence tree and calculate the time taken by every level of tree. Finally, we sum the
work done at all levels. To draw the recurrence tree, we start from the given recurrence and keep drawing till we find a pattern among levels. The
pattern is typically a arithmetic or geometric series.
For example consider the recurrence relation
T(n) = T(n/4) + T(n/2) + cn2
cn2
/
T(n/4)

\
T(n/2)

If we further break down the expression T(n/4) and T(n/2),


we get following recursion tree.
cn2
/
\
c(n2)/16
c(n2)/4
/
\
/
\
T(n/16)
T(n/8) T(n/8)
T(n/4)
Breaking down further gives us following
cn2
/
\
c(n2)/16
c(n2)/4
/
\
/
\
c(n2)/256 c(n2)/64 c(n2)/64
c(n2)/16
/
\
/
\
/
\
/
\
To know the value of T(n), we need to calculate sum of tree
nodes level by level. If we sum the above tree level by level,
we get the following series
T(n) = c(n^2 + 5(n^2)/16 + 25(n^2)/256) + ....
The above series is geometrical progression with ratio 5/16.
To get an upper bound, we can sum the infinite series.
We get the sum as (n2)/(1 - 5/16) which is O(n2)

3) Master Method:
Master Method is a direct way to get the solution. The master method works only for following type of recurrences or for recurrences that can be
transformed to following type.
T(n) = aT(n/b) + f(n) where a >= 1 and b > 1

There are following three cases:


1. If f(n) = ?(nc) where c < Logba then T(n) = ?(nLogba)
2. If f(n) = ?(nc) where c = Logba then T(n) = ?(ncLog n)
3.If f(n) = ?(nc) where c > Logba then T(n) = ?(f(n))

How does this work?


Master method is mainly derived from recurrence tree method. If we draw recurrence tree of T(n) = aT(n/b) + f(n), we can see that the work done
at root is f(n) and work done at all leaves is ?(nc) where c is Logba. And the height of recurrence tree is Logbn

In recurrence tree method, we calculate total work done. If the work done at leaves is polynomially more, then leaves are the dominant part, and
our result becomes the work done at leaves (Case 1). If work done at leaves and root is asymptotically same, then our result becomes height
multiplied by work done at any level (Case 2). If work done at root is asymptotically more, then our result becomes work done at root (Case 3).
Examples of some standard algorithms whose time complexity can be evaluated using Master Method
Merge Sort: T(n) = 2T(n/2) + ?(n). It falls in case 2 as c is 1 and Logba] is also 1. So the solution is ?(n Logn)
Binary Search: T(n) = T(n/2) + ?(1). It also falls in case 2 as c is 0 and Logba is also 0. So the solution is ?(Logn)
Notes:
1) It is not necessary that a recurrence of the form T(n) = aT(n/b) + f(n) can be solved using Master Theorem. The given three cases have some
gaps between them. For example, the recurrence T(n) = 2T(n/2) + n/Logn cannot be solved using master method.
2) Case 2 can be extended for f(n) = ?(ncLogkn)
If f(n) = ?(ncLogkn) for some constant k >= 0 and c = Logba, then T(n) = ?(ncLogk+1n)
Practice Problems and Solutions on Master Theorem.
References:
http://en.wikipedia.org/wiki/Master_theorem
MIT Video Lecture on Asymptotic Notation | Recurrences | Substitution, Master Method
Introduction to Algorithms 3rd Edition by Clifford Stein, Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest

Analysis of Algorithm | Set 5 (Amortized Analysis Introduction)


Amortized Analysis is used for algorithms where an occasional operation is very slow, but most of the other operations are faster. In Amortized
Analysis, we analyze a sequence of operations and guarantee a worst case average time which is lower than the worst case time of a particular
expensive operation.
The example data structures whose operations are analyzed using Amortized Analysis are Hash Tables, Disjoint Sets and Splay Trees.
Let us consider an example of a simple hash table insertions. How do we decide table size? There is a trade-off between space and time, if we
make hash-table size big, search time becomes fast, but space required becomes high.

The solution to this trade-off problem is to use Dynamic Table (or Arrays). The idea is to increase size of table whenever it becomes full. Following
are the steps to follow when table becomes full.
1) Allocate memory for a larger table of size, typically twice the old table.
2) Copy the contents of old table to new table.
3) Free the old table.
If the table has space available, we simply insert new item in available space.
What is the time complexity of n insertions using the above scheme?
If we use simple analysis, the worst case cost of an insertion is O(n). Therefore, worst case cost of n inserts is n * O(n) which is O(n2). This
analysis gives an upper bound, but not a tight upper bound for n insertions as all insertions dont take ?(n) time.

So using Amortized Analysis, we could prove that the Dynamic Table scheme has O(1) insertion time which is a great result used in hashing. Also,
the concept of dynamic table is used in vectors in C++, ArrayList in Java.
Following are few important notes.
1) Amortized cost of a sequence of operations can be seen as expenses of a salaried person. The average monthly expense of the person is less
than or equal to the salary, but the person can spend more money in a particular month by buying a car or something. In other months, he or she
saves money for the expensive month.
2) The above Amortized Analysis done for Dynamic Array example is called Aggregate Method. There are two more powerful ways to do
Amortized analysis called Accounting Method and Potential Method. We will be discussing the other two methods in separate posts.
3) The amortized analysis doesnt involve probability. There is also another different notion of average case running time where algorithms use
randomization to make them faster and expected running time is faster than the worst case running time. These algorithms are analyzed using

Randomized Analysis. Examples of these algorithms are Randomized Quick Sort, Quick Select and Hashing. We will soon be covering
Randomized analysis in a different post.
Sources:
Berkeley Lecture 35: Amortized Analysis
MIT Lecture 13: Amortized Algorithms, Table Doubling, Potential Method
http://www.cs.cornell.edu/courses/cs3110/2011sp/lectures/lec20-amortized/amortized.htm

What does Space Complexity mean?


Space Complexity:
The term Space Complexity is misused for Auxiliary Space at many places. Following are the correct definitions of Auxiliary Space and Space
Complexity.
Auxiliary Space is the extra space or temporary space used by an algorithm.
Space Complexity of an algorithm is total space taken by the algorithm with respect to the input size. Space complexity includes both Auxiliary
space and space used by input.
For example, if we want to compare standard sorting algorithms on the basis of space, then Auxiliary Space would be a better criteria than Space
Complexity. Merge Sort uses O(n) auxiliary space, Insertion sort and Heap Sort use O(1) auxiliary space. Space complexity of all these sorting
algorithms is O(n) though.

NP-Completeness | Set 1 (Introduction)


We have been writing about efficient algorithms to solve complex problems, like shortest path, Euler graph, minimum spanning tree, etc. Those
were all success stories of algorithm designers. In this post, failure stories of computer science are discussed.
Can all computational problems be solved by a computer? There are computational problems that can not be solved by algorithms even with
unlimited time. For example Turing Halting problem (Given a program and an input, whether the program will eventually halt when run with that
input, or will run forever). Alan Turing proved that general algorithm to solve the halting problem for all possible program-input pairs cannot exist.
A key part of the proof is, Turing machine was used as a mathematical definition of a computer and program (Source Halting Problem).
Status of NP Complete problems is another failure story, NP complete problems are problems whose status is unknown. No polynomial time
algorithm has yet been discovered for any NP complete problem, nor has anybody yet been able to prove that no polynomial-time algorithm exist
for any of them. The interesting part is, if any one of the NP complete problems can be solved in polynomial time, then all of them can be solved.
What are NP, P, NP-complete and NP-Hard problems?
P is set of problems that can be solved by a deterministic Turing machine in Polynomial time.
NP is set of decision problems that can be solved by a Non-deterministic Turing Machine in Polynomial time. P is subset of NP (any problem that
can be solved by deterministic machine in polynomial time can also be solved by non-deterministic machine in polynomial time).
Informally, NP is set of decision problems which can be solved by a polynomial time via a Lucky Algorithm, a magical algorithm that always makes
a right guess among the given set of choices (Source Ref 1).
NP-complete problems are the hardest problems in NP set. A decision problem L is NP-complete if:

1) L is in NP (Any given solution for NP-complete problems can be verified quickly, but there is no efficient known solution).
2) Every problem in NP is reducible to L in polynomial time (Reduction is defined below).
A problem is NP-Hard if it follows property 2 mentioned above, doesnt need to follow property 1. Therefore, NP-Complete set is also a subset
of NP-Hard set.

Decision vs Optimization Problems

NP-completeness applies to the realm of decision problems. It was set up this way because its easier to compare the difficulty of decision
problems than that of optimization problems. In reality, though, being able to solve a decision problem in polynomial time will often permit us to
solve the corresponding optimization problem in polynomial time (using a polynomial number of calls to the decision problem). So, discussing the
difficulty of decision problems is often really equivalent to discussing the difficulty of optimization problems.(Source Ref 2).
For example, consider the vertex cover problem (Given a graph, find out the minimum sized vertex set that covers all edges). It is an optimization
problem. Corresponding decision problem is, given undirected graph G and k, is there a vertex cover of size k?
What is Reduction?

Let L1 and L2 be two decision problems. Suppose algorithm A2 solves L2. That is, if y is an input for L2 then algorithm A2 will answer Yes or No
depending upon whether y belongs to L2 or not.
The idea is to find a transformation from L1 to L2 so that the algorithm A2 can be part of an algorithm A1 to solve L1.

Learning reduction in general is very important. For example, if we have library functions to solve certain problem and if we can reduce a new
problem to one of the solved problems, we save a lot of time. Consider the example of a problem where we have to find minimum product path in
a given directed graph where product of path is multiplication of weights of edges along the path. If we have code for Dijkstras algorithm to find
shortest path, we can take log of all weights and use Dijkstras algorithm to find the minimum product path rather than writing a fresh code for this
new problem.

How to prove that a given problem is NP complete?

From the definition of NP-complete, it appears impossible toprove that a problem L is NP-Complete. By definition, it requires us to that show
every problem in NP is polynomial time reducible to L. Fortunately, there is an alternate way to prove it. The idea is to take a known NPComplete problem and reduce it to L. If polynomial time reduction is possible, we can prove that L is NP-Complete by transitivity of reduction (If
a NP-Complete problem is reducible to L in polynomial time, then all problems are reducible to L in polynomial time).
What was the first problem proved as NP-Complete?
There must be some first NP-Complete problem proved by definition of NP-Complete problems. SAT (Boolean satisfiability problem) is the first
NP-Complete problem proved by Cook (See CLRS book for proof).
It is always useful to know about NP-Completeness even for engineers. Suppose you are asked to write an efficient algorithm to solve an
extremely important problem for your company. After a lot of thinking, you can only come up exponential time approach which is impractical. If
you dont know about NP-Completeness, you can only say that I could not come with an efficient algorithm. If you know about NP-Completeness
and prove that the problem as NP-complete, you can proudly say that the polynomial time solution is unlikely to exist. If there is a polynomial time
solution possible, then that solution solves a big problem of computer science many scientists have been trying for years.
We will soon be discussing more NP-Complete problems and their proof for NP-Completeness.
References:
MIT Video Lecture on Computational Complexity
Introduction to Algorithms 3rd Edition by Clifford Stein, Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest
http://www.ics.uci.edu/~eppstein/161/960312.html

A Time Complexity Question


What is the time complexity of following function fun()? Assume that log(x) returns log value in base 2.
void fun()
{
int i, j;
for (i=1; i<=n; i++)
for (j=1; j<=log(i); j++)
printf("GeeksforGeeks");
}

Time Complexity of the above function can be written as ?(log 1) + ?(log 2) + ?(log 3) + . . . . + ?(log n) which is ? (log n!)
Order of growth of log n! and n log n is same for large values of n, i.e., ? (log n!) = ?(n log n). So time complexity of fun() is ?(n log n).
The expression ?(log n!) = ?(n log n) can be easily derived from following Stirlings approximation (or Stirlings formula).
log n! = n log n - n + O(log(n))

Sources:
http://en.wikipedia.org/wiki/Stirling%27s_approximation

Time Complexity of building a heap


Consider the following algorithm for building a Heap of an input array A.
BUILD-HEAP(A)
heapsize := size(A);
for i := floor(heapsize/2) downto 1
do HEAPIFY(A, i);
end for
END

What is the worst case time complexity of the above algo?


Although the worst case complexity looks like O(nLogn), upper bound of time complexity is O(n). See following links for the proof of time
complexity.
http://www.cse.iitk.ac.in/users/sbaswana/Courses/ESO211/heap.pdf/
http://www.cs.sfu.ca/CourseCentral/307/petra/2009/SLN_2.pdf

Time Complexity where loop variable is incremented by 1, 2, 3, 4 ..


What is the time complexity of below code?
void fun(int n)
{
int j = 1, i = 0;
while (i < n)
{
// Some O(1) task
i = i + j;
j++;
}
}

The loop variable i is incremented by 1, 2, 3, 4, until i becomes greater than or equal to n.


The value of i is x(x+1)/2 after x iterations. So if loop runs x times, then x(x+1)/2 < n. Therefore time complexity can be written as ?(?n).

Time Complexity of Loop with Powers


What is the time complexity of below function?
void fun(int n, int k)
{
for (int i=1; i<=n; i++)
{
int p = pow(i, k);
for (int j=1; j<=p; j++)
{
// Some O(1) work
}
}
}

Time complexity of above function can be written as 1k + 2k + 3k + n1k.


Let us try few examples:
k=1
Sum =
=
=

1 + 2 + 3 ... n
n(n+1)/2
n2 + n/2

k=2
Sum =
=
=

12 + 22 + 32 + ... n12.
n(n+1)(2n+1)/6
n3/3 + n2/2 + n/6

k=3
Sum =
=
=

13 + 23 + 33 + ... n13.
n2(n+1)2/4
n4/4 + n3/2 + n2/4

In general, asymptotic value can be written as (nk+1)/(k+1) + ?(nk)


Note that, in asymptotic notations like ? we can always ignore lower order terms. So the time complexity is ?(nk+1 / (k+1))

Binary Search
Given a sorted array arr[] of n elements, write a function to search a given element x in arr[].
A simple approach is to do linear search, i.e., start from the leftmost element of arr[] and one by one compare x with each element of arr[], if x
matches with an element, return the index. If x doesnt match with any of elements, return -1.

C/C++
// Linearly search x in arr[]. If x is present then return its
// location, otherwise return -1
int search(int arr[], int n, int x)
{
int i;
for (i=0; i<n; i++)
if (arr[i] == x)
return i;
return -1;
}

Python
#
#
#
#
#

Searching an element in a list/array in python


can be simply done using 'in' operator
Example:
if x in arr:
print arr.index(x)

# If you want to implement Linear Search in python


# Linearly search x in arr[]
# If x is present then return its location
# else return -1
def search(arr, x):
for i in range(len(arr)):
if arr[i] == x:
return i
return -1

The idea of binary search is to use the information that the array is sorted and reduce the time complexity to O(Logn). We basically ignore half of
the elements just after one comparison.
1) Compare x with the middle element.
2) If x matches with middle element, we return the mid index.
3) Else If x is greater than the mid element, then x can only lie in right half subarray after the mid element. So we recur for right half.
4) Else (x is smaller) recur for the left half.
Following is Recursive implementation of Binary Search.

C/C++
#include <stdio.h>
// A recursive binary search function. It returns location of x in
// given array arr[l..r] is present, otherwise -1
int binarySearch(int arr[], int l, int r, int x)
{
if (r >= l)
{
int mid = l + (r - l)/2;
// If the element is present at the middle itself
if (arr[mid] == x) return mid;
// If element is smaller than mid, then it can only be present
// in left subarray
if (arr[mid] > x) return binarySearch(arr, l, mid-1, x);
// Else the element can only be present in right subarray

return binarySearch(arr, mid+1, r, x);


}
// We reach here when element is not present in array
return -1;
}
int main(void)
{
int arr[] = {2, 3, 4, 10, 40};
int n = sizeof(arr)/ sizeof(arr[0]);
int x = 10;
int result = binarySearch(arr, 0, n-1, x);
(result == -1)? printf("Element is not present in array")
: printf("Element is present at index %d", result);
return 0;
}

Python
# Python Program for recursive binary search.
# Returns index of x in arr if present, else -1
def binarySearch (arr, l, r, x):
# Check base case
if r >= l:
mid = l + (r - l)/2
# If element is present at the middle itself
if arr[mid] == x:
return mid
# If element is smaller than mid, then it can only
# be present in left subarray
elif arr[mid] > x:
return binarySearch(arr, l, mid-1, x)
# Else the element can only be present in right subarray
else:
return binarySearch(arr, mid+1, r, x)
else:
# Element is not present in the array
return -1
# Test array
arr = [ 2, 3, 4, 10, 40 ]
x = 10
# Function call
result = binarySearch(arr, 0, len(arr)-1, x)
if result != -1:
print "Element is present at index %d" % result
else:
print "Element is not present in array"

Element is present at index 3

Following is Iterative implementation of Binary Search.

C/C++
#include <stdio.h>
// A iterative binary search function. It returns location of x in
// given array arr[l..r] if present, otherwise -1
int binarySearch(int arr[], int l, int r, int x)
{
while (l <= r)
{
int m = l + (r-l)/2;
if (arr[m] == x) return m; // Check if x is present at mid

if (arr[m] < x) l = m + 1; // If x greater, ignore left half


else r = m - 1; // If x is smaller, ignore right half
}
return -1; // if we reach here, then element was not present
}
int main(void)
{
int arr[] = {2, 3, 4, 10, 40};
int n = sizeof(arr)/ sizeof(arr[0]);
int x = 10;
int result = binarySearch(arr, 0, n-1, x);
(result == -1)? printf("Element is not present in array")
: printf("Element is present at index %d", result);
return 0;
}

Python
# Iterative Binary Search Function
# It returns location of x in given array arr if present,
# else returns -1
def binarySearch(arr, l, r, x):
while l<=r:
mid = l + (r - l)/2;
# Check if x is present at mid
if arr[mid] == x:
return mid
# If x is greater, ignore left half
elif arr[mid] < x:
l = mid + 1
# If x is smaller, ignore right half
else:
r = mid - 1
# If we reach here, then the element was not present
return -1
# Test array
arr = [ 2, 3, 4, 10, 40 ]
x = 10
# Function call
result = binarySearch(arr, 0, len(arr)-1, x)
if result != -1:
print "Element is present at index %d" % result
else:
print "Element is not present in array"

Element is present at index 3

Time Complexity:
The time complexity of Binary Search can be written as
T(n) = T(n/2) + c

The above recurrence can be solved either using Recurrence T ree method or Master method. It falls in case II of Master Method and solution of
the recurrence is
.
Auxiliary Space: O(1) in case of iterative implementation. In case of recursive implementation, O(Logn) recursion call stack space.
Algorithmic Paradigm: Divide and Conquer
Following are some interesting articles based on Binary Search.
The Ubiquitous Binary Search
Interpolation search vs Binary search
Find the minimum element in a sorted and rotated array

Find a peak element


Find a Fixed Point in a given array
Count the number of occurrences in a sorted array
Median of two sorted arrays
Floor and Ceiling in a sorted array
Find the maximum element in an array which is first increasing and then decreasing
GATE CornerQuiz Corner

Selection Sort
The selection sort algorithm sorts an array by repeatedly finding the minimum element (considering ascending order) from unsorted part and putting
it at the beginning. The algorithm maintains two subarrays in a given array.
1) The subarray which is already sorted.
2) Remaining subarray which is unsorted.
In every iteration of selection sort, the minimum element (considering ascending order) from the unsorted subarray is picked and moved to the
sorted subarray.
Following example explains the above steps:
arr[] = 64 25 12 22 11
// Find the minimum element in arr[0...4] and place it at beginning
11 25 12 22 64
// Find the minimum element in arr[1...4] and
// place it at beginning of arr[1...4]
11 12 25 22 64
// Find the minimum element in arr[2...4] and
// place it at beginning of arr[2...4]
11 12 22 25 64
// Find the minimum element in arr[3...4] and
// place it at beginning of arr[3...4]
11 12 22 25 64
// C program for implementation of selection sort
#include <stdio.h>
void swap(int *xp, int *yp)
{
int temp = *xp;
*xp = *yp;
*yp = temp;
}
void selectionSort(int arr[], int n)
{
int i, j, min_idx;
// One by one move boundary of unsorted subarray
for (i = 0; i < n-1; i++)
{
// Find the minimum element in unsorted array
min_idx = i;
for (j = i+1; j < n; j++)
if (arr[j] < arr[min_idx])
min_idx = j;
// Swap the found minimum element with the first element
swap(&arr[min_idx], &arr[i]);
}
}
/* Function to print an array */
void printArray(int arr[], int size)
{
int i;
for (i=0; i < size; i++)
printf("%d ", arr[i]);
printf("\n");
}
// Driver program to test above functions
int main()
{
int arr[] = {64, 25, 12, 22, 11};
int n = sizeof(arr)/sizeof(arr[0]);
selectionSort(arr, n);
printf("Sorted array: \n");
printArray(arr, n);
return 0;
}

Output:

Sorted array:
11 12 22 25 64

Time Complexity: O(n*n) as there are two nested loops.


Auxiliary Space: O(1)
The good thing about selection sort is it never makes more than O(n) swaps and can be useful when memory write is a costly operation.
Other Sorting Algorithms on GeeksforGeeks/GeeksQuiz:
Bubble Sort
Insertion Sort
Merge Sort
Heap Sort
QuickSort
Radix Sort
Counting Sort
Bucket Sort
ShellSort

Bubble Sort
Bubble Sort is the simplest sorting algorithm that works by repeatedly swapping the adjacent elements if they are in wrong order.
Example:
First Pass:
( 5 1 4 2 8 ) > ( 1 5 4 2 8 ), Here, algorithm compares the first two elements, and swaps since 5 > 1.
( 1 5 4 2 8 ) > ( 1 4 5 2 8 ), Swap since 5 > 4
( 1 4 5 2 8 ) > ( 1 4 2 5 8 ), Swap since 5 > 2
( 1 4 2 5 8 ) > ( 1 4 2 5 8 ), Now, since these elements are already in order (8 > 5), algorithm does not swap them.
Second Pass:
(14258)>(14258)
( 1 4 2 5 8 ) > ( 1 2 4 5 8 ), Swap since 4 > 2
(12458)>(12458)
(12458)>(12458)
Now, the array is already sorted, but our algorithm does not know if it is completed. The algorithm needs one whole pass without any swap to
know it is sorted.
Third Pass:
(12458)>(12458)
(12458)>(12458)
(12458)>(12458)
(12458)>(12458)
Following is C implementation of Bubble Sort.
// C program for implementation of Bubble sort
#include <stdio.h>
void swap(int *xp, int *yp)
{
int temp = *xp;
*xp = *yp;
*yp = temp;
}
// A function to implement bubble sort
void bubbleSort(int arr[], int n)
{
int i, j;
for (i = 0; i < n-1; i++)
for (j = 0; j < n-i-1; j++) //Last i elements are already in place
if (arr[j] > arr[j+1])
swap(&arr[j], &arr[j+1]);
}
/* Function to print an array */
void printArray(int arr[], int size)
{
int i;
for (i=0; i < size; i++)
printf("%d ", arr[i]);
printf("\n");
}
// Driver program to test above functions
int main()
{
int arr[] = {64, 34, 25, 12, 22, 11, 90};
int n = sizeof(arr)/sizeof(arr[0]);
bubbleSort(arr, n);
printf("Sorted array: \n");
printArray(arr, n);
return 0;
}

Output:
Sorted array:
11 12 22 25 34 64 90

Optimized Implementation:
The above function always runs O(n^2) time even if the array is sorted. It can be optimized by stopping the algorithm if inner loop didnt cause any

swap.
// Optimized implementation of Bubble sort
#include <stdio.h>
void swap(int *xp, int *yp)
{
int temp = *xp;
*xp = *yp;
*yp = temp;
}
// An optimized version of Bubble Sort
void bubbleSort(int arr[], int n)
{
int i, j;
bool swapped;
for (i = 0; i < n-1; i++)
{
swapped = false;
for (j = 0; j < n-i-1; j++)
{
if (arr[j] > arr[j+1])
{
swap(&arr[j], &arr[j+1]);
swapped = true;
}
}
// IF no two elements were swapped by inner loop, then break
if (swapped == false)
break;
}
}
/* Function to print an array */
void printArray(int arr[], int size)
{
int i;
for (i=0; i < size; i++)
printf("%d ", arr[i]);
printf("\n");
}
// Driver program to test above functions
int main()
{
int arr[] = {64, 34, 25, 12, 22, 11, 90};
int n = sizeof(arr)/sizeof(arr[0]);
bubbleSort(arr, n);
printf("Sorted array: \n");
printArray(arr, n);
return 0;
}

Output:
Sorted array:
11 12 22 25 34 64 90

Worst and Average Case Time Complexity: O(n*n). Worst case occurs when array is reverse sorted.
Best Case Time Complexity: O(n). Best case occurs when array is already sorted.
Auxiliary Space: O(1)
Boundary Cases: Bubble sort takes minimum time (Order of n) when elements are already sorted.
Sorting In Place: Yes
Stable: Yes
Due to its simplicity, bubble sort is often used to introduce the concept of a sorting algorithm.
In computer graphics it is popular for its capability to detect a very small error (like swap of just two elements) in almost-sorted arrays and fix it
with just linear complexity (2n). For example, it is used in a polygon filling algorithm, where bounding lines are sorted by their x coordinate at a
specific scan line (a line parallel to x axis) and with incrementing y their order changes (two elements are swapped) only at intersections of two lines
(Source: Wikipedia)

Other Sorting Algorithms on GeeksforGeeks/GeeksQuiz:


Selection Sort
Insertion Sort
Merge Sort
Heap Sort
QuickSort
Radix Sort
Counting Sort
Bucket Sort
ShellSort

Insertion Sort
Insertion sort is a simple sorting algorithm that works the way we sort playing cards in our hands.

Algorithm
// Sort an arr[] of size n
insertionSort(arr, n)
Loop from i = 1 to n-1.
a) Pick element arr[i] and insert it into sorted sequence arr[0i-1]
Example:
12, 11, 13, 5, 6
Let us loop for i = 1 (second element of the array) to 5 (Size of input array)
i = 1. Since 11 is smaller than 12, move 12 and insert 11 before 12
11, 12, 13, 5, 6
i = 2. 13 will remain at its position as all elements in A[0..I-1] are smaller than 13
11, 12, 13, 5, 6
i = 3. 5 will move to the beginning and all other elements from 11 to 13 will move one position ahead of their current position.
5, 11, 12, 13, 6
i = 4. 6 will move to position after 5, and elements from 11 to 13 will move one position ahead of their current position.
5, 6, 11, 12, 13
// C program for insertion sort
#include <stdio.h>
#include <math.h>
/* Function to sort an array using insertion sort*/
void insertionSort(int arr[], int n)
{
int i, key, j;
for (i = 1; i < n; i++)
{
key = arr[i];
j = i-1;
/* Move elements of arr[0..i-1], that are
greater than key, to one position ahead
of their current position */
while (j >= 0 && arr[j] > key)
{
arr[j+1] = arr[j];
j = j-1;
}
arr[j+1] = key;
}
}
// A utility function ot print an array of size n
void printArray(int arr[], int n)
{
int i;
for (i=0; i < n; i++)
printf("%d ", arr[i]);
printf("\n");

/* Driver program to test insertion sort */


int main()
{
int arr[] = {12, 11, 13, 5, 6};
int n = sizeof(arr)/sizeof(arr[0]);
insertionSort(arr, n);
printArray(arr, n);
return 0;
}

Output:
5 6 11 12 13

Time Complexity: O(n*n)


Auxiliary Space: O(1)
Boundary Cases: Insertion sort takes maximum time to sort if elements are sorted in reverse order. And it takes minimum time (Order of n) when
elements are already sorted.
Algorithmic Paradigm: Incremental Approach
Sorting In Place: Yes
Stable: Yes
Online: Yes
Uses: Insertion sort is uses when number of elements is small. It can also be useful when input array is almost sorted, only few elements are
misplaced in complete big array.
Quizzes: Sorting Questions
Other Sorting Algorithms on GeeksforGeeks/GeeksQuiz:
Selection Sort
Bubble Sort
Merge Sort
Heap Sort
QuickSort
Radix Sort
Counting Sort
Bucket Sort
ShellSort

Merge Sort
MergeSort is a Divide and Conquer algorithm. It divides input array in two halves, calls itself for the two halves and then merges the two sorted
halves. The merg() function is used for merging two halves. The merge(arr, l, m, r) is key process that assumes that arr[l..m] and arr[m+1..r] are
sorted and merges the two sorted sub-arrays into one. See following C implementation for details.
MergeSort(arr[], l, r)
If r > l
1. Find the middle point to divide the array into two halves:
middle m = (l+r)/2
2. Call mergeSort for first half:
Call mergeSort(arr, l, m)
3. Call mergeSort for second half:
Call mergeSort(arr, m+1, r)
4. Merge the two halves sorted in step 2 and 3:
Call merge(arr, l, m, r)

The following diagram from wikipedia shows the complete merge sort process for an example array {38, 27, 43, 3, 9, 82, 10}. If we take a closer
look at the diagram, we can see that the array is recursively divided in two halves till the size becomes 1. Once the size becomes 1, the merge
processes comes into action and starts merging arrays back till the complete array is merged.

/* C program for merge sort */


#include<stdlib.h>
#include<stdio.h>
/* Function to merge the two haves arr[l..m] and arr[m+1..r] of array arr[] */
void merge(int arr[], int l, int m, int r)
{
int i, j, k;
int n1 = m - l + 1;
int n2 = r - m;
/* create temp arrays */
int L[n1], R[n2];
/* Copy data to temp arrays L[] and R[] */
for(i = 0; i < n1; i++)
L[i] = arr[l + i];
for(j = 0; j < n2; j++)
R[j] = arr[m + 1+ j];
/* Merge the temp arrays back into arr[l..r]*/
i = 0;
j = 0;
k = l;
while (i < n1 && j < n2)
{
if (L[i] <= R[j])
{
arr[k] = L[i];
i++;

}
else
{
arr[k] = R[j];
j++;
}
k++;
}
/* Copy the remaining elements of L[], if there are any */
while (i < n1)
{
arr[k] = L[i];
i++;
k++;
}
/* Copy the remaining elements of R[], if there are any */
while (j < n2)
{
arr[k] = R[j];
j++;
k++;
}
}
/* l is for left index and r is right index of the sub-array
of arr to be sorted */
void mergeSort(int arr[], int l, int r)
{
if (l < r)
{
int m = l+(r-l)/2; //Same as (l+r)/2, but avoids overflow for large l and h
mergeSort(arr, l, m);
mergeSort(arr, m+1, r);
merge(arr, l, m, r);
}
}
/* UITLITY FUNCTIONS */
/* Function to print an array */
void printArray(int A[], int size)
{
int i;
for (i=0; i < size; i++)
printf("%d ", A[i]);
printf("\n");
}
/* Driver program to test above functions */
int main()
{
int arr[] = {12, 11, 13, 5, 6, 7};
int arr_size = sizeof(arr)/sizeof(arr[0]);
printf("Given array is \n");
printArray(arr, arr_size);
mergeSort(arr, 0, arr_size - 1);
printf("\nSorted array is \n");
printArray(arr, arr_size);
return 0;
}

Output:
Given array is
12 11 13 5 6 7
Sorted array is
5 6 7 11 12 13

Time Complexity: Sorting arrays on different machines. Merge Sort is a recursive algorithm and time complexity can be expressed as following
recurrence relation.
T(n) = 2T(n/2) +
The above recurrence can be solved either using Recurrence Tree method or Master method. It falls in case II of Master Method and solution of
the recurrence is
.
Time complexity of Merge Sort is
in all 3 cases (worst, average and best) as merge sort always divides the array in two halves and

take linear time to merge two halves.


Auxiliary Space: O(n)
Algorithmic Paradigm: Divide and Conquer
Sorting In Place: No in a typical implementation
Stable: Yes
Applications of Merge Sort
1) Merge Sort is useful for sorting linked lists in O(nLogn) time. Other nlogn algorithms like Heap Sort, Quick Sort (average case nLogn) cannot
be applied to linked lists.
2) Inversion Count Problem
3) Used in External Sorting
Other Sorting Algorithms on GeeksforGeeks/GeeksQuiz:
Selection Sort
Bubble Sort
Insertion Sort
Merge Sort
Heap Sort
QuickSort
Bucket Sort
ShellSort
Radix Sort
Counting Sort

Heap Sort
Heap sort is a comparison based sorting technique based on Binary Heap data structure. It is similar to selection sort where we first find the
maximum element and place the maximum element at the end. We repeat the same process for remaining element.
What is Binary Heap?
Let us first define a Complete Binary Tree. A complete binary tree is a binary tree in which every level, except possibly the last, is completely filled,
and all nodes are as far left as possible (Source Wikipedia)
A Binary Heap is a Complete Binary Tree where items are stored in a special order such that value in a parent node is greater(or smaller) than the
values in its two children nodes. The former is called as max heap and the latter is called min heap. The heap can be represented by binary tree or
array.
Why array based representation for Binary Heap?
Since a Binary Heap is a Complete Binary Tree, it can be easily represented as array and array based representation is space efficient. If the
parent node is stored at index I, the left child can be calculated by 2 * I + 1 and right child by 2 * I + 2.
Heap Sort Algorithm for sorting in increasing order:
1. Build a max heap from the input data.
2. At this point, the largest item is stored at the root of the heap. Replace it with the last item of the heap followed by reducing the size of heap by
1. Finally, heapify the root of tree.
3. Repeat above steps until size of heap is greater than 1.
How to build the heap?
Heapify procedure can be applied to a node only if its children nodes are heapified. So the heapification must be performed in the bottom up
order.
Lets understand with the help of an example:
Input data: 4, 10, 3, 5, 1
4(0)
/ \
10(1) 3(2)
/ \
5(3)
1(4)
The numbers in bracket represent the indices in the array
representation of data.
Applying heapify procedure to index 1:
4(0)
/ \
10(1)
3(2)
/ \
5(3)
1(4)
Applying heapify procedure to index 0:
10(0)
/ \
5(1) 3(2)
/ \
4(3)
1(4)
The heapify procedure calls itself recursively to build heap
in top down manner.
// C implementation of Heap Sort
#include <stdio.h>
#include <stdlib.h>
// A heap has current size and array of elements
struct MaxHeap
{
int size;
int* array;
};
// A utility function to swap to integers
void swap(int* a, int* b) { int t = *a; *a = *b; *b = t; }
// The main function to heapify a Max Heap. The function
// assumes that everything under given root (element at
// index idx) is already heapified
void maxHeapify(struct MaxHeap* maxHeap, int idx)
{
int largest = idx; // Initialize largest as root
int left = (idx << 1) + 1; // left = 2*idx + 1

int right = (idx + 1) << 1; // right = 2*idx + 2


// See if left child of root exists and is greater than
// root
if (left < maxHeap->size &&
maxHeap->array[left] > maxHeap->array[largest])
largest = left;
// See if right child of root exists and is greater than
// the largest so far
if (right < maxHeap->size &&
maxHeap->array[right] > maxHeap->array[largest])
largest = right;
// Change root, if needed
if (largest != idx)
{
swap(&maxHeap->array[largest], &maxHeap->array[idx]);
maxHeapify(maxHeap, largest);
}
}
// A utility function to create a max heap of given capacity
struct MaxHeap* createAndBuildHeap(int *array, int size)
{
int i;
struct MaxHeap* maxHeap =
(struct MaxHeap*) malloc(sizeof(struct MaxHeap));
maxHeap->size = size; // initialize size of heap
maxHeap->array = array; // Assign address of first element of array
// Start from bottommost and rightmost internal mode and heapify all
// internal modes in bottom up way
for (i = (maxHeap->size - 2) / 2; i >= 0; --i)
maxHeapify(maxHeap, i);
return maxHeap;
}
// The main function to sort an array of given size
void heapSort(int* array, int size)
{
// Build a heap from the input data.
struct MaxHeap* maxHeap = createAndBuildHeap(array, size);
// Repeat following steps while heap size is greater than 1.
// The last element in max heap will be the minimum element
while (maxHeap->size > 1)
{
// The largest item in Heap is stored at the root. Replace
// it with the last item of the heap followed by reducing the
// size of heap by 1.
swap(&maxHeap->array[0], &maxHeap->array[maxHeap->size - 1]);
--maxHeap->size; // Reduce heap size
// Finally, heapify the root of tree.
maxHeapify(maxHeap, 0);
}
}
// A utility function to print a given array of given size
void printArray(int* arr, int size)
{
int i;
for (i = 0; i < size; ++i)
printf("%d ", arr[i]);
}
/* Driver program to test above functions */
int main()
{
int arr[] = {12, 11, 13, 5, 6, 7};
int size = sizeof(arr)/sizeof(arr[0]);
printf("Given array is \n");
printArray(arr, size);
heapSort(arr, size);
printf("\nSorted array is \n");
printArray(arr, size);
return 0;

Output:
Given array is
12 11 13 5 6 7
Sorted array is
5 6 7 11 12 13

Notes:
Heap sort is an in-place algorithm.
Its typical implementation is not stable, but can be made stable (See this)
Time Complexity: Time complexity of heapify is O(Logn). Time complexity of createAndBuildHeap() is O(n) and overall time complexity of
Heap Sort is O(nLogn).
Applications of HeapSort
1. Sort a nearly sorted (or K sorted) array
2. k largest(or smallest) elements in an array
Heap sort algorithm has limited uses because Quicksort and Mergesort are better in practice. Nevertheless, the Heap data structure itself is
enormously used. See Applications of Heap Data Structure
Other Sorting Algorithms on GeeksforGeeks/GeeksQuiz:
Selection Sort
Bubble Sort
Insertion Sort
Merge Sort
QuickSort
Radix Sort
Counting Sort
Bucket Sort
ShellSort

QuickSort
Like Merge Sort, QuickSort is a Divide and Conquer algorithm. It picks an element as pivot and partitions the given array around the picked
pivot. There are many different versions of quickSort that pick pivot in different ways.
1) Always pick first element as pivot.
2) Always pick last element as pivot (implemented below)
3) Pick a random element as pivot.
4) Pick median as pivot.
The key process in quickSort is partition(). Target of partitions is, given an array and an element x of array as pivot, put x at its correct position in
sorted array and put all smaller elements (smaller than x) before x, and put all greater elements (greater than x) after x. All this should be done in
linear time.
Partition Algorithm
There can be many ways to do partition, following code adopts the method given in CLRS book. The logic is simple, we start from the leftmost
element and keep track of index of smaller (or equal to) elements as i. While traversing, if we find a smaller element, we swap current element with
arr[i]. Otherwise we ignore current element.
Implementation:
Following is C++ implementation of QuickSort.
/* A typical recursive implementation of quick sort */
#include<stdio.h>
// A utility function to swap two elements
void swap(int* a, int* b)
{
int t = *a;
*a = *b;
*b = t;
}
/* This function takes last element as pivot, places the pivot element at its
correct position in sorted array, and places all smaller (smaller than pivot)
to left of pivot and all greater elements to right of pivot */
int partition (int arr[], int l, int h)
{
int x = arr[h];
// pivot
int i = (l - 1); // Index of smaller element
for (int j = l; j <= h- 1; j++)
{
// If current element is smaller than or equal to pivot
if (arr[j] <= x)
{
i++;
// increment index of smaller element
swap(&arr[i], &arr[j]); // Swap current element with index
}
}
swap(&arr[i + 1], &arr[h]);
return (i + 1);
}
/* arr[] --> Array to be sorted, l --> Starting index, h --> Ending index */
void quickSort(int arr[], int l, int h)
{
if (l < h)
{
int p = partition(arr, l, h); /* Partitioning index */
quickSort(arr, l, p - 1);
quickSort(arr, p + 1, h);
}
}
/* Function to print an array */
void printArray(int arr[], int size)
{
int i;
for (i=0; i < size; i++)
printf("%d ", arr[i]);
printf("\n");
}
// Driver program to test above functions
int main()
{
int arr[] = {10, 7, 8, 9, 1, 5};

int n = sizeof(arr)/sizeof(arr[0]);
quickSort(arr, 0, n-1);
printf("Sorted array: \n");
printArray(arr, n);
return 0;
}

Output:
Sorted array:
1 5 7 8 9 10

Analysis of QuickSort
Time taken by QuickSort in general can be written as following.
T(n) = T(k) + T(n-k-1) +

(n)

The first two terms are for two recursive calls, the last term is for the partition process. k is the number of elements which are smaller than pivot.
The time taken by QuickSort depends upon the input array and partition strategy. Following are three cases.
Worst Case: The worst case occurs when the partition process always picks greatest or smallest element as pivot. If we consider above partition
strategy where last element is always picked as pivot, the worst case would occur when the array is already sorted in increasing or decreasing
order. Following is recurrence for worst case.
T(n) = T(0) + T(n-1) +
which is equivalent to
T(n) = T(n-1) + (n)

(n)

The solution of above recurrence is (n2).


Best Case: The best case occurs when the partition process always picks the middle element as pivot. Following is recurrence for best case.
T(n) = 2T(n/2) +

(n)

The solution of above recurrence is (nLogn). It can be solved using case 2 of Master Theorem.
Average Case:
To do average case analysis, we need to consider all possible permutation of array and calculate time taken by every permutation which doesnt
look easy.
We can get an idea of average case by considering the case when partition puts O(n/9) elements in one set and O(9n/10) elements in other set.
Following is recurrence for this case.
T(n) = T(n/9) + T(9n/10) +

(n)

Solution of above recurrence is also O(nLogn)


Although the worst case time complexity of QuickSort is O(n2) which is more than many other sorting algorithms like Merge Sort and Heap Sort,
QuickSort is faster in practice, because its inner loop can be efficiently implemented on most architectures, and in most real-world data. QuickSort
can be implemented in different ways by changing the choice of pivot, so that the worst case rarely occurs for a given type of data. However,
merge sort is generally considered better when data is huge and stored in external storage.
References:
http://en.wikipedia.org/wiki/Quicksort
Other Sorting Algorithms on GeeksforGeeks/GeeksQuiz:
Selection Sort
Bubble Sort
Insertion Sort
Merge Sort
Heap Sort
Radix Sort
Counting Sort
Bucket Sort
ShellSort

Radix Sort
The lower bound for Comparison based sorting algorithm (Merge Sort, Heap Sort, Quick-Sort .. etc) is ?(nLogn), i.e., they cannot do better than
nLogn.
Counting sort is a linear time sorting algorithm that sort in O(n+k) time when elements are in range from 1 to k.
What if the elements are in range from 1 to n2?
We cant use counting sort because counting sort will take O(n2) which is worse than comparison based sorting algorithms. Can we sort such an
array in linear time?
Radix Sort is the answer. The idea of Radix Sort is to do digit by digit sort starting from least significant digit to most significant digit. Radix sort
uses counting sort as a subroutine to sort.
The Radix Sort Algorithm
1) Do following for each digit i where i varies from least significant digit to the most significant digit.
.a) Sort input array using counting sort (or any stable sort) according to the ith digit.
Example:
Original, unsorted list:
170, 45, 75, 90, 802, 24, 2, 66
Sorting by least significant digit (1s place) gives: [*Notice that we keep 802 before 2, because 802 occurred before 2 in the original list, and
similarly for pairs 170 & 90 and 45 & 75.]
170, 90, 802,2, 24, 45, 75, 66
Sorting by next digit (10s place) gives: [*Notice that 802 again comes before 2 as 802 comes before 2 in the previous list.]
802, 2,24,45,66, 170,75,90
Sorting by most significant digit (100s place) gives:
2, 24, 45, 66, 75, 90,170,802
What is the running time of Radix Sort?
Let there be d digits in input integers. Radix Sort takes O(d*(n+b)) time where b is the base for representing numbers, for example, for decimal
system, b is 10. What is the value of d? If k is the maximum possible value, then d would be O(logb(k)). So overall time complexity is O((n+b) *
logb(k)). Which looks more than the time complexity of comparison based sorting algorithms for a large k. Let us first limit k. Let k <= nc where c
is a constant. In that case, the complexity becomes O(nLogb(n)). But it still doesnt beat comparison based sorting algorithms.
What if we make value of b larger?. What should be the value of b to make the time complexity linear? If we set b as n, we get the time complexity
as O(n). In other words, we can sort an array of integers with range from 1 to nc if the numbers are represented in base n (or every digit takes
log2(n) bits).
Is Radix Sort preferable to Comparison based sorting algorithms like Quick-Sort?
If we have log2n bits for every digit, the running time of Radix appears to be better than Quick Sort for a wide range of input numbers. The
constant factors hidden in asymptotic notation are higher for Radix Sort and Quick-Sort uses hardware caches more effectively. Also, Radix sort
uses counting sort as a subroutine and counting sort takes extra space to sort numbers.
Implementation of Radix Sort
Following is a simple C++ implementation of Radix Sort. For simplicity, the value of d is assumed to be 10. We recommend you to see Counting
Sort for details of countSort() function in below code.

C/C++
// C++ implementation of Radix Sort
#include<iostream>
using namespace std;
// A utility function to get maximum value in arr[]
int getMax(int arr[], int n)
{
int mx = arr[0];
for (int i = 1; i < n; i++)
if (arr[i] > mx)
mx = arr[i];
return mx;
}

// A function to do counting sort of arr[] according to


// the digit represented by exp.
void countSort(int arr[], int n, int exp)
{
int output[n]; // output array
int i, count[10] = {0};
// Store count of occurrences in count[]
for (i = 0; i < n; i++)
count[ (arr[i]/exp)%10 ]++;
// Change count[i] so that count[i] now contains actual
// position of this digit in output[]
for (i = 1; i < 10; i++)
count[i] += count[i - 1];
// Build the output array
for (i = n - 1; i >= 0; i--)
{
output[count[ (arr[i]/exp)%10 ] - 1] = arr[i];
count[ (arr[i]/exp)%10 ]--;
}
// Copy the output array to arr[], so that arr[] now
// contains sorted numbers according to current digit
for (i = 0; i < n; i++)
arr[i] = output[i];
}
// The main function to that sorts arr[] of size n using
// Radix Sort
void radixsort(int arr[], int n)
{
// Find the maximum number to know number of digits
int m = getMax(arr, n);
// Do counting sort for every digit. Note that instead
// of passing digit number, exp is passed. exp is 10^i
// where i is current digit number
for (int exp = 1; m/exp > 0; exp *= 10)
countSort(arr, n, exp);
}
// A utility function to print an array
void print(int arr[], int n)
{
for (int i = 0; i < n; i++)
cout << arr[i] << " ";
}
// Driver program to test above functions
int main()
{
int arr[] = {170, 45, 75, 90, 802, 24, 2, 66};
int n = sizeof(arr)/sizeof(arr[0]);
radixsort(arr, n);
print(arr, n);
return 0;
}

Java
// Radix sort Java implementation
import java.io.*;
import java.util.*;
class Radix {
// A utility function to get maximum value in arr[]
static int getMax(int arr[], int n)
{
int mx = arr[0];
for (int i = 1; i < n; i++)
if (arr[i] > mx)
mx = arr[i];
return mx;
}
// A function to do counting sort of arr[] according to
// the digit represented by exp.

static void countSort(int arr[], int n, int exp)


{
int output[] = new int[n]; // output array
int i;
int count[] = new int[10];
Arrays.fill(count,0);
// Store count of occurrences in count[]
for (i = 0; i < n; i++)
count[ (arr[i]/exp)%10 ]++;
// Change count[i] so that count[i] now contains
// actual position of this digit in output[]
for (i = 1; i < 10; i++)
count[i] += count[i - 1];
// Build the output array
for (i = n - 1; i >= 0; i--)
{
output[count[ (arr[i]/exp)%10 ] - 1] = arr[i];
count[ (arr[i]/exp)%10 ]--;
}
// Copy the output array to arr[], so that arr[] now
// contains sorted numbers according to curent digit
for (i = 0; i < n; i++)
arr[i] = output[i];
}
// The main function to that sorts arr[] of size n using
// Radix Sort
static void radixsort(int arr[], int n)
{
// Find the maximum number to know number of digits
int m = getMax(arr, n);
// Do counting sort for every digit. Note that instead
// of passing digit number, exp is passed. exp is 10^i
// where i is current digit number
for (int exp = 1; m/exp > 0; exp *= 10)
countSort(arr, n, exp);
}
// A utility function to print an array
static void print(int arr[], int n)
{
for (int i=0; i<n; i++)
System.out.print(arr[i]+" ");
}
/*Driver function to check for above function*/
public static void main (String[] args)
{
int arr[] = {170, 45, 75, 90, 802, 24, 2, 66};
int n = arr.length;
radixsort(arr, n);
print(arr, n);
}
}
/* This code is contributed by Devesh Agrawal */

2 24 45 66 75 90 170 802

Other Sorting Algorithms on GeeksforGeeks/GeeksQuiz:


Selection Sort
Bubble Sort
Insertion Sort
Merge Sort
Heap Sort
QuickSort
Counting Sort
Bucket Sort
ShellSort
References:

http://en.wikipedia.org/wiki/Radix_sort
http://alg12.wikischolars.columbia.edu/file/view/RADIX.pdf
MIT Video Lecture
Introduction to Algorithms 3rd Edition by Clifford Stein, Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest

Counting Sort
Counting sort is a sorting technique based on keys between a specific range. It works by counting the number of objects having distinct key values
(kind of hashing). Then doing some arithmetic to calculate the position of each object in the output sequence.
Let us understand it with the help of an example.
For simplicity, consider the data in
Input data: 1, 4, 1, 2, 7, 5, 2
1) Take a count array to store the
Index:
0 1 2 3 4 5 6 7
Count:
0 2 2 0 1 1 0 1

the range 0 to 9.
count of each unique object.
8 9
0 0

2) Modify the count array such that each element at each index
stores the sum of previous counts.
Index:
0 1 2 3 4 5 6 7 8 9
Count:
0 2 4 4 5 6 6 7 7 7
The modified count array indicates the position of each object in
the output sequence.
3) Output each object from the input sequence followed by
decreasing its count by 1.
Process the input data: 1, 4, 1, 2, 7, 5, 2. Position of 1 is 2.
Put data 1 at index 2 in output. Decrease count by 1 to place
next data 1 at an index 1 smaller than this index.

Following is C implementation of counting sort.


// C Program for counting sort
#include <stdio.h>
#include <string.h>
#define RANGE 255
// The main function that sort the given string str in alphabatical order
void countSort(char *str)
{
// The output character array that will have sorted str
char output[strlen(str)];
// Create a count array to store count of inidividul characters and
// initialize count array as 0
int count[RANGE + 1], i;
memset(count, 0, sizeof(count));
// Store count of each character
for(i = 0; str[i]; ++i)
++count[str[i]];
// Change count[i] so that count[i] now contains actual position of
// this character in output array
for (i = 1; i <= RANGE; ++i)
count[i] += count[i-1];
// Build the output character array
for (i = 0; str[i]; ++i)
{
output[count[str[i]]-1] = str[i];
--count[str[i]];
}
// Copy the output array to str, so that str now
// contains sorted characters
for (i = 0; str[i]; ++i)
str[i] = output[i];
}
// Driver program to test above function
int main()
{
char str[] = "geeksforgeeks";//"applepp";
countSort(str);
printf("Sorted string is %s\n", str);
return 0;
}

Output:

Sorted character array is eeeefggkkorss

Time Complexity: O(n+k) where n is the number of elements in input array and k is the range of input.
Auxiliary Space: O(n+k)
Points to be noted:
1. Counting sort is efficient if the range of input data is not significantly greater than the number of objects to be sorted. Consider the situation
where the input sequence is between range 1 to 10K and the data is 10, 5, 10K, 5K.
2. It is not a comparison based sorting. It running time complexity is O(n) with space proportional to the range of data.
3. It is often used as a sub-routine to another sorting algorithm like radix sort.
4. Counting sort uses a partial hashing to count the occurrence of the data object in O(1).
5. Counting sort can be extended to work for negative inputs also.
Exercise:
1. Modify above code to sort the input data in the range from M to N.
2. Modify above code to sort negative input data.
3. Is counting sort stable and online?
4. Thoughts on parallelizing the counting sort algorithm.
Other Sorting Algorithms on GeeksforGeeks/GeeksQuiz:
Selection Sort
Bubble Sort
Insertion Sort
Merge Sort
Heap Sort
QuickSort
Radix Sort
Bucket Sort
ShellSort

Bucket Sort
Bucket sort is mainly useful when input is uniformly distributed over a range. For example, consider the following problem.
Sort a large set of floating point numbers which are in range from 0.0 to 1.0 and are uniformly distributed across the range. How do we
sort the numbers efficiently?
A simple way is to apply a comparison based sorting algorithm. The lower bound for Comparison based sorting algorithm (Merge Sort, Heap
Sort, Quick-Sort .. etc) is ?(n Log n), i.e., they cannot do better than nLogn.
Can we sort the array in linear time? Counting sort can not be applied here as we use keys as index in counting sort. Here keys are floating point
numbers.
The idea is to use bucket sort. Following is bucket algorithm.
bucketSort(arr[], n)
1) Create n empty buckets (Or lists).
2) Do following for every array element arr[i].
.......a) Insert arr[i] into bucket[n*array[i]]
3) Sort individual buckets using insertion sort.
4) Concatenate all sorted buckets.

Following diagram (taken from CLRS book) demonstrates working of bucket sort.

Time Complexity: If we assume that insertion in a bucket takes O(1) time then steps 1 and 2 of the above algorithm clearly take O(n) time. The
O(1) is easily possible if we use a linked list to represent a bucket (In the following code, C++ vector is used for simplicity). Step 4 also takes
O(n) time as there will be n items in all buckets.
The main step to analyze is step 3. This step also takes O(n) time on average if all numbers are uniformly distributed (please refer CLRS book for
more details)
Following is C++ implementation of the above algorithm.
// C++ program to sort an array using bucket sort
#include <iostream>
#include <algorithm>
#include <vector>
using namespace std;
// Function to sort arr[] of size n using bucket sort
void bucketSort(float arr[], int n)
{
// 1) Create n empty buckets
vector<float> b[n];
// 2) Put array elements in different buckets
for (int i=0; i<n; i++)
{
int bi = n*arr[i]; // Index in bucket
b[bi].push_back(arr[i]);
}
// 3) Sort individual buckets
for (int i=0; i<n; i++)
sort(b[i].begin(), b[i].end());
// 4) Concatenate all buckets into arr[]
int index = 0;
for (int i = 0; i < n; i++)
for (int j = 0; j < b[i].size(); j++)
arr[index++] = b[i][j];
}
/* Driver program to test above funtion */
int main()

{
float arr[] = {0.897, 0.565, 0.656, 0.1234, 0.665, 0.3434};
int n = sizeof(arr)/sizeof(arr[0]);
bucketSort(arr, n);
cout << "Sorted array is \n";
for (int i=0; i<n; i++)
cout << arr[i] << " ";
return 0;
}

Output:
Sorted array is
0.1234 0.3434 0.565 0.656 0.665 0.897

References:
Introduction to Algorithms 3rd Edition by Clifford Stein, Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest
http://en.wikipedia.org/wiki/Bucket_sort
Other Sorting Algorithms on GeeksforGeeks/GeeksQuiz:
Selection Sort
Bubble Sort
Insertion Sort
Merge Sort
Heap Sort
QuickSort
Radix Sort
Counting Sort
ShellSort

ShellSort
ShellSort is mainly a variation of Insertion Sort. In insertion sort, we move elements only one position ahead. When an element has to be moved far
ahead, many movements are involved. The idea of shellSort is to allow exchange of far items. In shellSort, we make the array h-sorted for a large
value of h. We keep reducing the value of h until it becomes 1. An array is said to be h-sorted if all sublists of every hth element is sorted.
Following is C++ implementation of ShellSort.
#include <iostream>
using namespace std;
/* function to sort arr using shellSort */
int shellSort(int arr[], int n)
{
// Start with a big gap, then reduce the gap
for (int gap = n/2; gap > 0; gap /= 2)
{
// Do a gapped insertion sort for this gap size.
// The first gap elements a[0..gap-1] are already in gapped order
// keep adding one more element until the entire array is
// gap sorted
for (int i = gap; i < n; i += 1)
{
// add a[i] to the elements that have been gap sorted
// save a[i] in temp and make a hole at position i
int temp = arr[i];
// shift earlier gap-sorted elements up until the correct
// location for a[i] is found
int j;
for (j = i; j >= gap && arr[j - gap] > temp; j -= gap)
arr[j] = arr[j - gap];
// put temp (the original a[i]) in its correct location
arr[j] = temp;
}
}
return 0;
}
void printArray(int arr[], int n)
{
for (int i=0; i<n; i++)
cout << arr[i] << " ";
}
int main()
{
int arr[] = {12, 34, 54, 2, 3}, i;
int n = sizeof(arr)/sizeof(arr[0]);
cout << "Array before sorting: \n";
printArray(arr, n);
shellSort(arr, n);
cout << "\nArray after sorting: \n";
printArray(arr, n);
return 0;
}

Output:
Array before sorting:
12 34 54 2 3
Array after sorting:
2 3 12 34 54

Time Complexity: Time complexity of above implementation of shellsort is O(n2). In the above implementation gap is reduce by half in every
iteration. There are many other ways to reduce gap which lead to better time complexity. See this for more details.
References:
https://www.youtube.com/watch?v=pGhazjsFW28
http://en.wikipedia.org/wiki/Shellsort
Other Sorting Algorithms on GeeksforGeeks/GeeksQuiz:

Interpolation search vs Binary search


Interpolation search works better than Binary Search for a sorted and uniformly distributed array.
On average the interpolation search makes about log(log(n)) comparisons (if the elements are uniformly distributed), where n is the number of
elements to be searched. In the worst case (for instance where the numerical values of the keys increase exponentially) it can make up to O(n)
comparisons.
Sources:
http://en.wikipedia.org/wiki/Interpolation_search

Stability in sorting algorithms


A sorting algorithm is said to be stable if two objects with equal keys appear in the same order in sorted output as they appear in the input
unsorted array. Some sorting algorithms are stable by nature like Insertion sort, Merge Sort, Bubble Sort, etc. And some sorting algorithms are
not, like Heap Sort, Quick Sort, etc.
However, any given sorting algo which is not stable can be modified to be stable. There can be sorting algo specific ways to make it stable, but in
general, any comparison based sorting algorithm which is not stable by nature can be modified to be stable by changing the key comparison
operation so that the comparison of two keys considers position as a factor for objects with equal keys.
References:
http://www.math.uic.edu/~leon/cs-mcs401-s08/handouts/stability.pdf
http://en.wikipedia.org/wiki/Sorting_algorithm#Stability

When does the worst case of Quicksort occur?


The answer depends on strategy for choosing pivot. In early versions of Quick Sort where leftmost (or rightmost) element is chosen as pivot, the
worst occurs in following cases.
1) Array is already sorted in same order.
2) Array is already sorted in reverse order.
3) All elements are same (special case of case 1 and 2)
Since these cases are very common use cases, the problem was easily solved by choosing either a random index for the pivot, choosing the middle
index of the partition or (especially for longer partitions) choosing the median of the first, middle and last element of the partition for the pivot. With
these modifications, the worst case of Quick sort has less chances to occur, but worst case can still occur if the input array is such that the
maximum (or minimum) element is always chosen as pivot.
References:
http://en.wikipedia.org/wiki/Quicksort

Lower bound for comparison based sorting algorithms


The problem of sorting can be viewed as following.
Input: A sequence of n numbers <a1, a2, . . . , an>.
Output: A permutation (reordering) <a1, a2, . . . , an> of the input sequence such that a1 <= a2 .. <= an.
A sorting algorithm is comparison based if it uses comparison operators to find the order between two numbers. Comparison sorts can be viewed
abstractly in terms of decision trees. A decision tree is a full binary tree that represents the comparisons between elements that are performed by a
particular sorting algorithm operating on an input of a given size. The execution of the sorting algorithm corresponds to tracing a path from the root
of the decision tree to a leaf. At each internal node, a comparison ai aj is made. The left subtree then dictates subsequent comparisons for ai aj,
and the right subtree dictates subsequent comparisons for ai > aj. When we come to a leaf, the sorting algorithm has established the ordering. So
we can say following about the decison tree.
1) Each of the n! permutations on n elements must appear as one of the leaves of the decision tree for the sorting algorithm to sort properly.
2) Let x be the maximum number of comparisons in a sorting algorithm. The maximum height of the decison tree would be x. A tree with maximum
height x has at most 2^x leaves.
After combining the above two facts, we get following relation.
n! <= 2^x
Taking Log on both sides.
log2(n!) <= x
Since log2(n!) = ?(nLogn), we can say
x = ?(nLog2n)

Therefore, any comparison based sorting algorithm must make at least nLog2n comparisons to sort the input array, and Heapsort and merge sort
are asymptotically optimal comparison sorts.
References:
Introduction to Algorithms, by Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest and Clifford Stein

Which sorting algorithm makes minimum number of memory writes?


Minimizing the number of writes is useful when making writes to some huge data set is very expensive, such as with EEPROMs or Flash memory,
where each write reduces the lifespan of the memory.
Among the sorting algorithms that we generally study in our data structure and algorithm courses, Selection Sort makes least number of writes (it
makes O(n) swaps). But, Cycle Sort almost always makes less number of writes compared to Selection Sort. In Cycle Sort, each value is either
written zero times, if its already in its correct position, or written one time to its correct position. This matches the minimal number of overwrites
required for a completed in-place sort.
Sources:
http://en.wikipedia.org/wiki/Cycle_sort
http://en.wikipedia.org/wiki/Selection_sort

Find the Minimum length Unsorted Subarray, sorting which makes the complete
array sorted
Given an unsorted array arr[0..n-1] of size n, find the minimum length subarray arr[s..e] such that sorting this subarray makes the whole array
sorted.
Examples:
1) If the input array is [10, 12, 20, 30, 25, 40, 32, 31, 35, 50, 60], your program should be able to find that the subarray lies between the indexes
3 and 8.
2) If the input array is [0, 1, 15, 25, 6, 7, 30, 40, 50], your program should be able to find that the subarray lies between the indexes 2 and 5.
Solution:
1) Find the candidate unsorted subarray
a) Scan from left to right and find the first element which is greater than the next element. Let s be the index of such an element. In the above
example 1, s is 3 (index of 30).
b) Scan from right to left and find the first element (first in right to left order) which is smaller than the next element (next in right to left order). Let e
be the index of such an element. In the above example 1, e is 7 (index of 31).
2) Check whether sorting the candidate unsorted subarray makes the complete array sorted or not. If not, then include more
elements in the subarray.
a) Find the minimum and maximum values in arr[s..e]. Let minimum and maximum values be min and max. min and max for [30, 25, 40, 32, 31]
are 25 and 40 respectively.
b) Find the first element (if there is any) in arr[0..s-1] which is greater than min, change s to index of this element. There is no such element in
above example 1.
c) Find the last element (if there is any) in arr[e+1..n-1] which is smaller than max, change e to index of this element. In the above example 1, e is
changed to 8 (index of 35)
3) Print s and e.

Implementation:
#include<stdio.h>
void printUnsorted(int arr[], int n)
{
int s = 0, e = n-1, i, max, min;
// step 1(a) of above algo
for (s = 0; s < n-1; s++)
{
if (arr[s] > arr[s+1])
break;
}
if (s == n-1)
{
printf("The complete array is sorted");
return;
}
// step 1(b) of above algo
for(e = n - 1; e > 0; e--)
{
if(arr[e] < arr[e-1])
break;
}
// step 2(a) of above algo
max = arr[s]; min = arr[s];
for(i = s + 1; i <= e; i++)
{
if(arr[i] > max)
max = arr[i];
if(arr[i] < min)
min = arr[i];
}
// step 2(b) of above algo
for( i = 0; i < s; i++)
{
if(arr[i] > min)

{
s = i;
break;
}
}
// step 2(c) of above algo
for( i = n -1; i >= e+1; i--)
{
if(arr[i] < max)
{
e = i;
break;
}
}
// step 3 of above algo
printf(" The unsorted subarray which makes the given array "
" sorted lies between the indees %d and %d", s, e);
return;
}
int main()
{
int arr[] = {10, 12, 20, 30, 25, 40, 32, 31, 35, 50, 60};
int arr_size = sizeof(arr)/sizeof(arr[0]);
printUnsorted(arr, arr_size);
getchar();
return 0;
}

Time Complexity: O(n)

Merge Sort for Linked Lists


Merge sort is often preferred for sorting a linked list. The slow random-access performance of a linked list makes some other algorithms (such as
quicksort) perform poorly, and others (such as heapsort) completely impossible.
Let head be the first node of the linked list to be sorted and headRef be the pointer to head. Note that we need a reference to head in MergeSort()
as the below implementation changes next links to sort the linked lists (not data at the nodes), so head node has to be changed if the data at original
head is not the smallest value in linked list.
MergeSort(headRef)
1) If head is NULL or there is only one element in the Linked List
then return.
2) Else divide the linked list into two halves.
FrontBackSplit(head, &a, &b); /* a and b are two halves */
3) Sort the two halves a and b.
MergeSort(a);
MergeSort(b);
4) Merge the sorted a and b (using SortedMerge() discussed here)
and update the head pointer using headRef.
*headRef = SortedMerge(a, b);
#include<stdio.h>
#include<stdlib.h>
/* Link list node */
struct node
{
int data;
struct node* next;
};
/* function prototypes */
struct node* SortedMerge(struct node* a, struct node* b);
void FrontBackSplit(struct node* source,
struct node** frontRef, struct node** backRef);
/* sorts the linked list by changing next pointers (not data) */
void MergeSort(struct node** headRef)
{
struct node* head = *headRef;
struct node* a;
struct node* b;
/* Base case -- length 0 or 1 */
if ((head == NULL) || (head->next == NULL))
{
return;
}
/* Split head into 'a' and 'b' sublists */
FrontBackSplit(head, &a, &b);
/* Recursively sort the sublists */
MergeSort(&a);
MergeSort(&b);
/* answer = merge the two sorted lists together */
*headRef = SortedMerge(a, b);
}
/* See http://geeksforgeeks.org/?p=3622 for details of this
function */
struct node* SortedMerge(struct node* a, struct node* b)
{
struct node* result = NULL;
/* Base cases */
if (a == NULL)
return(b);
else if (b==NULL)
return(a);
/* Pick either a or b, and recur */
if (a->data <= b->data)
{
result = a;
result->next = SortedMerge(a->next, b);
}
else

{
result = b;
result->next = SortedMerge(a, b->next);
}
return(result);
}
/* UTILITY FUNCTIONS */
/* Split the nodes of the given list into front and back halves,
and return the two lists using the reference parameters.
If the length is odd, the extra node should go in the front list.
Uses the fast/slow pointer strategy. */
void FrontBackSplit(struct node* source,
struct node** frontRef, struct node** backRef)
{
struct node* fast;
struct node* slow;
if (source==NULL || source->next==NULL)
{
/* length < 2 cases */
*frontRef = source;
*backRef = NULL;
}
else
{
slow = source;
fast = source->next;
/* Advance 'fast' two nodes, and advance 'slow' one node */
while (fast != NULL)
{
fast = fast->next;
if (fast != NULL)
{
slow = slow->next;
fast = fast->next;
}
}
/* 'slow' is before the midpoint in the list, so split it in two
at that point. */
*frontRef = source;
*backRef = slow->next;
slow->next = NULL;
}
}
/* Function to print nodes in a given linked list */
void printList(struct node *node)
{
while(node!=NULL)
{
printf("%d ", node->data);
node = node->next;
}
}
/* Function to insert a node at the beginging of the linked list */
void push(struct node** head_ref, int new_data)
{
/* allocate node */
struct node* new_node =
(struct node*) malloc(sizeof(struct node));
/* put in the data */
new_node->data = new_data;
/* link the old list off the new node */
new_node->next = (*head_ref);
/* move the head to point to the new node */
(*head_ref)
= new_node;
}
/* Drier program to test above functions*/
int main()
{
/* Start with the empty list */
struct node* res = NULL;
struct node* a = NULL;

/* Let us create a unsorted linked lists to test the functions


Created lists shall be a: 2->3->20->5->10->15 */
push(&a, 15);
push(&a, 10);
push(&a, 5);
push(&a, 20);
push(&a, 3);
push(&a, 2);
/* Sort the above created Linked List */
MergeSort(&a);
printf("\n Sorted Linked List is: \n");
printList(a);
getchar();
return 0;
}

Time Complexity: O(nLogn)


Sources:
http://en.wikipedia.org/wiki/Merge_sort
http://cslibrary.stanford.edu/105/LinkedListProblems.pdf

Sort a nearly sorted (or K sorted) array


Given an array of n elements, where each element is at most k away from its target position, devise an algorithm that sorts in O(n log k) time.
For example, let us consider k is 2, an element at index 7 in the sorted array, can be at indexes 5, 6, 7, 8, 9 in the given array.
Source: Nearly sorted algorithm
We can use Insertion Sort to sort the elements efficiently. Following is the C code for standard Insertion Sort.
/* Function to sort an array using insertion sort*/
void insertionSort(int A[], int size)
{
int i, key, j;
for (i = 1; i < size; i++)
{
key = A[i];
j = i-1;
/* Move elements of A[0..i-1], that are greater than key, to one
position ahead of their current position.
This loop will run at most k times */
while (j >= 0 && A[j] > key)
{
A[j+1] = A[j];
j = j-1;
}
A[j+1] = key;
}
}

The inner loop will run at most k times. To move every element to its correct place, at most k elements need to be moved. So overall complexity
will be O(nk)
We can sort such arrays more efficiently with the help of Heap data structure. Following is the detailed process that uses Heap.
1) Create a Min Heap of size k+1 with first k+1 elements. This will take O(k) time (See this GFact)
2) One by one remove min element from heap, put it in result array, and add a new element to heap from remaining elements.
Removing an element and adding a new element to min heap will take Logk time. So overall complexity will be O(k) + O((n-k)*logK)
#include<iostream>
using namespace std;
// Prototype of a utility function to swap two integers
void swap(int *x, int *y);
// A class for Min Heap
class MinHeap
{
int *harr; // pointer to array of elements in heap
int heap_size; // size of min heap
public:
// Constructor
MinHeap(int a[], int size);
// to heapify a subtree with root at given index
void MinHeapify(int );
// to get index of left child of node at index i
int left(int i) { return (2*i + 1); }
// to get index of right child of node at index i
int right(int i) { return (2*i + 2); }
// to remove min (or root), add a new value x, and return old root
int replaceMin(int x);
// to extract the root which is the minimum element
int extractMin();
};
// Given an array of size n, where every element is k away from its target
// position, sorts the array in O(nLogk) time.
int sortK(int arr[], int n, int k)
{
// Create a Min Heap of first (k+1) elements from
// input array
int *harr = new int[k+1];
for (int i = 0; i<=k && i<n; i++) // i < n condition is needed when k > n

harr[i] = arr[i];
MinHeap hp(harr, k+1);
// i is index for remaining elements in arr[] and ti
// is target index of for cuurent minimum element in
// Min Heapm 'hp'.
for(int i = k+1, ti = 0; ti < n; i++, ti++)
{
// If there are remaining elements, then place
// root of heap at target index and add arr[i]
// to Min Heap
if (i < n)
arr[ti] = hp.replaceMin(arr[i]);
// Otherwise place root at its target index and
// reduce heap size
else
arr[ti] = hp.extractMin();
}
}
// FOLLOWING ARE IMPLEMENTATIONS OF STANDARD MIN HEAP METHODS FROM CORMEN BOOK
// Constructor: Builds a heap from a given array a[] of given size
MinHeap::MinHeap(int a[], int size)
{
heap_size = size;
harr = a; // store address of array
int i = (heap_size - 1)/2;
while (i >= 0)
{
MinHeapify(i);
i--;
}
}
// Method to remove minimum element (or root) from min heap
int MinHeap::extractMin()
{
int root = harr[0];
if (heap_size > 1)
{
harr[0] = harr[heap_size-1];
heap_size--;
MinHeapify(0);
}
return root;
}
// Method to change root with given value x, and return the old root
int MinHeap::replaceMin(int x)
{
int root = harr[0];
harr[0] = x;
if (root < x)
MinHeapify(0);
return root;
}
// A recursive method to heapify a subtree with root at given index
// This method assumes that the subtrees are already heapified
void MinHeap::MinHeapify(int i)
{
int l = left(i);
int r = right(i);
int smallest = i;
if (l < heap_size && harr[l] < harr[i])
smallest = l;
if (r < heap_size && harr[r] < harr[smallest])
smallest = r;
if (smallest != i)
{
swap(&harr[i], &harr[smallest]);
MinHeapify(smallest);
}
}
// A utility function to swap two elements
void swap(int *x, int *y)
{
int temp = *x;
*x = *y;

*y = temp;
}
// A utility function to print array elements
void printArray(int arr[], int size)
{
for (int i=0; i < size; i++)
cout << arr[i] << " ";
cout << endl;
}
// Driver program to test above functions
int main()
{
int k = 3;
int arr[] = {2, 6, 3, 12, 56, 8};
int n = sizeof(arr)/sizeof(arr[0]);
sortK(arr, n, k);
cout << "Following is sorted array\n";
printArray (arr, n);
return 0;
}

Output:
Following is sorted array
2 3 6 8 12 56

The Min Heap based method takes O(nLogk) time and uses O(k) auxiliary space.
We can also use a Balanced Binary Search Tree instead of Heap to store K+1 elements. The insert and delete operations on Balanced BST
also take O(Logk) time. So Balanced BST based method will also take O(nLogk) time, but the Heap bassed method seems to be more efficient
as the minimum element will always be at root. Also, Heap doesnt need extra space for left and right pointers.

Iterative Quick Sort


Following is a typical recursive implementation of Quick Sort that uses last element as pivot.
/* A typical recursive implementation of quick sort */
/* This function takes last element as pivot, places the pivot element at its
correct position in sorted array, and places all smaller (smaller than pivot)
to left of pivot and all greater elements to right of pivot */
int partition (int arr[], int l, int h)
{
int x = arr[h];
int i = (l - 1);
for (int j = l; j <= h- 1; j++)
{
if (arr[j] <= x)
{
i++;
swap (&arr[i], &arr[j]);
}
}
swap (&arr[i + 1], &arr[h]);
return (i + 1);
}
/* A[] --> Array to be sorted, l --> Starting index, h --> Ending index */
void quickSort(int A[], int l, int h)
{
if (l < h)
{
int p = partition(A, l, h); /* Partitioning index */
quickSort(A, l, p - 1);
quickSort(A, p + 1, h);
}
}

The above implementation can be optimized in many ways


1) The above implementation uses last index as pivot. This causes worst-case behavior on already sorted arrays, which is a commonly occurring
case. The problem can be solved by choosing either a random index for the pivot, or choosing the middle index of the partition or choosing the
median of the first, middle and last element of the partition for the pivot. (See this for details)
2) To reduce the recursion depth, recur first for the smaller half of the array, and use a tail call to recurse into the other.
3) Insertion sort works better for small subarrays. Insertion sort can be used for invocations on such small arrays (i.e. where the length is less than
a threshold t determined experimentally). For example, this library implementation of qsort uses insertion sort below size 7.
Despite above optimizations, the function remains recursive and uses function call stack to store intermediate values of l and h. The function call
stack stores other bookkeeping information together with parameters. Also, function calls involve overheads like storing activation record of the
caller function and then resuming execution.
The above function can be easily converted to iterative version with the help of an auxiliary stack. Following is an iterative implementation of the
above recursive code.
// An iterative implementation of quick sort
#include <stdio.h>
// A utility function to swap two elements
void swap ( int* a, int* b )
{
int t = *a;
*a = *b;
*b = t;
}
/* This function is same in both iterative and recursive*/
int partition (int arr[], int l, int h)
{
int x = arr[h];
int i = (l - 1);
for (int j = l; j <= h- 1; j++)
{
if (arr[j] <= x)
{
i++;

swap (&arr[i], &arr[j]);


}
}
swap (&arr[i + 1], &arr[h]);
return (i + 1);
}
/* A[] --> Array to be sorted, l --> Starting index, h --> Ending index */
void quickSortIterative (int arr[], int l, int h)
{
// Create an auxiliary stack
int stack[ h - l + 1 ];
// initialize top of stack
int top = -1;
// push initial values of l and h to stack
stack[ ++top ] = l;
stack[ ++top ] = h;
// Keep popping from stack while is not empty
while ( top >= 0 )
{
// Pop h and l
h = stack[ top-- ];
l = stack[ top-- ];
// Set pivot element at its correct position in sorted array
int p = partition( arr, l, h );
// If there are elements on left side of pivot, then push left
// side to stack
if ( p-1 > l )
{
stack[ ++top ] = l;
stack[ ++top ] = p - 1;
}
// If there are elements on right side of pivot, then push right
// side to stack
if ( p+1 < h )
{
stack[ ++top ] = p + 1;
stack[ ++top ] = h;
}
}
}
// A utility function to print contents of arr
void printArr( int arr[], int n )
{
int i;
for ( i = 0; i < n; ++i )
printf( "%d ", arr[i] );
}
// Driver program to test above functions
int main()
{
int arr[] = {4, 3, 5, 2, 1, 3, 2, 3};
int n = sizeof( arr ) / sizeof( *arr );
quickSortIterative( arr, 0, n - 1 );
printArr( arr, n );
return 0;
}

Output:
1 2 2 3 3 3 4 5

The above mentioned optimizations for recursive quick sort can also be applied to iterative version.
1) Partition process is same in both recursive and iterative. The same techniques to choose optimal pivot can also be applied to iterative version.
2) To reduce the stack size, first push the indexes of smaller half.
3) Use insertion sort when the size reduces below a experimentally calculated threshold.
References:
http://en.wikipedia.org/wiki/Quicksort

QuickSort on Singly Linked List


QuickSort on Doubly Linked List is discussed here. QuickSort on Singly linked list was given as an exercise. Following is C++ implementation for
same. The important things about implementation are, it changes pointers rather swapping data and time complexity is same as the implementation
for Doubly Linked List.
In partition(), we consider last element as pivot. We traverse through the current list and if a node has value greater than pivot, we move it after
tail. If the node has smaller value, we keep it at its current position.
In QuickSortRecur(), we first call partition() which places pivot at correct position and returns pivot. After pivot is placed at correct position, we
find tail node of left side (list before pivot) and recur for left list. Finally, we recur for right list.
// C++ program for Quick Sort on Singly Linled List
#include <iostream>
#include <cstdio>
using namespace std;
/* a node of the singly linked list */
struct node
{
int data;
struct node *next;
};
/* A utility function to insert a node at the beginning of linked list */
void push(struct node** head_ref, int new_data)
{
/* allocate node */
struct node* new_node = new node;
/* put in the data */
new_node->data = new_data;
/* link the old list off the new node */
new_node->next = (*head_ref);
/* move the head to point to the new node */
(*head_ref)
= new_node;
}
/* A utility function to print linked list */
void printList(struct node *node)
{
while (node != NULL)
{
printf("%d ", node->data);
node = node->next;
}
printf("\n");
}
// Returns the last node of the list
struct node *getTail(struct node *cur)
{
while (cur != NULL && cur->next != NULL)
cur = cur->next;
return cur;
}
// Partitions the list taking
struct node *partition(struct
struct
{
struct node *pivot = end;
struct node *prev = NULL,

the last element as the pivot


node *head, struct node *end,
node **newHead, struct node **newEnd)
*cur = head, *tail = pivot;

// During partition, both the head and end of the list might change
// which is updated in the newHead and newEnd variables
while (cur != pivot)
{
if (cur->data < pivot->data)
{
// First node that has a value less than the pivot - becomes
// the new head
if ((*newHead) == NULL)
(*newHead) = cur;
prev = cur;
cur = cur->next;
}
else // If cur node is greater than pivot

{
// Move cur node to next of tail, and change tail
if (prev)
prev->next = cur->next;
struct node *tmp = cur->next;
cur->next = NULL;
tail->next = cur;
tail = cur;
cur = tmp;
}
}
// If the pivot data is the smallest element in the current list,
// pivot becomes the head
if ((*newHead) == NULL)
(*newHead) = pivot;
// Update newEnd to the current last node
(*newEnd) = tail;
// Return the pivot node
return pivot;
}
//here
struct
{
//
if

the sorting happens exclusive of the end node


node *quickSortRecur(struct node *head, struct node *end)
base condition
(!head || head == end)
return head;

node *newHead = NULL, *newEnd = NULL;


// Partition the list, newHead and newEnd will be updated
// by the partition function
struct node *pivot = partition(head, end, &newHead, &newEnd);
// If pivot is the smallest element - no need to recur for
// the left part.
if (newHead != pivot)
{
// Set the node before the pivot node as NULL
struct node *tmp = newHead;
while (tmp->next != pivot)
tmp = tmp->next;
tmp->next = NULL;
// Recur for the list before pivot
newHead = quickSortRecur(newHead, tmp);
// Change next of last node of the left half to pivot
tmp = getTail(newHead);
tmp->next = pivot;
}
// Recur for the list after the pivot element
pivot->next = quickSortRecur(pivot->next, newEnd);
return newHead;
}
// The main function for quick sort. This is a wrapper over recursive
// function quickSortRecur()
void quickSort(struct node **headRef)
{
(*headRef) = quickSortRecur(*headRef, getTail(*headRef));
return;
}
// Driver program to test above functions
int main()
{
struct node *a = NULL;
push(&a, 5);
push(&a, 20);
push(&a, 4);
push(&a, 3);
push(&a, 30);
cout << "Linked List before sorting \n";

printList(a);
quickSort(&a);
cout << "Linked List after sorting \n";
printList(a);
return 0;
}

Output:
Linked List before sorting
30 3 4 20 5
Linked List after sorting
3 4 5 20 30

QuickSort on Doubly Linked List


Following is a typical recursive implementation of QuickSort for arrays. The implementation uses last element as pivot.
/* A typical recursive implementation of Quicksort for array*/
/* This function takes last element as pivot, places the pivot element at its
correct position in sorted array, and places all smaller (smaller than
pivot) to left of pivot and all greater elements to right of pivot */
int partition (int arr[], int l, int h)
{
int x = arr[h];
int i = (l - 1);
for (int j = l; j <= h- 1; j++)
{
if (arr[j] <= x)
{
i++;
swap (&arr[i], &arr[j]);
}
}
swap (&arr[i + 1], &arr[h]);
return (i + 1);
}
/* A[] --> Array to be sorted, l --> Starting index, h --> Ending index */
void quickSort(int A[], int l, int h)
{
if (l < h)
{
int p = partition(A, l, h); /* Partitioning index */
quickSort(A, l, p - 1);
quickSort(A, p + 1, h);
}
}

Can we use same algorithm for Linked List?


Following is C++ implementation for doubly linked list. The idea is simple, we first find out pointer to last node. Once we have pointer to last node,
we can recursively sort the linked list using pointers to first and last nodes of linked list, similar to the above recursive function where we pass
indexes of first and last array elements. The partition function for linked list is also similar to partition for arrays. Instead of returning index of the
pivot element, it returns pointer to the pivot element. In the following implementation, quickSort() is just a wrapper function, the main recursive
function is _quickSort() which is similar to quickSort() for array implementation.

// A C++ program to sort a linked list using Quicksort


#include <iostream>
#include <stdio.h>
using namespace std;
/* a node of the doubly linked list */
struct node
{
int data;
struct node *next;
struct node *prev;
};
/* A utility function to swap two elements */
void swap ( int* a, int* b )
{ int t = *a;
*a = *b;
*b = t; }
// A utility function to find last node of linked list
struct node *lastNode(node *root)
{
while (root && root->next)
root = root->next;
return root;
}
/* Considers last element as pivot, places the pivot element at its
correct position in sorted array, and places all smaller (smaller than
pivot) to left of pivot and all greater elements to right of pivot */

node* partition(node *l, node *h)


{
// set pivot as h element
int x = h->data;
// similar to i = l-1 for array implementation
node *i = l->prev;
// Similar to "for (int j = l; j <= h- 1; j++)"
for (node *j = l; j != h; j = j->next)
{
if (j->data <= x)
{
// Similar to i++ for array
i = (i == NULL)? l : i->next;
swap(&(i->data), &(j->data));
}
}
i = (i == NULL)? l : i->next; // Similar to i++
swap(&(i->data), &(h->data));
return i;
}
/* A recursive implementation of quicksort for linked list */
void _quickSort(struct node* l, struct node *h)
{
if (h != NULL && l != h && l != h->next)
{
struct node *p = partition(l, h);
_quickSort(l, p->prev);
_quickSort(p->next, h);
}
}
// The main function to sort a linked list. It mainly calls _quickSort()
void quickSort(struct node *head)
{
// Find last node
struct node *h = lastNode(head);
// Call the recursive QuickSort
_quickSort(head, h);
}
// A utility function to print contents of arr
void printList(struct node *head)
{
while (head)
{
cout << head->data << " ";
head = head->next;
}
cout << endl;
}
/* Function to insert a node at the beginging of the Doubly Linked List */
void push(struct node** head_ref, int new_data)
{
struct node* new_node = new node;
/* allocate node */
new_node->data = new_data;
/* since we are adding at the begining, prev is always NULL */
new_node->prev = NULL;
/* link the old list off the new node */
new_node->next = (*head_ref);
/* change prev of head node to new node */
if ((*head_ref) != NULL) (*head_ref)->prev = new_node ;
/* move the head to point to the new node */
(*head_ref)
= new_node;
}
/* Driver program to test above function */
int main()
{
struct node *a = NULL;
push(&a, 5);
push(&a, 20);

push(&a, 4);
push(&a, 3);
push(&a, 30);
cout << "Linked List before sorting \n";
printList(a);
quickSort(a);
cout << "Linked List after sorting \n";
printList(a);
return 0;
}

Output :
Linked List before sorting
30 3 4 20 5
Linked List after sorting
3 4 5 20 30

Time Complexity: Time complexity of the above implementation is same as time complexity of QuickSort() for arrays. It takes O(n^2) time in
worst case and O(nLogn) in average and best cases. The worst case occurs when the linked list is already sorted.
Can we implement random quick sort for linked list?
Quicksort can be implemented for Linked List only when we can pick a fixed point as pivot (like last element in above implementation). Random
QuickSort cannot be efficiently implemented for Linked Lists by picking random pivot.
Exercise:
The above implementation is for doubly linked list. Modify it for singly linked list. Note that we dont have prev pointer in singly linked list.
Refer QuickSort on Singly Linked List for solution.

Find k closest elements to a given value


Given a sorted array arr[] and a value X, find the k closest elements to X in arr[].
Examples:
Input: K = 4, X = 35
arr[] = {12, 16, 22, 30, 35, 39, 42,
45, 48, 50, 53, 55, 56}
Output: 30 39 42 45

Note that if the element is present in array, then it should not be in output, only the other closest elements are required.
In the following solutions, it is assumed that all elements of array are distinct.
A simple solution is to do linear search for k closest elements.
1) Start from the first element and search for the crossover point (The point before which elements are smaller than or equal to X and after which
elements are greater). This step takes O(n) time.
2) Once we find the crossover point, we can compare elements on both sides of crossover point to print k closest elements. This step takes O(k)
time.
The time complexity of the above solution is O(n).
An Optimized Solution is to find k elements in O(Logn + k) time. The idea is to use Binary Search to find the crossover point. Once we find
index of crossover point, we can print k closest elements in O(k) time.
#include<stdio.h>
/* Function to find the cross over point (the point before
which elements are smaller than or equal to x and after
which greater than x)*/
int findCrossOver(int arr[], int low, int high, int x)
{
// Base cases
if (arr[high] <= x) // x is greater than all
return high;
if (arr[low] > x) // x is smaller than all
return low;
// Find the middle point
int mid = (low + high)/2; /* low + (high - low)/2 */
/* If x is same as middle element, then return mid */
if (arr[mid] <= x && arr[mid+1] > x)
return mid;
/* If x is greater than arr[mid], then either arr[mid + 1]
is ceiling of x or ceiling lies in arr[mid+1...high] */
if(arr[mid] < x)
return findCrossOver(arr, mid+1, high, x);
return findCrossOver(arr, low, mid - 1, x);
}
// This function prints k closest elements to x in arr[].
// n is the number of elements in arr[]
void printKclosest(int arr[], int x, int k, int n)
{
// Find the crossover point
int l = findCrossOver(arr, 0, n-1, x); // le
int r = l+1; // Right index to search
int count = 0; // To keep track of count of elements already printed
// If x is present in arr[], then reduce left index
// Assumption: all elements in arr[] are distinct
if (arr[l] == x) l--;
// Compare elements on left and right of crossover
// point to find the k closest elements
while (l >= 0 && r < n && count < k)
{
if (x - arr[l] < arr[r] - x)
printf("%d ", arr[l--]);
else
printf("%d ", arr[r++]);
count++;
}

// If there are no more elements on right side, then


// print left elements
while (count < k && l >= 0)
printf("%d ", arr[l--]), count++;
// If there are no more elements on left side, then
// print right elements
while (count < k && r < n)
printf("%d ", arr[r++]), count++;
}
/* Driver program to check above functions */
int main()
{
int arr[] ={12, 16, 22, 30, 35, 39, 42,
45, 48, 50, 53, 55, 56};
int n = sizeof(arr)/sizeof(arr[0]);
int x = 35, k = 4;
printKclosest(arr, x, 4, n);
return 0;
}

Output:
39 30 42 45

The time complexity of this method is O(Logn + k).


Exercise: Extend the optimized solution to work for duplicates also, i.e., to work for arrays where elements dont have to be distinct.

Sort n numbers in range from 0 to n^2 1 in linear time


Given an array of numbers of size n. It is also given that the array elements are in range from 0 to n2 1. Sort the given array in linear time.
Examples:
Since there are 5 elements, the elements can be from 0 to 24.
Input: arr[] = {0, 23, 14, 12, 9}
Output: arr[] = {0, 9, 12, 14, 23}
Since there are 3 elements, the elements can be from 0 to 8.
Input: arr[] = {7, 0, 2}
Output: arr[] = {0, 2, 7}

Solution: If we use Counting Sort, it would take O(n^2) time as the given range is of size n^2. Using any comparison based sorting like Merge
Sort, Heap Sort, .. etc would take O(nLogn) time.
Now question arises how to do this in 0(n)? Firstly, is it possible? Can we use data given in question? n numbers in range from 0 to n2 1?
The idea is to use Radix Sort. Following is standard Radix Sort algorithm.
1) Do following for each digit i where i varies from least
significant digit to the most significant digit.
..a) Sort input array using counting sort (or any stable
sort) according to the ith digit

Let there be d digits in input integers. Radix Sort takes O(d*(n+b)) time where b is the base for representing numbers, for example, for decimal
system, b is 10. Since n2-1 is the maximum possible value, the value of d would be O(logb(n)). So overall time complexity is O((n+b)*O(logb(n)).
Which looks more than the time complexity of comparison based sorting algorithms for a large k. The idea is to change base b. If we set b as n,
the value of O(logb(n)) becomes O(1) and overall time complexity becomes O(n).
arr[] = {0, 10, 13, 12, 7}
Let us consider the elements in base 5. For example 13 in
base 5 is 23, and 7 in base 5 is 12.
arr[] = {00(0), 20(10), 23(13), 22(12), 12(7)}
After first iteration (Sorting according to the last digit in
base 5), we get.
arr[] = {00(0), 20(10), 12(7), 22(12), 23(13)}
After second iteration, we get
arr[] = {00(0), 12(7), 20(10), 22(12), 23(13)}

Following is C++ implementation to sort an array of size n where elements are in range from 0 to n2 1.
#include<iostream>
using namespace std;
// A function to do counting sort of arr[] according to
// the digit represented by exp.
int countSort(int arr[], int n, int exp)
{
int output[n]; // output array
int i, count[n] ;
for (int i=0; i < n; i++)
count[i] = 0;
// Store count of occurrences in count[]
for (i = 0; i < n; i++)
count[ (arr[i]/exp)%n ]++;
// Change count[i] so that count[i] now contains actual
// position of this digit in output[]
for (i = 1; i < n; i++)
count[i] += count[i - 1];
// Build the output array
for (i = n - 1; i >= 0; i--)
{
output[count[ (arr[i]/exp)%n] - 1] = arr[i];
count[(arr[i]/exp)%n]--;
}
// Copy the output array to arr[], so that arr[] now
// contains sorted numbers according to curent digit
for (i = 0; i < n; i++)
arr[i] = output[i];
}

// The main function to that sorts arr[] of size n using Radix Sort
void sort(int arr[], int n)
{
// Do counting sort for first digit in base n. Note that
// instead of passing digit number, exp (n^0 = 0) is passed.
countSort(arr, n, 1);
// Do counting sort for second digit in base n. Note that
// instead of passing digit number, exp (n^1 = n) is passed.
countSort(arr, n, n);
}
// A utility function to print an array
void printArr(int arr[], int n)
{
for (int i = 0; i < n; i++)
cout << arr[i] << " ";
}
// Driver program to test above functions
int main()
{
// Since array size is 7, elements should be from 0 to 48
int arr[] = {40, 12, 45, 32, 33, 1, 22};
int n = sizeof(arr)/sizeof(arr[0]);
cout << "Given array is \n";
printArr(arr, n);
sort(arr, n);
cout << "\nSorted array is \n";
printArr(arr, n);
return 0;
}

Output:
Given array is
40 12 45 32 33 1 22
Sorted array is
1 12 22 32 33 40 45

How to sort if range is from 1 to n2?


If range is from 1 to n n2, the above process can not be directly applied, it must be changed. Consider n = 100 and range from 1 to 10000. Since
the base is 100, a digit must be from 0 to 99 and there should be 2 digits in the numbers. But the number 10000 has more than 2 digits. So to sort
numbers in a range from 1 to n2, we can use following process.
1) Subtract all numbers by 1.
2) Since the range is now 0 to n2, do counting sort twice as done in the above implementation.
3) After the elements are sorted, add 1 to all numbers to obtain the original numbers.
How to sort if range is from 0 to n^3 -1?
Since there can be 3 digits in base n, we need to call counting sort 3 times.

A Problem in Many Binary Search Implementations


Consider the following C implementation of Binary Search function, is there anything wrong in this?
// A iterative binary search function. It returns location of x in
// given array arr[l..r] if present, otherwise -1
int binarySearch(int arr[], int l, int r, int x)
{
while (l <= r)
{
// find index of middle element
int m = (l+r)/2;
// Check if x is present at mid
if (arr[m] == x) return m;
// If x greater, ignore left half
if (arr[m] < x) l = m + 1;
// If x is smaller, ignore right half
else r = m - 1;
}
// if we reach here, then element was not present
return -1;
}

The above looks fine except one subtle thing, the expression m = (l+r)/2?. It fails for large values of l and r. Specifically, it fails if the sum of low
and high is greater than the maximum positive int value (231 1). The sum overflows to a negative value, and the value stays negative when divided
by two. In C this causes an array index out of bounds with unpredictable results.
What is the way to resolve this problem?
Following is one way:
int mid = low + ((high - low) / 2);

Probably faster, and arguably as clear is (works only in Java, refer this):
int mid = (low + high) >>> 1;

In C and C++ (where you dont have the >>> operator), you can do this:
mid = ((unsigned int)low + (unsigned int)high)) >> 1

The similar problem appears in Merge Sort as well.


The above content is taken from google reasearch blog.
Please refer this as well, it points out that the above solutions may not always work.
The above problem occurs when array length is 230 or greater and the search repeatedly moves to second half of the array. This much size of array
is not likely to appear most of the time. For example, when we try the below program with 32 bit Code Blocks compiler, we get compiler error.
int main()
{
int arr[1<<30];
return 0;
}

Output:
error: size of array 'arr' is too large

Even when we try boolean array, the program compiles fine, but crashes when run in Windows 7.0 and Code Blocks 32 bit compiler
#include <stdbool.h>
int main()
{
bool arr[1<<30];
return 0;
}

Output: No compiler error, but crashes at run time.


Sources:
http://googleresearch.blogspot.in/2006/06/extra-extra-read-all-about-it-nearly.html

http://locklessinc.com/articles/binary_search/

Search in an almost sorted array


Given an array which is sorted, but after sorting some elements are moved to either of the adjacent positions, i.e., arr[i] may be present at arr[i+1]
or arr[i-1]. Write an efficient function to search an element in this array. Basically the element arr[i] can only be swapped with either arr[i+1] or
arr[i-1].
For example consider the array {2, 3, 10, 4, 40}, 4 is moved to next position and 10 is moved to previous position.
Example:
Input: arr[] = {10, 3, 40, 20, 50, 80, 70}, key = 40
Output: 2
Output is index of 40 in given array
Input: arr[] = {10, 3, 40, 20, 50, 80, 70}, key = 90
Output: -1
-1 is returned to indicate element is not present

A simple solution is to linearly search the given key in given array. Time complexity of this solution is O(n). We cab modify binary search to do it in
O(Logn) time.
The idea is to compare the key with middle 3 elements, if present then return the index. If not present, then compare the key with middle element
to decide whether to go in left half or right half. Comparing with middle element is enough as all the elements after mid+2 must be greater than
element mid and all elements before mid-2 must be smaller than mid element.
Following is C++ implementation of this approach.
// C++ program to find an element in an almost sorted array
#include <stdio.h>
// A recursive binary search based function. It returns index of x in
// given array arr[l..r] is present, otherwise -1
int binarySearch(int arr[], int l, int r, int x)
{
if (r >= l)
{
int mid = l + (r - l)/2;
//
if
if
if

If the element is present at one of the middle 3 positions


(arr[mid] == x) return mid;
(mid > l && arr[mid-1] == x) return (mid - 1);
(mid < r && arr[mid+1] == x) return (mid + 1);

// If element is smaller than mid, then it can only be present


// in left subarray
if (arr[mid] > x) return binarySearch(arr, l, mid-2, x);
// Else the element can only be present in right subarray
return binarySearch(arr, mid+2, r, x);
}
// We reach here when element is not present in array
return -1;
}
// Driver program to test above function
int main(void)
{
int arr[] = {3, 2, 10, 4, 40};
int n = sizeof(arr)/ sizeof(arr[0]);
int x = 4;
int result = binarySearch(arr, 0, n-1, x);
(result == -1)? printf("Element is not present in array")
: printf("Element is present at index %d", result);
return 0;
}

Output:
Element is present at index 3

Time complexity of the above function is O(Logn).

Sort an array in wave form


Given an unsorted array of integers, sort the array into a wave like array. An array arr[0..n-1] is sorted in wave form if arr[0] >= arr[1] <= arr[2]
>= arr[3] <= arr[4] >= ..
Examples:
Input: arr[] = {10, 5, 6, 3, 2, 20, 100, 80}
Output: arr[] = {10, 5, 6, 2, 20, 3, 100, 80} OR
{20, 5, 10, 2, 80, 6, 100, 3} OR
any other array that is in wave form
Input: arr[] = {20, 10, 8, 6, 4, 2}
Output: arr[] = {20, 8, 10, 4, 6, 2} OR
{10, 8, 20, 2, 6, 4} OR
any other array that is in wave form
Input: arr[] = {2, 4, 6, 8, 10, 20}
Output: arr[] = {4, 2, 8, 6, 20, 10} OR
any other array that is in wave form
Input: arr[] = {3, 6, 5, 10, 7, 20}
Output: arr[] = {6, 3, 10, 5, 20, 7} OR
any other array that is in wave form

A Simple Solution is to use sorting. First sort the input array, then swap all adjacent elements.
For example, let the input array be {3, 6, 5, 10, 7, 20}. After sorting, we get {3, 5, 6, 7, 10, 20}. After swapping adjacent elements, we get {5,
3, 7, 6, 20, 10}.
Below are implementations of this simple approach.

C++
// A C++ program to sort an array in wave form using a sorting function
#include<iostream>
#include<algorithm>
using namespace std;
// A utility method to swap two numbers.
void swap(int *x, int *y)
{
int temp = *x;
*x = *y;
*y = temp;
}
// This function sorts arr[0..n-1] in wave form, i.e.,
// arr[0] >= arr[1] <= arr[2] >= arr[3] <= arr[4] >= arr[5]..
void sortInWave(int arr[], int n)
{
// Sort the input array
sort(arr, arr+n);
// Swap adjacent elements
for (int i=0; i<n-1; i += 2)
swap(&arr[i], &arr[i+1]);
}
// Driver program to test above function
int main()
{
int arr[] = {10, 90, 49, 2, 1, 5, 23};
int n = sizeof(arr)/sizeof(arr[0]);
sortInWave(arr, n);
for (int i=0; i<n; i++)
cout << arr[i] << " ";
return 0;
}

Python
# Python function to sort the array arr[0..n-1] in wave form,
# i.e., arr[0] >= arr[1] <= arr[2] >= arr[3] <= arr[4] >= arr[5]
def sortInWave(arr, n):
#sort the array

arr.sort()
# Swap adjacent elements
for i in range(0,n-1,2):
arr[i], arr[i+1] = arr[i+1], arr[i]
# Driver progrM
arr = [10, 90, 49, 2, 1, 5, 23]
sortInWave(arr, len(arr))
for i in range(0,len(arr)):
print arr[i],
# This code is contributed by __Devesh Agrawal__

2 1 10 5 49 23 90

The time complexity of the above solution is O(nLogn) if a O(nLogn) sorting algorithm like Merge Sort, Heap Sort, .. etc is used.
This can be done in O(n) time by doing a single traversal of given array. The idea is based on the fact that if we make sure that all even
positioned (at index 0, 2, 4, ..) elements are greater than their adjacent odd elements, we dont need to worry about odd positioned element.
Following are simple steps.
1) Traverse all even positioned elements of input array, and do following.
.a) If current element is smaller than previous odd element, swap previous and current.
.b) If current element is smaller than next odd element, swap next and current.
Below are implementations of above simple algorithm.

C++
// A O(n) program to sort an input array in wave form
#include<iostream>
using namespace std;
// A utility method to swap two numbers.
void swap(int *x, int *y)
{
int temp = *x;
*x = *y;
*y = temp;
}
// This function sorts arr[0..n-1] in wave form, i.e., arr[0] >=
// arr[1] <= arr[2] >= arr[3] <= arr[4] >= arr[5] ....
void sortInWave(int arr[], int n)
{
// Traverse all even elements
for (int i = 0; i < n; i+=2)
{
// If current even element is smaller than previous
if (i>0 && arr[i-1] > arr[i] )
swap(&arr[i], &arr[i-1]);
// If current even element is smaller than next
if (i<n-1 && arr[i] < arr[i+1] )
swap(&arr[i], &arr[i + 1]);
}
}
// Driver program to test above function
int main()
{
int arr[] = {10, 90, 49, 2, 1, 5, 23};
int n = sizeof(arr)/sizeof(arr[0]);
sortInWave(arr, n);
for (int i=0; i<n; i++)
cout << arr[i] << " ";
return 0;
}

Python
# Python function to sort the array arr[0..n-1] in wave form,
# i.e., arr[0] >= arr[1] <= arr[2] >= arr[3] <= arr[4] >= arr[5]
def sortInWave(arr, n):

# Traverse all even elements


for i in range(0, n, 2):
# If current even element is smaller than previous
if (i> 0 and arr[i] < arr[i-1]):
arr[i],arr[i-1]=arr[i-1],arr[i]
# If current even element is smaller than next
if (i < n-1 and arr[i] < arr[i+1]):
arr[i],arr[i+1]=arr[i+1],arr[i]
# Driver program
arr = [10, 90, 49, 2, 1, 5, 23]
sortInWave(arr, len(arr))
for i in range(0,len(arr)):
print arr[i],
# This code is contributed by __Devesh Agrawal__

Output:
90 10 49 1 5 2 23

Why is Binary Search preferred over Ternary Search?


The following is a simple recursive Binary Search function in C++ taken from here.
// A recursive binary search function. It returns location of x in
// given array arr[l..r] is present, otherwise -1
int binarySearch(int arr[], int l, int r, int x)
{
if (r >= l)
{
int mid = l + (r - l)/2;
// If the element is present at the middle itself
if (arr[mid] == x) return mid;
// If element is smaller than mid, then it can only be present
// in left subarray
if (arr[mid] > x) return binarySearch(arr, l, mid-1, x);
// Else the element can only be present in right subarray
return binarySearch(arr, mid+1, r, x);
}
// We reach here when element is not present in array
return -1;
}

The following is a simple recursive Ternary Search function in C++.


// A recursive ternary search function. It returns location of x in
// given array arr[l..r] is present, otherwise -1
int ternarySearch(int arr[], int l, int r, int x)
{
if (r >= l)
{
int mid1 = l + (r - l)/3;
int mid2 = mid1 + (r - l)/3;
// If x is present at the mid1
if (arr[mid1] == x) return mid1;
// If x is present at the mid2
if (arr[mid2] == x) return mid2;
// If x is present in left one-third
if (arr[mid1] > x) return ternarySearch(arr, l, mid1-1, x);
// If x is present in right one-third
if (arr[mid2] < x) return ternarySearch(arr, mid2+1, r, x);
// If x is present in middle one-third
return ternarySearch(arr, mid1+1, mid2-1, x);
}
// We reach here when element is not present in array
return -1;
}

Which of the above two does less comparisons in worst case?


From the first look, it seems the ternary search does less number of comparisons as it makes Log3n recursive calls, but binary search makes Log2n
recursive calls. Let us take a closer look.
The following is recursive formula for counting comparisons in worst case of Binary Search.
T(n) = T(n/2) + 2, T(1) = 1

The following is recursive formula for counting comparisons in worst case of Ternary Search.
T(n) = T(n/3) + 4, T(1) = 1

In binary search, there are 2Log2n + 1 comparisons in worst case. In ternary search, there are 4Log3n + 1 comparisons in worst case.
Therefore, the comparison of Ternary and Binary Searches boils down the comparison of expressions 2Log3n and Log2n . The value of 2Log3n
can be written as (2 / Log23) * Log2n . Since the value of (2 / Log23) is more than one, Ternary Search does more comparisons than Binary
Search in worst case.
Exercise:
Why Merge Sort divides input array in two halves, why not in three or more parts?

Kth Smallest/Largest Element in Unsorted Array | Set 2 (Expected Linear Time)


We recommend to read following post as a prerequisite of this post.
Kth Smallest/Largest Element in Unsorted Array | Set 1
Given an array and a number k where k is smaller than size of array, we need to find the kth smallest element in the given array. It is given that ll
array elements are distinct.
Examples:
Input: arr[] = {7, 10, 4, 3, 20, 15}
k = 3
Output: 7
Input: arr[] = {7, 10, 4, 3, 20, 15}
k = 4
Output: 10

We have discussed three different solutions here.


In this post method 4 is discussed which is mainly an extension of method 3 (QuickSelect) discussed in the previous post. The idea is to randomly
pick a pivot element. To implement randomized partition, we use a random function, rand() to generate index between l and r, swap the element at
randomly generated index with the last element, and finally call the standard partition process which uses last element as pivot.
Following is C++ implementation of above Randomized QuickSelect.
// C++ implementation of randomized quickSelect
#include<iostream>
#include<climits>
#include<cstdlib>
using namespace std;
int randomPartition(int arr[], int l, int r);
// This function returns k'th smallest element in arr[l..r] using
// QuickSort based method. ASSUMPTION: ALL ELEMENTS IN ARR[] ARE DISTINCT
int kthSmallest(int arr[], int l, int r, int k)
{
// If k is smaller than number of elements in array
if (k > 0 && k <= r - l + 1)
{
// Partition the array around a random element and
// get position of pivot element in sorted array
int pos = randomPartition(arr, l, r);
// If position is same as k
if (pos-l == k-1)
return arr[pos];
if (pos-l > k-1) // If position is more, recur for left subarray
return kthSmallest(arr, l, pos-1, k);
// Else recur for right subarray
return kthSmallest(arr, pos+1, r, k-pos+l-1);
}
// If k is more than number of elements in array
return INT_MAX;
}
void swap(int *a, int *b)
{
int temp = *a;
*a = *b;
*b = temp;
}
// Standard partition process of QuickSort(). It considers the last
// element as pivot and moves all smaller element to left of it and
// greater elements to right. This function is used by randomPartition()
int partition(int arr[], int l, int r)
{
int x = arr[r], i = l;
for (int j = l; j <= r - 1; j++)
{
if (arr[j] <= x)
{
swap(&arr[i], &arr[j]);

i++;
}
}
swap(&arr[i], &arr[r]);
return i;
}
// Picks a random pivot element between l and r and partitions
// arr[l..r] arount the randomly picked element using partition()
int randomPartition(int arr[], int l, int r)
{
int n = r-l+1;
int pivot = rand() % n;
swap(&arr[l + pivot], &arr[r]);
return partition(arr, l, r);
}
// Driver program to test above methods
int main()
{
int arr[] = {12, 3, 5, 7, 4, 19, 26};
int n = sizeof(arr)/sizeof(arr[0]), k = 3;
cout << "K'th smallest element is " << kthSmallest(arr, 0, n-1, k);
return 0;
}

Output:
K'th smallest element is 5

Time Complexity:
The worst case time complexity of the above solution is still O(n2). In worst case, the randomized function may always pick a corner element. The
expected time complexity of above randomized QuickSelect is ?(n), see CLRS book or MIT video lecture for proof. The assumption in the
analysis is, random number generator is equally likely to generate any number in the input range.
Sources:
MIT Video Lecture on Order Statistics, Median
Introduction to Algorithms by Clifford Stein, Thomas H. Cormen, Charles E. Leiserson, Ronald L.

Kth Smallest/Largest Element in Unsorted Array | Set 2 (Expected Linear Time)


We recommend to read following post as a prerequisite of this post.
Kth Smallest/Largest Element in Unsorted Array | Set 1
Given an array and a number k where k is smaller than size of array, we need to find the kth smallest element in the given array. It is given that ll
array elements are distinct.
Examples:
Input: arr[] = {7, 10, 4, 3, 20, 15}
k = 3
Output: 7
Input: arr[] = {7, 10, 4, 3, 20, 15}
k = 4
Output: 10

We have discussed three different solutions here.


In this post method 4 is discussed which is mainly an extension of method 3 (QuickSelect) discussed in the previous post. The idea is to randomly
pick a pivot element. To implement randomized partition, we use a random function, rand() to generate index between l and r, swap the element at
randomly generated index with the last element, and finally call the standard partition process which uses last element as pivot.
Following is C++ implementation of above Randomized QuickSelect.
// C++ implementation of randomized quickSelect
#include<iostream>
#include<climits>
#include<cstdlib>
using namespace std;
int randomPartition(int arr[], int l, int r);
// This function returns k'th smallest element in arr[l..r] using
// QuickSort based method. ASSUMPTION: ALL ELEMENTS IN ARR[] ARE DISTINCT
int kthSmallest(int arr[], int l, int r, int k)
{
// If k is smaller than number of elements in array
if (k > 0 && k <= r - l + 1)
{
// Partition the array around a random element and
// get position of pivot element in sorted array
int pos = randomPartition(arr, l, r);
// If position is same as k
if (pos-l == k-1)
return arr[pos];
if (pos-l > k-1) // If position is more, recur for left subarray
return kthSmallest(arr, l, pos-1, k);
// Else recur for right subarray
return kthSmallest(arr, pos+1, r, k-pos+l-1);
}
// If k is more than number of elements in array
return INT_MAX;
}
void swap(int *a, int *b)
{
int temp = *a;
*a = *b;
*b = temp;
}
// Standard partition process of QuickSort(). It considers the last
// element as pivot and moves all smaller element to left of it and
// greater elements to right. This function is used by randomPartition()
int partition(int arr[], int l, int r)
{
int x = arr[r], i = l;
for (int j = l; j <= r - 1; j++)
{
if (arr[j] <= x)
{
swap(&arr[i], &arr[j]);

i++;
}
}
swap(&arr[i], &arr[r]);
return i;
}
// Picks a random pivot element between l and r and partitions
// arr[l..r] arount the randomly picked element using partition()
int randomPartition(int arr[], int l, int r)
{
int n = r-l+1;
int pivot = rand() % n;
swap(&arr[l + pivot], &arr[r]);
return partition(arr, l, r);
}
// Driver program to test above methods
int main()
{
int arr[] = {12, 3, 5, 7, 4, 19, 26};
int n = sizeof(arr)/sizeof(arr[0]), k = 3;
cout << "K'th smallest element is " << kthSmallest(arr, 0, n-1, k);
return 0;
}

Output:
K'th smallest element is 5

Time Complexity:
The worst case time complexity of the above solution is still O(n2). In worst case, the randomized function may always pick a corner element. The
expected time complexity of above randomized QuickSelect is ?(n), see CLRS book or MIT video lecture for proof. The assumption in the
analysis is, random number generator is equally likely to generate any number in the input range.
Sources:
MIT Video Lecture on Order Statistics, Median
Introduction to Algorithms by Clifford Stein, Thomas H. Cormen, Charles E. Leiserson, Ronald L.

Kth Smallest/Largest Element in Unsorted Array | Set 3 (Worst Case Linear Time)
We recommend to read following posts as a prerequisite of this post.
Kth Smallest/Largest Element in Unsorted Array | Set 1
Kth Smallest/Largest Element in Unsorted Array | Set 2 (Expected Linear Time)
Given an array and a number k where k is smaller than size of array, we need to find the kth smallest element in the given array. It is given that ll
array elements are distinct.
Examples:
Input: arr[] = {7, 10, 4, 3, 20, 15}
k = 3
Output: 7
Input: arr[] = {7, 10, 4, 3, 20, 15}
k = 4
Output: 10

In previous post, we discussed an expected linear time algorithm. In this post, a worst case linear time method is discussed. The idea in this new
method is similar to quickSelect(), we get worst case linear time by selecting a pivot that divides array in a balanced way (there are not
very few elements on one side and many on other side). After the array is divided in a balanced way, we apply the same steps as used in
quickSelect() to decide whether to go left or right of pivot.
Following is complete algorithm.
kthSmallest(arr[0..n-1], k)
1) Divide arr[] into ?n/5rceil; groups where size of each group is 5
except possibly the last group which may have less than 5 elements.
2) Sort the above created ?n/5? groups and find median
of all groups. Create an auxiliary array 'median[]' and store medians
of all ?n/5? groups in this median array.
// Recursively call this method to find median of median[0..?n/5?-1]
3) medOfMed = kthSmallest(median[0..?n/5?-1], ?n/10?)
4) Partition arr[] around medOfMed and obtain its position.
pos = partition(arr, n, medOfMed)
5) If pos == k return medOfMed
6) If pos < k return kthSmallest(arr[l..pos-1], k)
7) If poa > k return kthSmallest(arr[pos+1..r], k-pos+l-1)

In above algorithm, last 3 steps are same as algorithm in previous post. The first four steps are used to obtain a good point for partitioning the array
(to make sure that there are not too many elements either side of pivot).
Following is C++ implementation of above algorithm.
// C++ implementation of worst case linear time algorithm
// to find k'th smallest element
#include<iostream>
#include<algorithm>
#include<climits>
using namespace std;
int partition(int arr[], int l, int r, int k);
// A simple function to find median of arr[]. This is called
// only for an array of size 5 in this program.
int findMedian(int arr[], int n)
{
sort(arr, arr+n); // Sort the array
return arr[n/2]; // Return middle element
}
// Returns k'th smallest element in arr[l..r] in worst case
// linear time. ASSUMPTION: ALL ELEMENTS IN ARR[] ARE DISTINCT
int kthSmallest(int arr[], int l, int r, int k)
{
// If k is smaller than number of elements in array
if (k > 0 && k <= r - l + 1)
{
int n = r-l+1; // Number of elements in arr[l..r]
// Divide arr[] in groups of size 5, calculate median

// of every group and store it in median[] array.


int i, median[(n+4)/5]; // There will be floor((n+4)/5) groups;
for (i=0; i<n/5; i++)
median[i] = findMedian(arr+l+i*5, 5);
if (i*5 < n) //For last group with less than 5 elements
{
median[i] = findMedian(arr+l+i*5, n%5);
i++;
}
// Find median of all medians using recursive call.
// If median[] has only one element, then no need
// of recursive call
int medOfMed = (i == 1)? median[i-1]:
kthSmallest(median, 0, i-1, i/2);
// Partition the array around a random element and
// get position of pivot element in sorted array
int pos = partition(arr, l, r, medOfMed);
// If position is same as k
if (pos-l == k-1)
return arr[pos];
if (pos-l > k-1) // If position is more, recur for left
return kthSmallest(arr, l, pos-1, k);
// Else recur for right subarray
return kthSmallest(arr, pos+1, r, k-pos+l-1);
}
// If k is more than number of elements in array
return INT_MAX;
}
void swap(int *a, int *b)
{
int temp = *a;
*a = *b;
*b = temp;
}
// It searches for x in arr[l..r], and partitions the array
// around x.
int partition(int arr[], int l, int r, int x)
{
// Search for x in arr[l..r] and move it to end
int i;
for (i=l; i<r; i++)
if (arr[i] == x)
break;
swap(&arr[i], &arr[r]);
// Standard partition algorithm
i = l;
for (int j = l; j <= r - 1; j++)
{
if (arr[j] <= x)
{
swap(&arr[i], &arr[j]);
i++;
}
}
swap(&arr[i], &arr[r]);
return i;
}
// Driver program to test above methods
int main()
{
int arr[] = {12, 3, 5, 7, 4, 19, 26};
int n = sizeof(arr)/sizeof(arr[0]), k = 3;
cout << "K'th smallest element is "
<< kthSmallest(arr, 0, n-1, k);
return 0;
}

Output:
K'th smallest element is 5

Time Complexity:

The worst case time complexity of the above algorithm is O(n). Let us analyze all steps.
The steps 1) and 2) take O(n) time as finding median of an array of size 5 takes O(1) time and there are n/5 arrays of size 5.
The step 3) takes T(n/5) time. The step 4 is standard partition and takes O(n) time.
The interesting steps are 6) and 7). At most, one of them is executed. These are recursive steps. What is the worst case size of these recursive
calls. The answer is maximum number of elements greater than medOfMed (obtained in step 3) or maximum number of elements smaller than
medOfMed.
How many elements are greater than medOfMed and how many are smaller?
At least half of the medians found in step 2 are greater than or equal to medOfMed. Thus, at least half of the n/5 groups contribute 3 elements that
are greater than medOfMed, except for the one group that has fewer than 5 elements. Therefore, the number of elements greater than medOfMed
is at least.

Similarly, the number of elements that are less than medOfMed is at least 3n/10 6. In the worst case, the function recurs for at most n (3n/10 6)
which is 7n/10 + 6 elements.
Note that 7n/10 + 6 < n for n > 20 and that any input of 80 or fewer elements requires O(1) time. We can therefore obtain the recurrence

We show that the running time is linear by substitution. Assume that T(n) cn for some constant c and all n > 80. Substituting this inductive
hypothesis into the right-hand side of the recurrence yields
T(n) <= cn/5 + c(7n/10 + 6) + O(n)
<= cn/5 + c + 7cn/10 + 6c + O(n)
<= 9cn/10 + 7c + O(n)
<= cn,

since we can pick c large enough so that c(n/10 - 7) is larger than the function described by the O(n) term for all n > 80. The worst-case running
time of is therefore linear (Source: http://staff.ustc.edu.cn/~csli/graduate/algorithms/book6/chap10.htm ).
Note that the above algorithm is linear in worst case, but the constants are very high for this algorithm. Therefore, this algorithm doesn't work well
in practical situations, randomized quickSelect works much better and preferred.
Sources:
MIT Video Lecture on Order Statistics, Median
Introduction to Algorithms by Clifford Stein, Thomas H. Cormen, Charles E. Leiserson, Ronald L.
http://staff.ustc.edu.cn/~csli/graduate/algorithms/book6/chap10.htm

Find the closest pair from two sorted arrays


Given two sorted arrays and a number x, find the pair whose sum is closest to x and the pair has an element from each array.
We are given two arrays ar1[0m-1] and ar2[0..n-1] and a number x, we need to find the pair ar1[i] + ar2[j] such that absolute value of (ar1[i] +
ar2[j] x) is minimum.
Example:
Input: ar1[] = {1, 4, 5, 7};
ar2[] = {10, 20, 30, 40};
x = 32
Output: 1 and 30
Input: ar1[] = {1, 4, 5, 7};
ar2[] = {10, 20, 30, 40};
x = 50
Output: 7 and 40

A Simple Solution is to run two loops. The outer loop considers every element of first array and inner loop checks for the pair in second array.
We keep track of minimum difference between ar1[i] + ar2[j] and x.
We can do it in O(n) time using following steps.
1) Merge given two arrays into an auxiliary array of size m+n using merge process of merge sort. While merging keep another boolean array of
size m+n to indicate whether the current element in merged array is from ar1[] or ar2[].
2) Consider the merged array and use the linear time algorithm to find the pair with sum closest to x. One extra thing we need to consider only
those pairs which have one element from ar1[] and other from ar2[], we use the boolean array for this purpose.
Can we do it in a single pass and O(1) extra space?
The idea is to start from left side of one array and right side of another array, and use the algorithm same as step 2 of above approach. Following is
detailed algorithm.
1) Initialize a variable diff as infinite (Diff is used to store the
difference between pair and x). We need to find the minimum diff.
2) Initialize two index variables l and r in the given sorted array.
(a) Initialize first to the leftmost index in ar1: l = 0
(b) Initialize second the rightmost index in ar2: r = n-1
3) Loop while l < m and r >= 0
(a) If abs(ar1[l] + ar2[r] - sum) < diff then
update diff and result
(b) Else if(ar1[l] + ar2[r] < sum ) then l++
(c) Else r-4) Print the result.

Following is C++ implementation of this approach.


// C++ program to find the pair from two sorted arays such
// that the sum of pair is closest to a given number x
#include <iostream>
#include <climits>
#include <cstdlib>
using namespace std;
// ar1[0..m-1] and ar2[0..n-1] are two given sorted arrays
// and x is given number. This function prints the pair from
// both arrays such that the sum of the pair is closest to x.
void printClosest(int ar1[], int ar2[], int m, int n, int x)
{
// Initialize the diff between pair sum and x.
int diff = INT_MAX;
// res_l and res_r are result indexes from ar1[] and ar2[]
// respectively
int res_l, res_r;
// Start from left side of ar1[] and right side of ar2[]
int l = 0, r = n-1;
while (l<m && r>=0)
{
// If this pair is closer to x than the previously
// found closest, then update res_l, res_r and diff
if (abs(ar1[l] + ar2[r] - x) < diff)
{
res_l = l;
res_r = r;
diff = abs(ar1[l] + ar2[r] - x);

}
// If sum of this pair is more than x, move to smaller
// side
if (ar1[l] + ar2[r] > x)
r--;
else // move to the greater side
l++;
}
// Print the result
cout << "The closest pair is [" << ar1[res_l] << ", "
<< ar2[res_r] << "] \n";
}
// Driver program to test above functions
int main()
{
int ar1[] = {1, 4, 5, 7};
int ar2[] = {10, 20, 30, 40};
int m = sizeof(ar1)/sizeof(ar1[0]);
int n = sizeof(ar2)/sizeof(ar2[0]);
int x = 38;
printClosest(ar1, ar2, m, n, x);
return 0;
}

Output:
The closest pair is [7, 30]

Find common elements in three sorted arrays


Given three arrays sorted in non-decreasing order, print all common elements in these arrays.
Examples:
ar1[] =
ar2[] =
ar3[] =
Output:

{1,
{6,
{3,
20,

5, 10, 20, 40, 80}


7, 20, 80, 100}
4, 15, 20, 30, 70, 80, 120}
80

ar1[] =
ar2[] =
ar3[] =
Outptu:

{1, 5, 5}
{3, 4, 5, 5, 10}
{5, 5, 10, 20}
5, 5

A simple solution is to first find intersection of two arrays and store the intersection in a temporary array, then find the intersection of third array
and temporary array. Time complexity of this solution is O(n1 + n2 + n3) where n1, n2 and n3 are sizes of ar1[], ar2[] and ar3[] respectively.
The above solution requires extra space and two loops, we can find the common elements using a single loop and without extra space. The idea is
similar to intersection of two arrays. Like two arrays loop, we run a loop and traverse three arrays.
Let the current element traversed in ar1[] be x, in ar2[] be y and in ar3[] be z. We can have following cases inside the loop.
1) If x, y and z are same, we can simply print any of them as common element and move ahead in all three arrays.
2) Else If x < y, we can move ahead in ar1[] as x cannot be a common element 3) Else If y < z, we can move ahead in ar2[] as y cannot be a
common element 4) Else (We reach here when x > y and y > z), we can simply move ahead in ar3[] as z cannot be a common element.
Following are implementations of the above idea.

C++
// C++ program to print common elements in three arrays
#include <iostream>
using namespace std;
// This function prints common elements in ar1
int findCommon(int ar1[], int ar2[], int ar3[], int n1, int n2, int n3)
{
// Initialize starting indexes for ar1[], ar2[] and ar3[]
int i = 0, j = 0, k = 0;
// Iterate through three arrays while all arrays have elements
while (i < n1 && j < n2 && k < n3)
{
// If x = y and y = z, print any of them and move ahead
// in all arrays
if (ar1[i] == ar2[j] && ar2[j] == ar3[k])
{ cout << ar1[i] << " "; i++; j++; k++; }
// x < y
else if (ar1[i] < ar2[j])
i++;
// y < z
else if (ar2[j] < ar3[k])
j++;
// We reach here when x > y and z < y, i.e., z is smallest
else
k++;
}
}
// Driver program to test above function
int main()
{
int ar1[] = {1, 5, 10, 20, 40, 80};
int ar2[] = {6, 7, 20, 80, 100};
int ar3[] = {3, 4, 15, 20, 30, 70, 80, 120};
int n1 = sizeof(ar1)/sizeof(ar1[0]);
int n2 = sizeof(ar2)/sizeof(ar2[0]);
int n3 = sizeof(ar3)/sizeof(ar3[0]);
cout << "Common Elements are ";
findCommon(ar1, ar2, ar3, n1, n2, n3);
return 0;
}

Python
# Python function to print common elements in three sorted arrays
def findCommon(ar1, ar2, ar3, n1, n2, n3):
# Initialize starting indexes for ar1[], ar2[] and ar3[]
i, j, k = 0, 0, 0
# Iterate through three arrays while all arrays have elements
while (i < n1 and j < n2 and k< n3):
# If x = y and y = z, print any of them and move ahead
# in all arrays
if (ar1[i] == ar2[j] and ar2[j] == ar3[k]):
print ar1[i],
i += 1
j += 1
k += 1
# x < y
elif ar1[i] < ar2[j]:
i += 1
# y < z
elif ar2[j] < ar3[k]:
j += 1
# We reach here when x > y and z < y, i.e., z is smallest
else:
k += 1
#Driver program to check above function
ar1 = [1, 5, 10, 20, 40, 80]
ar2 = [6, 7, 20, 80, 100]
ar3 = [3, 4, 15, 20, 30, 70, 80, 120]
n1 = len(ar1)
n2 = len(ar2)
n3 = len(ar3)
print "Common elements are",
findCommon(ar1, ar2, ar3, n1, n2, n3)
# This code is contributed by __Devesh Agrawal__

Common Elements are 20 80

Time complexity of the above solution is O(n1 + n2 + n3). In worst case, the largest sized array may have all small elements and middle sized array
has all middle elements.

Given a sorted array and a number x, find the pair in array whose sum is closest to x
Given a sorted array and a number x, find a pair in array whose sum is closest to x.
Examples:
Input: arr[] = {10, 22, 28, 29, 30, 40}, x = 54
Output: 22 and 30
Input: arr[] = {1, 3, 4, 7, 10}, x = 15
Output: 4 and 10

A simple solution is to consider every pair and keep track of closest pair (absolute difference between pair sum and x is minimum). Finally print the
closest pair. Time complexity of this solution is O(n2)
An efficient solution can find the pair in O(n) time. The idea is similar to method 2 of this post. Following is detailed algorithm.
1) Initialize a variable diff as infinite (Diff is used to store the
difference between pair and x). We need to find the minimum diff.
2) Initialize two index variables l and r in the given sorted array.
(a) Initialize first to the leftmost index: l = 0
(b) Initialize second the rightmost index: r = n-1
3) Loop while l < r.
(a) If abs(arr[l] + arr[r] - sum) < diff then
update diff and result
(b) Else if(arr[l] + arr[r] < sum ) then l++
(c) Else r--

Following is C++ implementation of above algorithm.

C++
// Simple C++ program to find the pair with sum closest to a given no.
#include <iostream>
#include <climits>
#include <cstdlib>
using namespace std;
// Prints the pair with sum closest to x
void printClosest(int arr[], int n, int x)
{
int res_l, res_r; // To store indexes of result pair
// Initialize left and right indexes and difference between
// pair sum and x
int l = 0, r = n-1, diff = INT_MAX;
// While there are elements between l and r
while (r > l)
{
// Check if this pair is closer than the closest pair so far
if (abs(arr[l] + arr[r] - x) < diff)
{
res_l = l;
res_r = r;
diff = abs(arr[l] + arr[r] - x);
}
// If this pair has more sum, move to smaller values.
if (arr[l] + arr[r] > x)
r--;
else // Move to larger values
l++;
}
cout <<" The closest pair is " << arr[res_l] << " and " << arr[res_r];
}
// Driver program to test above functions
int main()
{
int arr[] = {10, 22, 28, 29, 30, 40}, x = 54;
int n = sizeof(arr)/sizeof(arr[0]);
printClosest(arr, n, x);
return 0;
}

Java
// Java program to find pair with sum closest to x
import java.io.*;
import java.util.*;
import java.lang.Math;
class CloseSum {
// Prints the pair with sum cloest to x
static void printClosest(int arr[], int n, int x)
{
int res_l=0, res_r=0; // To store indexes of result pair
// Initialize left and right indexes and difference between
// pair sum and x
int l = 0, r = n-1, diff = Integer.MAX_VALUE;
// While there are elements between l and r
while (r > l)
{
// Check if this pair is closer than the closest pair so far
if (Math.abs(arr[l] + arr[r] - x) < diff)
{
res_l = l;
res_r = r;
diff = Math.abs(arr[l] + arr[r] - x);
}
// If this pair has more sum, move to smaller values.
if (arr[l] + arr[r] > x)
r--;
else // Move to larger values
l++;
}
System.out.println(" The closest pair is "+arr[res_l]+" and "+ arr[res_r]);
}
// Driver program to test above function
public static void main(String[] args)
{
int arr[] = {10, 22, 28, 29, 30, 40}, x = 54;
int n = arr.length;
printClosest(arr, n, x);
}
}
/*This code is contributed by Devesh Agrawal*/

The closest pair is 22 and 30

Count 1s in a sorted binary array


Given a binary array sorted in non-increasing order, count the number of 1s in it.
Examples:
Input: arr[] = {1, 1, 0, 0, 0, 0, 0}
Output: 2
Input: arr[] = {1, 1, 1, 1, 1, 1, 1}
Output: 7
Input: arr[] = {0, 0, 0, 0, 0, 0, 0}
Output: 0

A simple solution is to linearly traverse the array. The time complexity of the simple solution is O(n). We can use Binary Search to find count in
O(Logn) time. The idea is to look for last occurrence of 1 using Binary Search. Once we find the index last occurrence, we return index + 1 as
count.
The following is C++ implementation of above idea.

C++
// C++ program to count one's in a boolean array
#include <iostream>
using namespace std;
/* Returns counts of 1's in arr[low..high]. The array is
assumed to be sorted in non-increasing order */
int countOnes(bool arr[], int low, int high)
{
if (high >= low)
{
// get the middle index
int mid = low + (high - low)/2;
// check if the element at middle index is last 1
if ( (mid == high || arr[mid+1] == 0) && (arr[mid] == 1))
return mid+1;
// If element is not last 1, recur for right side
if (arr[mid] == 1)
return countOnes(arr, (mid + 1), high);
// else recur for left side
return countOnes(arr, low, (mid -1));
}
return 0;
}
/* Driver program to test above functions */
int main()
{
bool arr[] = {1, 1, 1, 1, 0, 0, 0};
int n = sizeof(arr)/sizeof(arr[0]);
cout << "Count of 1's in given array is " << countOnes(arr, 0, n-1);
return 0;
}

Python
# Python program to count one's in a boolean array
# Returns counts of 1's in arr[low..high]. The array is
# assumed to be sorted in non-increasing order
def countOnes(arr,low,high):
if high>=low:
# get the middle index
mid = low + (high-low)/2
# check if the element at middle index is last 1
if ((mid == high or arr[mid+1]==0) and (arr[mid]==1)):
return mid+1
# If element is not last 1, recur for right side

if arr[mid]==1:
return countOnes(arr, (mid+1), high)
# else recur for left side
return countOnes(arr, low, mid-1)
return 0
# Driver function
arr=[1, 1, 1, 1, 0, 0, 0]
print "Count of 1's in given array is",countOnes(arr, 0 , len(arr)-1)
# This code is contributed by __Devesh Agrawal__

Output:
Count of 1's in given array is 4

Time complexity of the above solution is O(Logn)

Binary Insertion Sort


We can use binary search to reduce the number of comparisons in normal insertion sort. Binary Insertion Sort find use binary search to find the
proper location to insert the selected item at each iteration.
In normal insertion, sort it takes O(i) (at ith iteration) in worst case. we can reduce it to O(logi) by using binary search.
// C program for implementation of binary insertion sort
#include <stdio.h>
// A binary search based function to find the position
// where item should be inserted in a[low..high]
int binarySearch(int a[], int item, int low, int high)
{
if (high <= low)
return (item > a[low])? (low + 1): low;
int mid = (low + high)/2;
if(item == a[mid])
return mid+1;
if(item > a[mid])
return binarySearch(a, item, mid+1, high);
return binarySearch(a, item, low, mid-1);
}
// Function to sort an array a[] of size 'n'
void insertionSort(int a[], int n)
{
int i, loc, j, k, selected;
for (i = 1; i < n; ++i)
{
j = i - 1;
selected = a[i];
// find location where selected sould be inseretd
loc = binarySearch(a, selected, 0, j);
// Move all elements after location to create space
while (j >= loc)
{
a[j+1] = a[j];
j--;
}
a[j+1] = selected;
}
}
// Driver program to test above function
int main()
{
int a[] = {37, 23, 0, 17, 12, 72, 31,
46, 100, 88, 54};
int n = sizeof(a)/sizeof(a[0]), i;
insertionSort(a, n);
printf("Sorted array: \n");
for (i = 0; i < n; i++)
printf("%d ",a[i]);
return 0;
}

Output:
Sorted array:
0 12 17 23 31 37 46 54 72 88 100

Time Complexity: The algorithm as a whole still has a running worst case running time of O(n2) because of the series of swaps required for each
insertion.

Insertion Sort for Singly Linked List


We have discussed Insertion Sort for arrays. In this article same for linked list is discussed.
Below is simple insertion sort algorithm for linked list.
1) Create an empty sorted (or result) list
2) Traverse the given list, do following for every node.
......a) Insert current node in sorted way in sorted or result list.
3) Change head of given linked list to head of sorted (or result) list.

The main step is (2.a) which has been covered in below post.
Sorted Insert for Singly Linked List
Below is C implementation of above algorithm
/* C program for insertion sort on a linked list */
#include<stdio.h>
#include<stdlib.h>
/* Link list node */
struct node
{
int data;
struct node* next;
};
// Function to insert a given node in a sorted linked list
void sortedInsert(struct node**, struct node*);
// function to sort a singly linked list using insertion sort
void insertionSort(struct node **head_ref)
{
// Initialize sorted linked list
struct node *sorted = NULL;
// Traverse the given linked list and insert every
// node to sorted
struct node *current = *head_ref;
while (current != NULL)
{
// Store next for next iteration
struct node *next = current->next;
// insert current in sorted linked list
sortedInsert(&sorted, current);
// Update current
current = next;
}
// Update head_ref to point to sorted linked list
*head_ref = sorted;
}
/* function to insert a new_node in a list. Note that this
function expects a pointer to head_ref as this can modify the
head of the input linked list (similar to push())*/
void sortedInsert(struct node** head_ref, struct node* new_node)
{
struct node* current;
/* Special case for the head end */
if (*head_ref == NULL || (*head_ref)->data >= new_node->data)
{
new_node->next = *head_ref;
*head_ref = new_node;
}
else
{
/* Locate the node before the point of insertion */
current = *head_ref;
while (current->next!=NULL &&
current->next->data < new_node->data)
{
current = current->next;
}
new_node->next = current->next;
current->next = new_node;

}
}
/* BELOW FUNCTIONS ARE JUST UTILITY TO TEST sortedInsert */
/* A utility function to create a new node */
struct node *newNode(int new_data)
{
/* allocate node */
struct node* new_node =
(struct node*) malloc(sizeof(struct node));
/* put in the data */
new_node->data = new_data;
new_node->next = NULL;
return new_node;
}
/* Function to print linked list */
void printList(struct node *head)
{
struct node *temp = head;
while(temp != NULL)
{
printf("%d ", temp->data);
temp = temp->next;
}
}
/* A utility function to insert a node at the beginning of linked list */
void push(struct node** head_ref, int new_data)
{
/* allocate node */
struct node* new_node = new node;
/* put in the data */
new_node->data = new_data;
/* link the old list off the new node */
new_node->next = (*head_ref);
/* move the head to point to the new node */
(*head_ref)
= new_node;
}
// Driver program to test above functions
int main()
{
struct node *a = NULL;
push(&a, 5);
push(&a, 20);
push(&a, 4);
push(&a, 3);
push(&a, 30);
printf("Linked List before sorting \n");
printList(a);
insertionSort(&a);
printf("\nLinked List after sorting \n");
printList(a);
return 0;
}
Linked List before sorting
30 3 4 20 5
Linked List after sorting
3 4 5 20 30

Why Quick Sort preferred for Arrays and Merge Sort for Linked Lists?
Why is Quick Sort preferred for arrays?
Below are recursive and iterative implementations of Quick Sort and Merge Sort for arrays.
Recursive Quick Sort for array.
Iterative Quick Sort for arrays.
Recursive Merge Sort for arrays
Iterative Merge Sort for arrays
Quick Sort in its general form is an in-place sort (i.e. it doesnt require any extra storage) whereas merge sort requires O(N) extra storage, N
denoting the array size which may be quite expensive. Allocating and de-allocating the extra space used for merge sort increases the running time
of the algorithm. Comparing average complexity we find that both type of sorts have O(NlogN) average complexity but the constants differ. For
arrays, merge sort loses due to the use of extra O(N) storage space.
Most practical implementations of Quick Sort use randomized version. The randomized version has expected time complexity of O(nLogn). The
worst case is possible in randomized version also, but worst case doesnt occur for a particular pattern (like sorted array) and randomized Quick
Sort works well in practice.
Quick Sort is also a cache friendly sorting algorithm as it has good locality of reference when used for arrays.
Quick Sort is also tail recursive, therefore tail call optimizations is done.
Why is Merge Sort preferred for Linked Lists?
Below are implementations of Quicksort and Mergesort for singly and doubly linked lists.
Quick Sort for Doubly Linked List
Quick Sort for Singly Linked List
Merge Sort for Singly Linked List
Merge Sort for Doubly Linked List
In case of linked lists the case is different mainly due to difference in memory allocation of arrays and linked lists. Unlike arrays, linked list nodes
may not be adjacent in memory. Unlike array, in linked list, we can insert items in the middle in O(1) extra space and O(1) time. Therefore merge
operation of merge sort can be implemented without extra space for linked lists.
In arrays, we can do random access as elements are continuous in memory. Let us say we have an integer (4-byte) array A and let the address of
A[0] be x then to access A[i], we can directly access the memory at (x + i*4). Unlike arrays, we can not do random access in linked list. Quick
Sort requires a lot of this kind of access. In linked list to access ith index, we have to travel each and every node from the head to ith node as we
dont have continuous block of memory. Therefore, the overhead increases for quick sort. Merge sort accesses data sequentially and the need of
random access is low.

Merge Sort for Doubly Linked List


Given a doubly linked list, write a function to sort the doubly linked list in increasing order using merge sort.
For example, the following doubly linked list should be changed to 2<->4<->8<->10

Below is C implementation of merge sort for doubly linked list.


// C program for merge sort on doubly linked list
#include<stdio.h>
#include<stdlib.h>
struct node
{
int data;
struct node *next, *prev;
};
struct node *split(struct node *head);
// Function to merge two linked lists
struct node *merge(struct node *first, struct node *second)
{
// If first linked list is empty
if (!first)
return second;
// If second linked list is empty
if (!second)
return first;
// Pick the smaller value
if (first->data < second->data)
{
first->next = merge(first->next,second);
first->next->prev = first;
first->prev = NULL;
return first;
}
else
{
second->next = merge(first,second->next);
second->next->prev = second;
second->prev = NULL;
return second;
}
}
// Function to do merge sort
struct node *mergeSort(struct node *head)
{
if (!head || !head->next)
return head;
struct node *second = split(head);
// Recur for left and right halves
head = mergeSort(head);
second = mergeSort(second);
// Merge the two sorted halves
return merge(head,second);
}
// A utility function to insert a new node at the
// beginning of doubly linked list
void insert(struct node **head, int data)
{
struct node *temp =
(struct node *)malloc(sizeof(struct node));
temp->data = data;
temp->next = temp->prev = NULL;
if (!(*head))
(*head) = temp;
else

{
temp->next = *head;
(*head)->prev = temp;
(*head) = temp;
}
}
// A utility function to print a doubly linked list in
// both forward and backward directions
void print(struct node *head)
{
struct node *temp = head;
printf("Forward Traversal using next poitner\n");
while (head)
{
printf("%d ",head->data);
temp = head;
head = head->next;
}
printf("\nBackword Traversal using prev pointer\n");
while (temp)
{
printf("%d ", temp->data);
temp = temp->prev;
}
}
// Utility function to swap two integers
void swap(int *A, int *B)
{
int temp = *A;
*A = *B;
*B = temp;
}
// Split a doubly linked list (DLL) into 2 DLLs of
// half sizes
struct node *split(struct node *head)
{
struct node *fast = head,*slow = head;
while (fast->next && fast->next->next)
{
fast = fast->next->next;
slow = slow->next;
}
struct node *temp = slow->next;
slow->next = NULL;
return temp;
}
// Driver program
int main(void)
{
struct node *head = NULL;
insert(&head,5);
insert(&head,20);
insert(&head,4);
insert(&head,3);
insert(&head,30);
insert(&head,10);
printf("Linked List before sorting\n");
print(head);
head = mergeSort(head);
printf("\n\nLinked List after sorting\n");
print(head);
return 0;
}

Output:
Linked List before sorting
Forward Traversal using next pointer
10 30 3 4 20 5
Backward Traversal using prev pointer
5 20 4 3 30 10
Linked List after sorting
Forward Traversal using next pointer
3 4 5 10 20 30
Backward Traversal using prev pointer
30 20 10 5 4 3

Thanks to Goku for providing above implementation in a comment here.


Time Complexity: Time complexity of the above implementation is same as time complexity of MergeSort for arrays. It takes ?(nLogn) time.
You may also like to see QuickSort for doubly linked list

Greedy Algorithms | Set 1 (Activity Selection Problem)


Greedy is an algorithmic paradigm that builds up a solution piece by piece, always choosing the next piece that offers the most obvious and
immediate benefit. Greedy algorithms are used for optimization problems. An optimization problem can be solved using Greedy if the problem has
the following property: At every step, we can make a choice that looks best at the moment, and we get the optimal solution of the complete
problem.
If a Greedy Algorithm can solve a problem, then it generally becomes the best method to solve that problem as the Greedy algorithms are in
general more efficient than other techniques like Dynamic Programming. But Greedy algorithms cannot always be applied. For example, Fractional
Knapsack problem (See this) can be solved using Greedy, but 0-1 Knapsack cannot be solved using Greedy.
Following are some standard algorithms that are Greedy algorithms.
1) Kruskals Minimum Spanning Tree (MST): In Kruskals algorithm, we create a MST by picking edges one by one. The Greedy Choice is to
pick the smallest weight edge that doesnt cause a cycle in the MST constructed so far.
2) Prims Minimum Spanning Tree: In Prims algorithm also, we create a MST by picking edges one by one. We maintain two sets: set of the
vertices already included in MST and the set of the vertices not yet included. The Greedy Choice is to pick the smallest weight edge that connects
the two sets.
3) Dijkstras Shortest Path: The Dijkstras algorithm is very similar to Prims algorithm. The shortest path tree is built up, edge by edge. We
maintain two sets: set of the vertices already included in the tree and the set of the vertices not yet included. The Greedy Choice is to pick the edge
that connects the two sets and is on the smallest weight path from source to the set that contains not yet included vertices.
4) Huffman Coding: Huffman Coding is a loss-less compression technique. It assigns variable length bit codes to different characters. The
Greedy Choice is to assign least bit length code to the most frequent character.
The greedy algorithms are sometimes also used to get an approximation for Hard optimization problems. For example, Traveling Salesman
Problem is a NP Hard problem. A Greedy choice for this problem is to pick the nearest unvisited city from the current city at every step. This
solutions doesnt always produce the best optimal solution, but can be used to get an approximate optimal solution.
Let us consider the Activity Selection problem as our first example of Greedy algorithms. Following is the problem statement.
You are given n activities with their start and finish times. Select the maximum number of activities that can be performed by a single
person, assuming that a person can only work on a single activity at a time.
Example:
Consider the following 6 activities.
start[] = {1, 3, 0, 5, 8, 5};
finish[] = {2, 4, 6, 7, 9, 9};
The maximum set of activities that can be executed
by a single person is {0, 1, 3, 4}

The greedy choice is to always pick the next activity whose finish time is least among the remaining activities and the start time is more than or
equal to the finish time of previously selected activity. We can sort the activities according to their finishing time so that we always consider the next
activity as minimum finishing time activity.
1) Sort the activities according to their finishing time
2) Select the first activity from the sorted array and print it.
3) Do following for remaining activities in the sorted array.
.a) If the start time of this activity is greater than the finish time of previously selected activity then select this activity and print it.
In the following C implementation, it is assumed that the activities are already sorted according to their finish time.

C++
#include<stdio.h>
// Prints a maximum set of activities that can be done by a single
// person, one at a time.
// n --> Total number of activities
// s[] --> An array that contains start time of all activities
// f[] --> An array that contains finish time of all activities
void printMaxActivities(int s[], int f[], int n)
{
int i, j;
printf ("Following activities are selected \n");
// The first activity always gets selected
i = 0;
printf("%d ", i);
// Consider rest of the activities
for (j = 1; j < n; j++)
{
// If this activity has start time greater than or

// equal to the finish time of previously selected


// activity, then select it
if (s[j] >= f[i])
{
printf ("%d ", j);
i = j;
}
}
}
// driver program to test above function
int main()
{
int s[] = {1, 3, 0, 5, 8, 5};
int f[] = {2, 4, 6, 7, 9, 9};
int n = sizeof(s)/sizeof(s[0]);
printMaxActivities(s, f, n);
getchar();
return 0;
}

Python
"""The following implemenatation assumes that the activities
are already sorted according to their finish time"""
"""Prints a maximum set of activities that can be done by a
single person, one at a time"""
# n --> Total number of activities
# s[]--> An array that contains start time of all activities
# f[] --> An array that conatins finish time of all activities
def printMaxActivities(s , f ):
n = len(f)
print "The following activities are selected"
# The first activity is always selected
i = 0
print i,
# Consider rest of the activities
for j in xrange(n):
# If this activity has start time greater than
# or equal to the finish time of previously
# selected activity, then select it
if s[j] >= f[i]:
print j,
i = j
# Driver program to test above function
s = [1 , 3 , 0 , 5 , 8 , 5]
f = [2 , 4 , 6 , 7 , 9 , 9]
printMaxActivities(s , f)
# This code is contributed by Nikhil Kumar Singh

Following activities are selected


0 1 3 4

How does Greedy Choice work for Activities sorted according to finish time?
Let the give set of activities be S = {1, 2, 3, ..n} and activities be sorted by finish time. The greedy choice is to always pick activity 1. How come
the activity 1 always provides one of the optimal solutions. We can prove it by showing that if there is another solution B with first activity other
than 1, then there is also a solution A of same size with activity 1 as first activity. Let the first activity selected by B be k, then there always exist A
= {B {k}} U {1}.(Note that the activities in B are independent and k has smallest finishing time among all. Since k is not 1, finish(k) >= finish(1)).
References:
Introduction to Algorithms by Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, Clifford Stein
Algorithms by S. Dasgupta, C.H. Papadimitriou, and U.V. Vazirani
http://en.wikipedia.org/wiki/Greedy_algorithm

Greedy Algorithms | Set 2 (Kruskals Minimum Spanning Tree Algorithm)


What is Minimum Spanning Tree?
Given a connected and undirected graph, a spanning tree of that graph is a subgraph that is a tree and connects all the vertices together. A single
graph can have many different spanning trees. A minimum spanning tree (MST) or minimum weight spanning tree for a weighted, connected and
undirected graph is a spanning tree with weight less than or equal to the weight of every other spanning tree. The weight of a spanning tree is the
sum of weights given to each edge of the spanning tree.
How many edges does a minimum spanning tree has?
A minimum spanning tree has (V 1) edges where V is the number of vertices in the given graph.
What are the applications of Minimum Spanning Tree?
See this for applications of MST.
Below are the steps for finding MST using Kruskals algorithm
1. Sort all the edges in non-decreasing order of their weight.
2. Pick the smallest edge. Check if it forms a cycle with the spanning tree
formed so far. If cycle is not formed, include this edge. Else, discard it.
3. Repeat step#2 until there are (V-1) edges in the spanning tree.

The step#2 uses Union-Find algorithm to detect cycle. So we recommend to read following post as a prerequisite.
Union-Find Algorithm | Set 1 (Detect Cycle in a Graph)
Union-Find Algorithm | Set 2 (Union By Rank and Path Compression)
The algorithm is a Greedy Algorithm. The Greedy Choice is to pick the smallest weight edge that does not cause a cycle in the MST constructed
so far. Let us understand it with an example: Consider the below input graph.

The graph contains 9 vertices and 14 edges. So, the minimum spanning tree formed will be having (9 1) = 8 edges.
After sorting:
Weight Src
Dest
1
7
6
2
8
2
2
6
5
4
0
1
4
2
5
6
8
6
7
2
3
7
7
8
8
0
7
8
1
2
9
3
4
10
5
4
11
1
7
14
3
5

Now pick all edges one by one from sorted list of edges
1. Pick edge 7-6: No cycle is formed, include it.

2. Pick edge 8-2: No cycle is formed, include it.

3. Pick edge 6-5: No cycle is formed, include it.

4. Pick edge 0-1: No cycle is formed, include it.

5. Pick edge 2-5: No cycle is formed, include it.

6. Pick edge 8-6: Since including this edge results in cycle, discard it.
7. Pick edge 2-3: No cycle is formed, include it.

8. Pick edge 7-8: Since including this edge results in cycle, discard it.
9. Pick edge 0-7: No cycle is formed, include it.

10. Pick edge 1-2: Since including this edge results in cycle, discard it.
11. Pick edge 3-4: No cycle is formed, include it.

Since the number of edges included equals (V 1), the algorithm stops here.

C/C++
// C++ program for Kruskal's algorithm to find Minimum Spanning Tree
// of a given connected, undirected and weighted graph
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
// a structure to represent a weighted edge in graph
struct Edge
{
int src, dest, weight;

};
// a structure to represent a connected, undirected and weighted graph
struct Graph
{
// V-> Number of vertices, E-> Number of edges
int V, E;
// graph is represented as an array of edges. Since the graph is
// undirected, the edge from src to dest is also edge from dest
// to src. Both are counted as 1 edge here.
struct Edge* edge;
};
// Creates a graph with V vertices and E edges
struct Graph* createGraph(int V, int E)
{
struct Graph* graph = (struct Graph*) malloc( sizeof(struct Graph) );
graph->V = V;
graph->E = E;
graph->edge = (struct Edge*) malloc( graph->E * sizeof( struct Edge ) );
return graph;
}
// A structure to represent a subset for union-find
struct subset
{
int parent;
int rank;
};
// A utility function to find set of an element i
// (uses path compression technique)
int find(struct subset subsets[], int i)
{
// find root and make root as parent of i (path compression)
if (subsets[i].parent != i)
subsets[i].parent = find(subsets, subsets[i].parent);
return subsets[i].parent;
}
// A function that does union of two sets of x and y
// (uses union by rank)
void Union(struct subset subsets[], int x, int y)
{
int xroot = find(subsets, x);
int yroot = find(subsets, y);
// Attach smaller rank tree under root of high rank tree
// (Union by Rank)
if (subsets[xroot].rank < subsets[yroot].rank)
subsets[xroot].parent = yroot;
else if (subsets[xroot].rank > subsets[yroot].rank)
subsets[yroot].parent = xroot;
// If ranks are same, then make one as root and increment
// its rank by one
else
{
subsets[yroot].parent = xroot;
subsets[xroot].rank++;
}
}
// Compare two edges according to their weights.
// Used in qsort() for sorting an array of edges
int myComp(const void* a, const void* b)
{
struct Edge* a1 = (struct Edge*)a;
struct Edge* b1 = (struct Edge*)b;
return a1->weight > b1->weight;
}
// The main function to construct MST using Kruskal's algorithm
void KruskalMST(struct Graph* graph)
{
int V = graph->V;
struct Edge result[V]; // Tnis will store the resultant MST

int e = 0; // An index variable, used for result[]


int i = 0; // An index variable, used for sorted edges
// Step 1: Sort all the edges in non-decreasing order of their weight
// If we are not allowed to change the given graph, we can create a copy of
// array of edges
qsort(graph->edge, graph->E, sizeof(graph->edge[0]), myComp);
// Allocate memory for creating V ssubsets
struct subset *subsets =
(struct subset*) malloc( V * sizeof(struct subset) );
// Create V subsets with single elements
for (int v = 0; v < V; ++v)
{
subsets[v].parent = v;
subsets[v].rank = 0;
}
// Number of edges to be taken is equal to V-1
while (e < V - 1)
{
// Step 2: Pick the smallest edge. And increment the index
// for next iteration
struct Edge next_edge = graph->edge[i++];
int x = find(subsets, next_edge.src);
int y = find(subsets, next_edge.dest);
// If including this edge does't cause cycle, include it
// in result and increment the index of result for next edge
if (x != y)
{
result[e++] = next_edge;
Union(subsets, x, y);
}
// Else discard the next_edge
}
// print the contents of result[] to display the built MST
printf("Following are the edges in the constructed MST\n");
for (i = 0; i < e; ++i)
printf("%d -- %d == %d\n", result[i].src, result[i].dest,
result[i].weight);
return;
}
// Driver program to test above functions
int main()
{
/* Let us create following weighted graph
10
0--------1
| \
|
6| 5\ |15
|
\ |
2--------3
4
*/
int V = 4; // Number of vertices in graph
int E = 5; // Number of edges in graph
struct Graph* graph = createGraph(V, E);
// add edge 0-1
graph->edge[0].src = 0;
graph->edge[0].dest = 1;
graph->edge[0].weight = 10;
// add edge 0-2
graph->edge[1].src = 0;
graph->edge[1].dest = 2;
graph->edge[1].weight = 6;
// add edge 0-3
graph->edge[2].src = 0;
graph->edge[2].dest = 3;
graph->edge[2].weight = 5;
// add edge 1-3
graph->edge[3].src = 1;
graph->edge[3].dest = 3;

graph->edge[3].weight = 15;
// add edge 2-3
graph->edge[4].src = 2;
graph->edge[4].dest = 3;
graph->edge[4].weight = 4;
KruskalMST(graph);
return 0;
}

Java
// Java program for Kruskal's algorithm to find Minimum Spanning Tree
// of a given connected, undirected and weighted graph
import java.util.*;
import java.lang.*;
import java.io.*;
class Graph
{
// A class to represent a graph edge
class Edge implements Comparable<Edge>
{
int src, dest, weight;
// Comparator function used for sorting edges based on
// their weight
public int compareTo(Edge compareEdge)
{
return this.weight-compareEdge.weight;
}
};
// A class to represent a subset for union-find
class subset
{
int parent, rank;
};
int V, E;
// V-> no. of vertices & E->no.of edges
Edge edge[]; // collection of all edges
// Creates a graph with V vertices and E edges
Graph(int v, int e)
{
V = v;
E = e;
edge = new Edge[E];
for (int i=0; i<e; ++i)
edge[i] = new Edge();
}
// A utility function to find set of an element i
// (uses path compression technique)
int find(subset subsets[], int i)
{
// find root and make root as parent of i (path compression)
if (subsets[i].parent != i)
subsets[i].parent = find(subsets, subsets[i].parent);
return subsets[i].parent;
}
// A function that does union of two sets of x and y
// (uses union by rank)
void Union(subset subsets[], int x, int y)
{
int xroot = find(subsets, x);
int yroot = find(subsets, y);
// Attach smaller rank tree under root of high rank tree
// (Union by Rank)
if (subsets[xroot].rank < subsets[yroot].rank)
subsets[xroot].parent = yroot;
else if (subsets[xroot].rank > subsets[yroot].rank)
subsets[yroot].parent = xroot;
// If ranks are same, then make one as root and increment

// its rank by one


else
{
subsets[yroot].parent = xroot;
subsets[xroot].rank++;
}
}
// The main function to construct MST using Kruskal's algorithm
void KruskalMST()
{
Edge result[] = new Edge[V]; // Tnis will store the resultant MST
int e = 0; // An index variable, used for result[]
int i = 0; // An index variable, used for sorted edges
for (i=0; i<V; ++i)
result[i] = new Edge();
// Step 1: Sort all the edges in non-decreasing order of their
// weight. If we are not allowed to change the given graph, we
// can create a copy of array of edges
Arrays.sort(edge);
// Allocate memory for creating V ssubsets
subset subsets[] = new subset[V];
for(i=0; i<V; ++i)
subsets[i]=new subset();
// Create V subsets with single elements
for (int v = 0; v < V; ++v)
{
subsets[v].parent = v;
subsets[v].rank = 0;
}
i = 0; // Index used to pick next edge
// Number of edges to be taken is equal to V-1
while (e < V - 1)
{
// Step 2: Pick the smallest edge. And increment the index
// for next iteration
Edge next_edge = new Edge();
next_edge = edge[i++];
int x = find(subsets, next_edge.src);
int y = find(subsets, next_edge.dest);
// If including this edge does't cause cycle, include it
// in result and increment the index of result for next edge
if (x != y)
{
result[e++] = next_edge;
Union(subsets, x, y);
}
// Else discard the next_edge
}
// print the contents of result[] to display the built MST
System.out.println("Following are the edges in the constructed MST");
for (i = 0; i < e; ++i)
System.out.println(result[i].src+" -- "+result[i].dest+" == "+
result[i].weight);
}
// Driver Program
public static void main (String[] args)
{
/* Let us create following weighted graph
10
0--------1
| \
|
6| 5\ |15
|
\ |
2--------3
4
*/
int V = 4; // Number of vertices in graph
int E = 5; // Number of edges in graph
Graph graph = new Graph(V, E);
// add edge 0-1

graph.edge[0].src = 0;
graph.edge[0].dest = 1;
graph.edge[0].weight = 10;
// add edge 0-2
graph.edge[1].src = 0;
graph.edge[1].dest = 2;
graph.edge[1].weight = 6;
// add edge 0-3
graph.edge[2].src = 0;
graph.edge[2].dest = 3;
graph.edge[2].weight = 5;
// add edge 1-3
graph.edge[3].src = 1;
graph.edge[3].dest = 3;
graph.edge[3].weight = 15;
// add edge 2-3
graph.edge[4].src = 2;
graph.edge[4].dest = 3;
graph.edge[4].weight = 4;
graph.KruskalMST();
}
}
//This code is contributed by Aakash Hasija
Following
2 -- 3 ==
0 -- 3 ==
0 -- 1 ==

are the edges in the constructed MST


4
5
10

Time Complexity: O(ElogE) or O(ElogV). Sorting of edges takes O(ELogE) time. After sorting, we iterate through all edges and apply findunion algorithm. The find and union operations can take atmost O(LogV) time. So overall complexity is O(ELogE + ELogV) time. The value of E
can be atmost V^2, so O(LogV) are O(LogE) same. Therefore, overall time complexity is O(ElogE) or O(ElogV)
References:
http://www.ics.uci.edu/~eppstein/161/960206.html
http://en.wikipedia.org/wiki/Minimum_spanning_tree

Greedy Algorithms | Set 3 (Huffman Coding)


Huffman coding is a lossless data compression algorithm. The idea is to assign variable-legth codes to input characters, lengths of the assigned
codes are based on the frequencies of corresponding characters. The most frequent character gets the smallest code and the least frequent
character gets the largest code.
The variable-length codes assigned to input characters are Prefix Codes, means the codes (bit sequences) are assigned in such a way that the code
assigned to one character is not prefix of code assigned to any other character. This is how Huffman Coding makes sure that there is no ambiguity
when decoding the generated bit stream.
Let us understand prefix codes with a counter example. Let there be four characters a, b, c and d, and their corresponding variable length codes
be 00, 01, 0 and 1. This coding leads to ambiguity because code assigned to c is prefix of codes assigned to a and b. If the compressed bit stream
is 0001, the de-compressed output may be cccd or ccb or acd or ab.
See this for applications of Huffman Coding.
There are mainly two major parts in Huffman Coding
1) Build a Huffman Tree from input characters.
2) Traverse the Huffman Tree and assign codes to characters.
Steps to build Huffman Tree
Input is array of unique characters along with their frequency of occurrences and output is Huffman Tree.
1. Create a leaf node for each unique character and build a min heap of all leaf nodes (Min Heap is used as a priority queue. The value of
frequency field is used to compare two nodes in min heap. Initially, the least frequent character is at root)
2. Extract two nodes with the minimum frequency from the min heap.
3. Create a new internal node with frequency equal to the sum of the two nodes frequencies. Make the first extracted node as its left child and the
other extracted node as its right child. Add this node to the min heap.
4. Repeat steps#2 and #3 until the heap contains only one node. The remaining node is the root node and the tree is complete.
Let us understand the algorithm with an example:
character
a
b
c
d
e
f

Frequency
5
9
12
13
16
45

Step 1. Build a min heap that contains 6 nodes where each node represents root of a tree with single node.
Step 2 Extract two minimum frequency nodes from min heap. Add a new internal node with frequency 5 + 9 = 14.

Now min heap contains 5 nodes where 4 nodes are roots of trees with single element each, and one heap node is root of tree with 3 elements
character
c
d
Internal Node
e
f

Frequency
12
13
14
16
45

Step 3: Extract two minimum frequency nodes from heap. Add a new internal node with frequency 12 + 13 = 25

Now min heap contains 4 nodes where 2 nodes are roots of trees with single element each, and two heap nodes are root of tree with more than
one nodes.
character
Internal Node
e
Internal Node
f

Frequency
14
16
25
45

Step 4: Extract two minimum frequency nodes. Add a new internal node with frequency 14 + 16 = 30

Now min heap contains 3 nodes.


character
Internal Node
Internal Node
f

Frequency
25
30
45

Step 5: Extract two minimum frequency nodes. Add a new internal node with frequency 25 + 30 = 55

Now min heap contains 2 nodes.


character
Frequency
f
45
Internal Node
55

Step 6: Extract two minimum frequency nodes. Add a new internal node with frequency 45 + 55 = 100

Now min heap contains only one node.


character
Frequency
Internal Node
100

Since the heap contains only one node, the algorithm stops here.
Steps to print codes from Huffman Tree:
Traverse the tree formed starting from the root. Maintain an auxiliary array. While moving to the left child, write 0 to the array. While moving to the
right child, write 1 to the array. Print the array when a leaf node is encountered.

The codes are as follows:


character
f
c
d
a
b
e

code-word
0
100
101
1100
1101
111

// C program for Huffman Coding


#include <stdio.h>
#include <stdlib.h>

// This constant can be avoided by explicitly calculating height of Huffman Tree


#define MAX_TREE_HT 100
// A Huffman tree node
struct MinHeapNode
{
char data; // One of the input characters
unsigned freq; // Frequency of the character
struct MinHeapNode *left, *right; // Left and right child of this node
};
// A Min Heap: Collection of min heap (or Hufmman tree) nodes
struct MinHeap
{
unsigned size;
// Current size of min heap
unsigned capacity; // capacity of min heap
struct MinHeapNode **array; // Attay of minheap node pointers
};
// A utility function allocate a new min heap node with given character
// and frequency of the character
struct MinHeapNode* newNode(char data, unsigned freq)
{
struct MinHeapNode* temp =
(struct MinHeapNode*) malloc(sizeof(struct MinHeapNode));
temp->left = temp->right = NULL;
temp->data = data;
temp->freq = freq;
return temp;
}
// A utility function to create a min heap of given capacity
struct MinHeap* createMinHeap(unsigned capacity)
{
struct MinHeap* minHeap =
(struct MinHeap*) malloc(sizeof(struct MinHeap));
minHeap->size = 0; // current size is 0
minHeap->capacity = capacity;
minHeap->array =
(struct MinHeapNode**)malloc(minHeap->capacity * sizeof(struct MinHeapNode*));
return minHeap;
}
// A utility function to swap two min heap nodes
void swapMinHeapNode(struct MinHeapNode** a, struct MinHeapNode** b)
{
struct MinHeapNode* t = *a;
*a = *b;
*b = t;
}
// The standard minHeapify function.
void minHeapify(struct MinHeap* minHeap, int idx)
{
int smallest = idx;
int left = 2 * idx + 1;
int right = 2 * idx + 2;
if (left < minHeap->size &&
minHeap->array[left]->freq < minHeap->array[smallest]->freq)
smallest = left;
if (right < minHeap->size &&
minHeap->array[right]->freq < minHeap->array[smallest]->freq)
smallest = right;
if (smallest != idx)
{
swapMinHeapNode(&minHeap->array[smallest], &minHeap->array[idx]);
minHeapify(minHeap, smallest);
}
}
// A utility function to check if size of heap is 1 or not
int isSizeOne(struct MinHeap* minHeap)
{
return (minHeap->size == 1);
}
// A standard function to extract minimum value node from heap
struct MinHeapNode* extractMin(struct MinHeap* minHeap)

{
struct MinHeapNode* temp = minHeap->array[0];
minHeap->array[0] = minHeap->array[minHeap->size - 1];
--minHeap->size;
minHeapify(minHeap, 0);
return temp;
}
// A utility function to insert a new node to Min Heap
void insertMinHeap(struct MinHeap* minHeap, struct MinHeapNode* minHeapNode)
{
++minHeap->size;
int i = minHeap->size - 1;
while (i && minHeapNode->freq < minHeap->array[(i - 1)/2]->freq)
{
minHeap->array[i] = minHeap->array[(i - 1)/2];
i = (i - 1)/2;
}
minHeap->array[i] = minHeapNode;
}
// A standard funvtion to build min heap
void buildMinHeap(struct MinHeap* minHeap)
{
int n = minHeap->size - 1;
int i;
for (i = (n - 1) / 2; i >= 0; --i)
minHeapify(minHeap, i);
}
// A utility function to print an array of size n
void printArr(int arr[], int n)
{
int i;
for (i = 0; i < n; ++i)
printf("%d", arr[i]);
printf("\n");
}
// Utility function to check if this node is leaf
int isLeaf(struct MinHeapNode* root)
{
return !(root->left) && !(root->right) ;
}
// Creates a min heap of capacity equal to size and inserts all character of
// data[] in min heap. Initially size of min heap is equal to capacity
struct MinHeap* createAndBuildMinHeap(char data[], int freq[], int size)
{
struct MinHeap* minHeap = createMinHeap(size);
for (int i = 0; i < size; ++i)
minHeap->array[i] = newNode(data[i], freq[i]);
minHeap->size = size;
buildMinHeap(minHeap);
return minHeap;
}
// The main function that builds Huffman tree
struct MinHeapNode* buildHuffmanTree(char data[], int freq[], int size)
{
struct MinHeapNode *left, *right, *top;
// Step 1: Create a min heap of capacity equal to size. Initially, there are
// modes equal to size.
struct MinHeap* minHeap = createAndBuildMinHeap(data, freq, size);
// Iterate while size of heap doesn't become 1
while (!isSizeOne(minHeap))
{
// Step 2: Extract the two minimum freq items from min heap
left = extractMin(minHeap);
right = extractMin(minHeap);
// Step 3: Create a new internal node with frequency equal to the
// sum of the two nodes frequencies. Make the two extracted node as
// left and right children of this new node. Add this node to the min heap
// '$' is a special value for internal nodes, not used
top = newNode('$', left->freq + right->freq);
top->left = left;
top->right = right;
insertMinHeap(minHeap, top);

}
// Step 4: The remaining node is the root node and the tree is complete.
return extractMin(minHeap);
}
// Prints huffman codes from the root of Huffman Tree. It uses arr[] to
// store codes
void printCodes(struct MinHeapNode* root, int arr[], int top)
{
// Assign 0 to left edge and recur
if (root->left)
{
arr[top] = 0;
printCodes(root->left, arr, top + 1);
}
// Assign 1 to right edge and recur
if (root->right)
{
arr[top] = 1;
printCodes(root->right, arr, top + 1);
}
// If this is a leaf node, then it contains one of the input
// characters, print the character and its code from arr[]
if (isLeaf(root))
{
printf("%c: ", root->data);
printArr(arr, top);
}
}
// The main function that builds a Huffman Tree and print codes by traversing
// the built Huffman Tree
void HuffmanCodes(char data[], int freq[], int size)
{
// Construct Huffman Tree
struct MinHeapNode* root = buildHuffmanTree(data, freq, size);
// Print Huffman codes using the Huffman tree built above
int arr[MAX_TREE_HT], top = 0;
printCodes(root, arr, top);
}
// Driver program to test above functions
int main()
{
char arr[] = {'a', 'b', 'c', 'd', 'e', 'f'};
int freq[] = {5, 9, 12, 13, 16, 45};
int size = sizeof(arr)/sizeof(arr[0]);
HuffmanCodes(arr, freq, size);
return 0;
}
f:
c:
d:
a:
b:
e:

0
100
101
1100
1101
111

Time complexity: O(nlogn) where n is the number of unique characters. If there are n nodes, extractMin() is called 2*(n 1) times. extractMin()
takes O(logn) time as it calles minHeapify(). So, overall complexity is O(nlogn).
If the input array is sorted, there exists a linear time algorithm. We will soon be discussing in our next post.
Reference:
http://en.wikipedia.org/wiki/Huffman_coding

Greedy Algorithms | Set 4 (Efficient Huffman Coding for Sorted Input)


We recommend to read following post as a prerequisite for this.
Greedy Algorithms | Set 3 (Huffman Coding)
Time complexity of the algorithm discussed in above post is O(nLogn). If we know that the given array is sorted (by non-decreasing order of
frequency), we can generate Huffman codes in O(n) time. Following is a O(n) algorithm for sorted input.
1. Create two empty queues.
2. Create a leaf node for each unique character and Enqueue it to the first queue in non-decreasing order of frequency. Initially second queue is
empty.
3. Dequeue two nodes with the minimum frequency by examining the front of both queues. Repeat following steps two times
..a) If second queue is empty, dequeue from first queue.
..b) If first queue is empty, dequeue from second queue.
..c) Else, compare the front of two queues and dequeue the minimum.
4. Create a new internal node with frequency equal to the sum of the two nodes frequencies. Make the first Dequeued node as its left child and the
second Dequeued node as right child. Enqueue this node to second queue.
5. Repeat steps#3 and #4 until there is more than one node in the queues. The remaining node is the root node and the tree is complete.
// C Program for Efficient Huffman Coding for Sorted input
#include <stdio.h>
#include <stdlib.h>
// This constant can be avoided by explicitly calculating height of Huffman Tree
#define MAX_TREE_HT 100
// A node of huffman tree
struct QueueNode
{
char data;
unsigned freq;
struct QueueNode *left, *right;
};
// Structure for Queue: collection of Huffman Tree nodes (or QueueNodes)
struct Queue
{
int front, rear;
int capacity;
struct QueueNode **array;
};
// A utility function to create a new Queuenode
struct QueueNode* newNode(char data, unsigned freq)
{
struct QueueNode* temp =
(struct QueueNode*) malloc(sizeof(struct QueueNode));
temp->left = temp->right = NULL;
temp->data = data;
temp->freq = freq;
return temp;
}
// A utility function to create a Queue of given capacity
struct Queue* createQueue(int capacity)
{
struct Queue* queue = (struct Queue*) malloc(sizeof(struct Queue));
queue->front = queue->rear = -1;
queue->capacity = capacity;
queue->array =
(struct QueueNode**) malloc(queue->capacity * sizeof(struct QueueNode*));
return queue;
}
// A utility function to check if size of given queue is 1
int isSizeOne(struct Queue* queue)
{
return queue->front == queue->rear && queue->front != -1;
}
// A utility function to check if given queue is empty
int isEmpty(struct Queue* queue)

{
return queue->front == -1;
}
// A utility function to check if given queue is full
int isFull(struct Queue* queue)
{
return queue->rear == queue->capacity - 1;
}
// A utility function to add an item to queue
void enQueue(struct Queue* queue, struct QueueNode* item)
{
if (isFull(queue))
return;
queue->array[++queue->rear] = item;
if (queue->front == -1)
++queue->front;
}
// A utility function to remove an item from queue
struct QueueNode* deQueue(struct Queue* queue)
{
if (isEmpty(queue))
return NULL;
struct QueueNode* temp = queue->array[queue->front];
if (queue->front == queue->rear) // If there is only one item in queue
queue->front = queue->rear = -1;
else
++queue->front;
return temp;
}
// A utility function to get from of queue
struct QueueNode* getFront(struct Queue* queue)
{
if (isEmpty(queue))
return NULL;
return queue->array[queue->front];
}
/* A function to get minimum item from two queues */
struct QueueNode* findMin(struct Queue* firstQueue, struct Queue* secondQueue)
{
// Step 3.a: If second queue is empty, dequeue from first queue
if (isEmpty(firstQueue))
return deQueue(secondQueue);
// Step 3.b: If first queue is empty, dequeue from second queue
if (isEmpty(secondQueue))
return deQueue(firstQueue);
// Step 3.c: Else, compare the front of two queues and dequeue minimum
if (getFront(firstQueue)->freq < getFront(secondQueue)->freq)
return deQueue(firstQueue);
return deQueue(secondQueue);
}
// Utility function to check if this node is leaf
int isLeaf(struct QueueNode* root)
{
return !(root->left) && !(root->right) ;
}
// A utility function to print an array of size n
void printArr(int arr[], int n)
{
int i;
for (i = 0; i < n; ++i)
printf("%d", arr[i]);
printf("\n");
}
// The main function that builds Huffman tree
struct QueueNode* buildHuffmanTree(char data[], int freq[], int size)
{
struct QueueNode *left, *right, *top;
// Step 1: Create two empty queues
struct Queue* firstQueue = createQueue(size);

struct Queue* secondQueue = createQueue(size);


// Step 2:Create a leaf node for each unique character and Enqueue it to
// the first queue in non-decreasing order of frequency. Initially second
// queue is empty
for (int i = 0; i < size; ++i)
enQueue(firstQueue, newNode(data[i], freq[i]));
// Run while Queues contain more than one node. Finally, first queue will
// be empty and second queue will contain only one node
while (!(isEmpty(firstQueue) && isSizeOne(secondQueue)))
{
// Step 3: Dequeue two nodes with the minimum frequency by examining
// the front of both queues
left = findMin(firstQueue, secondQueue);
right = findMin(firstQueue, secondQueue);
// Step 4: Create a new internal node with frequency equal to the sum
// of the two nodes frequencies. Enqueue this node to second queue.
top = newNode('$' , left->freq + right->freq);
top->left = left;
top->right = right;
enQueue(secondQueue, top);
}
return deQueue(secondQueue);
}
// Prints huffman codes from the root of Huffman Tree. It uses arr[] to
// store codes
void printCodes(struct QueueNode* root, int arr[], int top)
{
// Assign 0 to left edge and recur
if (root->left)
{
arr[top] = 0;
printCodes(root->left, arr, top + 1);
}
// Assign 1 to right edge and recur
if (root->right)
{
arr[top] = 1;
printCodes(root->right, arr, top + 1);
}
// If this is a leaf node, then it contains one of the input
// characters, print the character and its code from arr[]
if (isLeaf(root))
{
printf("%c: ", root->data);
printArr(arr, top);
}
}
// The main function that builds a Huffman Tree and print codes by traversing
// the built Huffman Tree
void HuffmanCodes(char data[], int freq[], int size)
{
// Construct Huffman Tree
struct QueueNode* root = buildHuffmanTree(data, freq, size);
// Print Huffman codes using the Huffman tree built above
int arr[MAX_TREE_HT], top = 0;
printCodes(root, arr, top);
}
// Driver program to test above functions
int main()
{
char arr[] = {'a', 'b', 'c', 'd', 'e', 'f'};
int freq[] = {5, 9, 12, 13, 16, 45};
int size = sizeof(arr)/sizeof(arr[0]);
HuffmanCodes(arr, freq, size);
return 0;
}

Output:
f: 0
c: 100

d:
a:
b:
e:

101
1100
1101
111

Time complexity: O(n)


If the input is not sorted, it need to be sorted first before it can be processed by the above algorithm. Sorting can be done using heap-sort or
merge-sort both of which run in Theta(nlogn). So, the overall time complexity becomes O(nlogn) for unsorted input.
Reference:
http://en.wikipedia.org/wiki/Huffman_coding

Greedy Algorithms | Set 5 (Prims Minimum Spanning Tree (MST))


We have discussedKruskals algorithm for Minimum Spanning Tree. Like Kruskals algorithm, Prims algorithm is also aGreedy algorithm. It starts
with an empty spanning tree. The idea is to maintain two sets of vertices. The first set contains the vertices already included in the MST, the other
set contains the vertices not yet included. At every step, it considers all the edges that connect the two sets, and picks the minimum weight edge
from these edges. After picking the edge, it moves the other endpoint of the edge to the set containing MST.
A group of edges that connects two set of vertices in a graph is calledcut in graph theory.So, at every step of Prims algorithm, we find a cut (of
two sets, one contains the vertices already included in MST and other contains rest of the verices), pick the minimum weight edge from
the cut and include this vertex to MST Set (the set that contains already included vertices).
How does Prims Algorithm Work? The idea behind Prims algorithm is simple, a spanning tree means all vertices must be connected. So the two
disjoint subsets (discussed above) of vertices must be connected to make aSpanningTree. And they must be connected with the minimum weight
edge to make it aMinimumSpanning Tree.
Algorithm
1) Create a set mstSet that keeps track of vertices already included in MST.
2) Assign a key value to all vertices in the input graph. Initialize all key values as INFINITE. Assign key value as 0 for the first vertex so that it is
picked first.
3) While mstSet doesnt include all vertices
.a) Pick a vertex u which is not there in mstSet and has minimum key value.
.b) Include u to mstSet.
.c) Update key value of all adjacent vertices of u. To update the key values, iterate through all adjacent vertices. For every adjacent vertex v, if
weight of edge u-v is less than the previous key value of v, update the key value as weight of u-v
The idea of using key values is to pick the minimum weight edge from cut. The key values are used only for vertices which are not yet included in
MST, the key value for these vertices indicate the minimum weight edges connecting them to the set of vertices included in MST.
Let us understand with the following example:

The set mstSet is initially empty and keys assigned to vertices are {0, INF, INF, INF, INF, INF, INF, INF} where INF indicates infinite. Now
pick the vertex with minimum key value. The vertex 0 is picked, include it in mstSet. So mstSet becomes {0}. After including to mstSet, update
key values of adjacent vertices. Adjacent vertices of 0 are 1 and 7. The key values of 1 and 7 are updated as 4 and 8. Following subgraph shows
vertices and their key values, only the vertices with finite key values are shown. The vertices included in MST are shown in green color.

Pick the vertex with minimum key value and not already included in MST (not in mstSET). The vertex 1 is picked and added to mstSet. So mstSet
now becomes {0, 1}. Update the key values of adjacent vertices of 1. The key value of vertex 2 becomes 8.

Pick the vertex with minimum key value and not already included in MST (not in mstSET). We can either pick vertex 7 or vertex 2, let vertex 7 is
picked. So mstSet now becomes {0, 1, 7}. Update the key values of adjacent vertices of 7. The key value of vertex 6 and 8 becomes finite (7
and 1 respectively).

Pick the vertex with minimum key value and not already included in MST (not in mstSET). Vertex 6 is picked. So mstSet now becomes {0, 1, 7,
6}. Update the key values of adjacent vertices of 6. The key value of vertex 5 and 8 are updated.

We repeat the above steps until mstSet includes all vertices of given graph. Finally, we get the following graph.

How to implement the above algorithm?


We use a boolean array mstSet[] to represent the set of vertices included in MST. If a value mstSet[v] is true, then vertex v is included in MST,
otherwise not. Array key[] is used to store key values of all vertices. Another array parent[] to store indexes of parent nodes in MST. The parent
array is the output array which is used to show the constructed MST.

C/C++
// A C / C++ program for Prim's Minimum Spanning Tree (MST) algorithm.
// The program is for adjacency matrix representation of the graph
#include <stdio.h>
#include <limits.h>
// Number of vertices in the graph
#define V 5
// A utility function to find the vertex with minimum key value, from
// the set of vertices not yet included in MST
int minKey(int key[], bool mstSet[])
{
// Initialize min value
int min = INT_MAX, min_index;
for (int v = 0; v < V; v++)
if (mstSet[v] == false && key[v] < min)
min = key[v], min_index = v;
return min_index;
}
// A utility function to print the constructed MST stored in parent[]
int printMST(int parent[], int n, int graph[V][V])
{
printf("Edge Weight\n");
for (int i = 1; i < V; i++)
printf("%d - %d
%d \n", parent[i], i, graph[i][parent[i]]);
}
// Function to construct and print MST for a graph represented using adjacency
// matrix representation
void primMST(int graph[V][V])
{
int parent[V]; // Array to store constructed MST
int key[V]; // Key values used to pick minimum weight edge in cut
bool mstSet[V]; // To represent set of vertices not yet included in MST

// Initialize all keys as INFINITE


for (int i = 0; i < V; i++)
key[i] = INT_MAX, mstSet[i] = false;
// Always include first 1st vertex in MST.
key[0] = 0;
// Make key 0 so that this vertex is picked as first vertex
parent[0] = -1; // First node is always root of MST
// The MST will have V vertices
for (int count = 0; count < V-1; count++)
{
// Pick thd minimum key vertex from the set of vertices
// not yet included in MST
int u = minKey(key, mstSet);
// Add the picked vertex to the MST Set
mstSet[u] = true;
// Update key value and parent index of the adjacent vertices of
// the picked vertex. Consider only those vertices which are not yet
// included in MST
for (int v = 0; v < V; v++)
// graph[u][v] is non zero only for adjacent vertices of m
// mstSet[v] is false for vertices not yet included in MST
// Update the key only if graph[u][v] is smaller than key[v]
if (graph[u][v] && mstSet[v] == false && graph[u][v] < key[v])
parent[v] = u, key[v] = graph[u][v];
}
// print the constructed MST
printMST(parent, V, graph);
}
// driver program to test above function
int main()
{
/* Let us create the following graph
2
3
(0)--(1)--(2)
| / \ |
6| 8/ \5 |7
| /
\ |
(3)-------(4)
9
*/
int graph[V][V] = {{0, 2, 0, 6, 0},
{2, 0, 3, 8, 5},
{0, 3, 0, 0, 7},
{6, 8, 0, 0, 9},
{0, 5, 7, 9, 0},
};
// Print the solution
primMST(graph);
return 0;
}

Java
// A Java program for Prim's Minimum Spanning Tree (MST) algorithm.
// The program is for adjacency matrix representation of the graph
import java.util.*;
import java.lang.*;
import java.io.*;
class MST
{
// Number of vertices in the graph
private static final int V=5;
// A utility function to find the vertex with minimum key
// value, from the set of vertices not yet included in MST
int minKey(int key[], Boolean mstSet[])
{
// Initialize min value
int min = Integer.MAX_VALUE, min_index=-1;

for (int v = 0; v < V; v++)


if (mstSet[v] == false && key[v] < min)
{
min = key[v];
min_index = v;
}
return min_index;
}
// A utility function to print the constructed MST stored in
// parent[]
void printMST(int parent[], int n, int graph[][])
{
System.out.println("Edge Weight");
for (int i = 1; i < V; i++)
System.out.println(parent[i]+" - "+ i+"
"+
graph[i][parent[i]]);
}
// Function to construct and print MST for a graph represented
// using adjacency matrix representation
void primMST(int graph[][])
{
// Array to store constructed MST
int parent[] = new int[V];
// Key values used to pick minimum weight edge in cut
int key[] = new int [V];
// To represent set of vertices not yet included in MST
Boolean mstSet[] = new Boolean[V];
// Initialize all keys as INFINITE
for (int i = 0; i < V; i++)
{
key[i] = Integer.MAX_VALUE;
mstSet[i] = false;
}
// Always include first 1st vertex in MST.
key[0] = 0;
// Make key 0 so that this vertex is
// picked as first vertex
parent[0] = -1; // First node is always root of MST
// The MST will have V vertices
for (int count = 0; count < V-1; count++)
{
// Pick thd minimum key vertex from the set of vertices
// not yet included in MST
int u = minKey(key, mstSet);
// Add the picked vertex to the MST Set
mstSet[u] = true;
// Update key value and parent index of the adjacent
// vertices of the picked vertex. Consider only those
// vertices which are not yet included in MST
for (int v = 0; v < V; v++)
//
//
//
if

graph[u][v] is non zero only for adjacent vertices of m


mstSet[v] is false for vertices not yet included in MST
Update the key only if graph[u][v] is smaller than key[v]
(graph[u][v]!=0 && mstSet[v] == false &&
graph[u][v] < key[v])

{
parent[v] = u;
key[v] = graph[u][v];
}
}
// print the constructed MST
printMST(parent, V, graph);
}
public static void main (String[] args)
{
/* Let us create the following graph
2
3
(0)--(1)--(2)

|
/ \ |
6| 8/ \5 |7
| /
\ |
(3)-------(4)
9
*/
MST t = new MST();
int graph[][] = new int[][] {{0, 2, 0, 6, 0},
{2, 0, 3, 8, 5},
{0, 3, 0, 0, 7},
{6, 8, 0, 0, 9},
{0, 5, 7, 9, 0},
};
// Print the solution
t.primMST(graph);
}
}
// This code is contributed by Aakash Hasija

Edge Weight
0 - 1
2
1 - 2
3
0 - 3
6
1 - 4
5

Time Complexity of the above program is O(V^2). If the input graph is represented using adjacency list, then the time complexity of Prims
algorithm can be reduced to O(E log V) with the help of binary heap. Please see Prims MST for Adjacency List Representation for more details.

Greedy Algorithms | Set 6 (Prims MST for Adjacency List Representation)


We recommend to read following two posts as a prerequisite of this post.
1. Greedy Algorithms | Set 5 (Prims Minimum Spanning Tree (MST))
2. Graph and its representations
We have discussed Prims algorithm and its implementation for adjacency matrix representation of graphs. The time complexity for the matrix
representation is O(V^2). In this post, O(ELogV) algorithm for adjacency list representation is discussed.
As discussed in the previous post, in Prims algorithm, two sets are maintained, one set contains list of vertices already included in MST, other set
contains vertices not yet included. With adjacency list representation, all vertices of a graph can be traversed in O(V+E) time using BFS. The idea
is to traverse all vertices of graph using BFS and use a Min Heap to store the vertices not yet included in MST. Min Heap is used as a priority
queue to get the minimum weight edge from the cut. Min Heap is used as time complexity of operations like extracting minimum element and
decreasing key value is O(LogV) in Min Heap.
Following are the detailed steps.
1) Create a Min Heap of size V where V is the number of vertices in the given graph. Every node of min heap contains vertex number and key
value of the vertex.
2) Initialize Min Heap with first vertex as root (the key value assigned to first vertex is 0). The key value assigned to all other vertices is INF
(infinite).
3) While Min Heap is not empty, do following
..a) Extract the min value node from Min Heap. Let the extracted vertex be u.
..b) For every adjacent vertex v of u, check if v is in Min Heap (not yet included in MST). If v is in Min Heap and its key value is more than weight
of u-v, then update the key value of v as weight of u-v.
Let us understand the above algorithm with the following example:

Initially, key value of first vertex is 0 and INF (infinite) for all other vertices. So vertex 0 is extracted from Min Heap and key values of vertices
adjacent to 0 (1 and 7) are updated. Min Heap contains all vertices except vertex 0.
The vertices in green color are the vertices included in MST.

Since key value of vertex 1 is minimum among all nodes in Min Heap, it is extracted from Min Heap and key values of vertices adjacent to 1 are
updated (Key is updated if the a vertex is not in Min Heap and previous key value is greater than the weight of edge from 1 to the adjacent). Min
Heap contains all vertices except vertex 0 and 1.

Since key value of vertex 7 is minimum among all nodes in Min Heap, it is extracted from Min Heap and key values of vertices adjacent to 7 are
updated (Key is updated if the a vertex is not in Min Heap and previous key value is greater than the weight of edge from 7 to the adjacent). Min
Heap contains all vertices except vertex 0, 1 and 7.

Since key value of vertex 6 is minimum among all nodes in Min Heap, it is extracted from Min Heap and key values of vertices adjacent to 6 are
updated (Key is updated if the a vertex is not in Min Heap and previous key value is greater than the weight of edge from 6 to the adjacent). Min
Heap contains all vertices except vertex 0, 1, 7 and 6.

The above steps are repeated for rest of the nodes in Min Heap till Min Heap becomes empty

// C / C++ program for Prim's MST for adjacency list representation of graph
#include <stdio.h>
#include <stdlib.h>
#include <limits.h>
// A structure to represent a node in adjacency list
struct AdjListNode
{
int dest;
int weight;
struct AdjListNode* next;
};
// A structure to represent an adjacency liat
struct AdjList
{
struct AdjListNode *head; // pointer to head node of list
};
// A structure to represent a graph. A graph is an array of adjacency lists.
// Size of array will be V (number of vertices in graph)
struct Graph
{
int V;
struct AdjList* array;
};
// A utility function to create a new adjacency list node
struct AdjListNode* newAdjListNode(int dest, int weight)
{
struct AdjListNode* newNode =
(struct AdjListNode*) malloc(sizeof(struct AdjListNode));
newNode->dest = dest;
newNode->weight = weight;
newNode->next = NULL;
return newNode;
}
// A utility function that creates a graph of V vertices
struct Graph* createGraph(int V)
{
struct Graph* graph = (struct Graph*) malloc(sizeof(struct Graph));
graph->V = V;
// Create an array of adjacency lists. Size of array will be V
graph->array = (struct AdjList*) malloc(V * sizeof(struct AdjList));
// Initialize each adjacency list as empty by making head as NULL
for (int i = 0; i < V; ++i)
graph->array[i].head = NULL;
return graph;
}
// Adds an edge to an undirected graph
void addEdge(struct Graph* graph, int src, int dest, int weight)
{
// Add an edge from src to dest. A new node is added to the adjacency
// list of src. The node is added at the begining
struct AdjListNode* newNode = newAdjListNode(dest, weight);
newNode->next = graph->array[src].head;

graph->array[src].head = newNode;
// Since graph is undirected, add an edge from dest to src also
newNode = newAdjListNode(src, weight);
newNode->next = graph->array[dest].head;
graph->array[dest].head = newNode;
}
// Structure to represent a min heap node
struct MinHeapNode
{
int v;
int key;
};
// Structure to represent a min heap
struct MinHeap
{
int size;
// Number of heap nodes present currently
int capacity; // Capacity of min heap
int *pos;
// This is needed for decreaseKey()
struct MinHeapNode **array;
};
// A utility function to create a new Min Heap Node
struct MinHeapNode* newMinHeapNode(int v, int key)
{
struct MinHeapNode* minHeapNode =
(struct MinHeapNode*) malloc(sizeof(struct MinHeapNode));
minHeapNode->v = v;
minHeapNode->key = key;
return minHeapNode;
}
// A utilit function to create a Min Heap
struct MinHeap* createMinHeap(int capacity)
{
struct MinHeap* minHeap =
(struct MinHeap*) malloc(sizeof(struct MinHeap));
minHeap->pos = (int *)malloc(capacity * sizeof(int));
minHeap->size = 0;
minHeap->capacity = capacity;
minHeap->array =
(struct MinHeapNode**) malloc(capacity * sizeof(struct MinHeapNode*));
return minHeap;
}
// A utility function to swap two nodes of min heap. Needed for min heapify
void swapMinHeapNode(struct MinHeapNode** a, struct MinHeapNode** b)
{
struct MinHeapNode* t = *a;
*a = *b;
*b = t;
}
// A standard function to heapify at given idx
// This function also updates position of nodes when they are swapped.
// Position is needed for decreaseKey()
void minHeapify(struct MinHeap* minHeap, int idx)
{
int smallest, left, right;
smallest = idx;
left = 2 * idx + 1;
right = 2 * idx + 2;
if (left < minHeap->size &&
minHeap->array[left]->key < minHeap->array[smallest]->key )
smallest = left;
if (right < minHeap->size &&
minHeap->array[right]->key < minHeap->array[smallest]->key )
smallest = right;
if (smallest != idx)
{
// The nodes to be swapped in min heap
MinHeapNode *smallestNode = minHeap->array[smallest];
MinHeapNode *idxNode = minHeap->array[idx];
// Swap positions
minHeap->pos[smallestNode->v] = idx;

minHeap->pos[idxNode->v] = smallest;
// Swap nodes
swapMinHeapNode(&minHeap->array[smallest], &minHeap->array[idx]);
minHeapify(minHeap, smallest);
}
}
// A utility function to check if the given minHeap is ampty or not
int isEmpty(struct MinHeap* minHeap)
{
return minHeap->size == 0;
}
// Standard function to extract minimum node from heap
struct MinHeapNode* extractMin(struct MinHeap* minHeap)
{
if (isEmpty(minHeap))
return NULL;
// Store the root node
struct MinHeapNode* root = minHeap->array[0];
// Replace root node with last node
struct MinHeapNode* lastNode = minHeap->array[minHeap->size - 1];
minHeap->array[0] = lastNode;
// Update position of last node
minHeap->pos[root->v] = minHeap->size-1;
minHeap->pos[lastNode->v] = 0;
// Reduce heap size and heapify root
--minHeap->size;
minHeapify(minHeap, 0);
return root;
}
// Function to decreasy key value of a given vertex v. This function
// uses pos[] of min heap to get the current index of node in min heap
void decreaseKey(struct MinHeap* minHeap, int v, int key)
{
// Get the index of v in heap array
int i = minHeap->pos[v];
// Get the node and update its key value
minHeap->array[i]->key = key;
// Travel up while the complete tree is not hepified.
// This is a O(Logn) loop
while (i && minHeap->array[i]->key < minHeap->array[(i - 1) / 2]->key)
{
// Swap this node with its parent
minHeap->pos[minHeap->array[i]->v] = (i-1)/2;
minHeap->pos[minHeap->array[(i-1)/2]->v] = i;
swapMinHeapNode(&minHeap->array[i], &minHeap->array[(i - 1) / 2]);
// move to parent index
i = (i - 1) / 2;
}
}
// A utility function to check if a given vertex
// 'v' is in min heap or not
bool isInMinHeap(struct MinHeap *minHeap, int v)
{
if (minHeap->pos[v] < minHeap->size)
return true;
return false;
}
// A utility function used to print the constructed MST
void printArr(int arr[], int n)
{
for (int i = 1; i < n; ++i)
printf("%d - %d\n", arr[i], i);
}
// The main function that constructs Minimum Spanning Tree (MST)
// using Prim's algorithm

void PrimMST(struct Graph* graph)


{
int V = graph->V;// Get the number of vertices in graph
int parent[V]; // Array to store constructed MST
int key[V];
// Key values used to pick minimum weight edge in cut
// minHeap represents set E
struct MinHeap* minHeap = createMinHeap(V);
// Initialize min heap with all vertices. Key value of
// all vertices (except 0th vertex) is initially infinite
for (int v = 1; v < V; ++v)
{
parent[v] = -1;
key[v] = INT_MAX;
minHeap->array[v] = newMinHeapNode(v, key[v]);
minHeap->pos[v] = v;
}
// Make key value of 0th vertex as 0 so that it
// is extracted first
key[0] = 0;
minHeap->array[0] = newMinHeapNode(0, key[0]);
minHeap->pos[0] = 0;
// Initially size of min heap is equal to V
minHeap->size = V;
// In the followin loop, min heap contains all nodes
// not yet added to MST.
while (!isEmpty(minHeap))
{
// Extract the vertex with minimum key value
struct MinHeapNode* minHeapNode = extractMin(minHeap);
int u = minHeapNode->v; // Store the extracted vertex number
// Traverse through all adjacent vertices of u (the extracted
// vertex) and update their key values
struct AdjListNode* pCrawl = graph->array[u].head;
while (pCrawl != NULL)
{
int v = pCrawl->dest;
//
//
//
if
{

If v is not yet included in MST and weight of u-v is


less than key value of v, then update key value and
parent of v
(isInMinHeap(minHeap, v) && pCrawl->weight < key[v])
key[v] = pCrawl->weight;
parent[v] = u;
decreaseKey(minHeap, v, key[v]);

}
pCrawl = pCrawl->next;
}
}
// print edges of MST
printArr(parent, V);
}
// Driver program to test above functions
int main()
{
// Let us create the graph given in above fugure
int V = 9;
struct Graph* graph = createGraph(V);
addEdge(graph, 0, 1, 4);
addEdge(graph, 0, 7, 8);
addEdge(graph, 1, 2, 8);
addEdge(graph, 1, 7, 11);
addEdge(graph, 2, 3, 7);
addEdge(graph, 2, 8, 2);
addEdge(graph, 2, 5, 4);
addEdge(graph, 3, 4, 9);
addEdge(graph, 3, 5, 14);
addEdge(graph, 4, 5, 10);
addEdge(graph, 5, 6, 2);
addEdge(graph, 6, 7, 1);
addEdge(graph, 6, 8, 6);
addEdge(graph, 7, 8, 7);

PrimMST(graph);
return 0;
}

Output:
0
5
2
3
6
7
0
2

1
2
3
4
5
6
7
8

Time Complexity: The time complexity of the above code/algorithm looks O(V^2) as there are two nested while loops. If we take a closer look,
we can observe that the statements in inner loop are executed O(V+E) times (similar to BFS). The inner loop has decreaseKey() operation which
takes O(LogV) time. So overall time complexity is O(E+V)*O(LogV) which is O((E+V)*LogV) = O(ELogV) (For a connected graph, V =
O(E))
References:
Introduction to Algorithms by Clifford Stein, Thomas H. Cormen, Charles E. Leiserson, Ronald L.
http://en.wikipedia.org/wiki/Prims_algorithm

Greedy Algorithms | Set 7 (Dijkstras shortest path algorithm)


Given a graph and a source vertex in graph, find shortest paths from source to all vertices in the given graph.
Dijkstras algorithm is very similar to Prims algorithm for minimum spanning tree. Like Prims MST, we generate a SPT (shortest path tree) with
given source as root. We maintain two sets, one set contains vertices included in shortest path tree, other set includes vertices not yet included in
shortest path tree. At every step of the algorithm, we find a vertex which is in the other set (set of not yet included) and has minimum distance from
source.
Below are the detailed steps used in Dijkstras algorithm to find the shortest path from a single source vertex to all other vertices in the given graph.
Algorithm
1) Create a set sptSet (shortest path tree set) that keeps track of vertices included in shortest path tree, i.e., whose minimum distance from source
is calculated and finalized. Initially, this set is empty.
2) Assign a distance value to all vertices in the input graph. Initialize all distance values as INFINITE. Assign distance value as 0 for the source
vertex so that it is picked first.
3) While sptSet doesnt include all vertices
.a) Pick a vertex u which is not there in sptSetand has minimum distance value.
.b) Include u to sptSet.
.c) Update distance value of all adjacent vertices of u. To update the distance values, iterate through all adjacent vertices. For every adjacent
vertex v, if sum of distance value of u (from source) and weight of edge u-v, is less than the distance value of v, then update the distance value of v.
Let us understand with the following example:

The set sptSetis initially empty and distances assigned to vertices are {0, INF, INF, INF, INF, INF, INF, INF} where INF indicates infinite.
Now pick the vertex with minimum distance value. The vertex 0 is picked, include it in sptSet. So sptSet becomes {0}. After including 0 to
sptSet, update distance values of its adjacent vertices. Adjacent vertices of 0 are 1 and 7. The distance values of 1 and 7 are updated as 4 and 8.
Following subgraph shows vertices and their distance values, only the vertices with finite distance values are shown. The vertices included in SPT
are shown in green color.

Pick the vertex with minimum distance value and not already included in SPT (not in sptSET). The vertex 1 is picked and added to sptSet. So
sptSet now becomes {0, 1}. Update the distance values of adjacent vertices of 1. The distance value of vertex 2 becomes 12.

Pick the vertex with minimum distance value and not already included in SPT (not in sptSET). Vertex 7 is picked. So sptSet now becomes {0, 1,
7}. Update the distance values of adjacent vertices of 7. The distance value of vertex 6 and 8 becomes finite (15 and 9 respectively).

Pick the vertex with minimum distance value and not already included in SPT (not in sptSET). Vertex 6 is picked. So sptSet now becomes {0, 1,
7, 6}. Update the distance values of adjacent vertices of 6. The distance value of vertex 5 and 8 are updated.

We repeat the above steps until sptSet doesnt include all vertices of given graph. Finally, we get the following Shortest Path Tree (SPT).

How to implement the above algorithm?


We use a boolean array sptSet[] to represent the set of vertices included in SPT. If a value sptSet[v] is true, then vertex v is included in SPT,
otherwise not. Array dist[] is used to store shortest distance values of all vertices.

C/C++
// A C / C++ program for Dijkstra's single source shortest path algorithm.
// The program is for adjacency matrix representation of the graph
#include <stdio.h>
#include <limits.h>
// Number of vertices in the graph
#define V 9
// A utility function to find the vertex with minimum distance value, from
// the set of vertices not yet included in shortest path tree
int minDistance(int dist[], bool sptSet[])
{
// Initialize min value
int min = INT_MAX, min_index;
for (int v = 0; v < V; v++)
if (sptSet[v] == false && dist[v] <= min)
min = dist[v], min_index = v;
return min_index;
}
// A utility function to print the constructed distance array
int printSolution(int dist[], int n)
{
printf("Vertex Distance from Source\n");
for (int i = 0; i < V; i++)
printf("%d \t\t %d\n", i, dist[i]);
}
// Funtion that implements Dijkstra's single source shortest path algorithm
// for a graph represented using adjacency matrix representation
void dijkstra(int graph[V][V], int src)
{
int dist[V];
// The output array. dist[i] will hold the shortest
// distance from src to i
bool sptSet[V]; // sptSet[i] will true if vertex i is included in shortest
// path tree or shortest distance from src to i is finalized
// Initialize all distances as INFINITE and stpSet[] as false
for (int i = 0; i < V; i++)
dist[i] = INT_MAX, sptSet[i] = false;
// Distance of source vertex from itself is always 0
dist[src] = 0;
// Find shortest path for all vertices
for (int count = 0; count < V-1; count++)
{
// Pick the minimum distance vertex from the set of vertices not

// yet processed. u is always equal to src in first iteration.


int u = minDistance(dist, sptSet);
// Mark the picked vertex as processed
sptSet[u] = true;
// Update dist value of the adjacent vertices of the picked vertex.
for (int v = 0; v < V; v++)
//
//
//
if

Update dist[v] only if is not in sptSet, there is an edge from


u to v, and total weight of path from src to v through u is
smaller than current value of dist[v]
(!sptSet[v] && graph[u][v] && dist[u] != INT_MAX
&& dist[u]+graph[u][v] < dist[v])
dist[v] = dist[u] + graph[u][v];

}
// print the constructed distance array
printSolution(dist, V);
}
// driver program to test above function
int main()
{
/* Let us create the example graph discussed above */
int graph[V][V] = {{0, 4, 0, 0, 0, 0, 0, 8, 0},
{4, 0, 8, 0, 0, 0, 0, 11, 0},
{0, 8, 0, 7, 0, 4, 0, 0, 2},
{0, 0, 7, 0, 9, 14, 0, 0, 0},
{0, 0, 0, 9, 0, 10, 0, 0, 0},
{0, 0, 4, 0, 10, 0, 2, 0, 0},
{0, 0, 0, 14, 0, 2, 0, 1, 6},
{8, 11, 0, 0, 0, 0, 1, 0, 7},
{0, 0, 2, 0, 0, 0, 6, 7, 0}
};
dijkstra(graph, 0);
return 0;
}

Java
// A Java program for Dijkstra's single source shortest path algorithm.
// The program is for adjacency matrix representation of the graph
import java.util.*;
import java.lang.*;
import java.io.*;
class ShortestPath
{
// A utility function to find the vertex with minimum distance value,
// from the set of vertices not yet included in shortest path tree
static final int V=9;
int minDistance(int dist[], Boolean sptSet[])
{
// Initialize min value
int min = Integer.MAX_VALUE, min_index=-1;
for (int v = 0; v < V; v++)
if (sptSet[v] == false && dist[v] <= min)
{
min = dist[v];
min_index = v;
}
return min_index;
}
// A utility function to print the constructed distance array
void printSolution(int dist[], int n)
{
System.out.println("Vertex Distance from Source");
for (int i = 0; i < V; i++)
System.out.println(i+" \t\t "+dist[i]);
}
// Funtion that implements Dijkstra's single source shortest path
// algorithm for a graph represented using adjacency matrix
// representation

void dijkstra(int graph[][], int src)


{
int dist[] = new int[V]; // The output array. dist[i] will hold
// the shortest distance from src to i
// sptSet[i] will true if vertex i is included in shortest
// path tree or shortest distance from src to i is finalized
Boolean sptSet[] = new Boolean[V];
// Initialize all distances as INFINITE and stpSet[] as false
for (int i = 0; i < V; i++)
{
dist[i] = Integer.MAX_VALUE;
sptSet[i] = false;
}
// Distance of source vertex from itself is always 0
dist[src] = 0;
// Find shortest path for all vertices
for (int count = 0; count < V-1; count++)
{
// Pick the minimum distance vertex from the set of vertices
// not yet processed. u is always equal to src in first
// iteration.
int u = minDistance(dist, sptSet);
// Mark the picked vertex as processed
sptSet[u] = true;
// Update dist value of the adjacent vertices of the
// picked vertex.
for (int v = 0; v < V; v++)
//
//
//
if

Update dist[v] only if is not in sptSet, there is an


edge from u to v, and total weight of path from src to
v through u is smaller than current value of dist[v]
(!sptSet[v] && graph[u][v]!=0 &&
dist[u] != Integer.MAX_VALUE &&
dist[u]+graph[u][v] < dist[v])
dist[v] = dist[u] + graph[u][v];

}
// print the constructed distance array
printSolution(dist, V);
}
// Driver method
public static void main (String[] args)
{
/* Let us create the example graph discussed above */
int graph[][] = new int[][]{{0, 4, 0, 0, 0, 0, 0, 8, 0},
{4, 0, 8, 0, 0, 0, 0, 11, 0},
{0, 8, 0, 7, 0, 4, 0, 0, 2},
{0, 0, 7, 0, 9, 14, 0, 0, 0},
{0, 0, 0, 9, 0, 10, 0, 0, 0},
{0, 0, 4, 0, 10, 0, 2, 0, 0},
{0, 0, 0, 14, 0, 2, 0, 1, 6},
{8, 11, 0, 0, 0, 0, 1, 0, 7},
{0, 0, 2, 0, 0, 0, 6, 7, 0}
};
ShortestPath t = new ShortestPath();
t.dijkstra(graph, 0);
}
}
//This code is contributed by Aakash Hasija

Vertex
0
1
2
3
4
5
6
7
8

Notes:

Distance from Source


0
4
12
19
21
11
9
8
14

1) The code calculates shortest distance, but doesnt calculate the path information. We can create a parent array, update the parent array when
distance is updated (like prims implementation) and use it show the shortest path from source to different vertices.
2) The code is for undirected graph, same dijekstra function can be used for directed graphs also.
3) The code finds shortest distances from source to all vertices. If we are interested only in shortest distance from source to a single target, we can
break the for loop when the picked minimum distance vertex is equal to target (Step 3.a of algorithm).
4) Time Complexity of the implementation is O(V^2). If the input graph is represented using adjacency list, it can be reduced to O(E log V) with
the help of binary heap. Please see
Dijkstras Algorithm for Adjacency List Representation for more details.
5) Dijkstras algorithm doesnt work for graphs with negative weight edges. For graphs with negative weight edges, BellmanFord algorithm can be
used, we will soon be discussing it as a separate post.
Dijkstras Algorithm for Adjacency List Representation

Greedy Algorithms | Set 8 (Dijkstras Algorithm for Adjacency List Representation)


We recommend to read following two posts as a prerequisite of this post.
1. Greedy Algorithms | Set 7 (Dijkstras shortest path algorithm)
2.Graph and its representations
We have discussed Dijkstras algorithm and its implementation for adjacency matrix representation of graphs. The time complexity for the matrix
representation is O(V^2). In this post, O(ELogV) algorithm for adjacency list representation is discussed.
As discussed in the previous post, in Dijkstras algorithm, two sets are maintained, one set contains list of vertices already included in SPT (Shortest
Path Tree), other set contains vertices not yet included. With adjacency list representation, all vertices of a graph can be traversed in O(V+E) time
usingBFS. The idea is to traverse all vertices of graph usingBFSand use a Min Heap to store the vertices not yet included in SPT (or the vertices
for which shortest distance is not finalized yet). Min Heap is used as a priority queue to get the minimum distance vertex from set of not
yetincludedvertices. Time complexity of operations like extract-min and decrease-key value is O(LogV) for Min Heap.
Following are the detailed steps.
1)Create a Min Heap of size V where V is the number of vertices in the given graph. Every node of min heap contains vertex number and distance
value of the vertex.
2)Initialize Min Heap with source vertex as root (the distance value assigned to source vertex is 0). The distance value assigned to all other vertices
is INF (infinite).
3)While Min Heap is not empty, do following
..a)Extract the vertex with minimum distance value node from Min Heap. Let the extracted vertex be u.
..b)For every adjacent vertex v of u, check if v is in Min Heap. If v is in Min Heap and distance value is more than weight of u-v plus distance
value of u, then update the distance value of v.
Let us understand with the following example. Let the given source vertex be 0

Initially, distance value of source vertex is 0 and INF (infinite) for all other vertices. So source vertex is extracted from Min Heap and distance
values of vertices adjacent to 0 (1 and 7) are updated. Min Heap contains all vertices except vertex 0.
The vertices in green color are the vertices for which minimum distances are finalized and are not in Min Heap

Since distance value of vertex 1 is minimum among all nodes in Min Heap, it is extracted from Min Heap and distance values of vertices adjacent
to 1 are updated (distance is updated if the a vertex is not in Min Heap and distance through 1 is shorter than the previous distance). Min Heap
contains all vertices except vertex 0 and 1.

Pick the vertex with minimum distance value from min heap. Vertex 7 is picked. So min heap now contains all vertices except 0, 1 and 7. Update
the distance values of adjacent vertices of 7. The distance value of vertex 6 and 8 becomes finite (15 and 9 respectively).

Pick the vertex with minimum distance from min heap. Vertex 6 is picked. So min heap now contains all vertices except 0, 1, 7 and 6. Update the
distance values of adjacent vertices of 6. The distance value of vertex 5 and 8 are updated.

Above steps are repeated till min heap doesnt become empty. Finally, we get the following shortest path tree.

// C / C++ program for Dijkstra's shortest path algorithm for adjacency


// list representation of graph
#include <stdio.h>
#include <stdlib.h>
#include <limits.h>
// A structure to represent a node in adjacency list
struct AdjListNode
{
int dest;
int weight;
struct AdjListNode* next;
};
// A structure to represent an adjacency liat
struct AdjList
{
struct AdjListNode *head; // pointer to head node of list
};
// A structure to represent a graph. A graph is an array of adjacency lists.
// Size of array will be V (number of vertices in graph)
struct Graph
{
int V;
struct AdjList* array;
};
// A utility function to create a new adjacency list node
struct AdjListNode* newAdjListNode(int dest, int weight)
{
struct AdjListNode* newNode =
(struct AdjListNode*) malloc(sizeof(struct AdjListNode));
newNode->dest = dest;
newNode->weight = weight;
newNode->next = NULL;
return newNode;
}
// A utility function that creates a graph of V vertices
struct Graph* createGraph(int V)
{
struct Graph* graph = (struct Graph*) malloc(sizeof(struct Graph));
graph->V = V;
// Create an array of adjacency lists. Size of array will be V
graph->array = (struct AdjList*) malloc(V * sizeof(struct AdjList));
// Initialize each adjacency list as empty by making head as NULL
for (int i = 0; i < V; ++i)
graph->array[i].head = NULL;
return graph;
}

// Adds an edge to an undirected graph


void addEdge(struct Graph* graph, int src, int dest, int weight)
{
// Add an edge from src to dest. A new node is added to the adjacency
// list of src. The node is added at the begining
struct AdjListNode* newNode = newAdjListNode(dest, weight);
newNode->next = graph->array[src].head;
graph->array[src].head = newNode;
// Since graph is undirected, add an edge from dest to src also
newNode = newAdjListNode(src, weight);
newNode->next = graph->array[dest].head;
graph->array[dest].head = newNode;
}
// Structure to represent a min heap node
struct MinHeapNode
{
int v;
int dist;
};
// Structure to represent a min heap
struct MinHeap
{
int size;
// Number of heap nodes present currently
int capacity; // Capacity of min heap
int *pos;
// This is needed for decreaseKey()
struct MinHeapNode **array;
};
// A utility function to create a new Min Heap Node
struct MinHeapNode* newMinHeapNode(int v, int dist)
{
struct MinHeapNode* minHeapNode =
(struct MinHeapNode*) malloc(sizeof(struct MinHeapNode));
minHeapNode->v = v;
minHeapNode->dist = dist;
return minHeapNode;
}
// A utility function to create a Min Heap
struct MinHeap* createMinHeap(int capacity)
{
struct MinHeap* minHeap =
(struct MinHeap*) malloc(sizeof(struct MinHeap));
minHeap->pos = (int *)malloc(capacity * sizeof(int));
minHeap->size = 0;
minHeap->capacity = capacity;
minHeap->array =
(struct MinHeapNode**) malloc(capacity * sizeof(struct MinHeapNode*));
return minHeap;
}
// A utility function to swap two nodes of min heap. Needed for min heapify
void swapMinHeapNode(struct MinHeapNode** a, struct MinHeapNode** b)
{
struct MinHeapNode* t = *a;
*a = *b;
*b = t;
}
// A standard function to heapify at given idx
// This function also updates position of nodes when they are swapped.
// Position is needed for decreaseKey()
void minHeapify(struct MinHeap* minHeap, int idx)
{
int smallest, left, right;
smallest = idx;
left = 2 * idx + 1;
right = 2 * idx + 2;
if (left < minHeap->size &&
minHeap->array[left]->dist < minHeap->array[smallest]->dist )
smallest = left;
if (right < minHeap->size &&
minHeap->array[right]->dist < minHeap->array[smallest]->dist )
smallest = right;
if (smallest != idx)

{
// The nodes to be swapped in min heap
MinHeapNode *smallestNode = minHeap->array[smallest];
MinHeapNode *idxNode = minHeap->array[idx];
// Swap positions
minHeap->pos[smallestNode->v] = idx;
minHeap->pos[idxNode->v] = smallest;
// Swap nodes
swapMinHeapNode(&minHeap->array[smallest], &minHeap->array[idx]);
minHeapify(minHeap, smallest);
}
}
// A utility function to check if the given minHeap is ampty or not
int isEmpty(struct MinHeap* minHeap)
{
return minHeap->size == 0;
}
// Standard function to extract minimum node from heap
struct MinHeapNode* extractMin(struct MinHeap* minHeap)
{
if (isEmpty(minHeap))
return NULL;
// Store the root node
struct MinHeapNode* root = minHeap->array[0];
// Replace root node with last node
struct MinHeapNode* lastNode = minHeap->array[minHeap->size - 1];
minHeap->array[0] = lastNode;
// Update position of last node
minHeap->pos[root->v] = minHeap->size-1;
minHeap->pos[lastNode->v] = 0;
// Reduce heap size and heapify root
--minHeap->size;
minHeapify(minHeap, 0);
return root;
}
// Function to decreasy dist value of a given vertex v. This function
// uses pos[] of min heap to get the current index of node in min heap
void decreaseKey(struct MinHeap* minHeap, int v, int dist)
{
// Get the index of v in heap array
int i = minHeap->pos[v];
// Get the node and update its dist value
minHeap->array[i]->dist = dist;
// Travel up while the complete tree is not hepified.
// This is a O(Logn) loop
while (i && minHeap->array[i]->dist < minHeap->array[(i - 1) / 2]->dist)
{
// Swap this node with its parent
minHeap->pos[minHeap->array[i]->v] = (i-1)/2;
minHeap->pos[minHeap->array[(i-1)/2]->v] = i;
swapMinHeapNode(&minHeap->array[i], &minHeap->array[(i - 1) / 2]);
// move to parent index
i = (i - 1) / 2;
}
}
// A utility function to check if a given vertex
// 'v' is in min heap or not
bool isInMinHeap(struct MinHeap *minHeap, int v)
{
if (minHeap->pos[v] < minHeap->size)
return true;
return false;
}
// A utility function used to print the solution
void printArr(int dist[], int n)

{
printf("Vertex Distance from Source\n");
for (int i = 0; i < n; ++i)
printf("%d \t\t %d\n", i, dist[i]);
}
// The main function that calulates distances of shortest paths from src to all
// vertices. It is a O(ELogV) function
void dijkstra(struct Graph* graph, int src)
{
int V = graph->V;// Get the number of vertices in graph
int dist[V];
// dist values used to pick minimum weight edge in cut
// minHeap represents set E
struct MinHeap* minHeap = createMinHeap(V);
// Initialize min heap with all vertices. dist value of all vertices
for (int v = 0; v < V; ++v)
{
dist[v] = INT_MAX;
minHeap->array[v] = newMinHeapNode(v, dist[v]);
minHeap->pos[v] = v;
}
// Make dist value of src vertex as 0 so that it is extracted first
minHeap->array[src] = newMinHeapNode(src, dist[src]);
minHeap->pos[src] = src;
dist[src] = 0;
decreaseKey(minHeap, src, dist[src]);
// Initially size of min heap is equal to V
minHeap->size = V;
// In the followin loop, min heap contains all nodes
// whose shortest distance is not yet finalized.
while (!isEmpty(minHeap))
{
// Extract the vertex with minimum distance value
struct MinHeapNode* minHeapNode = extractMin(minHeap);
int u = minHeapNode->v; // Store the extracted vertex number
// Traverse through all adjacent vertices of u (the extracted
// vertex) and update their distance values
struct AdjListNode* pCrawl = graph->array[u].head;
while (pCrawl != NULL)
{
int v = pCrawl->dest;
// If shortest distance to v is not finalized yet, and distance to v
// through u is less than its previously calculated distance
if (isInMinHeap(minHeap, v) && dist[u] != INT_MAX &&
pCrawl->weight + dist[u] < dist[v])
{
dist[v] = dist[u] + pCrawl->weight;
// update distance value in min heap also
decreaseKey(minHeap, v, dist[v]);
}
pCrawl = pCrawl->next;
}
}
// print the calculated shortest distances
printArr(dist, V);
}
// Driver program to test above functions
int main()
{
// create the graph given in above fugure
int V = 9;
struct Graph* graph = createGraph(V);
addEdge(graph, 0, 1, 4);
addEdge(graph, 0, 7, 8);
addEdge(graph, 1, 2, 8);
addEdge(graph, 1, 7, 11);
addEdge(graph, 2, 3, 7);
addEdge(graph, 2, 8, 2);
addEdge(graph, 2, 5, 4);
addEdge(graph, 3, 4, 9);

addEdge(graph,
addEdge(graph,
addEdge(graph,
addEdge(graph,
addEdge(graph,
addEdge(graph,

3,
4,
5,
6,
6,
7,

5,
5,
6,
7,
8,
8,

14);
10);
2);
1);
6);
7);

dijkstra(graph, 0);
return 0;
}

Output:
Vertex
0
1
2
3
4
5
6
7
8

Distance from Source


0
4
12
19
21
11
9
8
14

Time Complexity:The time complexity of the above code/algorithm looks O(V^2) as there are two nested while loops. If we take a closer look,
we can observe that the statements in inner loop are executed O(V+E) times (similar to BFS). The inner loop has decreaseKey() operation which
takes O(LogV) time. So overall time complexity is O(E+V)*O(LogV) which is O((E+V)*LogV) = O(ELogV)
Note that the above code uses Binary Heap for Priority Queue implementation. Time complexity can be reduced to O(E + VLogV) using
Fibonacci Heap. The reason is, Fibonacci Heap takes O(1) time for decrease-key operation while Binary Heap takes O(Logn) time.
Notes:
1)The code calculates shortest distance, but doesnt calculate the path information. We can create a parent array, update the parent array when
distance is updated (likeprims implementation) and use it show the shortest path from source to different vertices.
2)The code is for undirected graph, same dijekstra function can be used for directed graphs also.
3)The code finds shortest distances from source to all vertices. If we are interested only in shortest distance from source to a single target, we can
break the for loop when the picked minimum distance vertex is equal to target (Step 3.a of algorithm).
4)Dijkstras algorithm doesnt work for graphs with negative weight edges. For graphs with negative weight edges,BellmanFord algorithmcan be
used, we will soon be discussing it as a separate post.
References:
Introduction to Algorithms by Clifford Stein, Thomas H. Cormen, Charles E. Leiserson, Ronald L.
Algorithms by Sanjoy Dasgupta, Christos Papadimitriou, Umesh Vazirani

Job Sequencing Problem | Set 1 (Greedy Algorithm)


Given an array of jobs where every job has a deadline and associated profit if the job is finished before the deadline. It is also given that every job
takes single unit of time, so the minimum possible deadline for any job is 1. How to maximize total profit if only one job can be scheduled at a time.
Examples:
Input: Four Jobs with following deadlines and profits
JobID
Deadline
Profit
a
4
20
b
1
10
c
1
40
d
1
30
Output: Following is maximum profit sequence of jobs
c, a
Input: Five Jobs with following deadlines and profits
JobID
Deadline
Profit
a
2
100
b
1
19
c
2
27
d
1
25
e
3
15
Output: Following is maximum profit sequence of jobs
c, a, e

A Simple Solution is to generate all subsets of given set of jobs and check individual subset for feasibility of jobs in that subset. Keep track of
maximum profit among all feasible subsets. The time complexity of this solution is exponential.
This is a standard Greedy Algorithm problem. Following is algorithm.
1) Sort all jobs in decreasing order of profit.
2) Initialize the result sequence as first job in sorted jobs.
3) Do following for remaining n-1 jobs
.......a) If the current job can fit in the current result sequence
without missing the deadline, add current job to the result.
Else ignore the current job.

The Following is C++ implementation of above algorithm.


// Program to find the maximum profit job sequence from a given array
// of jobs with deadlines and profits
#include<iostream>
#include<algorithm>
using namespace std;
// A structure to represent a job
struct Job
{
char id;
// Job Id
int dead;
// Deadline of job
int profit; // Profit if job is over before or on deadline
};
// This function is used for sorting all jobs according to profit
bool comparison(Job a, Job b)
{
return (a.profit > b.profit);
}
// Returns minimum number of platforms reqquired
void printJobScheduling(Job arr[], int n)
{
// Sort all jobs according to decreasing order of prfit
sort(arr, arr+n, comparison);
int result[n]; // To store result (Sequence of jobs)
bool slot[n]; // To keep track of free time slots
// Initialize all slots to be free
for (int i=0; i<n; i++)
slot[i] = false;
// Iterate through all given jobs
for (int i=0; i<n; i++)
{
// Find a free slot for this job (Note that we start
// from the last possible slot)

for (int j=min(n, arr[i].dead)-1; j>=0; j--)


{
// Free slot found
if (slot[j]==false)
{
result[j] = i; // Add this job to result
slot[j] = true; // Make this slot occupied
break;
}
}
}
// Print the result
for (int i=0; i<n; i++)
if (slot[i])
cout << arr[result[i]].id << " ";
}
// Driver program to test methods
int main()
{
Job arr[5] = { {'a', 2, 100}, {'b', 1, 19}, {'c', 2, 27},
{'d', 1, 25}, {'e', 3, 15}};
int n = sizeof(arr)/sizeof(arr[0]);
cout << "Following is maximum profit sequence of jobs\n";
printJobScheduling(arr, n);
return 0;
}

Output:
Following is maximum profit sequence of jobs
c a e

Time Complexity of the above solution is O(n2). It can be optimized to almost O(n) by using union-find data structure. We will son be discussing
the optimized solution.
Sources:
http://ocw.mit.edu/courses/civil-and-environmental-engineering/1-204-computer-algorithms-in-systems-engineering-spring-2010/lecturenotes/MIT1_204S10_lec10.pdf

Greedy Algorithm to find Minimum number of Coins


Given a value V, if we want to make change for V Rs, and we have infinite supply of each of the denominations in Indian currency, i.e., we have
infinite supply of { 1, 2, 5, 10, 20, 50, 100, 500, 1000} valued coins/notes, what is the minimum number of coins and/or notes needed to make
the change?
Examples:
Input: V = 70
Output: 2
We need a 50 Rs note and a 20 Rs note.
Input: V = 121
Output: 3
We need a 100 Rs note, a 20 Rs note and a
1 Rs coin.
1) Initialize result as empty.
2) find the largest denomination that is
smaller than V.
3) Add found denomination to result. Subtract
value of found denomination from V.
4) If V becomes 0, then print result.
Else repeat steps 2 and 3 for new value of V

Below is C++ implementation of above algorithm.


// C++ program to find minimum number of denominations
#include <bits/stdc++.h>
using namespace std;
// All denominations of Indian Currency
int deno[] = {1, 2, 5, 10, 20, 50, 100, 500, 1000};
int n = sizeof(deno)/sizeof(deno[0]);
// Driver program
void findMin(int V)
{
// Initialize result
vector<int> ans;
// Traverse through all denomination
for (int i=n-1; i>=0; i--)
{
// Find denominations
while (V >= deno[i])
{
V -= deno[i];
ans.push_back(deno[i]);
}
}
// Print result
for (int i = 0; i < ans.size(); i++)
cout << ans[i] << " ";
}
// Driver program
int main()
{
int n = 93;
cout << "Following is minimal number of change for " << n << " is ";
findMin(n);
return 0;
}

Output:
Following is minimal number of change for 93 is 50 20 20 2 1

Note that above approach may not work for all denominations. For example, it doesnt work for denominations {9, 6, 5, 1} and V = 11. The
above approach would print 9, 1 and 1. But we can use 2 denominations 5 and 6.
For general input, we use below dynamic programming approach.
Find minimum number of coins that make a given value
Thanks to Utkarsh for providing above solution here.

K Centers Problem | Set 1 (Greedy Approximate Algorithm)


Given n cities and distances between every pair of cities, select k cities to place warehouses (or ATMs) such that the maximum distance of a city
to a warehouse (or ATM) is minimized.
For example consider the following four cities, 0, 1, 2 and 3 and distances between them, how do place 2 ATMs among these 4 cities so that the
maximum distance of a city to an ATM is minimized.

There is no polynomial time solution available for this problem as the problem is a known NP-Hard problem. There is a polynomial time Greedy
approximate algorithm, the greedy algorithm provides a solution which is never worse that twice the optimal solution. The greedy solution works
only if the distances between cities follow Triangular Inequality (Distance between two points is always smaller than sum of distances through a
third point).
The 2-Approximate Greedy Algorithm:
1) Choose the first center arbitrarily.
2) Choose remaining k-1 centers using the following criteria.
Let c1, c2, c3, ci be the already chosen centers. Choose
(i+1)th center by picking the city which is farthest from already
selected centers, i.e, the point p which has following value as maximum
Min[dist(p, c1), dist(p, c2), dist(p, c3), . dist(p, ci)]
The following diagram taken from here illustrates above algorithm.

Example (k = 3 in the above shown Graph)


a) Let the first arbitrarily picked vertex be 0.
b) The next vertex is 1 because 1 is the farthest vertex from 0.
c) Remaining cities are 2 and 3. Calculate their distances from already selected centers (0 and 1). The greedy algorithm basically calculates
following values.
Minimum of all distanced from 2 to already considered centers
Min[dist(2, 0), dist(2, 1)] = Min[7, 8] = 7
Minimum of all distanced from 3 to already considered centers
Min[dist(3, 0), dist(3, 1)] = Min[6, 5] = 5
After computing the above values, the city 2 is picked as the value corresponding to 2 is maximum.
Note that the greedy algorithm doesnt give best solution for k = 2 as this is just an approximate algorithm with bound as twice of optimal.
Proof that the above greedy algorithm is 2 approximate.
Let OPT be the maximum distance of a city from a center in the Optimal solution. We need to show that the maximum distance obtained from
Greedy algorithm is 2*OPT.
The proof can be done using contradiction.
a) Assume that the distance from the furthest point to all centers is > 2OPT.

b) This means that distances between all centers are also > 2OPT.
c) We have k + 1 points with distances > 2OPT between every pair.
d) Each point has a center of the optimal solution with distance ? OPT to it.
e) There exists a pair of points with the same center X in the optimal solution (pigeonhole principle: k optimal centers, k+1 points)
f) The distance between them is at most 2OPT (triangle inequality) which is a contradiction.
Source:
http://algo2.iti.kit.edu/vanstee/courses/kcenter.pdf

Dynamic Programming | Set 1 (Overlapping Subproblems Property)


Dynamic Programming is an algorithmic paradigm that solves a given complex problem by breaking it into subproblems and stores the results of
subproblems to avoid computing the same results again. Following are the two main properties of a problem that suggest that the given problem
can be solved using Dynamic programming.
1) Overlapping Subproblems
2) Optimal Substructure
1) Overlapping Subproblems:
Like Divide and Conquer, Dynamic Programming combines solutions to sub-problems. Dynamic Programming is mainly used when solutions of
same subproblems are needed again and again. In dynamic programming, computed solutions to subproblems are stored in a table so that these
dont have to recomputed. So Dynamic Programming is not useful when there are no common (overlapping) subproblems because there is no point
storing the solutions if they are not needed again. For example, Binary Search doesnt have common subproblems. If we take example of following
recursive program for Fibonacci Numbers, there are many subproblems which are solved again and again.
/* simple recursive program for Fibonacci numbers */
int fib(int n)
{
if ( n <= 1 )
return n;
return fib(n-1) + fib(n-2);
}

Recursion tree for execution of fib(5)


fib(5)
/
\
fib(4)
fib(3)
/
\
/
\
fib(3)
fib(2)
fib(2)
fib(1)
/
\
/
\
/
\
fib(2) fib(1) fib(1) fib(0) fib(1) fib(0)
/
\
fib(1) fib(0)

We can see that the function f(3) is being called 2 times. If we would have stored the value of f(3), then instead of computing it again, we would
have reused the old stored value. There are following two different ways to store the values so that these values can be reused.
a) Memoization (Top Down):
b) Tabulation (Bottom Up):
a) Memoization (Top Down): The memoized program for a problem is similar to the recursive version with a small modification that it looks into a
lookup table before computing solutions. We initialize a lookup array with all initial values as NIL. Whenever we need solution to a subproblem,
we first look into the lookup table. If the precomputed value is there then we return that value, otherwise we calculate the value and put the result in
lookup table so that it can be reused later.
Following is the memoized version for nth Fibonacci Number.
/* Memoized version for nth Fibonacci number */
#include<stdio.h>
#define NIL -1
#define MAX 100
int lookup[MAX];
/* Function to initialize NIL values in lookup table */
void _initialize()
{
int i;
for (i = 0; i < MAX; i++)
lookup[i] = NIL;
}
/* function for
int fib(int n)
{
if(lookup[n]
{
if ( n <= 1
lookup[n]
else
lookup[n]
}

nth Fibonacci number */


== NIL)
)
= n;
= fib(n-1) + fib(n-2);

return lookup[n];
}
int main ()
{
int n = 40;
_initialize();
printf("Fibonacci number is %d ", fib(n));
getchar();
return 0;
}

b) Tabulation (Bottom Up): The tabulated program for a given problem builds a table in bottom up fashion and returns the last entry from table.
/* tabulated version */
#include<stdio.h>
int fib(int n)
{
int f[n+1];
int i;
f[0] = 0; f[1] = 1;
for (i = 2; i <= n; i++)
f[i] = f[i-1] + f[i-2];
return f[n];
}
int main ()
{
int n = 9;
printf("Fibonacci number is %d ", fib(n));
getchar();
return 0;
}

Both tabulated and Memoized store the solutions of subproblems. In Memoized version, table is filled on demand while in tabulated version,
starting from the first entry, all entries are filled one by one. Unlike the tabulated version, all entries of the lookup table are not necessarily filled in
memoized version. For example, memoized solution of LCS problem doesnt necessarily fill all entries.
To see the optimization achieved by memoized and tabulated versions over the basic recursive version, see the time taken by following runs for
40th Fibonacci number.
Simple recursive program
Memoized version
tabulated version
Also see method 2 of Ugly Number post for one more simple example where we have overlapping subproblems and we store the results of
subproblems.
We will be covering Optimal Substructure Property and some more example problems in future posts on Dynamic Programming.
Try following questions as an exercise of this post.
1) Write a memoized version for LCS problem. Note that the tabular version is given in the CLRS book.
2) How would you choose between Memoization and Tabulation?
References:
http://www.youtube.com/watch?v=V5hZoJ6uK-s

Dynamic Programming | Set 2 (Optimal Substructure Property)


As we discussed in Set 1, following are the two main properties of a problem that suggest that the given problem can be solved using Dynamic
programming.
1) Overlapping Subproblems
2) Optimal Substructure
We have already discussed Overlapping Subproblem property in the Set 1. Let us discuss Optimal Substructure property here.
2) Optimal Substructure: A given problems has Optimal Substructure Property if optimal solution of the given problem can be obtained by using
optimal solutions of its subproblems.
For example the shortest path problem has following optimal substructure property: If a node x lies in the shortest path from a source node u to
destination node v then the shortest path from u to v is combination of shortest path from u to x and shortest path from x to v. The standard All
Pair Shortest Path algorithms like FloydWarshall and BellmanFord are typical examples of Dynamic Programming.
On the other hand the Longest path problem doesnt have the Optimal Substructure property. Here by Longest Path we mean longest simple path
(path without cycle) between two nodes. Consider the following unweighted graph given in the CLRS book. There are two longest paths from q to
t: q -> r ->t and q ->s->t. Unlike shortest paths, these longest paths do not have the optimal substructure property. For example, the longest path
q->r->t is not a combination of longest path from q to r and longest path from r to t, because the longest path from q to r is q->s->t->r.

We will be covering some example problems in future posts on Dynamic Programming.


References:
http://en.wikipedia.org/wiki/Optimal_substructure
CLRS book

Dynamic Programming | Set 3 (Longest Increasing Subsequence)


We have discussed Overlapping Subproblems and Optimal Substructure properties in Set 1 and Set 2 respectively.
Let us discuss Longest Increasing Subsequence (LIS) problem as an example problem that can be solved using Dynamic Programming.
The longest Increasing Subsequence (LIS) problem is to find the length of the longest subsequence of a given sequence such that all elements of the
subsequence are sorted in increasing order. For example, length of LIS for { 10, 22, 9, 33, 21, 50, 41, 60, 80 } is 6 and LIS is {10, 22, 33, 50,
60, 80}.
Optimal Substructure:
Let arr[0..n-1] be the input array and L(i) be the length of the LIS till index i such that arr[i] is part of LIS and arr[i] is the last element in LIS, then
L(i) can be recursively written as.
L(i) = { 1 + Max ( L(j) ) } where j < i and arr[j] < arr[i] and if there is no such j then L(i) = 1
To get LIS of a given array, we need to return max(L(i)) where 0 < i < n So the LIS problem has optimal substructure property as the main
problem can be solved using solutions to subproblems. Overlapping Subproblems:
Following is simple recursive implementation of the LIS problem. The implementation simply follows the recursive structure mentioned above. The
value of lis ending with every element is returned using max_ending_here. The overall lis is returned using pointer to a variable max.

C/C++
/* A Naive C/C++ recursive implementation of LIS problem */
#include<stdio.h>
#include<stdlib.h>
/* To make use of recursive calls, this function must return
two things:
1) Length of LIS ending with element arr[n-1]. We use
max_ending_here for this purpose
2) Overall maximum as the LIS may end with an element
before arr[n-1] max_ref is used this purpose.
The value of LIS of full array of size n is stored in
*max_ref which is our final result */
int _lis( int arr[], int n, int *max_ref)
{
/* Base case */
if (n == 1)
return 1;
// 'max_ending_here' is length of LIS ending with arr[n-1]
int res, max_ending_here = 1;
/* Recursively get all LIS ending with arr[0], arr[1] ...
arr[n-2]. If arr[i-1] is smaller than arr[n-1], and
max ending with arr[n-1] needs to be updated, then
update it */
for (int i = 1; i < n; i++)
{
res = _lis(arr, i, max_ref);
if (arr[i-1] < arr[n-1] && res + 1 > max_ending_here)
max_ending_here = res + 1;
}
// Compare max_ending_here with the overall max. And
// update the overall max if needed
if (*max_ref < max_ending_here)
*max_ref = max_ending_here;
// Return length of LIS ending with arr[n-1]
return max_ending_here;
}
// The wrapper function for _lis()
int lis(int arr[], int n)
{
// The max variable holds the result
int max = 1;
// The function _lis() stores its result in max
_lis( arr, n, &max );
// returns max
return max;
}
/* Driver program to test above function */
int main()

{
int arr[] = { 10, 22, 9, 33, 21, 50, 41, 60 };
int n = sizeof(arr)/sizeof(arr[0]);
printf("Length of LIS is %d\n", lis( arr, n ));
return 0;
}

Python
# A naive Python implementation of LIS problem
""" To make use of recursive calls, this function must return
two things:
1) Length of LIS ending with element arr[n-1]. We use
max_ending_here for this purpose
2) Overall maximum as the LIS may end with an element
before arr[n-1] max_ref is used this purpose.
The value of LIS of full array of size n is stored in
*max_ref which is our final result """
# global variable to store the maximum
global maximum
def _lis(arr , n ):
# to allow the access of global variable
global maximum
# Base Case
if n == 1 :
return 1
# maxEndingHere is the length of LIS ending with arr[n-1]
maxEndingHere = 1
"""Recursively get all LIS ending with arr[0], arr[1]..arr[n-2]
IF arr[n-1] is maller than arr[n-1], and max ending with
arr[n-1] needs to be updated, then update it"""
for i in xrange(1, n):
res = _lis(arr , i)
if arr[i-1] < arr[n-1] and res+1 > maxEndingHere:
maxEndingHere = res +1
# Compare maxEndingHere with overall maximum.And update
# the overall maximum if needed
maximum = max(maximum , maxEndingHere)
return maxEndingHere
def lis(arr):
# to allow the access of global variable
global maximum
# lenght of arr
n = len(arr)
# maximum variable holds the result
maximum = 1
# The function _lis() stores its result in maximum
_lis(arr , n)
return maximum
# Driver program to test the above function
arr = [10 , 22 , 9 , 33 , 21 , 41 , 60]
n = len(arr)
print "Length of LIS is ", lis(arr)
# This code is contributed by NIKHIL KUMAR SINGH

/
lis(3)
/
\
lis(2) lis(1)

lis(4)
|
lis(2)
/
lis(1)

\
lis(1)

/
lis(1)

We can see that there are many subproblems which are solved again and again. So this problem has Overlapping Substructure property and
recomputation of same subproblems can be avoided by either using Memoization or Tabulation. Following is a tabluated implementation for the
LIS problem.

C/C++
/* Dynamic Programming C/C++ implementation of LIS problem */
#include<stdio.h>
#include<stdlib.h>
/* lis() returns the length of the longest increasing
subsequence in arr[] of size n */
int lis( int arr[], int n )
{
int *lis, i, j, max = 0;
lis = (int*) malloc ( sizeof( int ) * n );
/* Initialize LIS values for all indexes */
for ( i = 0; i < n; i++ )
lis[i] = 1;
/* Compute optimized LIS values in bottom up manner */
for ( i = 1; i < n; i++ )
for ( j = 0; j < i; j++ )
if ( arr[i] > arr[j] && lis[i] < lis[j] + 1)
lis[i] = lis[j] + 1;
/* Pick maximum of all LIS values */
for ( i = 0; i < n; i++ )
if ( max < lis[i] )
max = lis[i];
/* Free memory to avoid memory leak */
free( lis );
return max;
}
/* Driver program to test above function */
int main()
{
int arr[] = { 10, 22, 9, 33, 21, 50, 41, 60 };
int n = sizeof(arr)/sizeof(arr[0]);
printf("Length of LIS is %d\n", lis( arr, n ) );
return 0;
}

Python
# Dynamic programming Python implementation of LIS problem
# lis returns length of the longest increasing subsequence
# in arr of size n
def lis(arr):
n = len(arr)
# Declare the list (array) for LIS and initialize LIS
# values for all indexes
lis = [1]*n
# Compute optimized LIS values in bottom up manner
for i in range (1 , n):
for j in range(0 , i):
if arr[i] > arr[j] and lis[i]< lis[j] + 1 :
lis[i] = lis[j]+1
# Initialize maximum to 0 to get the maximum of all
# LIS
maximum = 0
# Pick maximum of all LIS values
for i in range(n):
maximum = max(maximum , lis[i])
return maximum

# end of lis function


# Driver program to test above function
arr = [10, 22, 9, 33, 21, 50, 41, 60]
print "Length of LIS is", lis(arr)
# This code is contributed by Nikhil Kumar Singh

Length of LIS is 5

Note that the time complexity of the above Dynamic Programming (DP) solution is O(n^2) and there is a O(nLogn) solution for the LIS problem.
We have not discussed the O(n Log n) solution here as the purpose of this post is to explain Dynamic Programming with a simple example. See
below post for O(n Log n) solution.
Longest Increasing Subsequence Size (N log N)

Dynamic Programming | Set 4 (Longest Common Subsequence)


We have discussed Overlapping Subproblems and Optimal Substructure properties in Set 1 and Set 2 respectively. We also discussed one
example problem in Set 3. Let us discuss Longest Common Subsequence (LCS) problem as one more example problem that can be solved using
Dynamic Programming.
LCS Problem Statement: Given two sequences, find the length of longest subsequence present in both of them. A subsequence is a sequence that
appears in the same relative order, but not necessarily contiguous. For example, abc, abg, bdf, aeg, acefg, .. etc are subsequences of abcdefg. So
a string of length n has 2^n different possible subsequences.
It is a classic computer science problem, the basis of diff (a file comparison program that outputs the differences between two files), and has
applications in bioinformatics.
Examples:
LCS for input Sequences ABCDGH and AEDFHR is ADH of length 3.
LCS for input Sequences AGGTAB and GXTXAYB is GTAB of length 4.
The naive solution for this problem is to generate all subsequences of both given sequences and find the longest matching subsequence. This
solution is exponential in term of time complexity. Let us see how this problem possesses both important properties of a Dynamic Programming
(DP) Problem.
1) Optimal Substructure:
Let the input sequences be X[0..m-1] and Y[0..n-1] of lengths m and n respectively. And let L(X[0..m-1], Y[0..n-1]) be the length of LCS of the
two sequences X and Y. Following is the recursive definition of L(X[0..m-1], Y[0..n-1]).
If last characters of both sequences match (or X[m-1] == Y[n-1]) then
L(X[0..m-1], Y[0..n-1]) = 1 + L(X[0..m-2], Y[0..n-2])
If last characters of both sequences do not match (or X[m-1] != Y[n-1]) then
L(X[0..m-1], Y[0..n-1]) = MAX ( L(X[0..m-2], Y[0..n-1]), L(X[0..m-1], Y[0..n-2])
Examples:
1) Consider the input strings AGGTAB and GXTXAYB. Last characters match for the strings. So length of LCS can be written as:
L(AGGTAB, GXTXAYB) = 1 + L(AGGTA, GXTXAY)
2) Consider the input strings ABCDGH and AEDFHR. Last characters do not match for the strings. So length of LCS can be written as:
L(ABCDGH, AEDFHR) = MAX ( L(ABCDG, AEDFHR), L(ABCDGH, AEDFH) )
So the LCS problem has optimal substructure property as the main problem can be solved using solutions to subproblems.
2) Overlapping Subproblems:
Following is simple recursive implementation of the LCS problem. The implementation simply follows the recursive structure mentioned above.

C/C++
/* A Naive recursive implementation of LCS problem */
#include<stdio.h>
#include<stdlib.h>
#include<string.h>
int max(int a, int b);
/* Returns length of LCS for X[0..m-1], Y[0..n-1] */
int lcs( char *X, char *Y, int m, int n )
{
if (m == 0 || n == 0)
return 0;
if (X[m-1] == Y[n-1])
return 1 + lcs(X, Y, m-1, n-1);
else
return max(lcs(X, Y, m, n-1), lcs(X, Y, m-1, n));
}
/* Utility function to get max of 2 integers */
int max(int a, int b)
{
return (a > b)? a : b;
}
/* Driver program to test above function */
int main()
{

char X[] = "AGGTAB";


char Y[] = "GXTXAYB";
int m = strlen(X);
int n = strlen(Y);
printf("Length of LCS is %d\n", lcs( X, Y, m, n ) );
return 0;
}

Python
# A Naive recursive Python implementation of LCS problem
def lcs(X, Y, m, n):
if m == 0 or n == 0:
return 0;
elif X[m-1] == Y[n-1]:
return 1 + lcs(X, Y, m-1, n-1);
else:
return max(lcs(X, Y, m, n-1), lcs(X, Y, m-1, n));
# Driver program to test the above function
X = "AGGTAB"
Y = "GXTXAYB"
print "Length of LCS is ", lcs(X , Y, len(X), len(Y))

Length of LCS is 4

Time complexity of the above naive recursive approach is O(2^n) in worst case and worst case happens when all characters of X and Y mismatch
i.e., length of LCS is 0.
Considering the above implementation, following is a partial recursion tree for input strings AXYT and AYZX
lcs("AXYT", "AYZX")
/
\
lcs("AXY", "AYZX")
lcs("AXYT", "AYZ")
/
\
/
\
lcs("AX", "AYZX") lcs("AXY", "AYZ") lcs("AXY", "AYZ") lcs("AXYT", "AY")

In the above partial recursion tree, lcs(AXY, AYZ) is being solved twice. If we draw the complete recursion tree, then we can see that there are
many subproblems which are solved again and again. So this problem has Overlapping Substructure property and recomputation of same
subproblems can be avoided by either using Memoization or Tabulation. Following is a tabulated implementation for the LCS problem.

C/C++
/* Dynamic Programming C/C++ implementation of LCS problem */
#include<stdio.h>
#include<stdlib.h>
int max(int a, int b);
/* Returns length of LCS for X[0..m-1], Y[0..n-1] */
int lcs( char *X, char *Y, int m, int n )
{
int L[m+1][n+1];
int i, j;
/* Following steps build L[m+1][n+1] in bottom up fashion. Note
that L[i][j] contains length of LCS of X[0..i-1] and Y[0..j-1] */
for (i=0; i<=m; i++)
{
for (j=0; j<=n; j++)
{
if (i == 0 || j == 0)
L[i][j] = 0;
else if (X[i-1] == Y[j-1])
L[i][j] = L[i-1][j-1] + 1;
else
L[i][j] = max(L[i-1][j], L[i][j-1]);
}
}

/* L[m][n] contains length of LCS for X[0..n-1] and Y[0..m-1] */


return L[m][n];
}
/* Utility function to get max of 2 integers */
int max(int a, int b)
{
return (a > b)? a : b;
}
/* Driver program to test above function */
int main()
{
char X[] = "AGGTAB";
char Y[] = "GXTXAYB";
int m = strlen(X);
int n = strlen(Y);
printf("Length of LCS is %d\n", lcs( X, Y, m, n ) );
return 0;
}

Python
# Dynamic Programming implementation of LCS problem
def lcs(X , Y):
# find the length of the strings
m = len(X)
n = len(Y)
# declaring the array for storing the dp values
L = [[None]*(n+1) for i in xrange(m+1)]
"""Following steps build L[m+1][n+1] in bottom up fashion
Note: L[i][j] contains length of LCS of X[0..i-1]
and Y[0..j-1]"""
for i in range(m+1):
for j in range(n+1):
if i == 0 or j == 0 :
L[i][j] = 0
elif X[i-1] == Y[j-1]:
L[i][j] = L[i-1][j-1]+1
else:
L[i][j] = max(L[i-1][j] , L[i][j-1])
# L[m][n] contains the length of LCS of X[0..n-1] & Y[0..m-1]
return L[m][n]
#end of function lcs
# Driver program to test the above function
X = "AGGTAB"
Y = "GXTXAYB"
print "Length of LCS is ", lcs(X, Y)
# This code is contributed by Nikhil Kumar Singh(nickzuck_007)

The above algorithm/code returns only length of LCS. Please see the following post for printing the LCS.
Printing Longest Common Subsequence
References:
http://www.youtube.com/watch?v=V5hZoJ6uK-s
http://www.algorithmist.com/index.php/Longest_Common_Subsequence
http://www.ics.uci.edu/~eppstein/161/960229.html
http://en.wikipedia.org/wiki/Longest_common_subsequence_problem

Dynamic Programming | Set 5 (Edit Distance)


Given two strings str1 and str2 and below operations that can performed on str1. Find minimum number of edits (operations) required to convert
str1? into str2?.
a. Insert
b. Remove
c. Replace
All of the above operations are of equal cost.
Examples:
Input: str1 = "geek", str2 = "gesek"
Output: 1
We can convert str1 into str2 by inserting a 's'.
Input: str1 = "cat", str2 = "cut"
Output: 1
We can convert str1 into str2 by replacing 'a' with 'u'.
Input: str1 = "sunday", str2 = "saturday"
Output: 3
Last three and first characters are same. We basically
need to convert "un" to "atur". This can be done using
below three operations.
Replace 'n' with 'r', insert t, insert a

What are the subproblems in this case?


The idea is process all characters one by one staring from either from left or right sides of both strings.
Let we traverse from right corner, there are two possibilities for every pair of character being traversed.
m: Length of str1 (first string)
n: Length of str2 (second string)

1. If last characters of two strings are same, nothing much to do. Ignore last characters and get count for remaining strings. So we recur for
lengths m-1 and n-1.
2. Else (If last characters are not same), we consider all operations on str1?, consider all three operations on last character of first string,
recursively compute minimum cost for all three operations and take minimum of three values.
a. Insert: Recur for m and n-1
b. Remove: Recur for m-1 and n
c. Replace: Recur for m-1 and n-1
Below is C++ implementation of above Naive recursive solution.

C++
// A Naive recursive C++ program to find minimum number
// operations to convert str1 to str2
#include<bits/stdc++.h>
using namespace std;
// Utility function to find minimum of three numbers
int min(int x, int y, int z)
{
return min(min(x, y), z);
}
int editDist(string str1 , string str2 , int m ,int n)
{
// If first string is empty, the only option is to
// insert all characters of second string into first
if (m == 0) return n;
// If second string is empty, the only option is to
// remove all characters of first string
if (n == 0) return m;
//
//
//
if

If last characters of two strings are same, nothing


much to do. Ignore last characters and get count for
remaining strings.
(str1[m-1] == str2[n-1])
return editDist(str1, str2, m-1, n-1);

// If last characters are not same, consider all three

// operations on last character


// compute minimum cost for all
// minimum of three values.
return 1 + min ( editDist(str1,
editDist(str1,
editDist(str1,
);

of first string, recursively


three operations and take
str2, m, n-1),
// Insert
str2, m-1, n), // Remove
str2, m-1, n-1) // Replace

}
// Driver program
int main()
{
// your code goes here
string str1 = "sunday";
string str2 = "saturday";
cout << editDist( str1 , str2 , str1.length(), str2.length());
return 0;
}

Python
# A Naive recursive Python program to fin minimum number
# operations to convert str1 to str2
def editDistance(str1, str2, m , n):
# If first string is empty, the only option is to
# insert all characters of second string into first
if m==0:
return n
# If second string is empty, the only option is to
# remove all characters of first string
if n==0:
return m
# If last characters of two strings are same, nothing
# much to do. Ignore last characters and get count for
# remaining strings.
if str1[m-1]==str2[n-1]:
return editDistance(str1,str2,m-1,n-1)
# If last characters are not same, consider all three
# operations on last character of first string, recursively
# compute minimum cost for all three operations and take
# minimum of three values.
return 1 + min(editDistance(str1, str2, m, n-1),
# Insert
editDistance(str1, str2, m-1, n),
# Remove
editDistance(str1, str2, m-1, n-1)
# Replace
)
# Driver program to test the above function
str1 = "sunday"
str2 = "saturday"
print editDistance(str1, str2, len(str1), len(str2))
# This code is contributed by Bhavya Jain

The time complexity of above solution is exponential. In worst case, we may end up doing O(3m) operations. The worst case happens when none
of characters of two strings match. Below is a recursive call diagram for worst case.

We can see that many subproblems are solved again and again, for example eD(2,2) is called three times. Since same suproblems are called again,
this problem has Overlapping Subprolems property. So Edit Distance problem has both properties (see this and this) of a dynamic programming
problem. Like other typical Dynamic Programming(DP) problems, recomputations of same subproblems can be avoided by constructing a
temporary array that stores results of subpriblems.

C++
// A Dynamic Programming based C++ program to find minimum
// number operations to convert str1 to str2
#include<bits/stdc++.h>
using namespace std;
// Utility function to find minimum of three numbers
int min(int x, int y, int z)
{
return min(min(x, y), z);
}
int editDistDP(string str1, string str2, int m, int n)
{
// Create a table to store results of subproblems
int dp[m+1][n+1];
// Fill d[][] in bottom up manner
for (int i=0; i<=m; i++)
{
for (int j=0; j<=n; j++)
{
// If first string is empty, only option is to
// isnert all characters of second string
if (i==0)
dp[i][j] = j; // Min. operations = j
// If second string is empty, only option is to
// remove all characters of second string
else if (j==0)
dp[i][j] = i; // Min. operations = i
// If last characters are same, ignore last char
// and recur for remaining string
else if (str1[i-1] == str2[j-1])
dp[i][j] = dp[i-1][j-1];
// If last character are different, consider all
// possibilities and find minimum
else
dp[i][j] = 1 + min(dp[i][j-1], // Insert
dp[i-1][j], // Remove
dp[i-1][j-1]); // Replace
}
}
return dp[m][n];
}
// Driver program
int main()
{

// your code goes here


string str1 = "sunday";
string str2 = "saturday";
cout << editDistDP(str1, str2, str1.length(), str2.length());
return 0;
}

Python
# A Dynamic Programming based Python program for edit
# distance problem
def editDistDP(str1, str2, m, n):
# Create a table to store results of subproblems
dp = [[0 for x in range(n+1)] for x in range(m+1)]
# Fill d[][] in bottom up manner
for i in range(m+1):
for j in range(n+1):
# If first string is empty, only option is to
# isnert all characters of second string
if i == 0:
dp[i][j] = j
# Min. operations = j
# If second string is empty, only option is to
# remove all characters of second string
elif j == 0:
dp[i][j] = i
# Min. operations = i
# If last characters are same, ignore last char
# and recur for remaining string
elif str1[i-1] == str2[j-1]:
dp[i][j] = dp[i-1][j-1]
# If last character are different, consider all
# possibilities and find minimum
else:
dp[i][j] = 1 + min(dp[i][j-1],
# Insert
dp[i-1][j],
# Remove
dp[i-1][j-1])
# Replace
return dp[m][n]
# Driver program
str1 = "sunday"
str2 = "saturday"
print(editDistDP(str1, str2, len(str1), len(str2)))
# This code is contributed by Bhavya Jain

Output:
3

Time Complexity: O(m x n)


Auxiliary Space: O(m x n)
Applications: There are many practical applications of edit distance algorithm, refer Lucene API for sample. Another example, display all the
words in a dictionary that are near proximity to a given word\incorrectly spelled word.
Thanks to Vivek Kumar for suggesting above updates.

Dynamic Programming | Set 6 (Min Cost Path)


Given a cost matrix cost[][] and a position (m, n) in cost[][], write a function that returns cost of minimum cost path to reach (m, n) from (0, 0).
Each cell of the matrix represents a cost to traverse through that cell. Total cost of a path to reach (m, n) is sum of all the costs on that path
(including both source and destination). You can only traverse down, right and diagonally lower cells from a given cell, i.e., from a given cell (i, j),
cells (i+1, j), (i, j+1) and (i+1, j+1) can be traversed. You may assume that all costs are positive integers.
For example, in the following figure, what is the minimum cost path to (2, 2)?

The path with minimum cost is highlighted in the following figure. The path is (0, 0) > (0, 1) > (1, 2) > (2, 2). The cost of the path is 8 (1 + 2 + 2 +
3).

1) Optimal Substructure
The path to reach (m, n) must be through one of the 3 cells: (m-1, n-1) or (m-1, n) or (m, n-1). So minimum cost to reach (m, n) can be written as
minimum of the 3 cells plus cost[m][n].
minCost(m, n) = min (minCost(m-1, n-1), minCost(m-1, n), minCost(m, n-1)) + cost[m][n]
2) Overlapping Subproblems
Following is simple recursive implementation of the MCP (Minimum Cost Path) problem. The implementation simply follows the recursive structure
mentioned above.
/* A Naive recursive implementation of MCP(Minimum Cost Path) problem */
#include<stdio.h>
#include<limits.h>
#define R 3
#define C 3
int min(int x, int y, int z);
/* Returns cost of minimum cost path from (0,0) to (m, n) in mat[R][C]*/
int minCost(int cost[R][C], int m, int n)
{
if (n < 0 || m < 0)
return INT_MAX;
else if (m == 0 && n == 0)
return cost[m][n];
else
return cost[m][n] + min( minCost(cost, m-1, n-1),
minCost(cost, m-1, n),
minCost(cost, m, n-1) );
}
/* A utility function
int min(int x, int y,
{
if (x < y)
return (x < z)?
else
return (y < z)?
}

that returns minimum of 3 integers */


int z)
x : z;
y : z;

/* Driver program to test above functions */


int main()
{
int cost[R][C] = { {1, 2, 3},
{4, 8, 2},

{1, 5, 3} };
printf(" %d ", minCost(cost, 2, 2));
return 0;
}

It should be noted that the above function computes the same subproblems again and again. See the following recursion tree, there are many nodes
which apear more than once. Time complexity of this naive recursive solution is exponential and it is terribly slow.
mC refers to minCost()

mC(1,
/
|
/
|
mC(0,0) mC(0,1)

mC(2, 2)
/
|
\
/
|
\
1)
mC(1, 2)
mC(2,
\
/
|
\
/
\
/
|
\
/
mC(1,0) mC(0,1) mC(0,2) mC(1,1) mC(1,0)

1)
|
\
|
\
mC(1,1) mC(2,0)

So the MCP problem has both properties (see this and this) of a dynamic programming problem. Like other typical Dynamic Programming(DP)
problems, recomputations of same subproblems can be avoided by constructing a temporary array tc[][] in bottom up manner.

C++
/* Dynamic Programming implementation of MCP problem */
#include<stdio.h>
#include<limits.h>
#define R 3
#define C 3
int min(int x, int y, int z);
int minCost(int cost[R][C], int m, int n)
{
int i, j;
// Instead of following line, we can use int tc[m+1][n+1] or
// dynamically allocate memoery to save space. The following line is
// used to keep te program simple and make it working on all compilers.
int tc[R][C];
tc[0][0] = cost[0][0];
/* Initialize first column of total cost(tc) array */
for (i = 1; i <= m; i++)
tc[i][0] = tc[i-1][0] + cost[i][0];
/* Initialize first row of tc array */
for (j = 1; j <= n; j++)
tc[0][j] = tc[0][j-1] + cost[0][j];
/* Construct rest of the tc array */
for (i = 1; i <= m; i++)
for (j = 1; j <= n; j++)
tc[i][j] = min(tc[i-1][j-1], tc[i-1][j], tc[i][j-1]) + cost[i][j];
return tc[m][n];
}
/* A utility function
int min(int x, int y,
{
if (x < y)
return (x < z)?
else
return (y < z)?
}

that returns minimum of 3 integers */


int z)
x : z;
y : z;

/* Driver program to test above functions */


int main()
{
int cost[R][C] = { {1, 2, 3},
{4, 8, 2},
{1, 5, 3} };
printf(" %d ", minCost(cost, 2, 2));
return 0;
}

Python

#
#
R
C

Dynamic Programming Python implementation of Min Cost Path


problem
= 3
= 3

def minCost(cost, m, n):


# Instead of following line, we can use int tc[m+1][n+1] or
# dynamically allocate memoery to save space. The following
# line is used to keep te program simple and make it working
# on all compilers.
tc = [[0 for x in range(C)] for x in range(R)]
tc[0][0] = cost[0][0]
# Initialize first column of total cost(tc) array
for i in range(1, m+1):
tc[i][0] = tc[i-1][0] + cost[i][0]
# Initialize first row of tc array
for j in range(1, n+1):
tc[0][j] = tc[0][j-1] + cost[0][j]
# Construct rest of the tc array
for i in range(1, m+1):
for j in range(1, n+1):
tc[i][j] = min(tc[i-1][j-1], tc[i-1][j], tc[i][j-1]) + cost[i][j]
return tc[m][n]
# Driver program to test above functions
cost = [[1, 2, 3],
[4, 8, 2],
[1, 5, 3]]
print(minCost(cost, 2, 2))
# This code is contributed by Bhavya Jain

Time Complexity of the DP implementation is O(mn) which is much better than Naive Recursive implementation.

Dynamic Programming | Set 7 (Coin Change)


Given a value N, if we want to make change for N cents, and we have infinite supply of each of S = { S1, S2, .. , Sm} valued coins, how many
ways can we make the change? The order of coins doesnt matter.
For example, for N = 4 and S = {1,2,3}, there are four solutions: {1,1,1,1},{1,1,2},{2,2},{1,3}. So output should be 4. For N = 10 and S = {2,
5, 3, 6}, there are five solutions: {2,2,2,2,2}, {2,2,3,3}, {2,2,6}, {2,3,5} and {5,5}. So the output should be 5.
1) Optimal Substructure
To count total number solutions, we can divide all set solutions in two sets.
1) Solutions that do not contain mth coin (or Sm).
2) Solutions that contain at least one Sm.
Let count(S[], m, n) be the function to count the number of solutions, then it can be written as sum of count(S[], m-1, n) and count(S[], m, n-Sm).
Therefore, the problem has optimal substructure property as the problem can be solved using solutions to subproblems.
2) Overlapping Subproblems
Following is a simple recursive implementation of the Coin Change problem. The implementation simply follows the recursive structure mentioned
above.
#include<stdio.h>
// Returns the count of ways we can sum S[0...m-1] coins to get sum n
int count( int S[], int m, int n )
{
// If n is 0 then there is 1 solution (do not include any coin)
if (n == 0)
return 1;
// If n is less than 0 then no solution exists
if (n < 0)
return 0;
// If there are no coins and n is greater than 0, then no solution exist
if (m <=0 && n >= 1)
return 0;
// count is sum of solutions (i) including S[m-1] (ii) excluding S[m-1]
return count( S, m - 1, n ) + count( S, m, n-S[m-1] );
}
// Driver program to test above function
int main()
{
int i, j;
int arr[] = {1, 2, 3};
int m = sizeof(arr)/sizeof(arr[0]);
printf("%d ", count(arr, m, 4));
getchar();
return 0;
}

It should be noted that the above function computes the same subproblems again and again. See the following recursion tree for S = {1, 2, 3} and
n = 5.
The function C({1}, 3) is called two times. If we draw the complete tree, then we can see that there are many subproblems being called more than
once.
C() --> count()
C({1,2,3}, 5)
/

\
/
\
C({1,2,3}, 2)
C({1,2}, 5)
/
\
/
\
/
\
/
\
C({1,2,3}, -1) C({1,2}, 2)
C({1,2}, 3)
C({1}, 5)
/
\
/
\
/
\
/
\
/
\
/
\
C({1,2},0) C({1},2) C({1,2},1) C({1},3)
C({1}, 4) C({}, 5)
/ \
/ \
/ \
/
\
/ \
/ \
/ \
/
\
.
. .
. .
. C({1}, 3) C({}, 4)
/ \
/
\
.
.

Since same suproblems are called again, this problem has Overlapping Subprolems property. So the Coin Change problem has both properties

(see this and this) of a dynamic programming problem. Like other typical Dynamic Programming(DP) problems, recomputations of same
subproblems can be avoided by constructing a temporary array table[][] in bottom up manner.
Dynamic Programming Solution

C
#include<stdio.h>
int count( int S[], int m, int n )
{
int i, j, x, y;
// We need n+1 rows as the table is consturcted in bottom up manner using
// the base case 0 value case (n = 0)
int table[n+1][m];
// Fill the enteries for 0 value case (n = 0)
for (i=0; i<m; i++)
table[0][i] = 1;
// Fill rest of the table enteries in bottom up manner
for (i = 1; i < n+1; i++)
{
for (j = 0; j < m; j++)
{
// Count of solutions including S[j]
x = (i-S[j] >= 0)? table[i - S[j]][j]: 0;
// Count of solutions excluding S[j]
y = (j >= 1)? table[i][j-1]: 0;
// total count
table[i][j] = x + y;
}
}
return table[n][m-1];
}
// Driver program to test above function
int main()
{
int arr[] = {1, 2, 3};
int m = sizeof(arr)/sizeof(arr[0]);
int n = 4;
printf(" %d ", count(arr, m, n));
return 0;
}

Python
# Dynamic Programming Python implementation of Coin Change problem
def count(S, m, n):
# We need n+1 rows as the table is consturcted in bottom up
# manner using the base case 0 value case (n = 0)
table = [[0 for x in range(m)] for x in range(n+1)]
# Fill the enteries for 0 value case (n = 0)
for i in range(m):
table[0][i] = 1
# Fill rest of the table enteries in bottom up manner
for i in range(1, n+1):
for j in range(m):
# Count of solutions including S[j]
x = table[i - S[j]][j] if i-S[j] >= 0 else 0
# Count of solutions excluding S[j]
y = table[i][j-1] if j >= 1 else 0
# total count
table[i][j] = x + y
return table[n][m-1]
# Driver program to test above function
arr = [1, 2, 3]
m = len(arr)

n = 4
print(count(arr, m, n))
# This code is contributed by Bhavya Jain

Time Complexity: O(mn)


Following is a simplified version of method 2. The auxiliary space required here is O(n) only.
int count( int S[], int m, int n )
{
// table[i] will be storing the number of solutions for
// value i. We need n+1 rows as the table is consturcted
// in bottom up manner using the base case (n = 0)
int table[n+1];
// Initialize all table values as 0
memset(table, 0, sizeof(table));
// Base case (If given value is 0)
table[0] = 1;
// Pick all coins one by one and update the table[] values
// after the index greater than or equal to the value of the
// picked coin
for(int i=0; i<m; i++)
for(int j=S[i]; j<=n; j++)
table[j] += table[j-S[i]];
return table[n];
}

Thanks to Rohan Laishram for suggesting this space optimized version.


References:
http://www.algorithmist.com/index.php/Coin_Change

Dynamic Programming | Set 8 (Matrix Chain Multiplication)


Given a sequence of matrices, find the most efficient way to multiply these matrices together. The problem is not actually to perform the
multiplications, but merely to decide in which order to perform the multiplications.
We have many options to multiply a chain of matrices because matrix multiplication is associative. In other words, no matter how we parenthesize
the product, the result will be the same. For example, if we had four matrices A, B, C, and D, we would have:
(ABC)D = (AB)(CD) = A(BCD) = ....

However, the order in which we parenthesize the product affects the number of simple arithmetic operations needed to compute the product, or
the efficiency. For example, suppose A is a 10 30 matrix, B is a 30 5 matrix, and C is a 5 60 matrix. Then,
(AB)C = (10305) + (10560) = 1500 + 3000 = 4500 operations
A(BC) = (30560) + (103060) = 9000 + 18000 = 27000 operations.

Clearly the first parenthesization requires less number of operations.


Given an array p[] which represents the chain of matrices such that the ith matrix Ai is of dimension p[i-1] x p[i]. We need to write a
function MatrixChainOrder() that should return the minimum number of multiplications needed to multiply the chain.
Input: p[] = {40, 20, 30, 10, 30}
Output: 26000
There are 4 matrices of dimensions 40x20, 20x30, 30x10 and 10x30.
Let the input 4 matrices be A, B, C and D. The minimum number of
multiplications are obtained by putting parenthesis in following way
(A(BC))D --> 20*30*10 + 40*20*10 + 40*10*30
Input: p[] = {10, 20, 30, 40, 30}
Output: 30000
There are 4 matrices of dimensions 10x20, 20x30, 30x40 and 40x30.
Let the input 4 matrices be A, B, C and D. The minimum number of
multiplications are obtained by putting parenthesis in following way
((AB)C)D --> 10*20*30 + 10*30*40 + 10*40*30
Input: p[] = {10, 20, 30}
Output: 6000
There are only two matrices of dimensions 10x20 and 20x30. So there
is only one way to multiply the matrices, cost of which is 10*20*30

1) Optimal Substructure:
A simple solution is to place parenthesis at all possible places, calculate the cost for each placement and return the minimum value. In a chain of
matrices of size n, we can place the first set of parenthesis in n-1 ways. For example, if the given chain is of 4 matrices. let the chain be ABCD,
then there are 3 way to place first set of parenthesis: A(BCD), (AB)CD and (ABC)D. So when we place a set of parenthesis, we divide the
problem into subproblems of smaller size. Therefore, the problem has optimal substructure property and can be easily solved using recursion.
Minimum number of multiplication needed to multiply a chain of size n = Minimum of all n-1 placements (these placements create subproblems of
smaller size)
2) Overlapping Subproblems
Following is a recursive implementation that simply follows the above optimal substructure property.
/* A naive recursive implementation that simply follows the above optimal
substructure property */
#include<stdio.h>
#include<limits.h>
// Matrix Ai has dimension p[i-1] x p[i] for i = 1..n
int MatrixChainOrder(int p[], int i, int j)
{
if(i == j)
return 0;
int k;
int min = INT_MAX;
int count;
// place parenthesis at different places between first and last matrix,
// recursively calculate count of multiplcations for each parenthesis
// placement and return the minimum count
for (k = i; k <j; k++)
{
count = MatrixChainOrder(p, i, k) +
MatrixChainOrder(p, k+1, j) +
p[i-1]*p[k]*p[j];
if (count < min)

min = count;
}
// Return minimum count
return min;
}
// Driver program to test above function
int main()
{
int arr[] = {1, 2, 3, 4, 3};
int n = sizeof(arr)/sizeof(arr[0]);
printf("Minimum number of multiplications is %d ",
MatrixChainOrder(arr, 1, n-1));
getchar();
return 0;
}

Time complexity of the above naive recursive approach is exponential. It should be noted that the above function computes the same subproblems
again and again. See the following recursion tree for a matrix chain of size 4. The function MatrixChainOrder(p, 3, 4) is called two times. We can
see that there are many subproblems being called more than once.

Since same suproblems are called again, this problem has Overlapping Subprolems property. So Matrix Chain Multiplication problem has both
properties (see this and this) of a dynamic programming problem. Like other typical Dynamic Programming(DP) problems, recomputations of
same subproblems can be avoided by constructing a temporary array m[][] in bottom up manner.
Dynamic Programming Solution
Following is C/C++ implementation for Matrix Chain Multiplication problem using Dynamic Programming.

C
// See the Cormen book for details of the following algorithm
#include<stdio.h>
#include<limits.h>
// Matrix Ai has dimension p[i-1] x p[i] for i = 1..n
int MatrixChainOrder(int p[], int n)
{
/* For simplicity of the program, one extra row and one extra column are
allocated in m[][]. 0th row and 0th column of m[][] are not used */
int m[n][n];
int i, j, k, L, q;
/* m[i,j] = Minimum number of scalar multiplications needed to compute
the matrix A[i]A[i+1]...A[j] = A[i..j] where dimention of A[i] is
p[i-1] x p[i] */
// cost is zero when multiplying one matrix.
for (i = 1; i < n; i++)
m[i][i] = 0;
// L is chain length.
for (L=2; L<n; L++)
{
for (i=1; i<=n-L+1; i++)
{
j = i+L-1;
m[i][j] = INT_MAX;
for (k=i; k<=j-1; k++)
{
// q = cost/scalar multiplications
q = m[i][k] + m[k+1][j] + p[i-1]*p[k]*p[j];

if (q < m[i][j])
m[i][j] = q;
}
}
}
return m[1][n-1];
}
int main()
{
int arr[] = {1, 2, 3, 4};
int size = sizeof(arr)/sizeof(arr[0]);
printf("Minimum number of multiplications is %d ",
MatrixChainOrder(arr, size));
getchar();
return 0;
}

Python
# Dynamic Programming Python implementation of Matrix Chain Multiplication
# See the Cormen book for details of the following algorithm
import sys
# Matrix Ai has dimension p[i-1] x p[i] for i = 1..n
def MatrixChainOrder(p, n):
# For simplicity of the program, one extra row and one extra column are
# allocated in m[][]. 0th row and 0th column of m[][] are not used
m = [[0 for x in range(n)] for x in range(n)]
# m[i,j] = Minimum number of scalar multiplications needed to compute
# the matrix A[i]A[i+1]...A[j] = A[i..j] where dimention of A[i] is
# p[i-1] x p[i]
# cost is zero when multiplying one matrix.
for i in range(1, n):
m[i][i] = 0
# L is chain length.
for L in range(2, n):
for i in range(1, n-L+1):
j = i+L-1
m[i][j] = sys.maxint
for k in range(i, j):
# q = cost/scalar multiplications
q = m[i][k] + m[k+1][j] + p[i-1]*p[k]*p[j]
if q < m[i][j]:
m[i][j] = q
return m[1][n-1]
# Driver program to test above function
arr = [1, 2, 3 ,4]
size = len(arr)
print("Minimum number of multiplications is " + str(MatrixChainOrder(arr, size)))
# This Code is contributed by Bhavya Jain

Minimum number of multiplications is 18

Time Complexity: O(n^3)


Auxiliary Space: O(n^2)
References:
http://en.wikipedia.org/wiki/Matrix_chain_multiplication
http://www.personal.kent.edu/~rmuhamma/Algorithms/MyAlgorithms/Dynamic/chainMatrixMult.htm

Dynamic Programming | Set 9 (Binomial Coefficient)


Following are common definition of Binomial Coefficients.
1) A binomial coefficient C(n, k) can be defined as the coefficient of X^k in the expansion of (1 + X)^n.
2) A binomial coefficient C(n, k) also gives the number of ways, disregarding order, that k objects can be chosen from among n objects; more
formally, the number of k-element subsets (or k-combinations) of an n-element set.
The Problem
Write a function that takes two parameters n and k and returns the value of Binomial Coefficient C(n, k). For example, your function
should return 6 for n = 4 and k = 2, and it should return 10 for n = 5 and k = 2.
1) Optimal Substructure
The value of C(n, k) can recursively calculated using following standard formula for Binomial Cofficients.
C(n, k) = C(n-1, k-1) + C(n-1, k)
C(n, 0) = C(n, n) = 1

2) Overlapping Subproblems
Following is simple recursive implementation that simply follows the recursive structure mentioned above.
// A Naive Recursive Implementation
#include<stdio.h>
// Returns value of Binomial Coefficient C(n, k)
int binomialCoeff(int n, int k)
{
// Base Cases
if (k==0 || k==n)
return 1;
// Recur
return binomialCoeff(n-1, k-1) + binomialCoeff(n-1, k);
}
/* Drier program to test above function*/
int main()
{
int n = 5, k = 2;
printf("Value of C(%d, %d) is %d ", n, k, binomialCoeff(n, k));
return 0;
}

It should be noted that the above function computes the same subproblems again and again. See the following recursion tree for n = 5 an k = 2.
The function C(3, 1) is called two times. For large values of n, there will be many common subproblems.
C(5, 2)
/

\
C(4, 1)
C(4, 2)
/ \
/
\
C(3, 0) C(3, 1)
C(3, 1)
C(3, 2)
/
\
/
\
/
\
C(2, 0)
C(2, 1)
C(2, 0) C(2, 1)
C(2, 1) C(2, 2)
/
\
/ \
/
\
C(1, 0) C(1, 1)
C(1, 0) C(1, 1) C(1, 0) C(1, 1)

Since same suproblems are called again, this problem has Overlapping Subprolems property. So the Binomial Coefficient problem has both
properties (see this and this) of a dynamic programming problem. Like other typical Dynamic Programming(DP) problems, recomputations of
same subproblems can be avoided by constructing a temporary array C[][] in bottom up manner. Following is Dynamic Programming based
implementation.

C
// A Dynamic Programming based solution that uses table C[][] to
// calculate the Binomial Coefficient
#include<stdio.h>
// Prototype of a utility function that returns minimum of two integers
int min(int a, int b);
// Returns value of Binomial Coefficient C(n, k)
int binomialCoeff(int n, int k)
{
int C[n+1][k+1];
int i, j;

// Caculate value of Binomial Coefficient in bottom up manner


for (i = 0; i <= n; i++)
{
for (j = 0; j <= min(i, k); j++)
{
// Base Cases
if (j == 0 || j == i)
C[i][j] = 1;
// Calculate value using previosly stored values
else
C[i][j] = C[i-1][j-1] + C[i-1][j];
}
}
return C[n][k];
}
// A utility function to return minimum of two integers
int min(int a, int b)
{
return (a<b)? a: b;
}
/* Drier program to test above function*/
int main()
{
int n = 5, k = 2;
printf ("Value of C(%d, %d) is %d ", n, k, binomialCoeff(n, k) );
return 0;
}

Python
# A Dynamic Programming based Python Program that uses table C[][]
# to calculate the Binomial Coefficient
# Returns value of Binomial Coefficient C(n, k)
def binomialCoef(n, k):
C = [[0 for x in range(k+1)] for x in range(n+1)]
# Calculate value of Binomial Coefficient in bottom up manner
for i in range(n+1):
for j in range(min(i, k)+1):
# Base Cases
if j == 0 or j == i:
C[i][j] = 1
# Calculate value using previosly stored values
else:
C[i][j] = C[i-1][j-1] + C[i-1][j]
return C[n][k]
# Driver program to test above function
n = 5
k = 2
print("Value of C[" + str(n) + "][" + str(k) + "] is "
+ str(binomialCoef(n,k)))
# This code is contributed by Bhavya Jain

Value of C[5][2] is 10

Time Complexity: O(n*k)


Auxiliary Space: O(n*k)
Following is a space optimized version of the above code. The following code only uses O(k). Thanks to AK for suggesting this method.
// A space optimized Dynamic Programming Solution
int binomialCoeff(int n, int k)
{
int* C = (int*)calloc(k+1, sizeof(int));
int i, j, res;
C[0] = 1;

for(i = 1; i <= n; i++)


{
for(j = min(i, k); j > 0; j--)
C[j] = C[j] + C[j-1];
}
res = C[k]; // Store the result before freeing memory
free(C); // free dynamically allocated memory to avoid memory leak
return res;
}

Time Complexity: O(n*k)


Auxiliary Space: O(k)
References:
http://www.csl.mtu.edu/cs4321/www/Lectures/Lecture%2015%20-%20Dynamic%20Programming%20Binomial%20Coefficients.htm

Dynamic Programming | Set 10 ( 0-1 Knapsack Problem)


Given weights and values of n items, put these items in a knapsack of capacity W to get the maximum total value in the knapsack. In other words,
given two integer arrays val[0..n-1] and wt[0..n-1] which represent values and weights associated with n items respectively. Also given an integer
W which represents knapsack capacity, find out the maximum value subset of val[] such that sum of the weights of this subset is smaller than or
equal to W. You cannot break an item, either pick the complete item, or dont pick it (0-1 property).
A simple solution is to consider all subsets of items and calculate the total weight and value of all subsets. Consider the only subsets whose total
weight is smaller than W. From all such subsets, pick the maximum value subset.
1) Optimal Substructure:
To consider all subsets of items, there can be two cases for every item: (1) the item is included in the optimal subset, (2) not included in the optimal
set.
Therefore, the maximum value that can be obtained from n items is max of following two values.
1) Maximum value obtained by n-1 items and W weight (excluding nth item).
2) Value of nth item plus maximum value obtained by n-1 items and W minus weight of the nth item (including nth item).
If weight of nth item is greater than W, then the nth item cannot be included and case 1 is the only possibility.
2) Overlapping Subproblems
Following is recursive implementation that simply follows the recursive structure mentioned above.

C/C++
/* A Naive recursive implementation of 0-1 Knapsack problem */
#include<stdio.h>
// A utility function that returns maximum of two integers
int max(int a, int b) { return (a > b)? a : b; }
// Returns the maximum value that can be put in a knapsack of capacity W
int knapSack(int W, int wt[], int val[], int n)
{
// Base Case
if (n == 0 || W == 0)
return 0;
// If weight of the nth item is more than Knapsack capacity W, then
// this item cannot be included in the optimal solution
if (wt[n-1] > W)
return knapSack(W, wt, val, n-1);
// Return the maximum of two cases:
// (1) nth item included
// (2) not included
else return max( val[n-1] + knapSack(W-wt[n-1], wt, val, n-1),
knapSack(W, wt, val, n-1)
);
}
// Driver program to test above function
int main()
{
int val[] = {60, 100, 120};
int wt[] = {10, 20, 30};
int W = 50;
int n = sizeof(val)/sizeof(val[0]);
printf("%d", knapSack(W, wt, val, n));
return 0;
}

Python
#A naive recursive implementation of 0-1 Knapsack Problem
# Returns the maximum value that can be put in a knapsack of
# capacity W
def knapSack(W , wt , val , n):
# Base Case
if n == 0 or W == 0 :
return 0
# If weight of the nth item is more than Knapsack of capacity
# W, then this item cannot be included in the optimal solution
if (wt[n-1] > W):

return knapSack(W , wt , val , n-1)


# return the maximum of two cases:
# (1) nth item included
# (2) not included
else:
return max(val[n-1] + knapSack(W-wt[n-1] , wt , val , n-1),
knapSack(W , wt , val , n-1))
# end of function knapSack
# To test above function
val = [60, 100, 120]
wt = [10, 20, 30]
W = 50
n = len(val)
print knapSack(W , wt , val , n)
# This code is contributed by Nikhil Kumar Singh

220

It should be noted that the above function computes the same subproblems again and again. See the following recursion tree, K(1, 1) is being
evaluated twice. Time complexity of this naive recursive solution is exponential (2^n).
In the following recursion tree, K() refers to knapSack(). The two
parameters indicated in the following recursion tree are n and W.
The recursion tree is for following sample inputs.
wt[] = {1, 1, 1}, W = 2, val[] = {10, 20, 30}
K(3, 2)
/
/
K(2,2)
/
\
/
K(1,2)
/ \
/
\
K(0,2) K(0,1)
Recursion tree

---------> K(n, W)
\
\

K(2,1)
/
\
\
/
\
K(1,1)
K(1,1)
K(1,0)
/ \
/ \
/
\
/
\
K(0,1) K(0,0) K(0,1) K(0,0)
for Knapsack capacity 2 units and 3 items of 1 unit weight.

Since suproblems are evaluated again, this problem has Overlapping Subprolems property. So the 0-1 Knapsack problem has both properties
(see this and this) of a dynamic programming problem. Like other typical Dynamic Programming(DP) problems, recomputations of same
subproblems can be avoided by constructing a temporary array K[][] in bottom up manner. Following is Dynamic Programming based
implementation.

C++
// A Dynamic Programming based solution for 0-1 Knapsack problem
#include<stdio.h>
// A utility function that returns maximum of two integers
int max(int a, int b) { return (a > b)? a : b; }
// Returns the maximum value that can be put in a knapsack of capacity W
int knapSack(int W, int wt[], int val[], int n)
{
int i, w;
int K[n+1][W+1];
// Build table K[][] in bottom up manner
for (i = 0; i <= n; i++)
{
for (w = 0; w <= W; w++)
{
if (i==0 || w==0)
K[i][w] = 0;
else if (wt[i-1] <= w)
K[i][w] = max(val[i-1] + K[i-1][w-wt[i-1]], K[i-1][w]);
else
K[i][w] = K[i-1][w];
}
}
return K[n][W];
}
int main()

{
int val[] = {60, 100, 120};
int wt[] = {10, 20, 30};
int W = 50;
int n = sizeof(val)/sizeof(val[0]);
printf("%d", knapSack(W, wt, val, n));
return 0;
}

Pyhton
# A Dynamic Programming based Python Program for 0-1 Knapsack problem
# Returns the maximum value that can be put in a knapsack of capacity W
def knapSack(W, wt, val, n):
K = [[0 for x in range(W+1)] for x in range(n+1)]
# Build table K[][] in bottom up manner
for i in range(n+1):
for w in range(W+1):
if i==0 or w==0:
K[i][w] = 0
elif wt[i-1] <= w:
K[i][w] = max(val[i-1] + K[i-1][w-wt[i-1]], K[i-1][w])
else:
K[i][w] = K[i-1][w]
return K[n][W]
# Driver program to test above function
val = [60, 100, 120]
wt = [10, 20, 30]
W = 50
n = len(val)
print(knapSack(W, wt, val, n))
# This code is contributed by Bhavya Jain

220

Time Complexity: O(nW) where n is the number of items and W is the capacity of knapsack.
References:
http://www.es.ele.tue.nl/education/5MC10/Solutions/knapsack.pdf
http://www.cse.unl.edu/~goddard/Courses/CSCE310J/Lectures/Lecture8-DynamicProgramming.pdf

Dynamic Programming | Set 11 (Egg Dropping Puzzle)


The following is a description of the instance of this famous puzzle involving n=2 eggs and a building with k=36 floors.
Suppose that we wish to know which stories in a 36-story building are safe to drop eggs from, and which will cause the eggs to break on landing.
We make a few assumptions:
..An egg that survives a fall can be used again.
..A broken egg must be discarded.
..The effect of a fall is the same for all eggs.
..If an egg breaks when dropped, then it would break if dropped from a higher floor.
..If an egg survives a fall then it would survive a shorter fall.
..It is not ruled out that the first-floor windows break eggs, nor is it ruled out that the 36th-floor do not cause an egg to break.
If only one egg is available and we wish to be sure of obtaining the right result, the experiment can be carried out in only one way. Drop the egg
from the first-floor window; if it survives, drop it from the second floor window. Continue upward until it breaks. In the worst case, this method
may require 36 droppings. Suppose 2 eggs are available. What is the least number of egg-droppings that is guaranteed to work in all cases?
The problem is not actually to find the critical floor, but merely to decide floors from which eggs should be dropped so that total number of trials
are minimized.
Source: Wiki for Dynamic Programming
In this post, we will discuss solution to a general problem with n eggs and k floors. The solution is to try dropping an egg from every floor (from 1
to k) and recursively calculate the minimum number of droppings needed in worst case. The floor which gives the minimum value in worst case is
going to be part of the solution.
In the following solutions, we return the minimum number of trails in worst case; these solutions can be easily modified to print floor numbers of
every trials also.
1) Optimal Substructure:
When we drop an egg from a floor x, there can be two cases (1) The egg breaks (2) The egg doesnt break.
1) If the egg breaks after dropping from xth floor, then we only need to check for floors lower than x with remaining eggs; so the problem reduces
to x-1 floors and n-1 eggs
2) If the egg doesnt break after dropping from the xth floor, then we only need to check for floors higher than x; so the problem reduces to k-x
floors and n eggs.
Since we need to minimize the number of trials in worst case, we take the maximum of two cases. We consider the max of above two cases for
every floor and choose the floor which yields minimum number of trials.
k ==> Number of floors
n ==> Number of Eggs
eggDrop(n, k) ==> Minimum number of trails needed to find the critical
floor in worst case.
eggDrop(n, k) = 1 + min{max(eggDrop(n - 1, x - 1), eggDrop(n, k - x)):
x in {1, 2, ..., k}}

2) Overlapping Subproblems
Following is recursive implementation that simply follows the recursive structure mentioned above.
# include <stdio.h>
# include <limits.h>
// A utility function to get maximum of two integers
int max(int a, int b) { return (a > b)? a: b; }
/* Function to get minimum number of trails needed in worst
case with n eggs and k floors */
int eggDrop(int n, int k)
{
// If there are no floors, then no trials needed. OR if there is
// one floor, one trial needed.
if (k == 1 || k == 0)
return k;
// We need k trials for one egg and k floors
if (n == 1)
return k;
int min = INT_MAX, x, res;
// Consider all droppings from 1st floor to kth floor and
// return the minimum of these values plus 1.
for (x = 1; x <= k; x++)

{
res = max(eggDrop(n-1, x-1), eggDrop(n, k-x));
if (res < min)
min = res;
}
return min + 1;
}
/* Driver program to test to pront printDups*/
int main()
{
int n = 2, k = 10;
printf ("\nMinimum number of trials in worst case with %d eggs and "
"%d floors is %d \n", n, k, eggDrop(n, k));
return 0;
}
Output:
Minimum number of trials in worst case with 2 eggs and 10 floors is 4

It should be noted that the above function computes the same subproblems again and again. See the following partial recursion tree, E(2, 2) is
being evaluated twice. There will many repeated subproblems when you draw the complete recursion tree even for small values of n and k.
E(2,4)
|
------------------------------------|
|
|
|
|
|
|
|
x=1/\
x=2/\
x=3/ \
x=4/ \
/ \
/ \
....
....
/
\
/
\
E(1,0) E(2,3)
E(1,1) E(2,2)
/\ /\...
/ \
x=1/ \
.....
/
\
E(1,0) E(2,2)
/ \
......
Partial recursion tree for 2 eggs and 4 floors.

Since same suproblems are called again, this problem has Overlapping Subprolems property. So Egg Dropping Puzzle has both properties (see
this and this) of a dynamic programming problem. Like other typical Dynamic Programming(DP) problems, recomputations of same subproblems
can be avoided by constructing a temporary array eggFloor[][] in bottom up manner.
Dynamic Programming Solution
Following are C++ and Python implementations for Egg Dropping problem using Dynamic Programming.

C++
# A Dynamic Programming based C++ Program for the Egg Dropping Puzzle
# include <stdio.h>
# include <limits.h>
// A utility function to get maximum of two integers
int max(int a, int b) { return (a > b)? a: b; }
/* Function to get minimum number of trails needed in worst
case with n eggs and k floors */
int eggDrop(int n, int k)
{
/* A 2D table where entery eggFloor[i][j] will represent minimum
number of trials needed for i eggs and j floors. */
int eggFloor[n+1][k+1];
int res;
int i, j, x;
// We need one trial for one floor and0 trials for 0 floors
for (i = 1; i <= n; i++)
{
eggFloor[i][1] = 1;
eggFloor[i][0] = 0;
}
// We always need j trials for one egg and j floors.
for (j = 1; j <= k; j++)
eggFloor[1][j] = j;

// Fill rest of the entries in table using optimal substructure


// property
for (i = 2; i <= n; i++)
{
for (j = 2; j <= k; j++)
{
eggFloor[i][j] = INT_MAX;
for (x = 1; x <= j; x++)
{
res = 1 + max(eggFloor[i-1][x-1], eggFloor[i][j-x]);
if (res < eggFloor[i][j])
eggFloor[i][j] = res;
}
}
}
// eggFloor[n][k] holds the result
return eggFloor[n][k];
}
/* Driver program to test to pront printDups*/
int main()
{
int n = 2, k = 36;
printf ("\nMinimum number of trials in worst case with %d eggs and "
"%d floors is %d \n", n, k, eggDrop(n, k));
return 0;
}

Python
# A Dynamic Programming based Python Program for the Egg Dropping Puzzle
INT_MAX = 32767
# Function to get minimum number of trails needed in worst
# case with n eggs and k floors
def eggDrop(n, k):
# A 2D table where entery eggFloor[i][j] will represent minimum
# number of trials needed for i eggs and j floors.
eggFloor = [[0 for x in range(k+1)] for x in range(n+1)]
# We need one trial for one floor and0 trials for 0 floors
for i in range(1, n+1):
eggFloor[i][1] = 1
eggFloor[i][0] = 0
# We always need j trials for one egg and j floors.
for j in range(1, k+1):
eggFloor[1][j] = j
# Fill rest of the entries in table using optimal substructure
# property
for i in range(2, n+1):
for j in range(2, k+1):
eggFloor[i][j] = INT_MAX
for x in range(1, j+1):
res = 1 + max(eggFloor[i-1][x-1], eggFloor[i][j-x])
if res < eggFloor[i][j]:
eggFloor[i][j] = res
# eggFloor[n][k] holds the result
return eggFloor[n][k]
# Driver program to test to pront printDups
n = 2
k = 36
print("Minimum number of trials in worst case with" + str(n) + "eggs and " \
+ str(k) + " floors is " + str(eggDrop(n, k)))
# This code is contributed by Bhavya Jain
Output:
Minimum number of trials in worst case with 2 eggs and 36 floors is 8

Time Complexity: O(nk^2)


Auxiliary Space: O(nk)
As an exercise, you may try modifying the above DP solution to print all intermediate floors (The floors used for minimum trail solution).

References:
http://archive.ite.journal.informs.org/Vol4No1/Sniedovich/index.php

Dynamic Programming | Set 12 (Longest Palindromic Subsequence)


Given a sequence, find the length of the longest palindromic subsequence in it. For example, if the given sequence is BBABCBCAB, then the
output should be 7 as BABCBAB is the longest palindromic subseuqnce in it. BBBBB and BBCBB are also palindromic subsequences of the
given sequence, but not the longest ones.
The naive solution for this problem is to generate all subsequences of the given sequence and find the longest palindromic subsequence. This
solution is exponential in term of time complexity. Let us see how this problem possesses both important properties of a Dynamic Programming
(DP) Problem and can efficiently solved using Dynamic Programming.
1) Optimal Substructure:
Let X[0..n-1] be the input sequence of length n and L(0, n-1) be the length of the longest palindromic subsequence of X[0..n-1].
If last and first characters of X are same, then L(0, n-1) = L(1, n-2) + 2.
Else L(0, n-1) = MAX (L(1, n-1), L(0, n-2)).
Following is a general recursive solution with all cases handled.
// Everay single character is a palindrom of length 1
L(i, i) = 1 for all indexes i in given sequence
// IF first and last characters are not same
If (X[i] != X[j]) L(i, j) = max{L(i + 1, j),L(i, j - 1)}
// If there are only 2 characters and both are same
Else if (j == i + 1) L(i, j) = 2
// If there are more than two characters, and first and last
// characters are same
Else L(i, j) = L(i + 1, j - 1) + 2

2) Overlapping Subproblems
Following is simple recursive implementation of the LPS problem. The implementation simply follows the recursive structure mentioned above.
#include<stdio.h>
#include<string.h>
// A utility function to get max of two integers
int max (int x, int y) { return (x > y)? x : y; }
// Returns the length of the longest palindromic subsequence in seq
int lps(char *seq, int i, int j)
{
// Base Case 1: If there is only 1 character
if (i == j)
return 1;
// Base Case 2: If there are only 2 characters and both are same
if (seq[i] == seq[j] && i + 1 == j)
return 2;
// If the first and last characters match
if (seq[i] == seq[j])
return lps (seq, i+1, j-1) + 2;
// If the first and last characters do not match
return max( lps(seq, i, j-1), lps(seq, i+1, j) );
}
/* Driver program to test above functions */
int main()
{
char seq[] = "GEEKSFORGEEKS";
int n = strlen(seq);
printf ("The lnegth of the LPS is %d", lps(seq, 0, n-1));
getchar();
return 0;
}

Output:
The lnegth of the LPS is 5

Considering the above implementation, following is a partial recursion tree for a sequence of length 6 with all different characters.
/

L(0, 5)
\

/
\
L(1,5)
L(0,4)
/
\
/
\
/
\
/
\
L(2,5)
L(1,4) L(1,4) L(0,3)

In the above partial recursion tree, L(1, 4) is being solved twice. If we draw the complete recursion tree, then we can see that there are many
subproblems which are solved again and again. Since same suproblems are called again, this problem has Overlapping Subprolems property. So
LPS problem has both properties (see this and this) of a dynamic programming problem. Like other typical Dynamic Programming(DP) problems,
recomputations of same subproblems can be avoided by constructing a temporary array L[][] in bottom up manner.
Dynamic Programming Solution

C++
# A Dynamic Programming based Python program for LPS problem
# Returns the length of the longest palindromic subsequence in seq
#include<stdio.h>
#include<string.h>
// A utility function to get max of two integers
int max (int x, int y) { return (x > y)? x : y; }
// Returns the length of the longest palindromic subsequence in seq
int lps(char *str)
{
int n = strlen(str);
int i, j, cl;
int L[n][n]; // Create a table to store results of subproblems
// Strings of length 1 are palindrome of lentgh 1
for (i = 0; i < n; i++)
L[i][i] = 1;
// Build the table. Note that the lower diagonal values of table are
// useless and not filled in the process. The values are filled in a
// manner similar to Matrix Chain Multiplication DP solution (See
// http://www.geeksforgeeks.org/archives/15553). cl is length of
// substring
for (cl=2; cl<=n; cl++)
{
for (i=0; i<n-cl+1; i++)
{
j = i+cl-1;
if (str[i] == str[j] && cl == 2)
L[i][j] = 2;
else if (str[i] == str[j])
L[i][j] = L[i+1][j-1] + 2;
else
L[i][j] = max(L[i][j-1], L[i+1][j]);
}
}
return L[0][n-1];
}
/* Driver program to test above functions */
int main()
{
char seq[] = "GEEKS FOR GEEKS";
int n = strlen(seq);
printf ("The lnegth of the LPS is %d", lps(seq));
getchar();
return 0;
}

Python
# A Dynamic Programming based Python program for LPS problem
# Returns the length of the longest palindromic subsequence in seq
def lps(str):
n = len(str)
# Create a table to store results of subproblems
L = [[0 for x in range(n)] for x in range(n)]
# Strings of length 1 are palindrome of length 1

for i in range(n):
L[i][i] = 1
# Build the table. Note that the lower diagonal values of table are
# useless and not filled in the process. The values are filled in a
# manner similar to Matrix Chain Multiplication DP solution (See
# http://www.geeksforgeeks.org/dynamic-programming-set-8-matrix-chain-multiplication/
# cl is length of substring
for cl in range(2, n+1):
for i in range(n-cl+1):
j = i+cl-1
if str[i] == str[j] and cl == 2:
L[i][j] = 2
elif str[i] == str[j]:
L[i][j] = L[i+1][j-1] + 2
else:
L[i][j] = max(L[i][j-1], L[i+1][j]);
return L[0][n-1]
# Driver program to test above functions
seq = "GEEKS FOR GEEKS"
n = len(seq)
print("The length of the LPS is " + str(lps(seq)))
# This code is contributed by Bhavya Jain

The lnegth of the LPS is 7

Time Complexity of the above implementation is O(n^2) which is much better than the worst case time complexity of Naive Recursive
implementation.
This problem is close to the Longest Common Subsequence (LCS) problem. In fact, we can use LCS as a subroutine to solve this problem.
Following is the two step solution that uses LCS.
1) Reverse the given sequence and store the reverse in another array say rev[0..n-1]
2) LCS of the given sequence and rev[] will be the longest palindromic sequence.
This solution is also a O(n^2) solution.
References:
http://users.eecs.northwestern.edu/~dda902/336/hw6-sol.pdf

Dynamic Programming | Set 13 (Cutting a Rod)


Given a rod of length n inches and an array of prices that contains prices of all pieces of size smaller than n. Determine the maximum value
obtainable by cutting up the rod and selling the pieces. For example, if length of the rod is 8 and the values of different pieces are given as
following, then the maximum obtainable value is 22 (by cutting in two pieces of lengths 2 and 6)
length | 1 2 3 4 5 6 7 8
-------------------------------------------price
| 1 5 8 9 10 17 17 20

And if the prices are as following, then the maximum obtainable value is 24 (by cutting in eight pieces of length 1)
length | 1 2 3 4 5 6 7 8
-------------------------------------------price
| 3 5 8 9 10 17 17 20

The naive solution for this problem is to generate all configurations of different pieces and find the highest priced configuration. This solution is
exponential in term of time complexity. Let us see how this problem possesses both important properties of a Dynamic Programming (DP)
Problem and can efficiently solved using Dynamic Programming.
1) Optimal Substructure:
We can get the best price by making a cut at different positions and comparing the values obtained after a cut. We can recursively call the same
function for a piece obtained after a cut.
Let cutRoad(n) be the required (best possible price) value for a rod of lenght n. cutRod(n) can be written as following.
cutRod(n) = max(price[i] + cutRod(n-i-1)) for all i in {0, 1 .. n-1}
2) Overlapping Subproblems
Following is simple recursive implementation of the Rod Cutting problem. The implementation simply follows the recursive structure mentioned
above.
// A Naive recursive solution for Rod cutting problem
#include<stdio.h>
#include<limits.h>
// A utility function to get the maximum of two integers
int max(int a, int b) { return (a > b)? a : b;}
/* Returns the best obtainable price for a rod of length n and
price[] as prices of different pieces */
int cutRod(int price[], int n)
{
if (n <= 0)
return 0;
int max_val = INT_MIN;
// Recursively cut the rod in different pieces and compare different
// configurations
for (int i = 0; i<n; i++)
max_val = max(max_val, price[i] + cutRod(price, n-i-1));
return max_val;
}
/* Driver program to test above functions */
int main()
{
int arr[] = {1, 5, 8, 9, 10, 17, 17, 20};
int size = sizeof(arr)/sizeof(arr[0]);
printf("Maximum Obtainable Value is %d\n", cutRod(arr, size));
getchar();
return 0;
}

Output:
Maximum Obtainable Value is 22

Considering the above implementation, following is recursion tree for a Rod of length 4.
cR() ---> cutRod()
/
/

/
/

cR(4)
\
\

\
\

cR(3)
/ | \
/ | \
cR(2) cR(1) cR(0)
/ \
|
/ \
|
cR(1) cR(0) cR(0)

cR(2)
cR(1) cR(0)
/ \
|
/
\
|
cR(1) cR(0) cR(0)
|
|
cR(0)

In the above partial recursion tree, cR(2) is being solved twice. We can see that there are many subproblems which are solved again and again.
Since same suproblems are called again, this problem has Overlapping Subprolems property. So the Rod Cutting problem has both properties
(see this and this) of a dynamic programming problem. Like other typical Dynamic Programming(DP) problems, recomputations of same
subproblems can be avoided by constructing a temporary array val[] in bottom up manner.

C++
// A Dynamic Programming solution for Rod cutting problem
#include<stdio.h>
#include<limits.h>
// A utility function to get the maximum of two integers
int max(int a, int b) { return (a > b)? a : b;}
/* Returns the best obtainable price for a rod of length n and
price[] as prices of different pieces */
int cutRod(int price[], int n)
{
int val[n+1];
val[0] = 0;
int i, j;
// Build the table val[] in bottom up manner and return the last entry
// from the table
for (i = 1; i<=n; i++)
{
int max_val = INT_MIN;
for (j = 0; j < i; j++)
max_val = max(max_val, price[j] + val[i-j-1]);
val[i] = max_val;
}
return val[n];
}
/* Driver program to test above functions */
int main()
{
int arr[] = {1, 5, 8, 9, 10, 17, 17, 20};
int size = sizeof(arr)/sizeof(arr[0]);
printf("Maximum Obtainable Value is %d\n", cutRod(arr, size));
getchar();
return 0;
}

Python
# A Dynamic Programming solution for Rod cutting problem
INT_MIN = -32767
# Returns the best obtainable price for a rod of length n and
# price[] as prices of different pieces
def cutRod(price, n):
val = [0 for x in range(n+1)]
val[0] = 0
# Build the table val[] in bottom up manner and return
# the last entry from the table
for i in range(1, n+1):
max_val = INT_MIN
for j in range(i):
max_val = max(max_val, price[j] + val[i-j-1])
val[i] = max_val
return val[n]
# Driver program to test above functions
arr = [1, 5, 8, 9, 10, 17, 17, 20]
size = len(arr)
print("Maximum Obtainable Value is " + str(cutRod(arr, size)))

# This code is contributed by Bhavya Jain

Maximum Obtainable Value is 22

Time Complexity of the above implementation is O(n^2) which is much better than the worst case time complexity of Naive Recursive
implementation.

Dynamic Programming | Set 14 (Maximum Sum Increasing Subsequence)


Given an array of n positive integers. Write a program to find the sum of maximum sum subsequence of the given array such that the intgers in the
subsequence are sorted in increasing order. For example, if input is {1, 101, 2, 3, 100, 4, 5}, then output should be 106 (1 + 2 + 3 + 100), if the
input array is {3, 4, 5, 10}, then output should be 22 (3 + 4 + 5 + 10) and if the input array is {10, 5, 4, 3}, then output should be 10
Solution
This problem is a variation of standard Longest Increasing Subsequence (LIS) problem. We need a slight change in the Dynamic Programming
solution of LIS problem. All we need to change is to use sum as a criteria instead of length of increasing subsequence.
Following are C/C++ and Python implementations for Dynamic Programming solution of the problem.

C/C++
/* Dynamic Programming implementation of Maximum Sum Increasing
Subsequence (MSIS) problem */
#include<stdio.h>
/* maxSumIS() returns the maximum sum of increasing subsequence
in arr[] of size n */
int maxSumIS( int arr[], int n )
{
int i, j, max = 0;
int msis[n];
/* Initialize msis values for all indexes */
for ( i = 0; i < n; i++ )
msis[i] = arr[i];
/* Compute maximum sum values in bottom up manner */
for ( i = 1; i < n; i++ )
for ( j = 0; j < i; j++ )
if ( arr[i] > arr[j] && msis[i] < msis[j] + arr[i])
msis[i] = msis[j] + arr[i];
/* Pick maximum of all msis values */
for ( i = 0; i < n; i++ )
if ( max < msis[i] )
max = msis[i];
return max;
}
/* Driver program to test above function */
int main()
{
int arr[] = {1, 101, 2, 3, 100, 4, 5};
int n = sizeof(arr)/sizeof(arr[0]);
printf("Sum of maximum sum increasing subsequence is %d\n",
maxSumIS( arr, n ) );
return 0;
}

Python
# Dynamic Programming bsed Python implementation of Maximum Sum Increasing
# Subsequence (MSIS) problem
# maxSumIS() returns the maximum sum of increasing subsequence in arr[] of
# size n
def maxSumIS(arr, n):
max = 0
msis = [0 for x in range(n)]
# Initialize msis values for all indexes
for i in range(n):
msis[i] = arr[i]
# Compute maximum sum values in bottom up manner
for i in range(1, n):
for j in range(i):
if arr[i] > arr[j] and msis[i] < msis[j] + arr[i]:
msis[i] = msis[j] + arr[i]
# Pick maximum of all msis values
for i in range(n):
if max < msis[i]:

max = msis[i]
return max
# Driver program to test above function
arr = [1, 101, 2, 3, 100, 4, 5]
n = len(arr)
print("Sum of maximum sum increasing subsequence is " +
str(maxSumIS(arr, n)))
# This code is contributed by Bhavya Jain

Output:
Sum of maximum sum increasing subsequence is 106

Time Complexity: O(n^2)


Source: Maximum Sum Increasing Subsequence Problem

Dynamic Programming | Set 15 (Longest Bitonic Subsequence)


Given an array arr[0 n-1] containing n positive integers, a subsequence of arr[] is called Bitonic if it is first increasing, then decreasing. Write a
function that takes an array as argument and returns the length of the longest bitonic subsequence.
A sequence, sorted in increasing order is considered Bitonic with the decreasing part as empty. Similarly, decreasing order sequence is considered
Bitonic with the increasing part as empty.
Examples:
Input arr[] = {1, 11, 2, 10, 4, 5, 2, 1};
Output: 6 (A Longest Bitonic Subsequence of length 6 is 1, 2, 10, 4, 2, 1)
Input arr[] = {12, 11, 40, 5, 3, 1}
Output: 5 (A Longest Bitonic Subsequence of length 5 is 12, 11, 5, 3, 1)
Input arr[] = {80, 60, 30, 40, 20, 10}
Output: 5 (A Longest Bitonic Subsequence of length 5 is 80, 60, 30, 20, 10)

Source: Microsoft Interview Question


Solution
This problem is a variation of standard Longest Increasing Subsequence (LIS) problem. Let the input array be arr[] of length n. We need to
construct two arrays lis[] and lds[] using Dynamic Programming solution of LIS problem. lis[i] stores the length of the Longest Increasing
subsequence ending with arr[i]. lds[i] stores the length of the longest Decreasing subsequence starting from arr[i]. Finally, we need to return the
max value of lis[i] + lds[i] 1 where i is from 0 to n-1.
Following is C++ implementation of the above Dynamic Programming solution.

C++
/* Dynamic Programming implementation of longest bitonic subsequence problem */
#include<stdio.h>
#include<stdlib.h>
/* lbs() returns the length of the Longest Bitonic Subsequence in
arr[] of size n. The function mainly creates two temporary arrays
lis[] and lds[] and returns the maximum lis[i] + lds[i] - 1.
lis[i] ==> Longest Increasing subsequence ending with arr[i]
lds[i] ==> Longest decreasing subsequence starting with arr[i]
*/
int lbs( int arr[], int n )
{
int i, j;
/* Allocate memory for LIS[] and initialize LIS values as 1 for
all indexes */
int *lis = new int[n];
for (i = 0; i < n; i++)
lis[i] = 1;
/* Compute LIS values from left to right */
for (i = 1; i < n; i++)
for (j = 0; j < i; j++)
if (arr[i] > arr[j] && lis[i] < lis[j] + 1)
lis[i] = lis[j] + 1;
/* Allocate memory for lds and initialize LDS values for
all indexes */
int *lds = new int [n];
for (i = 0; i < n; i++)
lds[i] = 1;
/* Compute LDS values from right to left */
for (i = n-2; i >= 0; i--)
for (j = n-1; j > i; j--)
if (arr[i] > arr[j] && lds[i] < lds[j] + 1)
lds[i] = lds[j] + 1;
/* Return the maximum value
int max = lis[0] + lds[0] for (i = 1; i < n; i++)
if (lis[i] + lds[i] - 1 >
max = lis[i] + lds[i]
return max;

of lis[i] + lds[i] - 1*/


1;
max)
- 1;

}
/* Driver program to test above function */
int main()
{
int arr[] = {0, 8, 4, 12, 2, 10, 6, 14, 1, 9, 5,
13, 3, 11, 7, 15};
int n = sizeof(arr)/sizeof(arr[0]);
printf("Length of LBS is %d\n", lbs( arr, n ) );
return 0;
}

Java
/* Dynamic Programming implementation in Java for longest bitonic
subsequence problem */
import java.util.*;
import java.lang.*;
import java.io.*;
class LBS
{
/* lbs() returns the length of the Longest Bitonic Subsequence in
arr[] of size n. The function mainly creates two temporary arrays
lis[] and lds[] and returns the maximum lis[i] + lds[i] - 1.
lis[i] ==>
lds[i] ==>
*/
static int
{
int i,

Longest Increasing subsequence ending with arr[i]


Longest decreasing subsequence starting with arr[i]
lbs( int arr[], int n )
j;

/* Allocate memory for LIS[] and initialize LIS values as 1 for


all indexes */
int[] lis = new int[n];
for (i = 0; i < n; i++)
lis[i] = 1;
/* Compute LIS values from left to right */
for (i = 1; i < n; i++)
for (j = 0; j < i; j++)
if (arr[i] > arr[j] && lis[i] < lis[j] + 1)
lis[i] = lis[j] + 1;
/* Allocate memory for lds and initialize LDS values for
all indexes */
int[] lds = new int [n];
for (i = 0; i < n; i++)
lds[i] = 1;
/* Compute LDS values from right to left */
for (i = n-2; i >= 0; i--)
for (j = n-1; j > i; j--)
if (arr[i] > arr[j] && lds[i] < lds[j] + 1)
lds[i] = lds[j] + 1;
/* Return the maximum value of lis[i] + lds[i] - 1*/
int max = lis[0] + lds[0] - 1;
for (i = 1; i < n; i++)
if (lis[i] + lds[i] - 1 > max)
max = lis[i] + lds[i] - 1;
return max;
}
public static void main (String[] args)
{
int arr[] = {0, 8, 4, 12, 2, 10, 6, 14, 1, 9, 5,
13, 3, 11, 7, 15};
int n = arr.length;
System.out.println("Length of LBS is "+ lbs( arr, n ));
}
}

Length of LBS is 7

Time Complexity: O(n^2)


Auxiliary Space: O(n)

Dynamic Programming | Set 16 (Floyd Warshall Algorithm)


The Floyd Warshall Algorithm is for solving the All Pairs Shortest Path problem. The problem is to find shortest distances between every pair of
vertices in a given edge weighted directed Graph.
Example:
Input:
graph[][] = { {0, 5, INF, 10},
{INF, 0, 3, INF},
{INF, INF, 0, 1},
{INF, INF, INF, 0} }
which represents the following graph
10
(0)------->(3)
|
/|\
5 |
|
|
| 1
\|/
|
(1)------->(2)
3
Note that the value of graph[i][j] is 0 if i is equal to j
And graph[i][j] is INF (infinite) if there is no edge from vertex i to j.
Output:
Shortest distance matrix
0
5
8
INF
0
3
INF
INF
0
INF
INF
INF

9
4
1
0

Floyd Warshall Algorithm


We initialize the solution matrix same as the input graph matrix as a first step. Then we update the solution matrix by considering all vertices as an
intermediate vertex. The idea is to one by one pick all vertices and update all shortest paths which include the picked vertex as an intermediate
vertex in the shortest path. When we pick vertex number k as an intermediate vertex, we already have considered vertices {0, 1, 2, .. k-1} as
intermediate vertices. For every pair (i, j) of source and destination vertices respectively, there are two possible cases.
1) k is not an intermediate vertex in shortest path from i to j. We keep the value of dist[i][j] as it is.
2) k is an intermediate vertex in shortest path from i to j. We update the value of dist[i][j] as dist[i][k] + dist[k][j].
The following figure is taken from the Cormen book. It shows the above optimal substructure property in the all-pairs shortest path problem.

Following is implementations of the Floyd Warshall algorithm.

C/C++
// C Program for Floyd Warshall Algorithm
#include<stdio.h>
// Number of vertices in the graph
#define V 4
/* Define Infinite as a large enough value. This value will be used
for vertices not connected to each other */
#define INF 99999
// A function to print the solution matrix
void printSolution(int dist[][V]);
// Solves the all-pairs shortest path problem using Floyd Warshall algorithm
void floydWarshell (int graph[][V])
{
/* dist[][] will be the output matrix that will finally have the shortest
distances between every pair of vertices */
int dist[V][V], i, j, k;
/* Initialize the solution matrix same as input graph matrix. Or
we can say the initial values of shortest distances are based
on shortest paths considering no intermediate vertex. */

for (i = 0; i < V; i++)


for (j = 0; j < V; j++)
dist[i][j] = graph[i][j];
/* Add all vertices one by one to the set of intermediate vertices.
---> Before start of a iteration, we have shortest distances between all
pairs of vertices such that the shortest distances consider only the
vertices in set {0, 1, 2, .. k-1} as intermediate vertices.
----> After the end of a iteration, vertex no. k is added to the set of
intermediate vertices and the set becomes {0, 1, 2, .. k} */
for (k = 0; k < V; k++)
{
// Pick all vertices as source one by one
for (i = 0; i < V; i++)
{
// Pick all vertices as destination for the
// above picked source
for (j = 0; j < V; j++)
{
// If vertex k is on the shortest path from
// i to j, then update the value of dist[i][j]
if (dist[i][k] + dist[k][j] < dist[i][j])
dist[i][j] = dist[i][k] + dist[k][j];
}
}
}
// Print the shortest distance matrix
printSolution(dist);
}
/* A utility function to print solution */
void printSolution(int dist[][V])
{
printf ("Following matrix shows the shortest distances"
" between every pair of vertices \n");
for (int i = 0; i < V; i++)
{
for (int j = 0; j < V; j++)
{
if (dist[i][j] == INF)
printf("%7s", "INF");
else
printf ("%7d", dist[i][j]);
}
printf("\n");
}
}
// driver program to test above function
int main()
{
/* Let us create the following weighted graph
10
(0)------->(3)
|
/|\
5 |
|
|
| 1
\|/
|
(1)------->(2)
3
*/
int graph[V][V] = { {0, 5, INF, 10},
{INF, 0, 3, INF},
{INF, INF, 0, 1},
{INF, INF, INF, 0}
};
// Print the solution
floydWarshell(graph);
return 0;
}

Java
// A Java program for Floyd Warshall All Pairs Shortest
// Path algorithm.
import java.util.*;
import java.lang.*;
import java.io.*;

class AllPairShortestPath
{
final static int INF = 99999, V = 4;
void floydWarshall(int graph[][])
{
int dist[][] = new int[V][V];
int i, j, k;
/* Initialize the solution matrix same as input graph matrix.
Or we can say the initial values of shortest distances
are based on shortest paths considering no intermediate
vertex. */
for (i = 0; i < V; i++)
for (j = 0; j < V; j++)
dist[i][j] = graph[i][j];
/* Add all vertices one by one to the set of intermediate
vertices.
---> Before start of a iteration, we have shortest
distances between all pairs of vertices such that
the shortest distances consider only the vertices in
set {0, 1, 2, .. k-1} as intermediate vertices.
----> After the end of a iteration, vertex no. k is added
to the set of intermediate vertices and the set
becomes {0, 1, 2, .. k} */
for (k = 0; k < V; k++)
{
// Pick all vertices as source one by one
for (i = 0; i < V; i++)
{
// Pick all vertices as destination for the
// above picked source
for (j = 0; j < V; j++)
{
// If vertex k is on the shortest path from
// i to j, then update the value of dist[i][j]
if (dist[i][k] + dist[k][j] < dist[i][j])
dist[i][j] = dist[i][k] + dist[k][j];
}
}
}
// Print the shortest distance matrix
printSolution(dist);
}
void printSolution(int dist[][])
{
System.out.println("Following matrix shows the shortest "+
"distances between every pair of vertices");
for (int i=0; i<V; ++i)
{
for (int j=0; j<V; ++j)
{
if (dist[i][j]==INF)
System.out.print("INF ");
else
System.out.print(dist[i][j]+" ");
}
System.out.println();
}
}
// Driver program to test above function
public static void main (String[] args)
{
/* Let us create the following weighted graph
10
(0)------->(3)
|
/|\
5 |
|
|
| 1
\|/
|
(1)------->(2)
3
*/
int graph[][] = { {0, 5, INF, 10},
{INF, 0, 3, INF},
{INF, INF, 0, 1},
{INF, INF, INF, 0}

};
AllPairShortestPath a = new AllPairShortestPath();
// Print the solution
a.floydWarshall(graph);
}
}
// Contributed by Aakash Hasija

Output:
Following matrix shows the shortest distances between every pair of vertices
0
5
8
9
INF
0
3
4
INF
INF
0
1
INF
INF
INF
0

Time Complexity: O(V^3)


The above program only prints the shortest distances. We can modify the solution to print the shortest paths also by storing the predecessor
information in a separate 2D matrix.
Also, the value of INF can be taken as INT_MAX from limits.h to make sure that we handle maximum possible value. When we take INF as
INT_MAX, we need to change the if condition in the above program to avoid arithmatic overflow.
#include<limits.h>
#define INF INT_MAX
..........................
if (dist[i][k] != INF && dist[k][j] != INF && dist[i][k] + dist[k][j] < dist[i][j])
dist[i][j] = dist[i][k] + dist[k][j];
...........................

Dynamic Programming | Set 17 (Palindrome Partitioning)


Given a string, a partitioning of the string is a palindrome partitioning if every substring of the partition is a palindrome. For example,
aba|b|bbabb|a|b|aba is a palindrome partitioning of ababbbabbababa. Determine the fewest cuts needed for palindrome partitioning of a given
string. For example, minimum 3 cuts are needed for ababbbabbababa. The three cuts are a|babbbab|b|ababa. If a string is palindrome, then
minimum 0 cuts are needed. If a string of length n containing all different characters, then minimum n-1 cuts are needed.
Solution
This problem is a variation of Matrix Chain Multiplication problem. If the string is palindrome, then we simply return 0. Else, like the Matrix Chain
Multiplication problem, we try making cuts at all possible places, recursively calculate the cost for each cut and return the minimum value.
Let the given string be str and minPalPartion() be the function that returns the fewest cuts needed for palindrome partitioning. following is the
optimal substructure property.
// i is the starting index and j is the ending index. i must be passed as 0 and j as n-1
minPalPartion(str, i, j) = 0 if i == j. // When string is of length 1.
minPalPartion(str, i, j) = 0 if str[i..j] is palindrome.
// If none of the above conditions is true, then minPalPartion(str, i, j) can be
// calculated recursively using the following formula.
minPalPartion(str, i, j) = Min { minPalPartion(str, i, k) + 1 +
minPalPartion(str, k+1, j) }
where k varies from i to j-1

Following is Dynamic Programming solution. It stores the solutions to subproblems in two arrays P[][] and C[][], and reuses the calculated values.
// Dynamic Programming Solution for Palindrome Partitioning Problem
#include <stdio.h>
#include <string.h>
#include <limits.h>
// A utility function to get minimum of two integers
int min (int a, int b) { return (a < b)? a : b; }
// Returns the minimum number of cuts needed to partition a string
// such that every part is a palindrome
int minPalPartion(char *str)
{
// Get the length of the string
int n = strlen(str);
/* Create two arrays to build the solution in bottom up manner
C[i][j] = Minimum number of cuts needed for palindrome partitioning
of substring str[i..j]
P[i][j] = true if substring str[i..j] is palindrome, else false
Note that C[i][j] is 0 if P[i][j] is true */
int C[n][n];
bool P[n][n];
int i, j, k, L; // different looping variables
// Every substring of length 1 is a palindrome
for (i=0; i<n; i++)
{
P[i][i] = true;
C[i][i] = 0;
}
/* L is substring length. Build the solution in bottom up manner by
considering all substrings of length starting from 2 to n.
The loop structure is same as Matrx Chain Multiplication problem (
See http://www.geeksforgeeks.org/archives/15553 )*/
for (L=2; L<=n; L++)
{
// For substring of length L, set different possible starting indexes
for (i=0; i<n-L+1; i++)
{
j = i+L-1; // Set ending index
// If L is 2, then we just need to compare two characters. Else
// need to check two corner characters and value of P[i+1][j-1]
if (L == 2)
P[i][j] = (str[i] == str[j]);
else
P[i][j] = (str[i] == str[j]) && P[i+1][j-1];
// IF str[i..j] is palindrome, then C[i][j] is 0
if (P[i][j] == true)

C[i][j] = 0;
else
{
// Make a cut at every possible localtion starting from i to j,
// and get the minimum cost cut.
C[i][j] = INT_MAX;
for (k=i; k<=j-1; k++)
C[i][j] = min (C[i][j], C[i][k] + C[k+1][j]+1);
}
}
}
// Return the min cut value for complete string. i.e., str[0..n-1]
return C[0][n-1];
}
// Driver program to test above function
int main()
{
char str[] = "ababbbabbababa";
printf("Min cuts needed for Palindrome Partitioning is %d",
minPalPartion(str));
return 0;
}

Output:
Min cuts needed for Palindrome Partitioning is 3

Time Complexity: O(n3)


An optimization to above approach
In above approach, we can calculating minimum cut while finding all palindromic substring. If we finding all palindromic substring 1st and then we
calculate minimum cut, time complexity will reduce to O(n2).
Thanks for Vivek for suggesting this optimization.
// Dynamic Programming Solution for Palindrome Partitioning Problem
#include <stdio.h>
#include <string.h>
#include <limits.h>
// A utility function to get minimum of two integers
int min (int a, int b) { return (a < b)? a : b; }
// Returns the minimum number of cuts needed to partition a string
// such that every part is a palindrome
int minPalPartion(char *str)
{
// Get the length of the string
int n = strlen(str);
/* Create two arrays to build the solution in bottom up manner
C[i] = Minimum number of cuts needed for palindrome partitioning
of substring str[0..i]
P[i][j] = true if substring str[i..j] is palindrome, else false
Note that C[i] is 0 if P[0][i] is true */
int C[n];
bool P[n][n];
int i, j, k, L; // different looping variables
// Every substring of length 1 is a palindrome
for (i=0; i<n; i++)
{
P[i][i] = true;
}
/* L is substring length. Build the solution in bottom up manner by
considering all substrings of length starting from 2 to n. */
for (L=2; L<=n; L++)
{
// For substring of length L, set different possible starting indexes
for (i=0; i<n-L+1; i++)
{
j = i+L-1; // Set ending index
// If L is 2, then we just need to compare two characters. Else
// need to check two corner characters and value of P[i+1][j-1]
if (L == 2)

P[i][j] = (str[i] == str[j]);


else
P[i][j] = (str[i] == str[j]) && P[i+1][j-1];
}
}
for (i=0; i<n; i++)
{
if (P[0][i] == true)
C[i] = 0;
else
{
C[i] = INT_MAX;
for(j=0;j<i;j++)
{
if(P[j+1][i] == true && 1+C[j]<C[i])
C[i]=1+C[j];
}
}
}
// Return the min cut value for complete string. i.e., str[0..n-1]
return C[n-1];
}
// Driver program to test above function
int main()
{
char str[] = "ababbbabbababa";
printf("Min cuts needed for Palindrome Partitioning is %d",
minPalPartion(str));
return 0;
}

Output:
Min cuts needed for Palindrome Partitioning is 3

Time Complexity: O(n2)

Dynamic Programming | Set 18 (Partition problem)


Partition problem is to determine whether a given set can be partitioned into two subsets such that the sum of elements in both subsets is same.
Examples
arr[] = {1, 5, 11, 5}
Output: true
The array can be partitioned as {1, 5, 5} and {11}
arr[] = {1, 5, 3}
Output: false
The array cannot be partitioned into equal sum sets.

Following are the two main steps to solve this problem:


1) Calculate sum of the array. If sum is odd, there can not be two subsets with equal sum, so return false.
2) If sum of array elements is even, calculate sum/2 and find a subset of array with sum equal to sum/2.
The first step is simple. The second step is crucial, it can be solved either using recursion or Dynamic Programming.
Recursive Solution
Following is the recursive property of the second step mentioned above.
Let isSubsetSum(arr, n, sum/2) be the function that returns true if
there is a subset of arr[0..n-1] with sum equal to sum/2
The isSubsetSum problem can be divided into two subproblems
a) isSubsetSum() without considering last element
(reducing n to n-1)
b) isSubsetSum considering the last element
(reducing sum/2 by arr[n-1] and n to n-1)
If any of the above the above subproblems return true, then return true.
isSubsetSum (arr, n, sum/2) = isSubsetSum (arr, n-1, sum/2) ||
isSubsetSum (arr, n-1, sum/2 - arr[n-1])

C/C++
// A recursive C program for partition problem
#include <stdio.h>
// A utility function that returns true if there is
// a subset of arr[] with sun equal to given sum
bool isSubsetSum (int arr[], int n, int sum)
{
// Base Cases
if (sum == 0)
return true;
if (n == 0 && sum != 0)
return false;
// If last element is greater than sum, then
// ignore it
if (arr[n-1] > sum)
return isSubsetSum (arr, n-1, sum);
/* else, check if sum can be obtained by any of
the following
(a) including the last element
(b) excluding the last element
*/
return isSubsetSum (arr, n-1, sum) ||
isSubsetSum (arr, n-1, sum-arr[n-1]);
}
// Returns true if arr[] can be partitioned in two
// subsets of equal sum, otherwise false
bool findPartiion (int arr[], int n)
{
// Calculate sum of the elements in array
int sum = 0;
for (int i = 0; i < n; i++)
sum += arr[i];
// If sum is odd, there cannot be two subsets
// with equal sum
if (sum%2 != 0)
return false;

// Find if there is subset with sum equal to


// half of total sum
return isSubsetSum (arr, n, sum/2);
}
// Driver program to test above function
int main()
{
int arr[] = {3, 1, 5, 9, 12};
int n = sizeof(arr)/sizeof(arr[0]);
if (findPartiion(arr, n) == true)
printf("Can be divided into two subsets "
"of equal sum");
else
printf("Can not be divided into two subsets"
" of equal sum");
return 0;
}

Java
// A recursive Java solution for partition problem
import java.io.*;
class Partition
{
// A utility function that returns true if there is a
// subset of arr[] with sun equal to given sum
static boolean isSubsetSum (int arr[], int n, int sum)
{
// Base Cases
if (sum == 0)
return true;
if (n == 0 && sum != 0)
return false;
// If last element is greater than sum, then ignore it
if (arr[n-1] > sum)
return isSubsetSum (arr, n-1, sum);
/* else, check if sum can be obtained by any of
the following
(a) including the last element
(b) excluding the last element
*/
return isSubsetSum (arr, n-1, sum) ||
isSubsetSum (arr, n-1, sum-arr[n-1]);
}
// Returns true if arr[] can be partitioned in two
// subsets of equal sum, otherwise false
static boolean findPartition (int arr[], int n)
{
// Calculate sum of the elements in array
int sum = 0;
for (int i = 0; i < n; i++)
sum += arr[i];
// If sum is odd, there cannot be two subsets
// with equal sum
if (sum%2 != 0)
return false;
// Find if there is subset with sum equal to half
// of total sum
return isSubsetSum (arr, n, sum/2);
}
/*Driver function to check for above function*/
public static void main (String[] args)
{
int arr[] = {3, 1, 5, 9, 12};
int n = arr.length;
if (findPartition(arr, n) == true)
System.out.println("Can be divided into two "+
"subsets of equal sum");
else
System.out.println("Can not be divided into " +

"two subsets of equal sum");


}
}
/* This code is contributed by Devesh Agrawal */

Can be divided into two subsets of equal sum

Time Complexity: O(2^n) In worst case, this solution tries two possibilities (whether to include or exclude) for every element.

Dynamic Programming Solution


The problem can be solved using dynamic programming when the sum of the elements is not too big. We can create a 2D array part[][] of size
(sum/2)*(n+1). And we can construct the solution in bottom up manner such that every filled entry has following property
part[i][j] = true if a subset of {arr[0], arr[1], ..arr[j-1]} has sum
equal to i, otherwise false

C/C++
// A Dynamic Programming based C program to partition problem
#include <stdio.h>
// Returns true if arr[] can be partitioned in two subsets of
// equal sum, otherwise false
bool findPartiion (int arr[], int n)
{
int sum = 0;
int i, j;
// Caculcate sun of all elements
for (i = 0; i < n; i++)
sum += arr[i];
if (sum%2 != 0)
return false;
bool part[sum/2+1][n+1];
// initialize top row as true
for (i = 0; i <= n; i++)
part[0][i] = true;
// initialize leftmost column, except part[0][0], as 0
for (i = 1; i <= sum/2; i++)
part[i][0] = false;
// Fill the partition table in botton up manner
for (i = 1; i <= sum/2; i++)
{
for (j = 1; j <= n; j++)
{
part[i][j] = part[i][j-1];
if (i >= arr[j-1])
part[i][j] = part[i][j] || part[i - arr[j-1]][j-1];
}
}
/* // uncomment this part to print table
for (i = 0; i <= sum/2; i++)
{
for (j = 0; j <= n; j++)
printf ("%4d", part[i][j]);
printf("\n");
} */
return part[sum/2][n];
}
// Driver program to test above funtion
int main()
{
int arr[] = {3, 1, 1, 2, 2, 1};
int n = sizeof(arr)/sizeof(arr[0]);
if (findPartiion(arr, n) == true)
printf("Can be divided into two subsets of equal sum");
else

printf("Can not be divided into two subsets of equal sum");


getchar();
return 0;
}

Java
// A dynamic programming based Java program for partition problem
import java.io.*;
class Partition {
// Returns true if arr[] can be partitioned in two subsets of
// equal sum, otherwise false
static boolean findPartition (int arr[], int n)
{
int sum = 0;
int i, j;
// Caculcate sun of all elements
for (i = 0; i < n; i++)
sum += arr[i];
if (sum%2 != 0)
return false;
boolean part[][]=new boolean[sum/2+1][n+1];
// initialize top row as true
for (i = 0; i <= n; i++)
part[0][i] = true;
// initialize leftmost column, except part[0][0], as 0
for (i = 1; i <= sum/2; i++)
part[i][0] = false;
// Fill the partition table in botton up manner
for (i = 1; i <= sum/2; i++)
{
for (j = 1; j <= n; j++)
{
part[i][j] = part[i][j-1];
if (i >= arr[j-1])
part[i][j] = part[i][j] ||
part[i - arr[j-1]][j-1];
}
}
/* // uncomment this part to print table
for (i = 0; i <= sum/2; i++)
{
for (j = 0; j <= n; j++)
printf ("%4d", part[i][j]);
printf("\n");
} */
return part[sum/2][n];
}
/*Driver function to check for above function*/
public static void main (String[] args)
{
int arr[] = {3, 1, 1, 2, 2,1};
int n = arr.length;
if (findPartition(arr, n) == true)
System.out.println("Can be divided into two "
"subsets of equal sum");
else
System.out.println("Can not be divided into"
" two subsets of equal sum");
}
}
/* This code is contributed by Devesh Agrawal */

Can be divided into two subsets of equal sum

Following diagram shows the values in partition table. The diagram is taken form the wiki page of partition problem.

Time Complexity: O(sum*n)


Auxiliary Space: O(sum*n)
Please note that this solution will not be feasible for arrays with big sum.
References:
http://en.wikipedia.org/wiki/Partition_problem

Dynamic Programming | Set 19 (Word Wrap Problem)


Given a sequence of words, and a limit on the number of characters that can be put in one line (line width). Put line breaks in the given sequence
such that the lines are printed neatly. Assume that the length of each word is smaller than the line width.
The word processors like MS Word do task of placing line breaks. The idea is to have balanced lines. In other words, not have few lines with lots
of extra spaces and some lines with small amount of extra spaces.
The extra spaces includes spaces put at the end of every line except the last one.
The problem is to minimize the following total cost.
Cost of a line = (Number of extra spaces in the line)^3
Total Cost = Sum of costs for all lines
For example, consider the following string and line width M = 15
"Geeks for Geeks presents word wrap problem"
Following is the optimized arrangement of words in 3 lines
Geeks for Geeks
presents word
wrap problem
The total extra spaces in line 1, line 2 and line 3 are 0, 2 and 3 respectively.
So optimal value of total cost is 0 + 2*2 + 3*3 = 13

Please note that the total cost function is not sum of extra spaces, but sum of cubes (or square is also used) of extra spaces. The idea behind this
cost function is to balance the spaces among lines. For example, consider the following two arrangement of same set of words:
1) There are 3 lines. One line has 3 extra spaces and all other lines have 0 extra spaces. Total extra spaces = 3 + 0 + 0 = 3. Total cost = 3*3*3 +
0*0*0 + 0*0*0 = 27.
2) There are 3 lines. Each of the 3 lines has one extra space. Total extra spaces = 1 + 1 + 1 = 3. Total cost = 1*1*1 + 1*1*1 + 1*1*1 = 3.
Total extra spaces are 3 in both scenarios, but second arrangement should be preferred because extra spaces are balanced in all three lines. The
cost function with cubic sum serves the purpose because the value of total cost in second scenario is less.
Method 1 (Greedy Solution)
The greedy solution is to place as many words as possible in the first line. Then do the same thing for the second line and so on until all words are
placed. This solution gives optimal solution for many cases, but doesnt give optimal solution in all cases. For example, consider the following string
aaa bb cc ddddd and line width as 6. Greedy method will produce following output.
aaa bb
cc
ddddd

Extra spaces in the above 3 lines are 0, 4 and 1 respectively. So total cost is 0 + 64 + 1 = 65.
But the above solution is not the best solution. Following arrangement has more balanced spaces. Therefore less value of total cost function.
aaa
bb cc
ddddd

Extra spaces in the above 3 lines are 3, 1 and 1 respectively. So total cost is 27 + 1 + 1 = 29.
Despite being sub-optimal in some cases, the greedy approach is used by many word processors like MS Word and OpenOffice.org Writer.
Method 2 (Dynamic Programming)
The following Dynamic approach strictly follows the algorithm given in solution of Cormen book. First we compute costs of all possible lines in a
2D table lc[][]. The value lc[i][j] indicates the cost to put words from i to j in a single line where i and j are indexes of words in the input
sequences. If a sequence of words from i to j cannot fit in a single line, then lc[i][j] is considered infinite (to avoid it from being a part of the
solution). Once we have the lc[][] table constructed, we can calculate total cost using following recursive formula. In the following formula, C[j] is
the optimized total cost for arranging words from 1 to j.

The above recursion has overlapping subproblem property. For example, the solution of subproblem c(2) is used by c(3), C(4) and so on. So
Dynamic Programming is used to store the results of subproblems. The array c[] can be computed from left to right, since each value depends only
on earlier values.
To print the output, we keep track of what words go on what lines, we can keep a parallel p array that points to where each c value came from.

The last line starts at word p[n] and goes through word n. The previous line starts at word p[p[n]] and goes through word p[n] 1, etc. The function
printSolution() uses p[] to print the solution.
In the below program, input is an array l[] that represents lengths of words in a sequence. The value l[i] indicates length of the ith word (i starts
from 1) in theinput sequence.
// A Dynamic programming solution for Word Wrap Problem
#include <limits.h>
#include <stdio.h>
#define INF INT_MAX
// A utility function to print the solution
int printSolution (int p[], int n);
// l[] represents lengths of different words in input sequence. For example,
// l[] = {3, 2, 2, 5} is for a sentence like "aaa bb cc ddddd". n is size of
// l[] and M is line width (maximum no. of characters that can fit in a line)
void solveWordWrap (int l[], int n, int M)
{
// For simplicity, 1 extra space is used in all below arrays
// extras[i][j] will have number of extra spaces if words from i
// to j are put in a single line
int extras[n+1][n+1];
// lc[i][j] will have cost of a line which has words from
// i to j
int lc[n+1][n+1];
// c[i] will have total cost of optimal arrangement of words
// from 1 to i
int c[n+1];
// p[] is used to print the solution.
int p[n+1];
int i, j;
// calculate extra spaces in a single line. The value extra[i][j]
// indicates extra spaces if words from word number i to j are
// placed in a single line
for (i = 1; i <= n; i++)
{
extras[i][i] = M - l[i-1];
for (j = i+1; j <= n; j++)
extras[i][j] = extras[i][j-1] - l[j-1] - 1;
}
// Calculate line cost corresponding to the above calculated extra
// spaces. The value lc[i][j] indicates cost of putting words from
// word number i to j in a single line
for (i = 1; i <= n; i++)
{
for (j = i; j <= n; j++)
{
if (extras[i][j] < 0)
lc[i][j] = INF;
else if (j == n && extras[i][j] >= 0)
lc[i][j] = 0;
else
lc[i][j] = extras[i][j]*extras[i][j];
}
}
// Calculate minimum cost and find minimum cost arrangement.
// The value c[j] indicates optimized cost to arrange words
// from word number 1 to j.
c[0] = 0;
for (j = 1; j <= n; j++)
{
c[j] = INF;
for (i = 1; i <= j; i++)
{
if (c[i-1] != INF && lc[i][j] != INF && (c[i-1] + lc[i][j] < c[j]))
{
c[j] = c[i-1] + lc[i][j];
p[j] = i;
}
}
}

printSolution(p, n);
}
int printSolution (int p[], int n)
{
int k;
if (p[n] == 1)
k = 1;
else
k = printSolution (p, p[n]-1) + 1;
printf ("Line number %d: From word no. %d to %d \n", k, p[n], n);
return k;
}
// Driver program to test above functions
int main()
{
int l[] = {3, 2, 2, 5};
int n = sizeof(l)/sizeof(l[0]);
int M = 6;
solveWordWrap (l, n, M);
return 0;
}

Output:
Line number 1: From word no. 1 to 1
Line number 2: From word no. 2 to 3
Line number 3: From word no. 4 to 4

Time Complexity: O(n^2)


Auxiliary Space: O(n^2) The auxiliary space used in the above program cane be optimized to O(n) (See the reference 2 for details)
References:
http://en.wikipedia.org/wiki/Word_wrap

Dynamic Programming | Set 20 (Maximum Length Chain of Pairs)


You are given n pairs of numbers. In every pair, the first number is always smaller than the second number. A pair (c, d) can follow another pair (a,
b) if b < c. Chain of pairs can be formed in this fashion. Find the longest chain which can be formed from a given set of pairs. Source: Amazon
Interview | Set 2
For example, if the given pairs are {{5, 24}, {39, 60}, {15, 28}, {27, 40}, {50, 90} }, then the longest chain that can be formed is of length 3,
and the chain is {{5, 24}, {27, 40}, {50, 90}}
This problem is a variation of standard Longest Increasing Subsequence problem. Following is a simple two step process.
1) Sort given pairs in increasing order of first (or smaller) element.
2) Now run a modified LIS process where we compare the second element of already finalized LIS with the first element of new LIS being
constructed.
The following code is a slight modification of method 2 of this post.
#include<stdio.h>
#include<stdlib.h>
// Structure for a pair
struct pair
{
int a;
int b;
};
// This function assumes that arr[] is sorted in increasing order
// according the first (or smaller) values in pairs.
int maxChainLength( struct pair arr[], int n)
{
int i, j, max = 0;
int *mcl = (int*) malloc ( sizeof( int ) * n );
/* Initialize MCL (max chain length) values for all indexes */
for ( i = 0; i < n; i++ )
mcl[i] = 1;
/* Compute optimized chain length values in bottom up manner */
for ( i = 1; i < n; i++ )
for ( j = 0; j < i; j++ )
if ( arr[i].a > arr[j].b && mcl[i] < mcl[j] + 1)
mcl[i] = mcl[j] + 1;
// mcl[i] now stores the maximum chain length ending with pair i
/* Pick maximum of all MCL values */
for ( i = 0; i < n; i++ )
if ( max < mcl[i] )
max = mcl[i];
/* Free memory to avoid memory leak */
free( mcl );
return max;
}
/* Driver program to test above function */
int main()
{
struct pair arr[] = { {5, 24}, {15, 25},
{27, 40}, {50, 60} };
int n = sizeof(arr)/sizeof(arr[0]);
printf("Length of maximum size chain is %d\n",
maxChainLength( arr, n ));
return 0;
}

Output:
Length of maximum size chain is 3

Time Complexity: O(n^2) where n is the number of pairs.


The given problem is also a variation of Activity Selection problem and can be solved in (nLogn) time. To solve it as a activity selection problem,
consider the first element of a pair as start time in activity selection problem, and the second element of pair as end time. Thanks to Palash for
suggesting this approach.

Dynamic Programming | Set 22 (Box Stacking Problem)


You are given a set of n types of rectangular 3-D boxes, where the i^th box has height h(i), width w(i) and depth d(i) (all real numbers). You want
to create a stack of boxes which is as tall as possible, but you can only stack a box on top of another box if the dimensions of the 2-D base of the
lower box are each strictly larger than those of the 2-D base of the higher box. Of course, you can rotate a box so that any side functions as its
base. It is also allowable to use multiple instances of the same type of box.
Source: http://people.csail.mit.edu/bdean/6.046/dp/. The link also has video for explanation of solution.

The Box Stacking problem is a variation of LIS problem. We need to build a maximum height stack.
Following are the key points to note in the problem statement:
1) A box can be placed on top of another box only if both width and depth of the upper placed box are smaller than width and depth of the lower
box respectively.
2) We can rotate boxes. For example, if there is a box with dimensions {1x2x3} where 1 is height, 23 is base, then there can be three possibilities,
{1x2x3}, {2x1x3} and {3x1x2}.
3) We can use multiple instances of boxes. What it means is, we can have two different rotations of a box as part of our maximum height stack.
Following is the solution based on DP solution of LIS problem.
1) Generate all 3 rotations of all boxes. The size of rotation array becomes 3 times the size of original array. For simplicity, we consider depth as
always smaller than or equal to width.
2) Sort the above generated 3n boxes in decreasing order of base area.
3) After sorting the boxes, the problem is same as LIS with following optimal substructure property.
MSH(i) = Maximum possible Stack Height with box i at top of stack
MSH(i) = { Max ( MSH(j) ) + height(i) } where j < i and width(j) > width(i) and depth(j) > depth(i).
If there is no such j then MSH(i) = height(i)
4) To get overall maximum height, we return max(MSH(i)) where 0 < i < n Following is C++ implementation of the above solution.
/* Dynamic Programming implementation of Box Stacking problem */
#include<stdio.h>
#include<stdlib.h>
/* Representation of a box */
struct Box
{
// h > height, w > width, d > depth
int h, w, d; // for simplicity of solution, always keep w <= d
};
// A utility function to get minimum of two intgers
int min (int x, int y)
{ return (x < y)? x : y; }
// A utility function to get maximum of two intgers
int max (int x, int y)
{ return (x > y)? x : y; }
/* Following function is needed for library function qsort(). We
use qsort() to sort boxes in decreasing order of base area.
Refer following link for help of qsort() and compare()
http://www.cplusplus.com/reference/clibrary/cstdlib/qsort/ */
int compare (const void *a, const void * b)
{
return ( (*(Box *)b).d * (*(Box *)b).w )
( (*(Box *)a).d * (*(Box *)a).w );
}
/* Returns the height of the tallest stack that can be formed with give type of boxes */
int maxStackHeight( Box arr[], int n )

{
/* Create an array of all rotations of given boxes
For example, for a box {1, 2, 3}, we consider three
instances{{1, 2, 3}, {2, 1, 3}, {3, 1, 2}} */
Box rot[3*n];
int index = 0;
for (int i = 0; i < n; i++)
{
// Copy the original box
rot[index] = arr[i];
index++;
// First rotation of box
rot[index].h = arr[i].w;
rot[index].d = max(arr[i].h, arr[i].d);
rot[index].w = min(arr[i].h, arr[i].d);
index++;
// Second rotation of box
rot[index].h = arr[i].d;
rot[index].d = max(arr[i].h, arr[i].w);
rot[index].w = min(arr[i].h, arr[i].w);
index++;
}
// Now the number of boxes is 3n
n = 3*n;
/* Sort the array rot[] in decreasing order, using library
function for quick sort */
qsort (rot, n, sizeof(rot[0]), compare);
// Uncomment following two lines to print all rotations
// for (int i = 0; i < n; i++ )
//
printf("%d x %d x %d\n", rot[i].h, rot[i].w, rot[i].d);
/* Initialize msh values for all indexes
msh[i] > Maximum possible Stack Height with box i on top */
int msh[n];
for (int i = 0; i < n; i++ )
msh[i] = rot[i].h;
/* Compute optimized msh values in bottom up manner */
for (int i = 1; i < n; i++ )
for (int j = 0; j < i; j++ )
if ( rot[i].w < rot[j].w &&
rot[i].d < rot[j].d &&
msh[i] < msh[j] + rot[i].h
)
{
msh[i] = msh[j] + rot[i].h;
}
/* Pick maximum of all msh values */
int max = -1;
for ( int i = 0; i < n; i++ )
if ( max < msh[i] )
max = msh[i];
return max;
}
/* Driver program to test above function */
int main()
{
Box arr[] = { {4, 6, 7}, {1, 2, 3}, {4, 5, 6}, {10, 12, 32} };
int n = sizeof(arr)/sizeof(arr[0]);
printf("The maximum possible height of stack is %d\n",
maxStackHeight (arr, n) );
return 0;
}

Output:
The maximum possible height of stack is 60

In the above program, given input boxes are {4, 6, 7}, {1, 2, 3}, {4, 5, 6}, {10, 12, 32}. Following are all rotations of the boxes in decreasing
order of base area.

10 x 12
12 x 10
32 x 10
4 x 6 x
4 x 5 x
6 x 4 x
5 x 4 x
7 x 4 x
6 x 4 x
1 x 2 x
2 x 1 x
3 x 1 x

x 32
x 32
x 12
7
6
7
6
6
5
3
3
2

The height 60 is obtained by boxes { {3, 1, 2}, {1, 2, 3}, {6, 4, 5}, {4, 5, 6}, {4, 6, 7}, {32, 10, 12}, {10, 12, 32}}
Time Complexity: O(n^2)
Auxiliary Space: O(n)

Program for Fibonacci numbers


The Fibonacci numbers are the numbers in the following integer sequence.
0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 141, ..
In mathematical terms, the sequence Fn of Fibonacci numbers is defined by the recurrence relation
Fn = Fn-1 + Fn-2

with seed values


F0 = 0 and F1 = 1.

Write a function int fib(int n) that returns Fn. For example, if n = 0, then fib() should return 0. If n = 1, then it should return 1. For n > 1, it should
return Fn-1 + Fn-2
Following are different methods to get the nth Fibonacci number.
Method 1 ( Use recursion )
A simple method that is a direct recusrive implementation mathematical recurance relation given above.
#include<stdio.h>
int fib(int n)
{
if (n <= 1)
return n;
return fib(n-1) + fib(n-2);
}
int main ()
{
int n = 9;
printf("%d", fib(n));
getchar();
return 0;
}

Time Complexity: T(n) = T(n-1) + T(n-2) which is exponential.


We can observe that this implementation does a lot of repeated work (see the following recursion tree). So this is a bad implementation for nth
Fibonacci number.
fib(5)
/
\
fib(4)
fib(3)
/
\
/
\
fib(3)
fib(2)
fib(2)
fib(1)
/
\
/
\
/
\
fib(2) fib(1) fib(1) fib(0) fib(1) fib(0)
/
\
fib(1) fib(0)

Extra Space: O(n) if we consider the function call stack size, otherwise O(1).
Method 2 ( Use Dynamic Programming )
We can avoid the repeated work done is the method 1 by storing the Fibonacci numbers calculated so far.
#include<stdio.h>
int fib(int n)
{
/* Declare an array to store Fibonacci numbers. */
int f[n+1];
int i;
/* 0th and 1st number of the series are 0 and 1*/
f[0] = 0;
f[1] = 1;
for (i = 2; i <= n; i++)
{
/* Add the previous 2 numbers in the series
and store it */
f[i] = f[i-1] + f[i-2];
}

return f[n];
}
int main ()
{
int n = 9;
printf("%d", fib(n));
getchar();
return 0;
}

Time Complexity: O(n)


Extra Space: O(n)
Method 3 ( Space Otimized Method 2 )
We can optimize the space used in method 2 by storing the previous two numbers only because that is all we need to get the next Fibannaci
number in series.
#include<stdio.h>
int fib(int n)
{
int a = 0, b = 1, c, i;
if( n == 0)
return a;
for (i = 2; i <= n; i++)
{
c = a + b;
a = b;
b = c;
}
return b;
}
int main ()
{
int n = 9;
printf("%d", fib(n));
getchar();
return 0;
}

Time Complexity: O(n)


Extra Space: O(1)
Method 4 ( Using power of the matrix {{1,1},{1,0}} )
This another O(n) which relies on the fact that if we n times multiply the matrix M = {{1,1},{1,0}} to itself (in other words calculate power(M, n
)), then we get the (n+1)th Fibonacci number as the element at row and column (0, 0) in the resultant matrix.
The matrix representation gives the following closed expression for the Fibonacci numbers:

#include <stdio.h>
/* Helper function that multiplies 2 matricies F and M of size 2*2, and
puts the multiplication result back to F[][] */
void multiply(int F[2][2], int M[2][2]);
/* Helper function that calculates F[][] raise to the power n and puts the
result in F[][]
Note that this function is desinged only for fib() and won't work as general
power function */
void power(int F[2][2], int n);
int fib(int n)
{
int F[2][2] = {{1,1},{1,0}};
if (n == 0)
return 0;
power(F, n-1);
return F[0][0];
}
void multiply(int F[2][2], int M[2][2])
{
int x = F[0][0]*M[0][0] + F[0][1]*M[1][0];
int y = F[0][0]*M[0][1] + F[0][1]*M[1][1];

int z = F[1][0]*M[0][0] + F[1][1]*M[1][0];


int w = F[1][0]*M[0][1] + F[1][1]*M[1][1];
F[0][0]
F[0][1]
F[1][0]
F[1][1]

=
=
=
=

x;
y;
z;
w;

}
void power(int F[2][2], int n)
{
int i;
int M[2][2] = {{1,1},{1,0}};
// n - 1 times multiply the matrix to {{1,0},{0,1}}
for (i = 2; i <= n; i++)
multiply(F, M);
}
/* Driver program to test above function */
int main()
{
int n = 9;
printf("%d", fib(n));
getchar();
return 0;
}

Time Complexity: O(n)


Extra Space: O(1)
Method 5 ( Optimized Method 4 )
The method 4 can be optimized to work in O(Logn) time complexity. We can do recursive multiplication to get power(M, n) in the prevous
method (Similar to the optimization done in this post)
#include <stdio.h>
void multiply(int F[2][2], int M[2][2]);
void power(int F[2][2], int n);
/* function that returns nth Fibonacci number */
int fib(int n)
{
int F[2][2] = {{1,1},{1,0}};
if (n == 0)
return 0;
power(F, n-1);
return F[0][0];
}
/* Optimized version of power() in method 4 */
void power(int F[2][2], int n)
{
if( n == 0 || n == 1)
return;
int M[2][2] = {{1,1},{1,0}};
power(F, n/2);
multiply(F, F);
if (n%2 != 0)
multiply(F, M);
}
void multiply(int F[2][2],
{
int x = F[0][0]*M[0][0]
int y = F[0][0]*M[0][1]
int z = F[1][0]*M[0][0]
int w = F[1][0]*M[0][1]
F[0][0]
F[0][1]
F[1][0]
F[1][1]

=
=
=
=

int M[2][2])
+
+
+
+

F[0][1]*M[1][0];
F[0][1]*M[1][1];
F[1][1]*M[1][0];
F[1][1]*M[1][1];

x;
y;
z;
w;

}
/* Driver program to test above function */

int main()
{
int n = 9;
printf("%d", fib(9));
getchar();
return 0;
}

Time Complexity: O(Logn)


Extra Space: O(Logn) if we consider the function call stack size, otherwise O(1).
References:
http://en.wikipedia.org/wiki/Fibonacci_number
http://www.ics.uci.edu/~eppstein/161/960109.html

Minimum number of jumps to reach end


Given an array of integers where each element represents the max number of steps that can be made forward from that element. Write a function
to return the minimum number of jumps to reach the end of the array (starting from the first element). If an element is 0, then cannot move through
that element.
Example:
Input: arr[] = {1, 3, 5, 8, 9, 2, 6, 7, 6, 8, 9}
Output: 3 (1-> 3 -> 8 ->9)

First element is 1, so can only go to 3. Second element is 3, so can make at most 3 steps eg to 5 or 8 or 9.
Method 1 (Naive Recursive Approach)
A naive approach is to start from the first element and recursively call for all the elements reachable from first element. The minimum number of
jumps to reach end from first can be calculated using minimum number of jumps needed to reach end from the elements reachable from first.
minJumps(start, end) = Min ( minJumps(k, end) ) for all k reachable from start
#include <stdio.h>
#include <limits.h>
// Returns minimum number of jumps to reach arr[h] from arr[l]
int minJumps(int arr[], int l, int h)
{
// Base case: when source and destination are same
if (h == l)
return 0;
// When nothing is reachable from the given source
if (arr[l] == 0)
return INT_MAX;
// Traverse through all the points reachable from arr[l]. Recursively
// get the minimum number of jumps needed to reach arr[h] from these
// reachable points.
int min = INT_MAX;
for (int i = l+1; i <= h && i <= l + arr[l]; i++)
{
int jumps = minJumps(arr, i, h);
if(jumps != INT_MAX && jumps + 1 < min)
min = jumps + 1;
}
return min;
}
// Driver program to test above function
int main()
{
int arr[] = {1, 3, 6, 3, 2, 3, 6, 8, 9, 5};
int n = sizeof(arr)/sizeof(arr[0]);
printf("Minimum number of jumps to reach end is %d ", minJumps(arr, 0, n-1));
return 0;
}

If we trace the execution of this method, we can see that there will be overlapping subproblems. For example, minJumps(3, 9) will be called two
times as arr[3] is reachable from arr[1] and arr[2]. So this problem has both properties (optimal substructure and overlapping subproblems) of
Dynamic Programming.

Method 2 (Dynamic Programming)


In this method, we build a jumps[] array from left to right such that jumps[i] indicates the minimum number of jumps needed to reach arr[i] from
arr[0]. Finally, we return jumps[n-1].
#include <stdio.h>
#include <limits.h>
int min(int x, int y) { return (x < y)? x: y; }
// Returns minimum number of jumps to reach arr[n-1] from arr[0]
int minJumps(int arr[], int n)
{
int *jumps = new int[n]; // jumps[n-1] will hold the result
int i, j;

if (n == 0 || arr[0] == 0)
return INT_MAX;
jumps[0] = 0;
// Find the minimum number of jumps to reach arr[i]
// from arr[0], and assign this value to jumps[i]
for (i = 1; i < n; i++)
{
jumps[i] = INT_MAX;
for (j = 0; j < i; j++)
{
if (i <= j + arr[j] && jumps[j] != INT_MAX)
{
jumps[i] = min(jumps[i], jumps[j] + 1);
break;
}
}
}
return jumps[n-1];
}
// Driver program to test above function
int main()
{
int arr[] = {1, 3, 6, 1, 0, 9};
int size = sizeof(arr)/sizeof(int);
printf("Minimum number of jumps to reach end is %d ", minJumps(arr,size));
return 0;
}

Output:
Minimum number of jumps to reach end is 3

Thanks to paras for suggesting this method.


Time Complexity: O(n^2)

Method 3 (Dynamic Programming)


In this method, we build jumps[] array from right to left such that jumps[i] indicates the minimum number of jumps needed to reach arr[n-1] from
arr[i]. Finally, we return arr[0].
int minJumps(int arr[], int n)
{
int *jumps = new int[n]; // jumps[0] will hold the result
int min;
// Minimum number of jumps needed to reach last element
// from last elements itself is always 0
jumps[n-1] = 0;
int i, j;
// Start from the second element, move from right to left
// and construct the jumps[] array where jumps[i] represents
// minimum number of jumps needed to reach arr[m-1] from arr[i]
for (i = n-2; i >=0; i--)
{
// If arr[i] is 0 then arr[n-1] can't be reached from here
if (arr[i] == 0)
jumps[i] = INT_MAX;
// If we can direcly reach to the end point from here then
// jumps[i] is 1
else if (arr[i] >= n - i - 1)
jumps[i] = 1;
// Otherwise, to find out the minimum number of jumps needed
// to reach arr[n-1], check all the points reachable from here
// and jumps[] value for those points
else
{
min = INT_MAX; // initialize min value
// following loop checks with all reachable points and

// takes the minimum


for (j = i+1; j < n && j <= arr[i] + i; j++)
{
if (min > jumps[j])
min = jumps[j];
}
// Handle overflow
if (min != INT_MAX)
jumps[i] = min + 1;
else
jumps[i] = min; // or INT_MAX
}
}
return jumps[0];
}

Time Complexity: O(n^2) in worst case.


Thanks to Ashish for suggesting this solution.

Maximum size square sub-matrix with all 1s


Given a binary matrix, find out the maximum size square sub-matrix with all 1s.
For example, consider the below binary matrix.
0
1
0
1
1
0

1
1
1
1
1
0

1
0
1
1
1
0

0
1
1
1
1
0

1
0
0
0
1
0

The maximum square sub-matrix with all set bits is


1 1 1
1 1 1
1 1 1

Algorithm:
Let the given binary matrix be M[R][C]. The idea of the algorithm is to construct an auxiliary size matrix S[][] in which each entry S[i][j] represents
size of the square sub-matrix with all 1s including M[i][j] where M[i][j] is the rightmost and bottommost entry in sub-matrix.
1) Construct a sum matrix S[R][C] for the given M[R][C].
a) Copy first row and first columns as it is from M[][] to S[][]
b) For other entries, use following expressions to construct S[][]
If M[i][j] is 1 then
S[i][j] = min(S[i][j-1], S[i-1][j], S[i-1][j-1]) + 1
Else /*If M[i][j] is 0*/
S[i][j] = 0
2) Find the maximum entry in S[R][C]
3) Using the value and coordinates of maximum entry in S[i], print
sub-matrix of M[][]

For the given M[R][C] in above example, constructed S[R][C] would be:
0
1
0
1
1
0

1
1
1
1
2
0

1
0
1
2
2
0

0
1
1
2
3
0

1
0
0
0
1
0

The value of maximum entry in above matrix is 3 and coordinates of the entry are (4, 3). Using the maximum value and its coordinates, we can find
out the required sub-matrix.
#include<stdio.h>
#define bool int
#define R 6
#define C 5
void printMaxSubSquare(bool M[R][C])
{
int i,j;
int S[R][C];
int max_of_s, max_i, max_j;
/* Set first column of S[][]*/
for(i = 0; i < R; i++)
S[i][0] = M[i][0];
/* Set first row of S[][]*/
for(j = 0; j < C; j++)
S[0][j] = M[0][j];
/* Construct other entries of S[][]*/
for(i = 1; i < R; i++)
{
for(j = 1; j < C; j++)
{
if(M[i][j] == 1)
S[i][j] = min(S[i][j-1], S[i-1][j], S[i-1][j-1]) + 1;
else
S[i][j] = 0;
}
}

/* Find the maximum entry, and indexes of maximum entry


in S[][] */
max_of_s = S[0][0]; max_i = 0; max_j = 0;
for(i = 0; i < R; i++)
{
for(j = 0; j < C; j++)
{
if(max_of_s < S[i][j])
{
max_of_s = S[i][j];
max_i = i;
max_j = j;
}
}
}
printf("\n Maximum size sub-matrix is: \n");
for(i = max_i; i > max_i - max_of_s; i--)
{
for(j = max_j; j > max_j - max_of_s; j--)
{
printf("%d ", M[i][j]);
}
printf("\n");
}
}
/* UTILITY FUNCTIONS */
/* Function to get minimum of three values */
int min(int a, int b, int c)
{
int m = a;
if (m > b)
m = b;
if (m > c)
m = c;
return m;
}
/* Driver function to test above functions */
int main()
{
bool M[R][C] = {{0, 1, 1, 0, 1},
{1, 1, 0, 1, 0},
{0, 1, 1, 1, 0},
{1, 1, 1, 1, 0},
{1, 1, 1, 1, 1},
{0, 0, 0, 0, 0}};
printMaxSubSquare(M);
getchar();
}

Time Complexity: O(m*n) where m is number of rows and n is number of columns in the given matrix.
Auxiliary Space: O(m*n) where m is number of rows and n is number of columns in the given matrix.
Algorithmic Paradigm: Dynamic Programming

Ugly Numbers
Ugly numbers are numbers whose only prime factors are 2, 3 or 5. The sequence
1, 2, 3, 4, 5, 6, 8, 9, 10, 12, 15,
shows the first 11 ugly numbers. By convention, 1 is included.
Write a program to find and print the 150th ugly number.
METHOD 1 (Simple)
Thanks to Nedylko Draganov for suggesting this solution.
Algorithm:
Loop for all positive integers until ugly number count is smaller than n, if an integer is ugly than increment ugly number count.
To check if a number is ugly, divide the number by greatest divisible powers of 2, 3 and 5, if the number becomes 1 then it is an ugly number
otherwise not.
For example, let us see how to check for 300 is ugly or not. Greatest divisible power of 2 is 4, after dividing 300 by 4 we get 75. Greatest
divisible power of 3 is 3, after dividing 75 by 3 we get 25. Greatest divisible power of 5 is 25, after dividing 25 by 25 we get 1. Since we get 1
finally, 300 is ugly number.
Implementation:
# include<stdio.h>
# include<stdlib.h>
/*This function divides a by greatest divisible
power of b*/
int maxDivide(int a, int b)
{
while (a%b == 0)
a = a/b;
return a;
}
/* Function to check
int isUgly(int no)
{
no = maxDivide(no,
no = maxDivide(no,
no = maxDivide(no,

if a number is ugly or not */


2);
3);
5);

return (no == 1)? 1 : 0;


}
/* Function to get the nth ugly number*/
int getNthUglyNo(int n)
{
int i = 1;
int count = 1; /* ugly number count */
/*Check for all integers untill ugly count
becomes n*/
while (n > count)
{
i++;
if (isUgly(i))
count++;
}
return i;
}
/* Driver program to test above functions */
int main()
{
unsigned no = getNthUglyNo(150);
printf("150th ugly no. is %d ", no);
getchar();
return 0;
}

This method is not time efficient as it checks for all integers until ugly number count becomes n, but space complexity of this method is O(1)
METHOD 2 (Use Dynamic Programming)
Here is a time efficient solution with O(n) extra space. The ugly-number sequence is 1, 2, 3, 4, 5, 6, 8, 9, 10, 12, 15,

because every number can only be divided by 2, 3, 5, one way to look at the sequence is to split the sequence to three groups as below:
(1) 12, 22, 32, 42, 52,
(2) 13, 23, 33, 43, 53,
(3) 15, 25, 35, 45, 55,
We can find that every subsequence is the ugly-sequence itself (1, 2, 3, 4, 5, ) multiply 2, 3, 5. Then we use similar merge method as merge sort,
to get every ugly number from the three subsequence. Every step we choose the smallest one, and move one step after.
Algorithm:
1 Declare an array for ugly numbers: ugly[150]
2 Initialize first ugly no: ugly[0] = 1
3 Initialize three array index variables i2, i3, i5 to point to
1st element of the ugly array:
i2 = i3 = i5 =0;
4 Initialize 3 choices for the next ugly no:
next_mulitple_of_2 = ugly[i2]*2;
next_mulitple_of_3 = ugly[i3]*3
next_mulitple_of_5 = ugly[i5]*5;
5 Now go in a loop to fill all ugly numbers till 150:
For (i = 1; i < 150; i++ )
{
/* These small steps are not optimized for good
readability. Will optimize them in C program */
next_ugly_no = Min(next_mulitple_of_2,
next_mulitple_of_3,
next_mulitple_of_5);
if (next_ugly_no == next_mulitple_of_2)
{
i2 = i2 + 1;
next_mulitple_of_2 = ugly[i2]*2;
}
if (next_ugly_no == next_mulitple_of_3)
{
i3 = i3 + 1;
next_mulitple_of_3 = ugly[i3]*3;
}
if (next_ugly_no == next_mulitple_of_5)
{
i5 = i5 + 1;
next_mulitple_of_5 = ugly[i5]*5;
}
ugly[i] = next_ugly_no
}/* end of for loop */
6.return next_ugly_no

Example:
Let us see how it works
initialize
ugly[] = | 1 |
i2 = i3 = i5 = 0;
First iteration
ugly[1] = Min(ugly[i2]*2, ugly[i3]*3, ugly[i5]*5)
= Min(2, 3, 5)
= 2
ugly[] = | 1 | 2 |
i2 = 1, i3 = i5 = 0 (i2 got incremented )
Second iteration
ugly[2] = Min(ugly[i2]*2, ugly[i3]*3, ugly[i5]*5)
= Min(4, 3, 5)
= 3
ugly[] = | 1 | 2 | 3 |
i2 = 1, i3 = 1, i5 = 0 (i3 got incremented )
Third iteration
ugly[3] = Min(ugly[i2]*2, ugly[i3]*3, ugly[i5]*5)
= Min(4, 6, 5)
= 4
ugly[] = | 1 | 2 | 3 | 4 |
i2 = 2, i3 = 1, i5 = 0 (i2 got incremented )
Fourth iteration
ugly[4] = Min(ugly[i2]*2, ugly[i3]*3, ugly[i5]*5)
= Min(6, 6, 5)
= 5
ugly[] = | 1 | 2 | 3 | 4 | 5 |

i2 = 2, i3 = 1, i5 = 1 (i5 got incremented )


Fifth iteration
ugly[4] = Min(ugly[i2]*2, ugly[i3]*3, ugly[i5]*5)
= Min(6, 6, 10)
= 6
ugly[] = | 1 | 2 | 3 | 4 | 5 | 6 |
i2 = 3, i3 = 2, i5 = 1 (i2 and i3 got incremented )
Will continue same way till I < 150

Program:
# include<stdio.h>
# include<stdlib.h>
# define bool int
/* Function to find minimum of 3 numbers */
unsigned min(unsigned , unsigned , unsigned );
/* Function to get the nth ugly number*/
unsigned getNthUglyNo(unsigned n)
{
unsigned *ugly =
(unsigned *)(malloc (sizeof(unsigned)*n));
unsigned i2 = 0, i3 = 0, i5 = 0;
unsigned i;
unsigned next_multiple_of_2 = 2;
unsigned next_multiple_of_3 = 3;
unsigned next_multiple_of_5 = 5;
unsigned next_ugly_no = 1;
*(ugly+0) = 1;
for(i=1; i<n; i++)
{
next_ugly_no = min(next_multiple_of_2,
next_multiple_of_3,
next_multiple_of_5);
*(ugly+i) = next_ugly_no;
if(next_ugly_no == next_multiple_of_2)
{
i2 = i2+1;
next_multiple_of_2 = *(ugly+i2)*2;
}
if(next_ugly_no == next_multiple_of_3)
{
i3 = i3+1;
next_multiple_of_3 = *(ugly+i3)*3;
}
if(next_ugly_no == next_multiple_of_5)
{
i5 = i5+1;
next_multiple_of_5 = *(ugly+i5)*5;
}
} /*End of for loop (i=1; i<n; i++) */
return next_ugly_no;
}
/* Function to find minimum of 3 numbers */
unsigned min(unsigned a, unsigned b, unsigned c)
{
if(a <= b)
{
if(a <= c)
return a;
else
return c;
}
if(b <= c)
return b;
else
return c;
}
/* Driver program to test above functions */
int main()
{
unsigned no = getNthUglyNo(150);
printf("%dth ugly no. is %d ", 150, no);
getchar();
return 0;

Algorithmic Paradigm: Dynamic Programming


Time Complexity: O(n)
Storage Complexity: O(n)

Largest Sum Contiguous Subarray


Write an efficient C program to find the sum of contiguous subarray within a one-dimensional array of numbers which has the largest sum.
Kadanes Algorithm:
Initialize:
max_so_far = 0
max_ending_here = 0
Loop for each element of the array
(a) max_ending_here = max_ending_here + a[i]
(b) if(max_ending_here < 0)
max_ending_here = 0
(c) if(max_so_far < max_ending_here)
max_so_far = max_ending_here
return max_so_far

Explanation:
Simple idea of the Kadane's algorithm is to look for all positive contiguous segments of the array (max_ending_here is used for this). And keep
track of maximum sum contiguous segment among all positive segments (max_so_far is used for this). Each time we get a positive sum compare it
with max_so_far and update max_so_far if it is greater than max_so_far
Lets take the example:
{-2, -3, 4, -1, -2, 1, 5, -3}
max_so_far = max_ending_here = 0
for i=0, a[0] = -2
max_ending_here = max_ending_here + (-2)
Set max_ending_here = 0 because max_ending_here < 0
for i=1, a[1] = -3
max_ending_here = max_ending_here + (-3)
Set max_ending_here = 0 because max_ending_here < 0
for i=2, a[2] = 4
max_ending_here = max_ending_here + (4)
max_ending_here = 4
max_so_far is updated to 4 because max_ending_here greater
than max_so_far which was 0 till now
for i=3, a[3] = -1
max_ending_here = max_ending_here + (-1)
max_ending_here = 3
for i=4, a[4] = -2
max_ending_here = max_ending_here + (-2)
max_ending_here = 1
for i=5, a[5] = 1
max_ending_here = max_ending_here + (1)
max_ending_here = 2
for i=6, a[6] = 5
max_ending_here = max_ending_here + (5)
max_ending_here = 7
max_so_far is updated to 7 because max_ending_here is
greater than max_so_far
for i=7, a[7] = -3
max_ending_here = max_ending_here + (-3)
max_ending_here = 4

Program:

C++
// C++ program to print largest contiguous array sum
#include<iostream>
using namespace std;
int maxSubArraySum(int a[], int size)
{
int max_so_far = 0, max_ending_here = 0;
for (int i = 0; i < size; i++)
{

max_ending_here = max_ending_here + a[i];


if (max_ending_here < 0)
max_ending_here = 0;
if (max_so_far < max_ending_here)
max_so_far = max_ending_here;
}
return max_so_far;
}
/*Driver program to test maxSubArraySum*/
int main()
{
int a[] = {-2, -3, 4, -1, -2, 1, 5, -3};
int n = sizeof(a)/sizeof(a[0]);
int max_sum = maxSubArraySum(a, n);
cout << "Maximum contiguous sum is \n" << max_sum;
return 0;
}

Python
# Python program to find maximum contiguous subarray
# Function to find the maximum contiguous subarray
def maxSubArraySum(a,size):
max_so_far = 0
max_ending_here = 0
for i in range(0, size):
max_ending_here = max_ending_here + a[i]
if max_ending_here < 0:
max_ending_here = 0
if (max_so_far < max_ending_here):
max_so_far = max_ending_here
return max_so_far
# Driver function to check the above function
a = [-2, -3, 4, -1, -2, 1, 5, -3]
print"Maximum contiguous sum is", maxSubArraySum(a,len(a))
#This code is contributed by _Devesh Agrawal_

Maximum contiguous sum is 7

Notes:
Algorithm doesn't work for all negative numbers. It simply returns 0 if all numbers are negative. For handling this we can add an extra phase before
actual implementation. The phase will look if all numbers are negative, if they are it will return maximum of them (or smallest in terms of absolute
value). There may be other ways to handle it though.
Above program can be optimized further, if we compare max_so_far with max_ending_here only if max_ending_here is greater than 0.

C++
int maxSubArraySum(int a[], int size)
{
int max_so_far = 0, max_ending_here = 0;
for (int i = 0; i < size; i++)
{
max_ending_here = max_ending_here + a[i];
if (max_ending_here < 0)
max_ending_here = 0;
/* Do not compare for all elements. Compare only
when max_ending_here > 0 */
else if (max_so_far < max_ending_here)
max_so_far = max_ending_here;
}
return max_so_far;
}

Python

def maxSubArraySum(a,size):
max_so_far = 0
max_ending_here = 0
for i in range(0, size):
max_ending_here = max_ending_here + a[i]
if max_ending_here < 0:
max_ending_here = 0
# Do not compare for all elements. Compare only
# when max_ending_here > 0
elif (max_so_far < max_ending_here):
max_so_far = max_ending_here
return max_so_far

Time Complexity: O(n)


Algorithmic Paradigm: Dynamic Programming
Following is another simple implementation suggested by Mohit Kumar. The implementation handles the case when all numbers in array are
negative.

Longest Palindromic Substring | Set 1


Given a string, find the longest substring which is palindrome. For example, if the given string is forgeeksskeegfor, the output should be
geeksskeeg.
Method 1 ( Brute Force )
The simple approach is to check each substring whether the substring is a palindrome or not. We can run three loops, the outer two loops pick all
substrings one by one by fixing the corner characters, the inner loop checks whether the picked substring is palindrome or not.
Time complexity: O ( n^3 )
Auxiliary complexity: O ( 1 )
Method 2 ( Dynamic Programming )
The time complexity can be reduced by storing results of subproblems. The idea is similar to this post. We maintain a boolean table[n][n] that is
filled in bottom up manner. The value of table[i][j] is true, if the substring is palindrome, otherwise false. To calculate table[i][j], we first check the
value of table[i+1][j-1], if the value is true and str[i] is same as str[j], then we make table[i][j] true. Otherwise, the value of table[i][j] is made false.
// A dynamic programming solution for longest palindr.
// This code is adopted from following link
// http://www.leetcode.com/2011/11/longest-palindromic-substring-part-i.html
#include <stdio.h>
#include <string.h>
// A utility function to print a substring str[low..high]
void printSubStr( char* str, int low, int high )
{
for( int i = low; i <= high; ++i )
printf("%c", str[i]);
}
// This function prints the longest palindrome substring
// of str[].
// It also returns the length of the longest palindrome
int longestPalSubstr( char *str )
{
int n = strlen( str ); // get length of input string
// table[i][j] will be false if substring str[i..j]
// is not palindrome.
// Else table[i][j] will be true
bool table[n][n];
memset(table, 0, sizeof(table));
// All substrings of length 1 are palindromes
int maxLength = 1;
for (int i = 0; i < n; ++i)
table[i][i] = true;
// check for sub-string of length 2.
int start = 0;
for (int i = 0; i < n-1; ++i)
{
if (str[i] == str[i+1])
{
table[i][i+1] = true;
start = i;
maxLength = 2;
}
}
// Check for lengths greater than 2. k is length
// of substring
for (int k = 3; k <= n; ++k)
{
// Fix the starting index
for (int i = 0; i < n-k+1 ; ++i)
{
// Get the ending index of substring from
// starting index i and length k
int j = i + k - 1;
//
//
//
if
{

checking for sub-string from ith index to


jth index iff str[i+1] to str[j-1] is a
palindrome
(table[i+1][j-1] && str[i] == str[j])
table[i][j] = true;

if (k > maxLength)
{
start = i;
maxLength = k;
}
}
}
}
printf("Longest palindrome substring is: ");
printSubStr( str, start, start + maxLength - 1 );
return maxLength; // return length of LPS
}
// Driver program to test above functions
int main()
{
char str[] = "forgeeksskeegfor";
printf("\nLength is: %d\n", longestPalSubstr( str ) );
return 0;
}

Output:
Longest palindrome substring is: geeksskeeg
Length is: 10

Time complexity: O ( n^2 )


Auxiliary Space: O ( n^2 )
We will soon be adding more optimized methods as separate posts.

Dynamic Programming | Set 23 (BellmanFord Algorithm)


Given a graph and a source vertex src in graph, find shortest paths from src to all vertices in the given graph. The graph may contain negative
weight edges.
We have discussed Dijkstras algorithm for this problem. Dijksras algorithm is a Greedy algorithm and time complexity is O(VLogV) (with the use
of Fibonacci heap). Dijkstra doesnt work for Graphs with negative weight edges, Bellman-Ford works for such graphs. Bellman-Ford is
also simpler than Dijkstra and suites well for distributed systems. But time complexity of Bellman-Ford is O(VE), which is more than
Dijkstra.
Algorithm
Following are the detailed steps.
Input: Graph and a source vertex src
Output: Shortest distance to all vertices from src. If there is a negative weight cycle, then shortest distances are not calculated, negative weight
cycle is reported.
1) This step initializes distances from source to all vertices as infinite and distance to source itself as 0. Create an array dist[] of size |V| with all
values as infinite except dist[src] where src is source vertex.
2) This step calculates shortest distances. Do following |V|-1 times where |V| is the number of vertices in given graph.
..a) Do following for each edge u-v
If dist[v] > dist[u] + weight of edge uv, then update dist[v]
.dist[v] = dist[u] + weight of edge uv
3) This step reports if there is a negative weight cycle in graph. Do following for each edge u-v
If dist[v] > dist[u] + weight of edge uv, then Graph contains negative weight cycle
The idea of step 3 is, step 2 guarantees shortest distances if graph doesnt contain negative weight cycle. If we iterate through all edges one more
time and get a shorter path for any vertex, then there is a negative weight cycle
How does this work? Like other Dynamic Programming Problems, the algorithm calculate shortest paths in bottom-up manner. It first calculates
the shortest distances for the shortest paths which have at-most one edge in the path. Then, it calculates shortest paths with at-nost 2 edges, and so
on. After the ith iteration of outer loop, the shortest paths with at most i edges are calculated. There can be maximum |V| 1 edges in any simple
path, that is why the outer loop runs |v| 1 times. The idea is, assuming that there is no negative weight cycle, if we have calculated shortest paths
with at most i edges, then an iteration over all edges guarantees to give shortest path with at-most (i+1) edges (Proof is simple, you can refer this or
MIT Video Lecture)
Example
Let us understand the algorithm with following example graph. The images are taken from this source.
Let the given source vertex be 0. Initialize all distances as infinite, except the distance to source itself. Total number of vertices in the graph is 5, so
all edges must be processed 4 times.

Let all edges are processed in following order: (B,E), (D,B), (B,D), (A,B), (A,C), (D,C), (B,C), (E,D). We get following distances when all edges
are processed first time. The first row in shows initial distances. The second row shows distances when edges (B,E), (D,B), (B,D) and (A,B) are
processed. The third row shows distances when (A,C) is processed. The fourth row shows when (D,C), (B,C) and (E,D) are processed.

The first iteration guarantees to give all shortest paths which are at most 1 edge long. We get following distances when all edges are processed
second time (The last row shows final values).

The second iteration guarantees to give all shortest paths which are at most 2 edges long. The algorithm processes all edges 2 more times. The
distances are minimized after the second iteration, so third and fourth iterations dont update the distances.
Implementation:

C++
// A C / C++ program for Bellman-Ford's single source
// shortest path algorithm.
#include
#include
#include
#include

<stdio.h>
<stdlib.h>
<string.h>
<limits.h>

// a structure to represent a weighted edge in graph


struct Edge
{
int src, dest, weight;
};
// a structure to represent a connected, directed and
// weighted graph
struct Graph
{
// V-> Number of vertices, E-> Number of edges
int V, E;
// graph is represented as an array of edges.
struct Edge* edge;
};
// Creates a graph with V vertices and E edges
struct Graph* createGraph(int V, int E)
{
struct Graph* graph =
(struct Graph*) malloc( sizeof(struct Graph) );
graph->V = V;
graph->E = E;
graph->edge =
(struct Edge*) malloc( graph->E * sizeof( struct Edge ) );
return graph;
}
// A utility function used to print the solution
void printArr(int dist[], int n)
{
printf("Vertex Distance from Source\n");
for (int i = 0; i < n; ++i)
printf("%d \t\t %d\n", i, dist[i]);
}
// The main function that finds shortest distances from src to
// all other vertices using Bellman-Ford algorithm. The function
// also detects negative weight cycle
void BellmanFord(struct Graph* graph, int src)
{
int V = graph->V;
int E = graph->E;
int dist[V];
// Step 1: Initialize distances from src to all other vertices
// as INFINITE
for (int i = 0; i < V; i++)
dist[i] = INT_MAX;
dist[src] = 0;

// Step 2: Relax all edges |V| - 1 times. A simple shortest


// path from src to any other vertex can have at-most |V| - 1
// edges
for (int i = 1; i <= V-1; i++)
{
for (int j = 0; j < E; j++)
{
int u = graph->edge[j].src;
int v = graph->edge[j].dest;
int weight = graph->edge[j].weight;
if (dist[u] != INT_MAX && dist[u] + weight < dist[v])
dist[v] = dist[u] + weight;
}
}
// Step 3: check for negative-weight cycles. The above step
// guarantees shortest distances if graph doesn't contain
// negative weight cycle. If we get a shorter path, then there
// is a cycle.
for (int i = 0; i < E; i++)
{
int u = graph->edge[i].src;
int v = graph->edge[i].dest;
int weight = graph->edge[i].weight;
if (dist[u] != INT_MAX && dist[u] + weight < dist[v])
printf("Graph contains negative weight cycle");
}
printArr(dist, V);
return;
}
// Driver program to test above functions
int main()
{
/* Let us create the graph given in above example */
int V = 5; // Number of vertices in graph
int E = 8; // Number of edges in graph
struct Graph* graph = createGraph(V, E);
// add edge 0-1 (or A-B in above figure)
graph->edge[0].src = 0;
graph->edge[0].dest = 1;
graph->edge[0].weight = -1;
// add edge 0-2 (or A-C in above figure)
graph->edge[1].src = 0;
graph->edge[1].dest = 2;
graph->edge[1].weight = 4;
// add edge 1-2 (or B-C in above figure)
graph->edge[2].src = 1;
graph->edge[2].dest = 2;
graph->edge[2].weight = 3;
// add edge 1-3 (or B-D in above figure)
graph->edge[3].src = 1;
graph->edge[3].dest = 3;
graph->edge[3].weight = 2;
// add edge 1-4 (or A-E in above figure)
graph->edge[4].src = 1;
graph->edge[4].dest = 4;
graph->edge[4].weight = 2;
// add edge 3-2 (or D-C in above figure)
graph->edge[5].src = 3;
graph->edge[5].dest = 2;
graph->edge[5].weight = 5;
// add edge 3-1 (or D-B in above figure)
graph->edge[6].src = 3;
graph->edge[6].dest = 1;
graph->edge[6].weight = 1;
// add edge 4-3 (or E-D in above figure)
graph->edge[7].src = 4;
graph->edge[7].dest = 3;
graph->edge[7].weight = -3;

BellmanFord(graph, 0);
return 0;
}

Java
// A Java program for Bellman-Ford's single source shortest path
// algorithm.
import java.util.*;
import java.lang.*;
import java.io.*;
// A class to represent a connected, directed and weighted graph
class Graph
{
// A class to represent a weighted edge in graph
class Edge {
int src, dest, weight;
Edge() {
src = dest = weight = 0;
}
};
int V, E;
Edge edge[];
// Creates a graph with V vertices and E edges
Graph(int v, int e)
{
V = v;
E = e;
edge = new Edge[e];
for (int i=0; i<e; ++i)
edge[i] = new Edge();
}
// The main function that finds shortest distances from src
// to all other vertices using Bellman-Ford algorithm. The
// function also detects negative weight cycle
void BellmanFord(Graph graph,int src)
{
int V = graph.V, E = graph.E;
int dist[] = new int[V];
// Step 1: Initialize distances from src to all other
// vertices as INFINITE
for (int i=0; i<V; ++i)
dist[i] = Integer.MAX_VALUE;
dist[src] = 0;
// Step 2: Relax all edges |V| - 1 times. A simple
// shortest path from src to any other vertex can
// have at-most |V| - 1 edges
for (int i=1; i<V; ++i)
{
for (int j=0; j<E; ++j)
{
int u = graph.edge[j].src;
int v = graph.edge[j].dest;
int weight = graph.edge[j].weight;
if (dist[u]!=Integer.MAX_VALUE &&
dist[u]+weight<dist[v])
dist[v]=dist[u]+weight;
}
}
// Step 3: check for negative-weight cycles. The above
// step guarantees shortest distances if graph doesn't
// contain negative weight cycle. If we get a shorter
// path, then there is a cycle.
for (int j=0; j<E; ++j)
{
int u = graph.edge[j].src;
int v = graph.edge[j].dest;
int weight = graph.edge[j].weight;
if (dist[u]!=Integer.MAX_VALUE &&
dist[u]+weight<dist[v])
System.out.println("Graph contains negative weight cycle");

}
printArr(dist, V);
}
// A utility function used to print the solution
void printArr(int dist[], int V)
{
System.out.println("Vertex Distance from Source");
for (int i=0; i<V; ++i)
System.out.println(i+"\t\t"+dist[i]);
}
// Driver method to test above function
public static void main(String[] args)
{
int V = 5; // Number of vertices in graph
int E = 8; // Number of edges in graph
Graph graph = new Graph(V, E);
// add edge 0-1 (or A-B in above figure)
graph.edge[0].src = 0;
graph.edge[0].dest = 1;
graph.edge[0].weight = -1;
// add edge 0-2 (or A-C in above figure)
graph.edge[1].src = 0;
graph.edge[1].dest = 2;
graph.edge[1].weight = 4;
// add edge 1-2 (or B-C in above figure)
graph.edge[2].src = 1;
graph.edge[2].dest = 2;
graph.edge[2].weight = 3;
// add edge 1-3 (or B-D in above figure)
graph.edge[3].src = 1;
graph.edge[3].dest = 3;
graph.edge[3].weight = 2;
// add edge 1-4 (or A-E in above figure)
graph.edge[4].src = 1;
graph.edge[4].dest = 4;
graph.edge[4].weight = 2;
// add edge 3-2 (or D-C in above figure)
graph.edge[5].src = 3;
graph.edge[5].dest = 2;
graph.edge[5].weight = 5;
// add edge 3-1 (or D-B in above figure)
graph.edge[6].src = 3;
graph.edge[6].dest = 1;
graph.edge[6].weight = 1;
// add edge 4-3 (or E-D in above figure)
graph.edge[7].src = 4;
graph.edge[7].dest = 3;
graph.edge[7].weight = -3;
graph.BellmanFord(graph, 0);
}
}
// Contributed by Aakash Hasija

Vertex
0
1
2
3
4

Distance from Source


0
-1
2
-2
1

Notes
1) Negative weights are found in various applications of graphs. For example, instead of paying cost for a path, we may get some advantage if we
follow the path.
2) Bellman-Ford works better (better than Dijksras) for distributed systems. Unlike Dijksras where we need to find minimum value of all vertices,
in Bellman-Ford, edges are considered one by one.

Exercise
1) The standard Bellman-Ford algorithm reports shortest path only if there is no negative weight cycles. Modify it so that it reports minimum
distances even if there is a negative weight cycle.
2) Can we use Dijksras algorithm for shortest paths for graphs with negative weights one idea can be, calculate the minimum weight value, add a
positive value (equal to absolute value of minimum weight value) to all weights and run the Dijksras algorithm for the modified graph. Will this
algorithm work?
References:
http://www.youtube.com/watch?v=Ttezuzs39nk
http://en.wikipedia.org/wiki/Bellman%E2%80%93Ford_algorithm
http://www.cs.arizona.edu/classes/cs445/spring07/ShortestPath2.prn.pdf

Dynamic Programming | Set 24 (Optimal Binary Search Tree)


Given a sorted array keys[0.. n-1] of search keys and an array freq[0.. n-1] of frequency counts, where freq[i] is the number of searches to
keys[i]. Construct a binary search tree of all keys such that the total cost of all the searches is as small as possible.
Let us first define the cost of a BST. The cost of a BST node is level of that node multiplied by its frequency. Level of root is 1.
Example 1
Input: keys[] = {10, 12}, freq[] = {34, 50}
There can be following two possible BSTs
10
12
\
/
12
10
I
II
Frequency of searches of 10 and 12 are 34 and 50 respectively.
The cost of tree I is 34*1 + 50*2 = 134
The cost of tree II is 50*1 + 34*2 = 118
Example 2
Input: keys[] = {10, 12, 20}, freq[] = {34, 8, 50}
There can be following possible BSTs
10
12
20
10
\
/
\
/
\
12
10
20
12
20
\
/
/
20
10
12
I
II
III
IV
Among all possible BSTs, cost of the fifth BST is minimum.
Cost of the fifth BST is 1*50 + 2*34 + 3*8 = 142

20
/
10
\
12
V

1) Optimal Substructure:
The optimal cost for freq[i..j] can be recursively calculated using following formula.

We need to calculate optCost(0, n-1) to find the result.


The idea of above formula is simple, we one by one try all nodes as root (r varies from i to j in second term). When we make rth node as root, we
recursively calculate optimal cost from i to r-1 and r+1 to j.
We add sum of frequencies from i to j (see first term in the above formula), this is added because every search will go through root and one
comparison will be done for every search.
2) Overlapping Subproblems
Following is recursive implementation that simply follows the recursive structure mentioned above.
// A naive recursive implementation of optimal binary search tree problem
#include <stdio.h>
#include <limits.h>
// A utility function to get sum of array elements freq[i] to freq[j]
int sum(int freq[], int i, int j);
// A recursive function to calculate cost of optimal binary search tree
int optCost(int freq[], int i, int j)
{
// Base cases
if (j < i)
// If there are no elements in this subarray
return 0;
if (j == i)
// If there is one element in this subarray
return freq[i];
// Get sum of freq[i], freq[i+1], ... freq[j]
int fsum = sum(freq, i, j);
// Initialize minimum value
int min = INT_MAX;
// One by one consider all elements as root and recursively find cost
// of the BST, compare the cost with min and update min if needed
for (int r = i; r <= j; ++r)
{
int cost = optCost(freq, i, r-1) + optCost(freq, r+1, j);
if (cost < min)
min = cost;
}
// Return minimum value

return min + fsum;


}
// The main function that calculates minimum cost of a Binary Search Tree.
// It mainly uses optCost() to find the optimal cost.
int optimalSearchTree(int keys[], int freq[], int n)
{
// Here array keys[] is assumed to be sorted in increasing order.
// If keys[] is not sorted, then add code to sort keys, and rearrange
// freq[] accordingly.
return optCost(freq, 0, n-1);
}
// A utility function to get sum of array elements freq[i] to freq[j]
int sum(int freq[], int i, int j)
{
int s = 0;
for (int k = i; k <=j; k++)
s += freq[k];
return s;
}
// Driver program to test above functions
int main()
{
int keys[] = {10, 12, 20};
int freq[] = {34, 8, 50};
int n = sizeof(keys)/sizeof(keys[0]);
printf("Cost of Optimal BST is %d ", optimalSearchTree(keys, freq, n));
return 0;
}

Output:
Cost of Optimal BST is 142

Time complexity of the above naive recursive approach is exponential. It should be noted that the above function computes the same subproblems
again and again. We can see many subproblems being repeated in the following recursion tree for freq[1..4].

Since same suproblems are called again, this problem has Overlapping Subprolems property. So optimal BST problem has both properties (see
this and this) of a dynamic programming problem. Like other typical Dynamic Programming(DP) problems, recomputations of same subproblems
can be avoided by constructing a temporary array cost[][] in bottom up manner.
Dynamic Programming Solution
Following is C/C++ implementation for optimal BST problem using Dynamic Programming. We use an auxiliary array cost[n][n] to store the
solutions of subproblems. cost[0][n-1] will hold the final result. The challenge in implementation is, all diagonal values must be filled first, then the
values which lie on the line just above the diagonal. In other words, we must first fill all cost[i][i] values, then all cost[i][i+1] values, then all cost[i]
[i+2] values. So how to fill the 2D array in such manner> The idea used in the implementation is same as Matrix Chain Multiplication problem, we
use a variable L for chain length and increment L, one by one. We calculate column number j using the values of i and L.
// Dynamic Programming code for Optimal Binary Search Tree Problem
#include <stdio.h>
#include <limits.h>
// A utility function to get sum of array elements freq[i] to freq[j]
int sum(int freq[], int i, int j);
/* A Dynamic Programming based function that calculates minimum cost of
a Binary Search Tree. */
int optimalSearchTree(int keys[], int freq[], int n)
{
/* Create an auxiliary 2D matrix to store results of subproblems */
int cost[n][n];
/* cost[i][j] = Optimal cost of binary search tree that can be
formed from keys[i] to keys[j].

cost[0][n-1] will store the resultant cost */


// For a single key, cost is equal to frequency of the key
for (int i = 0; i < n; i++)
cost[i][i] = freq[i];
// Now we need to consider chains of length 2, 3, ... .
// L is chain length.
for (int L=2; L<=n; L++)
{
// i is row number in cost[][]
for (int i=0; i<=n-L+1; i++)
{
// Get column number j from row number i and chain length L
int j = i+L-1;
cost[i][j] = INT_MAX;
// Try making all keys in interval keys[i..j] as root
for (int r=i; r<=j; r++)
{
// c = cost when keys[r] becomes root of this subtree
int c = ((r > i)? cost[i][r-1]:0) +
((r < j)? cost[r+1][j]:0) +
sum(freq, i, j);
if (c < cost[i][j])
cost[i][j] = c;
}
}
}
return cost[0][n-1];
}
// A utility function to get sum of array elements freq[i] to freq[j]
int sum(int freq[], int i, int j)
{
int s = 0;
for (int k = i; k <=j; k++)
s += freq[k];
return s;
}
// Driver program to test above functions
int main()
{
int keys[] = {10, 12, 20};
int freq[] = {34, 8, 50};
int n = sizeof(keys)/sizeof(keys[0]);
printf("Cost of Optimal BST is %d ", optimalSearchTree(keys, freq, n));
return 0;
}

Output:
Cost of Optimal BST is 142

Notes
1) The time complexity of the above solution is O(n^4). The time complexity can be easily reduced to O(n^3) by pre-calculating sum of
frequencies instead of calling sum() again and again.
2) In the above solutions, we have computed optimal cost only. The solutions can be easily modified to store the structure of BSTs also. We can
create another auxiliary array of size n to store the structure of tree. All we need to do is, store the chosen r in the innermost loop.

Dynamic Programming | Set 26 (Largest Independent Set Problem)


Given a Binary Tree, find size of the Largest Independent Set(LIS) in it. A subset of all tree nodes is an independent set if there is no edge
between any two nodes of the subset.
For example, consider the following binary tree. The largest independent set(LIS) is {10, 40, 60, 70, 80} and size of the LIS is 5.

A Dynamic Programming solution solves a given problem using solutions of subproblems in bottom up manner. Can the given problem be solved
using solutions to subproblems? If yes, then what are the subproblems? Can we find largest independent set size (LISS) for a node X if we know
LISS for all descendants of X? If a node is considered as part of LIS, then its children cannot be part of LIS, but its grandchildren can be.
Following is optimal substructure property.
1) Optimal Substructure:
Let LISS(X) indicates size of largest independent set of a tree with root X.
LISS(X) = MAX { (1 + sum of LISS for all grandchildren of X),
(sum of LISS for all children of X) }

The idea is simple, there are two possibilities for every node X, either X is a member of the set or not a member. If X is a member, then the value
of LISS(X) is 1 plus LISS of all grandchildren. If X is not a member, then the value is sum of LISS of all children.
2) Overlapping Subproblems
Following is recursive implementation that simply follows the recursive structure mentioned above.
// A naive recursive implementation of Largest Independent Set problem
#include <stdio.h>
#include <stdlib.h>
// A utility function to find max of two integers
int max(int x, int y) { return (x > y)? x: y; }
/* A binary tree node has data, pointer to left child and a pointer to
right child */
struct node
{
int data;
struct node *left, *right;
};
// The function returns size of the largest independent set in a given
// binary tree
int LISS(struct node *root)
{
if (root == NULL)
return 0;
// Caculate size excluding the current node
int size_excl = LISS(root->left) + LISS(root->right);
// Calculate size including the current node
int size_incl = 1;
if (root->left)
size_incl += LISS(root->left->left) + LISS(root->left->right);
if (root->right)
size_incl += LISS(root->right->left) + LISS(root->right->right);
// Return the maximum of two sizes
return max(size_incl, size_excl);
}
// A utility function to create a node
struct node* newNode( int data )

{
struct node* temp = (struct node *) malloc( sizeof(struct node) );
temp->data = data;
temp->left = temp->right = NULL;
return temp;
}
// Driver program to test above functions
int main()
{
// Let us construct the tree given in the above diagram
struct node *root
= newNode(20);
root->left
= newNode(8);
root->left->left
= newNode(4);
root->left->right
= newNode(12);
root->left->right->left = newNode(10);
root->left->right->right = newNode(14);
root->right
= newNode(22);
root->right->right
= newNode(25);
printf ("Size of the Largest Independent Set is %d ", LISS(root));
return 0;
}

Output:
Size of the Largest Independent Set is 5

Time complexity of the above naive recursive approach is exponential. It should be noted that the above function computes the same subproblems
again and again. For example, LISS of node with value 50 is evaluated for node with values 10 and 20 as 50 is grandchild of 10 and child of 20.
Since same suproblems are called again, this problem has Overlapping Subprolems property. So LISS problem has both properties (see this and
this) of a dynamic programming problem. Like other typical Dynamic Programming(DP) problems, recomputations of same subproblems can be
avoided by storing the solutions to subproblems and solving problems in bottom up manner.
Following is C implementation of Dynamic Programming based solution. In the following solution, an additional field liss is added to tree nodes.
The initial value of liss is set as 0 for all nodes. The recursive function LISS() calculates liss for a node only if it is not already set.
/* Dynamic programming based program for Largest Independent Set problem */
#include <stdio.h>
#include <stdlib.h>
// A utility function to find max of two integers
int max(int x, int y) { return (x > y)? x: y; }
/* A binary tree node has data, pointer to left child and a pointer to
right child */
struct node
{
int data;
int liss;
struct node *left, *right;
};
// A memoization function returns size of the largest independent set in
// a given binary tree
int LISS(struct node *root)
{
if (root == NULL)
return 0;
if (root->liss)
return root->liss;
if (root->left == NULL && root->right == NULL)
return (root->liss = 1);
// Calculate size excluding the current node
int liss_excl = LISS(root->left) + LISS(root->right);
// Calculate size including the current node
int liss_incl = 1;
if (root->left)
liss_incl += LISS(root->left->left) + LISS(root->left->right);
if (root->right)
liss_incl += LISS(root->right->left) + LISS(root->right->right);
// Maximum of two sizes is LISS, store it for future uses.
root->liss = max(liss_incl, liss_excl);

return root->liss;
}
// A utility function to create a node
struct node* newNode(int data)
{
struct node* temp = (struct node *) malloc( sizeof(struct node) );
temp->data = data;
temp->left = temp->right = NULL;
temp->liss = 0;
return temp;
}
// Driver program to test above functions
int main()
{
// Let us construct the tree given in the above diagram
struct node *root
= newNode(20);
root->left
= newNode(8);
root->left->left
= newNode(4);
root->left->right
= newNode(12);
root->left->right->left = newNode(10);
root->left->right->right = newNode(14);
root->right
= newNode(22);
root->right->right
= newNode(25);
printf ("Size of the Largest Independent Set is %d ", LISS(root));
return 0;
}

Output
Size of the Largest Independent Set is 5

Time Complexity: O(n) where n is the number of nodes in given Binary tree.
Following extensions to above solution can be tried as an exercise.
1) Extend the above solution for n-ary tree.
2) The above solution modifies the given tree structure by adding an additional field liss to tree nodes. Extend the solution so that it doesnt modify
the tree structure.
3) The above solution only returns size of LIS, it doesnt print elements of LIS. Extend the solution to print all nodes that are part of LIS.

Dynamic Programming | Set 25 (Subset Sum Problem)


Given a set of non-negative integers, and a value sum, determine if there is a subset of the given set with sum equal to given sum.
Examples: set[] = {3, 34, 4, 12, 5, 2}, sum = 9
Output: True //There is a subset (4, 5) with sum 9.

Let isSubSetSum(int set[], int n, int sum) be the function to find whether there is a subset of set[] with sum equal to sum. n is the number of
elements in set[].
The isSubsetSum problem can be divided into two subproblems
a) Include the last element, recur for n = n-1, sum = sum set[n-1]
b) Exclude the last element, recur for n = n-1.
If any of the above the above subproblems return true, then return true.
Following is the recursive formula for isSubsetSum() problem.
isSubsetSum(set, n, sum) = isSubsetSum(set, n-1, sum) ||
isSubsetSum(arr, n-1, sum-set[n-1])
Base Cases:
isSubsetSum(set, n, sum) = false, if sum > 0 and n == 0
isSubsetSum(set, n, sum) = true, if sum == 0

Following is naive recursive implementation that simply follows the recursive structure mentioned above.
// A recursive solution for subset sum problem
#include <stdio.h>
// Returns true if there is a subset of set[] with sun equal to given sum
bool isSubsetSum(int set[], int n, int sum)
{
// Base Cases
if (sum == 0)
return true;
if (n == 0 && sum != 0)
return false;
// If last element is greater than sum, then ignore it
if (set[n-1] > sum)
return isSubsetSum(set, n-1, sum);
/* else, check if sum can be obtained by any of the following
(a) including the last element
(b) excluding the last element */
return isSubsetSum(set, n-1, sum) || isSubsetSum(set, n-1, sum-set[n-1]);
}
// Driver program to test above function
int main()
{
int set[] = {3, 34, 4, 12, 5, 2};
int sum = 9;
int n = sizeof(set)/sizeof(set[0]);
if (isSubsetSum(set, n, sum) == true)
printf("Found a subset with given sum");
else
printf("No subset with given sum");
return 0;
}

Output:
Found a subset with given sum

The above solution may try all subsets of given set in worst case. Therefore time complexity of the above solution is exponential. The problem is infact NP-Complete (There is no known polynomial time solution for this problem).
We can solve the problem in Pseudo-polynomial time using Dynamic programming. We create a boolean 2D table subset[][] and fill it in
bottom up manner. The value of subset[i][j] will be true if there is a subset of set[0..j-1] with sum equal to i., otherwise false. Finally, we return
subset[sum][n]
// A Dynamic Programming solution for subset sum problem
#include <stdio.h>
// Returns true if there is a subset of set[] with sun equal to given sum
bool isSubsetSum(int set[], int n, int sum)
{

// The value of subset[i][j] will be true if there is a subset of set[0..j-1]


// with sum equal to i
bool subset[sum+1][n+1];
// If sum is 0, then answer is true
for (int i = 0; i <= n; i++)
subset[0][i] = true;
// If sum is not 0 and set is empty, then answer is false
for (int i = 1; i <= sum; i++)
subset[i][0] = false;
// Fill the subset table in botton up manner
for (int i = 1; i <= sum; i++)
{
for (int j = 1; j <= n; j++)
{
subset[i][j] = subset[i][j-1];
if (i >= set[j-1])
subset[i][j] = subset[i][j] || subset[i - set[j-1]][j-1];
}
}
/* // uncomment this code to print table
for (int i = 0; i <= sum; i++)
{
for (int j = 0; j <= n; j++)
printf ("%4d", subset[i][j]);
printf("\n");
} */
return subset[sum][n];
}
// Driver program to test above function
int main()
{
int set[] = {3, 34, 4, 12, 5, 2};
int sum = 9;
int n = sizeof(set)/sizeof(set[0]);
if (isSubsetSum(set, n, sum) == true)
printf("Found a subset with given sum");
else
printf("No subset with given sum");
return 0;
}

Output:
Found a subset with given sum

Time complexity of the above solution is O(sum*n).

Dynamic Programming | Set 27 (Maximum sum rectangle in a 2D matrix)


Given a 2D array, find the maximum sum subarray in it. For example, in the following 2D array, the maximum sum subarray is highlighted with blue
rectangle and sum of this subarray is 29.

This problem is mainly an extension of Largest Sum Contiguous Subarray for 1D array.
The naive solution for this problem is to check every possible rectangle in given 2D array. This solution requires 4 nested loops and time
complexity of this solution would be O(n^4).
Kadanes algorithm for 1D array can be used to reduce the time complexity to O(n^3). The idea is to fix the left and right columns one by one
and find the maximum sum contiguous rows for every left and right column pair. We basically find top and bottom row numbers (which have
maximum sum) for every fixed left and right column pair. To find the top and bottom row numbers, calculate sun of elements in every row from left
to right and store these sums in an array say temp[]. So temp[i] indicates sum of elements from left to right in row i. If we apply Kadanes 1D
algorithm on temp[], and get the maximum sum subarray of temp, this maximum sum would be the maximum possible sum with left and right as
boundary columns. To get the overall maximum sum, we compare this sum with the maximum sum so far.
// Program to find maximum sum subarray in a given 2D array
#include <stdio.h>
#include <string.h>
#include <limits.h>
#define ROW 4
#define COL 5
// Implementation of Kadane's algorithm for 1D array. The function returns the
// maximum sum and stores starting and ending indexes of the maximum sum subarray
// at addresses pointed by start and finish pointers respectively.
int kadane(int* arr, int* start, int* finish, int n)
{
// initialize sum, maxSum and
int sum = 0, maxSum = INT_MIN, i;
// Just some initial value to check for all negative values case
*finish = -1;
// local variable
int local_start = 0;
for (i = 0; i < n; ++i)
{
sum += arr[i];
if (sum < 0)
{
sum = 0;
local_start = i+1;
}
else if (sum > maxSum)
{
maxSum = sum;
*start = local_start;
*finish = i;
}
}
// There is at-least one non-negative number
if (*finish != -1)
return maxSum;
// Special Case: When all numbers in arr[] are negative
maxSum = arr[0];
*start = *finish = 0;

// Find the maximum element in array


for (i = 1; i < n; i++)
{
if (arr[i] > maxSum)
{
maxSum = arr[i];
*start = *finish = i;
}
}
return maxSum;
}
// The main function that finds maximum sum rectangle in M[][]
void findMaxSum(int M[][COL])
{
// Variables to store the final output
int maxSum = INT_MIN, finalLeft, finalRight, finalTop, finalBottom;
int left, right, i;
int temp[ROW], sum, start, finish;
// Set the left column
for (left = 0; left < COL; ++left)
{
// Initialize all elements of temp as 0
memset(temp, 0, sizeof(temp));
// Set the right column for the left column set by outer loop
for (right = left; right < COL; ++right)
{
// Calculate sum between current left and right for every row 'i'
for (i = 0; i < ROW; ++i)
temp[i] += M[i][right];
// Find the maximum sum subarray in temp[]. The kadane() function
// also sets values of start and finish. So 'sum' is sum of
// rectangle between (start, left) and (finish, right) which is the
// maximum sum with boundary columns strictly as left and right.
sum = kadane(temp, &start, &finish, ROW);
// Compare sum with maximum sum so far. If sum is more, then update
// maxSum and other output values
if (sum > maxSum)
{
maxSum = sum;
finalLeft = left;
finalRight = right;
finalTop = start;
finalBottom = finish;
}
}
}
// Print final values
printf("(Top, Left) (%d, %d)\n", finalTop, finalLeft);
printf("(Bottom, Right) (%d, %d)\n", finalBottom, finalRight);
printf("Max sum is: %d\n", maxSum);
}
// Driver program to test above functions
int main()
{
int M[ROW][COL] = {{1, 2, -1, -4, -20},
{-8, -3, 4, 2, 1},
{3, 8, 10, 1, 3},
{-4, -1, 1, 7, -6}
};
findMaxSum(M);
return 0;
}

Output:
(Top, Left) (1, 1)
(Bottom, Right) (3, 3)
Max sum is: 29

Time Complexity: O(n^3)

Count number of binary strings without consecutive 1s


Given a positive integer N, count all possible distinct binary strings of length N such that there are no consecutive 1s.
Examples:
Input: N = 2
Output: 3
// The 3 strings are 00, 01, 10
Input: N = 3
Output: 5
// The 5 strings are 000, 001, 010, 100, 101

This problem can be solved using Dynamic Programming. Let a[i] be the number of binary strings of length i which do not contain any two
consecutive 1s and which end in 0. Similarly, let b[i] be the number of such strings which end in 1. We can append either 0 or 1 to a string ending
in 0, but we can only append 0 to a string ending in 1. This yields the recurrence relation:
a[i] = a[i - 1] + b[i - 1]
b[i] = a[i - 1]

The base cases of above recurrence are a[1] = b[1] = 1. The total number of strings of length i is just a[i] + b[i].
Following is C++ implementation of above solution. In the following implementation, indexes start from 0. So a[i] represents the number of binary
strings for input length i+1. Similarly, b[i] represents binary strings for input length i+1.
// C++ program to count all distinct binary strings
// without two consecutive 1's
#include <iostream>
using namespace std;
int countStrings(int n)
{
int a[n], b[n];
a[0] = b[0] = 1;
for (int i = 1; i < n; i++)
{
a[i] = a[i-1] + b[i-1];
b[i] = a[i-1];
}
return a[n-1] + b[n-1];
}
// Driver program to test above functions
int main()
{
cout << countStrings(3) << endl;
return 0;
}

Output:
5

Source:
courses.csail.mit.edu/6.006/oldquizzes/solutions/q2-f2009-sol.pdf

Dynamic Programming | Set 37 (Boolean Parenthesization Problem)


Given a boolean expression with following symbols.
Symbols
'T' ---> true
'F' ---> false

And following operators filled between symbols


Operators
& ---> boolean AND
| ---> boolean OR
^ ---> boolean XOR

Count the number of ways we can parenthesize the expression so that the value of expression evaluates to true.
Let the input be in form of two arrays one contains the symbols (T and F) in order and other contains operators (&, | and ^}
Examples:
Input: symbol[]
= {T, F, T}
operator[] = {^, &}
Output: 2
The given expression is "T ^ F & T", it evaluates true
in two ways "((T ^ F) & T)" and "(T ^ (F & T))"
Input: symbol[]
= {T, F, F}
operator[] = {^, |}
Output: 2
The given expression is "T ^ F | F", it evaluates true
in two ways "( (T ^ F) | F )" and "( T ^ (F | F) )".
Input: symbol[]
= {T, T, F, T}
operator[] = {|, &, ^}
Output: 4
The given expression is "T | T & F ^ T", it evaluates true
in 4 ways ((T|T)&(F^T)), (T|(T&(F^T))), (((T|T)&F)^T)
and (T|((T&F)^T)).

Solution:
Let T(i, j) represents the number of ways to parenthesize the symbols between i and j (both inclusive) such that the subexpression between i and j
evaluates to true.

Let F(i, j) represents the number of ways to parenthesize the symbols between i and j (both inclusive) such that the subexpression between i and j
evaluates to false.

Base Cases:
T(i, i) = 1 if symbol[i] = 'T'
T(i, i) = 0 if symbol[i] = 'F'
F(i, i) = 1 if symbol[i] = 'F'
F(i, i) = 0 if symbol[i] = 'T'

If we draw recursion tree of above recursive solution, we can observe that it many overlapping subproblems. Like other dynamic programming
problems, it can be solved by filling a table in bottom up manner. Following is C++ implementation of dynamic programming solution.
#include<iostream>
#include<cstring>
using namespace std;
// Returns count of all possible parenthesizations that lead to
// result true for a boolean expression with symbols like true
// and false and operators like &, | and ^ filled between symbols

int countParenth(char symb[], char oper[], int n)


{
int F[n][n], T[n][n];
// Fill diaginal entries first
// All diagonal entries in T[i][i] are 1 if symbol[i]
// is T (true). Similarly, all F[i][i] entries are 1 if
// symbol[i] is F (False)
for (int i = 0; i < n; i++)
{
F[i][i] = (symb[i] == 'F')? 1: 0;
T[i][i] = (symb[i] == 'T')? 1: 0;
}
// Now fill T[i][i+1], T[i][i+2], T[i][i+3]... in order
// And F[i][i+1], F[i][i+2], F[i][i+3]... in order
for (int gap=1; gap<n; ++gap)
{
for (int i=0, j=gap; j<n; ++i, ++j)
{
T[i][j] = F[i][j] = 0;
for (int g=0; g<gap; g++)
{
// Find place of parenthesization using current value
// of gap
int k = i + g;
// Store Total[i][k] and Total[k+1][j]
int tik = T[i][k] + F[i][k];
int tkj = T[k+1][j] + F[k+1][j];
// Follow the recursive formulas according to the current
// operator
if (oper[k] == '&')
{
T[i][j] += T[i][k]*T[k+1][j];
F[i][j] += (tik*tkj - T[i][k]*T[k+1][j]);
}
if (oper[k] == '|')
{
F[i][j] += F[i][k]*F[k+1][j];
T[i][j] += (tik*tkj - F[i][k]*F[k+1][j]);
}
if (oper[k] == '^')
{
T[i][j] += F[i][k]*T[k+1][j] + T[i][k]*F[k+1][j];
F[i][j] += T[i][k]*T[k+1][j] + F[i][k]*F[k+1][j];
}
}
}
}
return T[0][n-1];
}
// Driver program to test above function
int main()
{
char symbols[] = "TTFT";
char operators[] = "|&^";
int n = strlen(symbols);
// There are 4 ways
// ((T|T)&(F^T)), (T|(T&(F^T))), (((T|T)&F)^T) and (T|((T&F)^T))
cout << countParenth(symbols, operators, n);
return 0;
}

Output:
4

Time Complexity: O(n3)


Auxiliary Space: O(n2)
References:
http://people.cs.clemson.edu/~bcdean/dp_practice/dp_9.swf

Count ways to reach the nth stair


There are n stairs, a person standing at the bottom wants to reach the top. The person can climb either 1 stair or 2 stairs at a time. Count the
number of ways, the person can reach the top.

Consider the example shown in diagram. The value of n is 3. There are 3 ways to reach the top. The diagram is taken from Easier Fibonacci
puzzles

More Examples:
Input: n = 1
Output: 1
There is only one way to climb 1 stair
Input: n = 2
Output: 2
There are two ways: (1, 1) and (2)
Input: n = 4
Output: 5
(1, 1, 1, 1), (1, 1, 2), (2, 1, 1), (1, 2, 1), (2, 2)

We can easily find recursive nature in above problem. The person can reach nth stair from either (n-1)th stair or from (n-2)th stair. Let the total
number of ways to reach nt stair be ways(n). The value of ways(n) can be written as following.
ways(n) = ways(n-1) + ways(n-2)

The above expression is actually the expression for Fibonacci numbers, but there is one thing to notice, the value of ways(n) is equal to
fibonacci(n+1).
ways(1) = fib(2) = 1
ways(2) = fib(3) = 2
ways(3) = fib(4) = 3
So we can use function for fibonacci numbers to find the value of ways(n). Following is C++ implementation of the above idea.
// A C program to count number of ways to reach n't stair when
// a person can climb 1, 2, ..m stairs at a time.
#include<stdio.h>
// A simple recursive program to find n'th fibonacci number
int fib(int n)
{
if (n <= 1)
return n;
return fib(n-1) + fib(n-2);
}
// Returns number of ways to reach s'th stair
int countWays(int s)
{
return fib(s + 1);
}
// Driver program to test above functions

int main ()
{
int s = 4;
printf("Number of ways = %d", countWays(s));
getchar();
return 0;
}

Output:
Number of ways = 5

The time complexity of the above implementation is exponential (golden ratio raised to power n). It can be optimized to work in O(Logn) time
using the previously discussed Fibonacci function optimizations.
Generalization of the above problem
How to count number of ways if the person can climb up to m stairs for a given value m? For example if m is 4, the person can climb 1 stair or 2
stairs or 3 stairs or 4 stairs at a time.
We can write the recurrence as following.
ways(n, m) = ways(n-1, m) + ways(n-2, m) + ... ways(n-m, m)

Following is C++ implementation of above recurrence.


// A C program to count number of ways to reach n't stair when
// a person can climb either 1 or 2 stairs at a time
#include<stdio.h>
// A recursive function used by countWays
int countWaysUtil(int n, int m)
{
if (n <= 1)
return n;
int res = 0;
for (int i = 1; i<=m && i<=n; i++)
res += countWaysUtil(n-i, m);
return res;
}
// Returns number of ways to reach s'th stair
int countWays(int s, int m)
{
return countWaysUtil(s+1, m);
}
// Driver program to test above functions
int main ()
{
int s = 4, m = 2;
printf("Nuber of ways = %d", countWays(s, m));
return 0;
}

Output:
Number of ways = 5

The time complexity of above solution is exponential. It can be optimized to O(mn) by using dynamic programming. Following is dynamic
programming based solution. We build a table res[] in bottom up manner.
// A C program to count number of ways to reach n't stair when
// a person can climb 1, 2, ..m stairs at a time
#include<stdio.h>
// A recursive function used by countWays
int countWaysUtil(int n, int m)
{
int res[n];
res[0] = 1; res[1] = 1;
for (int i=2; i<n; i++)
{
res[i] = 0;
for (int j=1; j<=m && j<=i; j++)
res[i] += res[i-j];
}
return res[n-1];
}

// Returns number of ways to reach s'th stair


int countWays(int s, int m)
{
return countWaysUtil(s+1, m);
}
// Driver program to test above functions
int main ()
{
int s = 4, m = 2;
printf("Nuber of ways = %d", countWays(s, m));
return 0;
}

Output:
Number of ways = 5

Minimum Cost Polygon Triangulation


A triangulation of a convex polygon is formed by drawing diagonals between non-adjacent vertices (corners) such that the diagonals never
intersect. The problem is to find the cost of triangulation with the minimum cost. The cost of a triangulation is sum of the weights of its component
triangles. Weight of each triangle is its perimeter (sum of lengths of all sides)
See following example taken from this source.

Two triangulations of the same convex pentagon. The triangulation on the left has a cost of 8 + 2?2 + 2?5 (approximately 15.30), the
one on the right has a cost of 4 + 2?2 + 4?5 (approximately 15.77).
This problem has recursive substructure. The idea is to divide the polygon into three parts: a single triangle, the sub-polygon to the left, and the
sub-polygon to the right. We try all possible divisions like this and find the one that minimizes the cost of the triangle plus the cost of the
triangulation of the two sub-polygons.
Let Minimum Cost of triangulation of vertices from i to j be minCost(i, j)
If j <= i + 2 Then
minCost(i, j) = 0
Else
minCost(i, j) = Min { minCost(i, k) + minCost(k, j) + cost(i, k, j) }
Here k varies from 'i+1' to 'j-1'
Cost of a triangle formed by edges (i, j), (j, k) and (k, j) is
cost(i, j, k) = dist(i, j) + dist(j, k) + dist(k, j)

Following is C++ implementation of above naive recursive formula.


// Recursive implementation for minimum cost convex polygon triangulation
#include <iostream>
#include <cmath>
#define MAX 1000000.0
using namespace std;
// Structure of a point in 2D plane
struct Point
{
int x, y;
};
// Utility function to find minimum of two double values
double min(double x, double y)
{
return (x <= y)? x : y;
}
// A utility function
double dist(Point p1,
{
return sqrt((p1.x
(p1.y
}

to find distance between two points in a plane


Point p2)
- p2.x)*(p1.x - p2.x) +
- p2.y)*(p1.y - p2.y));

// A utility function to find cost of a triangle. The cost is considered


// as perimeter (sum of lengths of all edges) of the triangle
double cost(Point points[], int i, int j, int k)
{
Point p1 = points[i], p2 = points[j], p3 = points[k];
return dist(p1, p2) + dist(p2, p3) + dist(p3, p1);
}
// A recursive function to find minimum cost of polygon triangulation
// The polygon is represented by points[i..j].
double mTC(Point points[], int i, int j)
{
// There must be at least three points between i and j
// (including i and j)
if (j < i+2)

return 0;
// Initialize result as infinite
double res = MAX;
// Find minimum triangulation by considering all
for (int k=i+1; k<j; k++)
res = min(res, (mTC(points, i, k) + mTC(points, k, j) +
cost(points, i, k, j)));
return res;
}
// Driver program to test above functions
int main()
{
Point points[] = {{0, 0}, {1, 0}, {2, 1}, {1, 2}, {0, 2}};
int n = sizeof(points)/sizeof(points[0]);
cout << mTC(points, 0, n-1);
return 0;
}

Output:
15.3006

The above problem is similar to Matrix Chain Multiplication. The following is recursion tree for mTC(points[], 0, 4).

It can be easily seen in the above recursion tree that the problem has many overlapping subproblems. Since the problem has both properties:
Optimal Substructure and Overlapping Subproblems, it can be efficiently solved using dynamic programming.
Following is C++ implementation of dynamic programming solution.
// A Dynamic Programming based program to find minimum cost of convex
// polygon triangulation
#include <iostream>
#include <cmath>
#define MAX 1000000.0
using namespace std;
// Structure of a point in 2D plane
struct Point
{
int x, y;
};
// Utility function to find minimum of two double values
double min(double x, double y)
{
return (x <= y)? x : y;
}
// A utility function
double dist(Point p1,
{
return sqrt((p1.x
(p1.y
}

to find distance between two points in a plane


Point p2)
- p2.x)*(p1.x - p2.x) +
- p2.y)*(p1.y - p2.y));

// A utility function to find cost of a triangle. The cost is considered


// as perimeter (sum of lengths of all edges) of the triangle
double cost(Point points[], int i, int j, int k)
{

Point p1 = points[i], p2 = points[j], p3 = points[k];


return dist(p1, p2) + dist(p2, p3) + dist(p3, p1);
}
// A Dynamic programming based function to find minimum cost for convex
// polygon triangulation.
double mTCDP(Point points[], int n)
{
// There must be at least 3 points to form a triangle
if (n < 3)
return 0;
// table to store results of subproblems. table[i][j] stores cost of
// triangulation of points from i to j. The entry table[0][n-1] stores
// the final result.
double table[n][n];
// Fill table using above recursive formula. Note that the table
// is filled in diagonal fashion i.e., from diagonal elements to
// table[0][n-1] which is the result.
for (int gap = 0; gap < n; gap++)
{
for (int i = 0, j = gap; j < n; i++, j++)
{
if (j < i+2)
table[i][j] = 0.0;
else
{
table[i][j] = MAX;
for (int k = i+1; k < j; k++)
{
double val = table[i][k] + table[k][j] + cost(points,i,j,k);
if (table[i][j] > val)
table[i][j] = val;
}
}
}
}
return table[0][n-1];
}
// Driver program to test above functions
int main()
{
Point points[] = {{0, 0}, {1, 0}, {2, 1}, {1, 2}, {0, 2}};
int n = sizeof(points)/sizeof(points[0]);
cout << mTCDP(points, n);
return 0;
}

Output:
15.3006

Time complexity of the above dynamic programming solution is O(n3).


Please note that the above implementations assume that the points of covnvex polygon are given in order (either clockwise or anticlockwise)
Exercise:
Extend the above solution to print triangulation also. For the above example, the optimal triangulation is 0 3 4, 0 1 3, and 1 2 3.
Sources:
http://www.cs.utexas.edu/users/djimenez/utsa/cs3343/lecture12.html
http://www.cs.utoronto.ca/~heap/Courses/270F02/A4/chains/node2.html

Mobile Numeric Keypad Problem

Given the mobile numeric keypad. You can only press buttons that are up, left, right or down to the current
button. You are not allowed to press bottom row corner buttons (i.e. * and # ).
Given a number N, find out the number of possible numbers of given length.
Examples:
For N=1, number of possible numbers would be 10 (0, 1, 2, 3, ., 9)
For N=2, number of possible numbers would be 36
Possible numbers: 00,08 11,12,14 22,21,23,25 and so on.
If we start with 0, valid numbers will be 00, 08 (count: 2)
If we start with 1, valid numbers will be 11, 12, 14 (count: 3)
If we start with 2, valid numbers will be 22, 21, 23,25 (count: 4)
If we start with 3, valid numbers will be 33, 32, 36 (count: 3)
If we start with 4, valid numbers will be 44,41,45,47 (count: 4)
If we start with 5, valid numbers will be 55,54,52,56,58 (count: 5)
We need to print the count of possible numbers.
N = 1 is trivial case, number of possible numbers would be 10 (0, 1, 2, 3, ., 9)
For N > 1, we need to start from some button, then move to any of the four direction (up, left, right or down) which takes to a valid button (should
not go to *, #). Keep doing this until N length number is obtained (depth first traversal).
Recursive Solution:
Mobile Keypad is a rectangular grid of 4X3 (4 rows and 3 columns)
Lets say Count(i, j, N) represents the count of N length numbers starting from position (i, j)
If N = 1
Count(i, j, N) = 10
Else
Count(i, j, N) = Sum of all Count(r, c, N-1) where (r, c) is new
position after valid move of length 1 from current
position (i, j)

Following is C implementation of above recursive formula.


// A Naive Recursive C program to count number of possible numbers
// of given length
#include <stdio.h>
// left, up, right, down move from current location
int row[] = {0, 0, -1, 0, 1};
int col[] = {0, -1, 0, 1, 0};
// Returns count of numbers of length n starting from key position
// (i, j) in a numeric keyboard.
int getCountUtil(char keypad[][3], int i, int j, int n)
{
if (keypad == NULL || n <= 0)
return 0;
// From a given key, only one number is possible of length 1
if (n == 1)
return 1;
int k=0, move=0, ro=0, co=0, totalCount = 0;
// move left, up, right, down from current location and if
// new location is valid, then get number count of length
// (n-1) from that new position and add in count obtained so far
for (move=0; move<5; move++)
{
ro = i + row[move];
co = j + col[move];
if (ro >= 0 && ro <= 3 && co >=0 && co <= 2 &&

keypad[ro][co] != '*' && keypad[ro][co] != '#')


{
totalCount += getCountUtil(keypad, ro, co, n-1);
}
}
return totalCount;
}
// Return count of all possible numbers of length n
// in a given numeric keyboard
int getCount(char keypad[][3], int n)
{
// Base cases
if (keypad == NULL || n <= 0)
return 0;
if (n == 1)
return 10;
int i=0, j=0, totalCount = 0;
for (i=0; i<4; i++) // Loop on keypad row
{
for (j=0; j<3; j++) // Loop on keypad column
{
// Process for 0 to 9 digits
if (keypad[i][j] != '*' && keypad[i][j] != '#')
{
// Get count when number is starting from key
// position (i, j) and add in count obtained so far
totalCount += getCountUtil(keypad, i, j, n);
}
}
}
return totalCount;
}
// Driver program to test above function
int main(int argc, char *argv[])
{
char keypad[4][3] = {{'1','2','3'},
{'4','5','6'},
{'7','8','9'},
{'*','0','#'}};
printf("Count for numbers of length %d:
printf("Count for numbers of length %d:
printf("Count for numbers of length %d:
printf("Count for numbers of length %d:
printf("Count for numbers of length %d:

%d\n",
%d\n",
%d\n",
%d\n",
%d\n",

1,
2,
3,
4,
5,

getCount(keypad,
getCount(keypad,
getCount(keypad,
getCount(keypad,
getCount(keypad,

1));
2));
3));
4));
5));

return 0;
}

Output:
Count
Count
Count
Count
Count

for
for
for
for
for

numbers
numbers
numbers
numbers
numbers

of
of
of
of
of

length
length
length
length
length

1:
2:
3:
4:
5:

10
36
138
532
2062

Dynamic Programming
There are many repeated traversal on smaller paths (traversal for smaller N) to find all possible longer paths (traversal for bigger N). See following
two diagrams for example. In this traversal, for N = 4 from two starting positions (buttons 4 and 8), we can see there are few repeated traversals
for N = 2 (e.g. 4 -> 1, 6 -> 3, 8 -> 9, 8 -> 7 etc).

Since the problem has both properties: Optimal Substructure and Overlapping Subproblems, it can be efficiently solved using dynamic
programming.
Following is C program for dynamic programming implementation.
// A Dynamic Programming based C program to count number of
// possible numbers of given length
#include <stdio.h>
// Return count of all possible numbers of length n
// in a given numeric keyboard
int getCount(char keypad[][3], int n)
{
if(keypad == NULL || n <= 0)
return 0;
if(n == 1)
return 10;
// left, up, right, down move from current location
int row[] = {0, 0, -1, 0, 1};
int col[] = {0, -1, 0, 1, 0};
// taking n+1 for simplicity - count[i][j] will store
// number count starting with digit i and length j
int count[10][n+1];
int i=0, j=0, k=0, move=0, ro=0, co=0, num = 0;
int nextNum=0, totalCount = 0;
// count numbers starting with digit i and of lengths 0 and 1
for (i=0; i<=9; i++)
{
count[i][0] = 0;
count[i][1] = 1;
}
// Bottom up - Get number count of length 2, 3, 4, ... , n
for (k=2; k<=n; k++)
{
for (i=0; i<4; i++) // Loop on keypad row
{
for (j=0; j<3; j++) // Loop on keypad column
{
// Process for 0 to 9 digits
if (keypad[i][j] != '*' && keypad[i][j] != '#')
{
// Here we are counting the numbers starting with

// digit keypad[i][j] and of length k keypad[i][j]


// will become 1st digit, and we need to look for
// (k-1) more digits
num = keypad[i][j] - '0';
count[num][k] = 0;
// move left, up, right, down from current location
// and if new location is valid, then get number
// count of length (k-1) from that new digit and
// add in count we found so far
for (move=0; move<5; move++)
{
ro = i + row[move];
co = j + col[move];
if (ro >= 0 && ro <= 3 && co >=0 && co <= 2 &&
keypad[ro][co] != '*' && keypad[ro][co] != '#')
{
nextNum = keypad[ro][co] - '0';
count[num][k] += count[nextNum][k-1];
}
}
}
}
}
}
// Get count of all possible numbers of length "n" starting
// with digit 0, 1, 2, ..., 9
totalCount = 0;
for (i=0; i<=9; i++)
totalCount += count[i][n];
return totalCount;
}
// Driver program to test above function
int main(int argc, char *argv[])
{
char keypad[4][3] = {{'1','2','3'},
{'4','5','6'},
{'7','8','9'},
{'*','0','#'}};
printf("Count for numbers of length %d:
printf("Count for numbers of length %d:
printf("Count for numbers of length %d:
printf("Count for numbers of length %d:
printf("Count for numbers of length %d:

%d\n",
%d\n",
%d\n",
%d\n",
%d\n",

1,
2,
3,
4,
5,

getCount(keypad,
getCount(keypad,
getCount(keypad,
getCount(keypad,
getCount(keypad,

1));
2));
3));
4));
5));

return 0;
}

Output:
Count
Count
Count
Count
Count

for
for
for
for
for

numbers
numbers
numbers
numbers
numbers

of
of
of
of
of

length
length
length
length
length

1:
2:
3:
4:
5:

10
36
138
532
2062

A Space Optimized Solution:


The above dynamic programming approach also runs in O(n) time and requires O(n) auxiliary space, as only one for loop runs n times, other for
loops runs for constant time. We can see that nth iteration needs data from (n-1)th iteration only, so we need not keep the data from older
iterations. We can have a space efficient dynamic programming approach with just two arrays of size 10. Thanks to Nik for suggesting this
solution.
// A Space Optimized C program to count number of possible numbers
// of given length
#include <stdio.h>
// Return count of all possible numbers of length n
// in a given numeric keyboard
int getCount(char keypad[][3], int n)
{
if(keypad == NULL || n <= 0)
return 0;
if(n == 1)
return 10;
// odd[i], even[i] arrays represent count of numbers starting
// with digit i for any length j
int odd[10], even[10];

int i = 0, j = 0, useOdd = 0, totalCount = 0;


for (i=0; i<=9; i++)
odd[i] = 1; // for j = 1
for (j=2; j<=n; j++) // Bottom Up calculation from j = 2 to n
{
useOdd = 1 - useOdd;
// Here we are explicitly writing lines for each number 0
// to 9. But it can always be written as DFS on 4X3 grid
// using row, column array valid moves
if(useOdd == 1)
{
even[0] = odd[0] + odd[8];
even[1] = odd[1] + odd[2] + odd[4];
even[2] = odd[2] + odd[1] + odd[3] + odd[5];
even[3] = odd[3] + odd[2] + odd[6];
even[4] = odd[4] + odd[1] + odd[5] + odd[7];
even[5] = odd[5] + odd[2] + odd[4] + odd[8] + odd[6];
even[6] = odd[6] + odd[3] + odd[5] + odd[9];
even[7] = odd[7] + odd[4] + odd[8];
even[8] = odd[8] + odd[0] + odd[5] + odd[7] + odd[9];
even[9] = odd[9] + odd[6] + odd[8];
}
else
{
odd[0] = even[0] + even[8];
odd[1] = even[1] + even[2] + even[4];
odd[2] = even[2] + even[1] + even[3] + even[5];
odd[3] = even[3] + even[2] + even[6];
odd[4] = even[4] + even[1] + even[5] + even[7];
odd[5] = even[5] + even[2] + even[4] + even[8] + even[6];
odd[6] = even[6] + even[3] + even[5] + even[9];
odd[7] = even[7] + even[4] + even[8];
odd[8] = even[8] + even[0] + even[5] + even[7] + even[9];
odd[9] = even[9] + even[6] + even[8];
}
}
// Get count of all possible numbers of length "n" starting
// with digit 0, 1, 2, ..., 9
totalCount = 0;
if(useOdd == 1)
{
for (i=0; i<=9; i++)
totalCount += even[i];
}
else
{
for (i=0; i<=9; i++)
totalCount += odd[i];
}
return totalCount;
}
// Driver program to test above function
int main()
{
char keypad[4][3] = {{'1','2','3'},
{'4','5','6'},
{'7','8','9'},
{'*','0','#'}
};
printf("Count for numbers of length %d:
printf("Count for numbers of length %d:
printf("Count for numbers of length %d:
printf("Count for numbers of length %d:
printf("Count for numbers of length %d:
return 0;
}

Output:
Count
Count
Count
Count
Count

for
for
for
for
for

numbers
numbers
numbers
numbers
numbers

of
of
of
of
of

length
length
length
length
length

1:
2:
3:
4:
5:

10
36
138
532
2062

%d\n",
%d\n",
%d\n",
%d\n",
%d\n",

1,
2,
3,
4,
5,

getCount(keypad,
getCount(keypad,
getCount(keypad,
getCount(keypad,
getCount(keypad,

1));
2));
3));
4));
5));

Count of n digit numbers whose sum of digits equals to given sum


Given two integers n and sum, find count of all n digit numbers with sum of digits as sum. Leading 0s are not counted as digits.
1 <= n <= 100 and 1 <= sum <= 50000
Example:
Input: n = 2, sum = 2
Output: 2
Explanation: Numbers are 11 and 20
Input: n = 2, sum = 5
Output: 5
Explanation: Numbers are 14, 23, 32, 41 and 50
Input: n = 3, sum = 6
Output: 21

The idea is simple, we subtract all values from 0 to 9 from given sum and recur for sum minus that digit. Below is recursive formula.
countRec(n, sum) = ?finalCount(n-1, sum-x)
where 1 =< x <= 9 and
sum-x >= 0
One important observation is, leading 0's must be
handled explicitly as they are not counted as digits.
So our final count can be written as below.
finalCount(n, sum) = ?finalCount(n-1, sum-x)
where 0 =< x <= 9 and
sum-x >= 0

Below is a simple recursive solution based on above recursive formula.


// A recursive program to count numbers with sum
// of digits as given 'sum'
#include<bits/stdc++.h>
using namespace std;
// Recursive function to count 'n' digit numbers
// with sum of digits as 'sum'. This function
// considers leading 0's also as digits, that is
// why not directly called
unsigned long long int countRec(int n, int sum)
{
// Base case
if (n == 0)
return sum == 0;
// Initialize answer
unsigned long long int ans = 0;
// Traverse through every digit and count
// numbers beginning with it using recursion
for (int i=0; i<=9; i++)
if (sum-i >= 0)
ans += countRec(n-1, sum-i);
return ans;
}
// This is mainly a wrapper over countRec. It
// explicitly handles leading digit and calls
// countRec() for remaining digits.
unsigned long long int finalCount(int n, int sum)
{
// Initialize final answer
unsigned long long int ans = 0;
// Traverse through every digit from 1 to
// 9 and count numbers beginning with it
for (int i = 1; i <= 9; i++)
if (sum-i >= 0)
ans += countRec(n-1, sum-i);
return ans;
}
// Driver program
int main()

{
int n = 2, sum = 5;
cout << finalCount(n, sum);
return 0;
}

Output:
5

The time complexity of above solution is exponential. If we draw the complete recursion tree, we can observer that many subproblems are solved
again and again. For example, if we start with n = 3 and sum = 10, we can reach n = 1, sum = 8, by considering digit sequences 1,1 or 2, 0.
Since same suproblems are called again, this problem has Overlapping Subprolems property. So min square sum problem has both properties (see
this and this) of a dynamic programming problem.
Below is Memoization based C++ implementation.
// A memoization based recursive program to count
// numbers with sum of n as given 'sum'
#include<bits/stdc++.h>
using namespace std;
// A lookup table used for memoization
unsigned long long int lookup[101][50001];
// Memoizatiob based implementation of recursive
// function
unsigned long long int countRec(int n, int sum)
{
// Base case
if (n == 0)
return sum == 0;
// If this subproblem is already evaluated,
// return the evaluated value
if (lookup[n][sum] != -1)
return lookup[n][sum];
// Initialize answer
unsigned long long int ans = 0;
// Traverse through every digit and
// recursively count numbers beginning
// with it
for (int i=0; i<10; i++)
if (sum-i >= 0)
ans += countRec(n-1, sum-i);
return lookup[n][sum] = ans;
}
// This is mainly a wrapper over countRec. It
// explicitly handles leading digit and calls
// countRec() for remaining n.
unsigned long long int finalCount(int n, int sum)
{
// Initialize all entries of lookup table
memset(lookup, -1, sizeof lookup);
// Initialize final answer
unsigned long long int ans = 0;
// Traverse through every digit from 1 to
// 9 and count numbers beginning with it
for (int i = 1; i <= 9; i++)
if (sum-i >= 0)
ans += countRec(n-1, sum-i);
return ans;
}
// Driver program
int main()
{
int n = 3, sum = 5;
cout << finalCount(n, sum);
return 0;
}

Output:

Thanks to Gaurav Ahirwar for suggesting above solution.

Minimum Initial Points to Reach Destination


Given a grid with each cell consisting of positive, negative or no points i.e, zero points. We can move across a cell only if we have positive points (
> 0 ). Whenever we pass through a cell, points in that cell are added to our overall points. We need to find minimum initial points to reach cell (m1, n-1) from (0, 0).
Constraints :

Total number of non-decreasing numbers with n digits


A number is non-decreasing if every digit (except the first one) is greater than or equal to previous digit. For example, 223, 4455567, 899, are
non-decreasing numbers.
So, given the number of digits n, you are required to find the count of total non-decreasing numbers with n digits.
Examples:
Input: n = 1
Output: count = 10
Input: n = 2
Output: count = 55
Input: n = 3
Output: count = 220

One way to look at the problem is, count of numbers is equal to count n digit number ending with 9 plus count of ending with digit 8 plus count for
7 and so on. How to get count ending with a particular digit? We can recur for n-1 length and digits smaller than or equal to the last digit. So below
is recursive formula.
Count of n digit numbers = (Count of (n-1) digit numbers Ending with digit 9) +
(Count of (n-1) digit numbers Ending with digit 8) +
.............................................+
.............................................+
(Count of (n-1) digit numbers Ending with digit 0)

Let count ending with digit d and length n be count(n, d)


count(n, d) = ? (count(n-1, i)) where i varies from 0 to d
Total count = ? count(n-1, d) where d varies from 0 to n-1

The above recursive solution is going to have many overlapping subproblems. Therefore, we can use Dynamic Programming to build a table in
bottom up manner. Below is Dynamic programming based C++ program.
// C++ program to count non-decreasing number with n digits
#include<bits/stdc++.h>
using namespace std;
long long int countNonDecreasing(int n)
{
// dp[i][j] contains total count of non decreasing
// numbers ending with digit i and of length j
long long int dp[10][n+1];
memset(dp, 0, sizeof dp);
// Fill table for non decreasing numbers of length 1
// Base cases 0, 1, 2, 3, 4, 5, 6, 7, 8, 9
for (int i = 0; i < 10; i++)
dp[i][1] = 1;
// Fill the table in bottom-up manner
for (int digit = 0; digit <= 9; digit++)
{
// Compute total numbers of non decreasing
// numbers of length 'len'
for (int len = 2; len <= n; len++)
{
// sum of all numbers of length of len-1
// in which last digit x is <= 'digit'
for (int x = 0; x <= digit; x++)
dp[digit][len] += dp[x][len-1];
}
}
long long int count = 0;
// There total nondecreasing numbers of length n
// wiint be dp[0][n] + dp[1][n] ..+ dp[9][n]
for (int i = 0; i < 10; i++)
count += dp[i][n];
return count;
}
// Driver program

int main()
{
int n = 3;
cout << countNonDecreasing(n);
return 0;
}

Output:
220

Thanks to Gaurav Ahirwar for suggesting above method.


Another method is based on below direct formula
Count of non-decreasing numbers with n digits =
N*(N+1)/2*(N+2)/3* ....*(N+n-1)/n
Where N = 10

Below is a C++ program to compute count using above formula.


// C++ program to count non-decreasing numner with n digits
#include<bits/stdc++.h>
using namespace std;
long long int countNonDecreasing(int n)
{
int N = 10;
// Compute value of N*(N+1)/2*(N+2)/3* ....*(N+n-1)/n
long long count = 1;
for (int i=1; i<=n; i++)
{
count *= (N+i-1);
count /= i;
}
return count;
}
// Driver program
int main()
{
int n = 3;
cout << countNonDecreasing(n);
return 0;
}

Output:
220

Thanks to Abhishek Somani for suggesting this method.


How does this formula work?
N * (N+1)/2 * (N+2)/3 * .... * (N+n-1)/n
Where N = 10

Let us try for different values of n.


For n = 1, the value is N from formula.
Which is true as for n = 1, we have all single digit
numbers, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9.
For n = 2, the value is N(N+1)/2 from formula
We can have N numbers beginning with 0, (N-1) numbers
beginning with 1, and so on.
So sum is N + (N-1) + .... + 1 = N(N+1)/2
For n = 3, the value is N(N+1)/2(N+2)/3 from formula
We can have N(N+1)/2 numbers beginning with 0, (N-1)N/2
numbers beginning with 1 (Note that when we begin with 1,
we have N-1 digits left to consider for remaining places),
(N-2)(N-1)/2 beginning with 2, and so on.
Count = N(N+1)/2 + (N-1)N/2 + (N-2)(N-1)/2 +
(N-3)(N-2)/2 .... 3 + 1
[Combining first 2 terms, next 2 terms and so on]
= 1/2[N2 + (N-2)2 + .... 4]

= N*(N+1)*(N+2)/6 [Refer this , putting n=N/2 in the


even sum formula]

For general n digit case, we can apply Mathematical Induction. The count would be equal to count n-1 digit beginning with 0, i.e., N*(N+1)/2*
(N+2)/3* .*(N+n-1-1)/(n-1). Plus count of n-1 digit numbers beginning with 1, i.e., (N-1)*(N)/2*(N+1)/3* .*(N-1+n-1-1)/(n-1) (Note that N is
replaced by N-1) and so on.

Find length of the longest consecutive path from a given starting character
Given a matrix of characters. Find length of the longest path from a given character, such that all characters in the path are consecutive to each
other, i.e., every character in path is next to previous in alphabetical order. It is allowed to move in all 8 directions from a cell.

Example
Input: mat[][] = { {a,
{h,
{i,
Starting Point =

c, d},
b, e},
g, f}}
'e'

Output: 5
If starting point is 'e', then longest path with consecutive
characters is "e f g h i".
Input: mat[R][C] = { {b, e, f},
{h, d, a},
{i, c, a}};
Starting Point = 'b'
Output: 1
'c' is not present in all adjacent cells of 'b'

The idea is to first search given starting character in the given matrix. Do Depth First Search (DFS) from all occurrences to find all consecutive
paths. While doing DFS, we may encounter many subproblems again and again. So we use dynamic programming to store results of subproblems.
Below is C++ implementation of above idea.
// C++ program to find the longest consecutive path
#include<bits/stdc++.h>
#define R 3
#define C 3
using namespace std;
// tool matrices to recur for adjacent cells.
int x[] = {0, 1, 1, -1, 1, 0, -1, -1};
int y[] = {1, 0, 1, 1, -1, -1, 0, -1};
// dp[i][j] Stores length of longest consecutive path
// starting at arr[i][j].
int dp[R][C];
// check whether mat[i][j] is a valid cell or not.
bool isvalid(int i, int j)
{
if (i < 0 || j < 0 || i >= R || j >= C)
return false;
return true;
}
// Check whether current character is adjacent to previous
// character (character processed in parent call) or not.
bool isadjacent(char prev, char curr)
{
return ((curr - prev) == 1);
}

// i, j are the indices of the current cell and prev is the


// character processed in the parent call.. also mat[i][j]
// is our current character.
int getLenUtil(char mat[R][C], int i, int j, char prev)
{
// If this cell is not valid or current character is not
// adjacent to previous one (e.g. d is not adjacent to b )
// or if this cell is already included in the path than return 0.
if (!isvalid(i, j) || !isadjacent(prev, mat[i][j]))
return 0;
// If this subproblem is already solved , return the answer
if (dp[i][j] != -1)
return dp[i][j];
int ans = 0; // Initialize answer
// recur for paths with differnt adjacent cells and store
// the length of longest path.
for (int k=0; k<8; k++)
ans = max(ans, 1 + getLenUtil(mat, i + x[k],
j + y[k], mat[i][j]));
// save the answer and return
return dp[i][j] = ans;
}
// Returns length of the longest path with all characters consecutive
// to each other. This function first initializes dp array that
// is used to store results of subproblems, then it calls
// recursive DFS based function getLenUtil() to find max length path
int getLen(char mat[R][C], char s)
{
memset(dp, -1, sizeof dp);
int ans = 0;
for (int i=0; i<R; i++)
{
for (int j=0; j<C; j++)
{
// check for each possible starting point
if (mat[i][j] == s) {
// recur for all eight adjacent cells
for (int k=0; k<8; k++)
ans = max(ans, 1 + getLenUtil(mat,
i + x[k], j + y[k], s));
}
}
}
return ans;
}
// Driver program
int main() {
char mat[R][C] = { {'a','c','d'},
{ 'h','b','a'},
{ 'i','g','f'}};
cout << getLen(mat,
cout << getLen(mat,
cout << getLen(mat,
cout << getLen(mat,
return 0;

'a')
'e')
'b')
'f')

<<
<<
<<
<<

endl;
endl;
endl;
endl;

Output:
4
0
3
4

Thanks to Gaurav Ahirwar for above solution.

Find minimum number of coins that make a given value


Given a value V, if we want to make change for V cents, and we have infinite supply of each of C = { C1, C2, .. , Cm} valued coins, what is the
minimum number of coins to make the change?
Examples:
Input: coins[] = {25, 10, 5}, V = 30
Output: Minimum 2 coins required
We can use one coin of 25 cents and one of 5 cents
Input: coins[] = {9, 6, 5, 1}, V = 11
Output: Minimum 2 coins required
We can use one coin of 6 cents and 1 coin of 5 cents

This problem is a variation of the problem discussed Coin Change Problem. Here instead of finding total number of possible solutions, we need to
find the solution with minimum number of coins.
The minimum number of coins for a value V can be computed using below recursive formula.
If V == 0, then 0 coins required.
If V > 0
minCoin(coins[0..m-1], V) = min {1 + minCoins(V-coin[i])}
where i varies from 0 to m-1
and coin[i] <= V

Below is recursive solution based on above recursive formula.


// A Naive recursive C++ program to find minimum of coins
// to make a given change V
#include<bits/stdc++.h>
using namespace std;
// m is size of coins array (number of different coins)
int minCoins(int coins[], int m, int V)
{
// base case
if (V == 0) return 0;
// Initialize result
int res = INT_MAX;
// Try every coin that has smaller value than V
for (int i=0; i<m; i++)
{
if (coins[i] <= V)
{
int sub_res = minCoins(coins, m, V-coins[i]);
// Check for INT_MAX to avoid overflow and see if
// result can minimized
if (sub_res != INT_MAX && sub_res + 1 < res)
res = sub_res + 1;
}
}
return res;
}
// Driver program to test above function
int main()
{
int coins[] = {9, 6, 5, 1};
int m = sizeof(coins)/sizeof(coins[0]);
int V = 11;
cout << "Minimum coins required is "
<< minCoins(coins, m, V);
return 0;
}

Output:
Minimum coins required is 2

The time complexity of above solution is exponential. If we draw the complete recursion tree, we can observer that many subproblems are solved
again and again. For example, when we start from V = 11, we can reach 6 by subtracting one 5 times and by subtracting 5 one times. So the
subproblem for 6 is called twice.
Since same suproblems are called again, this problem has Overlapping Subprolems property. So the min coins problem has both properties (see

this and this) of a dynamic programming problem. Like other typical Dynamic Programming(DP) problems, recomputations of same subproblems
can be avoided by constructing a temporary array table[][] in bottom up manner. Below is Dynamic Programming based solution.
// A Dynamic Programming based C++ program to find minimum of coins
// to make a given change V
#include<bits/stdc++.h>
using namespace std;
// m is size of coins array (number of different coins)
int minCoins(int coins[], int m, int V)
{
// table[i] will be storing the minimum number of coins
// required for i value. So table[V] will have result
int table[V+1];
// Base case (If given value V is 0)
table[0] = 0;
// Initialize all table values as Infinite
for (int i=1; i<=V; i++)
table[i] = INT_MAX;
// Compute minimum coins required for all
// values from 1 to V
for (int i=1; i<=V; i++)
{
// Go through all coins smaller than i
for (int j=0; j<m; j++)
if (coins[j] <= i)
{
int sub_res = table[i-coins[j]];
if (sub_res != INT_MAX && sub_res + 1 < table[i])
table[i] = sub_res + 1;
}
}
return table[V];
}
// Driver program to test above function
int main()
{
int coins[] = {9, 6, 5, 1};
int m = sizeof(coins)/sizeof(coins[0]);
int V = 11;
cout << "Minimum coins required is "
<< minCoins(coins, m, V);
return 0;
}

Output:
Minimum coins required is 2

Time complexity of the above solution is O(mV).


Thanks to Goku for suggesting above solution in a comment here and thanks to Vignesh Mohan for suggesting this problem and initial solution.

Collect maximum points in a grid using two traversals


Given a matrix where every cell represents points. How to collect maximum points using two traversals under following conditions?
Let the dimensions of given grid be R x C.
1) The first traversal starts from top left corner, i.e., (0, 0) and should reach left bottom corner, i.e., (R-1, 0). The second traversal starts from top
right corner, i.e., (0, C-1) and should reach bottom right corner, i.e., (R-1, C-1)/
2) From a point (i, j), we can move to (i+1, j+1) or (i+1, j-1) or (i+1, j)
3) A traversal gets all points of a particular cell through which it passes. If one traversal has already collected points of a cell, then the other
traversal gets no points if goes through that cell again.
Input :
int arr[R][C] = {{3,
{5,
{1,
{1,
{1,
};

6,
2,
1,
1,
1,

8, 2},
4, 3},
20, 10},
20, 10},
20, 10},

Output: 73
Explanation :

First traversal collects total points of value 3 + 2 + 20 + 1 + 1 = 27


Second traversal collects total points of value 2 + 4 + 10 + 20 + 10 = 46.
Total Points collected = 27 + 46 = 73.

Source: http://qa.geeksforgeeks.org/1485/running-through-the-grid-to-get-maximum-nutritional-value
Both traversals always move forward along x
Base Cases:
// If destinations reached
if (x == R-1 && y1 == 0 && y2 == C-1)
maxPoints(arr, x, y1, y2) = arr[x][y1] + arr[x][y2];
// If any of the two locations is invalid (going out of grid)
if input is not valid
maxPoints(arr, x, y1, y2) = -INF (minus infinite)
// If both traversals are at same cell, then we count the value of cell
// only once.
If y1 and y2 are same
result = arr[x][y1]
Else
result = arr[x][y1] + arr[x][y2]
result += max { // Max of 9 cases
maxPoints(arr, x+1,
maxPoints(arr, x+1,
maxPoints(arr, x+1,
maxPoints(arr, x+1,
maxPoints(arr, x+1,
maxPoints(arr, x+1,
maxPoints(arr, x+1,
maxPoints(arr, x+1,
maxPoints(arr, x+1,
}

y1+1, y2),
y1+1, y2+1),
y1+1, y2-1),
y1-1, y2),
y1-1, y2+1),
y1-1, y2-1),
y1, y2),
y1, y2+1),
y1, y2-1)

The above recursive solution has many subproblems that are solved again and again. Therefore, we can use Dynamic Programming to solve the
above problem more efficiently. Below is memoization (Memoization is alternative to table based iterative solution in Dynamic Programming) based
implementation. In below implementation, we use a memoization table mem to keep track of already solved problems.
// A Memoization based program to find maximum collection
// using two traversals of a grid

#include<bits/stdc++.h>
using namespace std;
#define R 5
#define C 4
// checks whether a given input is valid or not
bool isValid(int x, int y1, int y2)
{
return (x >= 0 && x < R && y1 >=0 &&
y1 < C && y2 >=0 && y2 < C);
}
// Driver function to collect max value
int getMaxUtil(int arr[R][C], int mem[R][C][C], int x, int y1, int y2)
{
/*---------- BASE CASES -----------*/
// if P1 or P2 is at an invalid cell
if (!isValid(x, y1, y2)) return INT_MIN;
// if both traversals reach their destinations
if (x == R-1 && y1 == 0 && y2 == C-1)
return arr[x][y1] + arr[x][y2];
// If both traversals are at last row but not at their destination
if (x == R-1) return INT_MIN;
// If subproblem is already solved
if (mem[x][y1][y2] != -1) return mem[x][y1][y2];
// Initialize answer for this subproblem
int ans = INT_MIN;
// this variable is used to store gain of current cell(s)
int temp = (y1 == y2)? arr[x][y1]: arr[x][y1] + arr[x][y2];
/* Recur for all possible cases, then
one with max value */
ans = max(ans, temp + getMaxUtil(arr,
ans = max(ans, temp + getMaxUtil(arr,
ans = max(ans, temp + getMaxUtil(arr,

store and return the


mem, x+1, y1, y2-1));
mem, x+1, y1, y2+1));
mem, x+1, y1, y2));

ans = max(ans, temp + getMaxUtil(arr, mem, x+1, y1-1, y2));


ans = max(ans, temp + getMaxUtil(arr, mem, x+1, y1-1, y2-1));
ans = max(ans, temp + getMaxUtil(arr, mem, x+1, y1-1, y2+1));
ans = max(ans, temp + getMaxUtil(arr, mem, x+1, y1+1, y2));
ans = max(ans, temp + getMaxUtil(arr, mem, x+1, y1+1, y2-1));
ans = max(ans, temp + getMaxUtil(arr, mem, x+1, y1+1, y2+1));
return (mem[x][y1][y2] = ans);
}
// This is mainly a wrapper over recursive function getMaxUtil().
// This function creates a table for memoization and calls
// getMaxUtil()
int geMaxCollection(int arr[R][C])
{
// Create a memoization table and initialize all entries as -1
int mem[R][C][C];
memset(mem, -1, sizeof(mem));
// Calculation maximum value using memoization based function
// getMaxUtil()
return getMaxUtil(arr, mem, 0, 0, C-1);
}
// Driver program to test above functions
int main()
{
int arr[R][C] = {{3, 6, 8, 2},
{5, 2, 4, 3},
{1, 1, 20, 10},
{1, 1, 20, 10},
{1, 1, 20, 10},
};
cout << "Maximum collection is " << geMaxCollection(arr);
return 0;
}

Output:

Maximum collection is 73

Thanks to Gaurav Ahirwar for suggesting above problem and solution here.

Shortest Common Supersequence


Given two strings str1 and str2, find the shortest string that has both str1 and str2 as subsequences.
Examples:
Input: str1 = "geek", str2 = "eke"
Output: "geeke"
Input: str1 = "AGGTAB", str2 = "GXTXAYB"
Output: "AGXGTXAYB"

This problem is closely related to longest common subsequence problem. Below are steps.
1) Find Longest Common Subsequence (lcs) of two given strings. For example, lcs of geek and eke is ek.
2) Insert non-lcs characters (in their original order in strings) to the lcs found above, and return the result. So ek becomes geeke which is shortest
common supersequence.
Let us consider another example, str1 = AGGTAB and str2 = GXTXAYB. LCS of str1 and str2 is GTAB. Once we find LCS, we insert
characters of both strings in order and we get AGXGTXAYB
How does this work?
We need to find a string that has both strings as subsequences and is shortest such string. If both strings have all characters different, then result is
sum of lengths of two given strings. If there are common characters, then we dont want them multiple times as the task is to minimize length.
Therefore, we fist find the longest common subsequence, take one occurrence of this subsequence and add extra characters.
Length of the shortest supersequence = (Sum of lengths of given two strings) (Length of LCS of two given strings)

Below is C implementation of above idea. The below implementation only finds length of the shortest supersequence.
/* C program to find length of the shortest supersequence */
#include<stdio.h>
#include<string.h>
/* Utility function to get max of 2 integers */
int max(int a, int b) { return (a > b)? a : b; }
/* Returns length of LCS for X[0..m-1], Y[0..n-1] */
int lcs( char *X, char *Y, int m, int n);
// Function to find length of the shortest supersequence
// of X and Y.
int shortestSuperSequence(char *X, char *Y)
{
int m = strlen(X), n = strlen(Y);
int l = lcs(X, Y, m, n); // find lcs
// Result is sum of input string lengths - length of lcs
return (m + n - l);
}
/* Returns length of LCS for X[0..m-1], Y[0..n-1] */
int lcs( char *X, char *Y, int m, int n)
{
int L[m+1][n+1];
int i, j;
/* Following steps build L[m+1][n+1] in bottom up fashion.
Note that L[i][j] contains length of LCS of X[0..i-1]
and Y[0..j-1] */
for (i=0; i<=m; i++)
{
for (j=0; j<=n; j++)
{
if (i == 0 || j == 0)
L[i][j] = 0;
else if (X[i-1] == Y[j-1])
L[i][j] = L[i-1][j-1] + 1;
else
L[i][j] = max(L[i-1][j], L[i][j-1]);
}
}

/* L[m][n] contains length of LCS for X[0..n-1] and


Y[0..m-1] */
return L[m][n];
}
/* Driver program to test above function */
int main()
{
char X[] = "AGGTAB";
char Y[] = "GXTXAYB";
printf("Length of the shortest supersequence is %d\n",
shortestSuperSequence(X, Y));
return 0;
}

Output:
Length of the shortest supersequence is 9

Below is Another Method to solve the above problem.


A simple analysis yields below simple recursive solution.
Let X[0..m-1] and Y[0..n-1] be two strings and m and be respective
lengths.
if (m == 0) return n;
if (n == 0) return m;
// If last characters are same, then add 1 to result and
// recur for X[]
if (X[m-1] == Y[n-1])
return 1 + SCS(X, Y, m-1, n-1);
// Else find shortest of following two
// a) Remove last character from X and recur
// b) Remove last character from Y and recur
else return 1 + min( SCS(X, Y, m-1, n), SCS(X, Y, m, n-1) );

Below is simple naive recursive solution based on above recursive formula.


/* A Naive recursive C++ program to find length
of the shortest supersequence */
#include<bits/stdc++.h>
using namespace std;
int superSeq(char* X, char* Y, int m, int n)
{
if (!m) return n;
if (!n) return m;
if (X[m-1] == Y[n-1])
return 1 + superSeq(X, Y, m-1, n-1);
return 1 + min(superSeq(X, Y, m-1, n),
superSeq(X, Y, m, n-1));
}
// Driver program to test above function
int main()
{
char X[] = "AGGTAB";
char Y[] = "GXTXAYB";
cout << "Length of the shortest supersequence is "
<< superSeq(X, Y, strlen(X), strlen(Y));
return 0;
}

Output:
Length of the shortest supersequence is 9

Time complexity of the above solution exponential O(2min(m, n)). Since there are overlapping subproblems, we can efficiently solve this recursive
problem using Dynamic Programming. Below is Dynamic Programming based implementation. Time complexity of this solution is O(mn).
/* A dynamic programming based C program to find length
of the shortest supersequence */

#include<bits/stdc++.h>
using namespace std;
// Returns length of the shortest supersequence of X and Y
int superSeq(char* X, char* Y, int m, int n)
{
int dp[m+1][n+1];
// Fill table in bottom up manner
for (int i = 0; i <= m; i++)
{
for (int j = 0; j <= n; j++)
{
// Below steps follow above recurrence
if (!i)
dp[i][j] = j;
else if (!j)
dp[i][j] = i;
else if (X[i-1] == Y[j-1])
dp[i][j] = 1 + dp[i-1][j-1];
else
dp[i][j] = 1 + min(dp[i-1][j], dp[i][j-1]);
}
}
return dp[m][n];
}
// Driver program to test above function
int main()
{
char X[] = "AGGTAB";
char Y[] = "GXTXAYB";
cout << "Length of the shortest supersequence is "
<< superSeq(X, Y, strlen(X), strlen(Y));
return 0;
}

Output:
Length of the shortest supersequence is 9

Thanks to Gaurav Ahirwar for suggesting this solution.


Exercise:
Extend the above program to print shortest supersequence also using function to print LCS.
References:
https://en.wikipedia.org/wiki/Shortest_common_supersequence

Compute sum of digits in all numbers from 1 to n


Given a number x, find sum of digits in all numbers from 1 to n.
Examples:
Input: n = 5
Output: Sum of digits in numbers from 1 to 5 = 15
Input: n = 12
Output: Sum of digits in numbers from 1 to 12 = 51
Input: n = 328
Output: Sum of digits in numbers from 1 to 328 = 3241

Naive Solution:
A naive solution is to go through every number x from 1 to n, and compute sum in x by traversing all digits of x. Below is C++ implementation of
this idea.
// A Simple C++ program to compute sum of digits in numbers from 1 to n
#include<iostream>
using namespace std;
int sumOfDigits(int );
// Returns sum of all digits in numbers from 1 to n
int sumOfDigitsFrom1ToN(int n)
{
int result = 0; // initialize result
// One by one compute sum of digits in every number from
// 1 to n
for (int x=1; x<=n; x++)
result += sumOfDigits(x);
return result;
}
// A utility function to compute sum of digits in a
// given number x
int sumOfDigits(int x)
{
int sum = 0;
while (x != 0)
{
sum += x %10;
x = x /10;
}
return sum;
}
// Driver Program
int main()
{
int n = 328;
cout << "Sum of digits in numbers from 1 to " << n << " is "
<< sumOfDigitsFrom1ToN(n);
return 0;
}

Output
Sum of digits in numbers from 1 to 328 is 3241

Efficient Solution:
Above is a naive solution. We can do it more efficiently by finding a pattern.
Let us take few examples.
sum(9) = 1 + 2 + 3 + 4 ........... + 9
= 9*10/2
= 45
sum(99) =
=
=
=

45 + (10 + 45) + (20 + 45) + ..... (90 + 45)


45*10 + (10 + 20 + 30 ... 90)
45*10 + 10(1 + 2 + ... 9)
45*10 + 45*10

= sum(9)*10 + 45*10
sum(999) = sum(99)*10 + 45*100

In general, we can compute sum(10d 1) using below formula


sum(10d - 1) = sum(10d-1 - 1) * 10 + 45*(10d-1)

In below implementation, the above formula is implemented using dynamic programming as there are overlapping subproblems.
The above formula is one core step of the idea. Below is complete algorithm
Algorithm: sum(n)
1) Find number of digits minus one in n. Let this value be 'd'.
For 328, d is 2.
2) Compute some of digits in numbers from 1 to 10d - 1.
Let this sum be w. For 328, we compute sum of digits from 1 to
99 using above formula.
3) Find Most significant digit (msd) in n. For 328, msd is 3.
4) Overall sum is sum of following terms
a) Sum of digits in 1 to "msd * 10d - 1". For 328, sum of
digits in numbers from 1 to 299.
For 328, we compute 3*sum(99) + (1 + 2)*100. Note that sum of
sum(299) is sum(99) + sum of digits from 100 to 199 + sum of digits
from 200 to 299.
Sum of 100 to 199 is sum(99) + 1*100 and sum of 299 is sum(99) + 2*100.
In general, this sum can be computed as w*msd + (msd*(msd-1)/2)*10d
b) Sum of digits in msd * 10d to n. For 328, sum of digits in
300 to 328.
For 328, this sum is computed as 3*29 + recursive call "sum(28)"
In general, this sum can be computed as msd * (n % (msd*10d) + 1)
+ sum(n % (10d))

Below is C++ implementation of above aglorithm.


// C++ program to compute sum of digits in numbers from 1 to n
#include<bits/stdc++.h>
using namespace std;
// Function to computer sum of digits in numbers from 1 to n
// Comments use example of 328 to explain the code
int sumOfDigitsFrom1ToN(int n)
{
// base case: if n<10 return sum of
// first n natural numbers
if (n<10)
return n*(n+1)/2;
// d = number of digits minus one in n. For 328, d is 2
int d = log10(n);
// computing sum of digits from 1 to 10^d-1,
// d=1 a[0]=0;
// d=2 a[1]=sum of digit from 1 to 9 = 45
// d=3 a[2]=sum of digit from 1 to 99 = a[1]*10 + 45*10^1 = 900
// d=4 a[3]=sum of digit from 1 to 999 = a[2]*10 + 45*10^2 = 13500
int *a = new int[d+1];
a[0] = 0, a[1] = 45;
for (int i=2; i<=d; i++)
a[i] = a[i-1]*10 + 45*ceil(pow(10,i-1));
// computing 10^d
int p = ceil(pow(10, d));
// Most significant digit (msd) of n,
// For 328, msd is 3 which can be obtained using 328/100
int msd = n/p;
//
//
//
//
//

EXPLANATION FOR FIRST and SECOND TERMS IN BELOW LINE OF CODE


First two terms compute sum of digits from 1 to 299
(sum of digits in range 1-99 stored in a[d]) +
(sum of digits in range 100-199, can be calculated as 1*100 + a[d]
(sum of digits in range 200-299, can be calculated as 2*100 + a[d]

// The above sum can be written as 3*a[d] + (1+2)*100


// EXPLANATION FOR THIRD AND FOURTH TERMS IN BELOW LINE OF CODE
// The last two terms compute sum of digits in number from 300 to 328
// The third term adds 3*29 to sum as digit 3 occurs in all numbers
//
from 300 to 328
// The fourth term recursively calls for 28
return msd*a[d] + (msd*(msd-1)/2)*p +
msd*(1+n%p) + sumOfDigitsFrom1ToN(n%p);
}
// Driver Program
int main()
{
int n = 328;
cout << "Sum of digits in numbers from 1 to " << n << " is "
<< sumOfDigitsFrom1ToN(n);
return 0;
}

Output
Sum of digits in numbers from 1 to 328 is 3241

The efficient algorithm has one more advantage that we need to compute the array a[] only once even when we are given multiple inputs.

Count possible ways to construct buildings


Given an input number of sections and each section has 2 plots on either sides of the road. Find all possible ways to construct buildings in the plots
such that there is a space between any 2 buildings.
Example:
N = 1
Output = 4
Place a building
Place a building
Do not place any
Place a building

on one side.
on other side
building.
on both sides.

N = 3
Output = 25
3 sections, which means possible ways for one side are
BSS, BSB, SSS, SBS, SSB where B represents a building
and S represents an empty space
Total possible ways are 25, because a way to place on
one side can correspond to any of 5 ways on other side.
N = 4
Output = 64

We can simplify the problem to first calculate for one side only. If we know the result for one side, we can always do square of the result and get
result for two sides.
A new building can be placed on a section if section just before it has space. A space can be placed anywhere (it doesnt matter whether the
previous section has a building or not).
Let countB(i) be count of
ending with
countS(i) be count of
ending with

possible ways with i sections


a building.
possible ways with i sections
a space.

// A space can be added after a building or after a space.


countS(N) = countB(N-1) + countS(N-1)
// A building can only be added after a space.
countB[N] = countS(N-1)
// Result for one side is sum of the above two counts.
result1(N) = countS(N) + countB(N)
// Result for two sides is square of result1(N)
result2(N) = result1(N) * result1(N)

Below is C++ implementation of above idea.


// C++ program to count all possible way to construct buildings
#include<iostream>
using namespace std;
// Returns count of possible ways for N sections
int countWays(int N)
{
// Base case
if (N == 1)
return 4; // 2 for one side and 4 for two sides
//
//
//
//

countB is count of ways with a building at the end


countS is count of ways with a space at the end
prev_countB and prev_countS are previous values of
countB and countS respectively.

// Initialize countB and countS for one side


int countB=1, countS=1, prev_countB, prev_countS;
// Use the above recursive formula for calculating
// countB and countS using previous values
for (int i=2; i<=N; i++)
{
prev_countB = countB;
prev_countS = countS;
countS = prev_countB + prev_countS;
countB = prev_countS;

}
// Result for one side is sum of ways ending with building
// and ending with space
int result = countS + countB;
// Result for 2 sides is square of result for one side
return (result*result);
}
// Driver program
int main()
{
int N = 3;
cout << "Count of ways for " << N
<< " sections is " << countWays(N);
return 0;
}

Output:
25

Time complexity: O(N)


Auxiliary Space: O(1)
Algorithmic Paradigm: Dynamic Programming
Optimized Solution:
Note that the above solution can be further optimized. If we take closer look at the results, for different values, we can notice that the results for
two sides are squares of Fibonacci Numbers.
N = 1, result = 4 [result for one side = 2]
N = 2, result = 9 [result for one side = 3]
N = 3, result = 25 [result for one side = 5]
N = 4, result = 64 [result for one side = 8]
N = 5, result = 169 [result for one side = 13]
.
.
In general, we can say
result(N) = fib(N+2)2
fib(N) is a function that returns N'th
Fibonacci Number.

Therefore, we can use O(LogN) implementation of Fibonacci Numbers to find number of ways in O(logN) time.

Maximum profit by buying and selling a share at most twice


In a daily share trading, a buyer buys shares in the morning and sells it on same day. If the trader is allowed to make at most 2 transactions in a
day, where as second transaction can only start after first one is complete (Sell->buy->sell->buy). Given stock prices throughout day, find out
maximum profit that a share trader could have made.
Examples:
Input: price[] = {10, 22, 5, 75, 65, 80}
Output: 87
Trader earns 87 as sum of 12 and 75
Buy at price 10, sell at 22, buy at 5 and sell at 80
Input: price[] = {2, 30, 15, 10, 8, 25, 80}
Output: 100
Trader earns 100 as sum of 28 and 72
Buy at price 2, sell at 30, buy at 8 and sell at 80
Input: price[] = {100, 30, 15, 10, 8, 25, 80};
Output: 72
Buy at price 8 and sell at 80.
Input: price[] = {90, 80, 70, 60, 50}
Output: 0
Not possible to earn.

A Simple Solution is to to consider every index i and do following


Max profit with at most two transactions =
MAX {max profit with one transaction and subarray price[0..i] +
max profit with one transaction and aubarray price[i+1..n-1] }
i varies from 0 to n-1.

Maximum possible using one transaction can be calculated using following O(n) algorithm
Maximum difference between two elements such that larger element appears after the smaller number
Time complexity of above simple solution is O(n2).
We can do this O(n) using following Efficient Solution. The idea is to store maximum possible profit of every subarray and solve the problem in
following two phases.
1) Create a table profit[0..n-1] and initialize all values in it 0.
2) Traverse price[] from right to left and update profit[i] such that profit[i] stores maximum profit achievable from one transaction in subarray
price[i..n-1]
3) Traverse price[] from left to right and update profit[i] such that profit[i] stores maximum profit such that profit[i] contains maximum achievable
profit from two transactions in subarray price[0..i].
4) Return profit[n-1]
To do step 1, we need to keep track of maximum price from right to left side and to do step 2, we need to keep track of minimum price from left
to right. Why we traverse in reverse directions? The idea is to save space, in second step, we use same array for both purposes, maximum with 1
transaction and maximum with 2 transactions. After an iteration i, the array profit[0..i] contains maximum profit with 2 transactions and
profit[i+1..n-1] contains profit with two transactions.
Below are implementations of above idea.

C++
// C++ program to find maximum possible profit with at most
// two transactions
#include<iostream>
using namespace std;
// Returns maximum profit with two transactions on a given
// list of stock prices, price[0..n-1]
int maxProfit(int price[], int n)
{
// Create profit array and initialize it as 0
int *profit = new int[n];
for (int i=0; i<n; i++)
profit[i] = 0;

/* Get the maximum profit with only one transaction


allowed. After this loop, profit[i] contains maximum
profit from price[i..n-1] using at most one trans. */
int max_price = price[n-1];
for (int i=n-2;i>=0;i--)
{
// max_price has maximum of price[i..n-1]
if (price[i] > max_price)
max_price = price[i];
// we can get profit[i] by taking maximum of:
// a) previous maximum, i.e., profit[i+1]
// b) profit by buying at price[i] and selling at
//
max_price
profit[i] = max(profit[i+1], max_price-price[i]);
}
/* Get the maximum profit with two transactions allowed
After this loop, profit[n-1] contains the result */
int min_price = price[0];
for (int i=1; i<n; i++)
{
// min_price is minimum price in price[0..i]
if (price[i] < min_price)
min_price = price[i];
// Maximum profit is maximum of:
// a) previous maximum, i.e., profit[i-1]
// b) (Buy, Sell) at (min_price, price[i]) and add
//
profit of other trans. stored in profit[i]
profit[i] = max(profit[i-1], profit[i] +
(price[i]-min_price) );
}
int result = profit[n-1];
delete [] profit; // To avoid memory leak
return result;
}
// Drive program
int main()
{
int price[] = {2, 30, 15, 10, 8, 25, 80};
int n = sizeof(price)/sizeof(price[0]);
cout << "Maximum Profit = " << maxProfit(price, n);
return 0;
}

Python
# Returns maximum profit with two transactions on a given
# list of stock prices price[0..n-1]
def maxProfit(price,n):
# Create profit array and initialize it as 0
profit = [0]*n
# Get the maximum profit with only one transaction
# allowed. After this loop, profit[i] contains maximum
# profit from price[i..n-1] using at most one trans.
max_price=price[n-1]
for i in range( n-2, 0 ,-1):
if price[i]> max_price:
max_price = price[i]
# we can get profit[i] by taking maximum of:
# a) previous maximum, i.e., profit[i+1]
# b) profit by buying at price[i] and selling at
#
max_price
profit[i] = max(profit[i+1], max_price - price[i])
# Get the maximum profit with two transactions allowed
# After this loop, profit[n-1] contains the result
min_price=price[0]
for i in range(1,n):

if price[i] < min_price:


min_price = price[i]
# Maximum profit is maximum of:
# a) previous maximum, i.e., profit[i-1]
# b) (Buy, Sell) at (min_price, A[i]) and add
#
profit of other trans. stored in profit[i]
profit[i] = max(profit[i-1], profit[i]+(price[i]-min_price))
result = profit[n-1]
return result
# Driver function
price = [2, 30, 15, 10, 8, 25, 80]
print "Maximum profit is", maxProfit(price, len(price))
# This code is contributed by __Devesh Agrawal__

Maximum Profit = 100

Time complexity of the above solution is O(n).


Algorithmic Paradigm: Dynamic Programming

How to print maximum number of As using given four keys


This is a famous interview question asked in Google, Paytm and many other company interviews.
Below is the problem statement.
Imagine you have a special keyboard with the following keys:
Key 1: Prints 'A' on screen
Key 2: (Ctrl-A): Select screen
Key 3: (Ctrl-C): Copy selection to buffer
Key 4: (Ctrl-V): Print buffer on screen appending it
after what has already been printed.
If you can only press the keyboard for N times (with the above four
keys), write a program to produce maximum numbers of A's. That is to
say, the input parameter is N (No. of keys that you can press), the
output is M (No. of As that you can produce).

Examples:
Input: N = 3
Output: 3
We can at most get 3 A's on screen by pressing
following key sequence.
A, A, A
Input: N = 7
Output: 9
We can at most get 9 A's on screen by pressing
following key sequence.
A, A, A, Ctrl A, Ctrl C, Ctrl V, Ctrl V
Input: N = 11
Output: 27
We can at most get 27 A's on screen by pressing
following key sequence.
A, A, A, Ctrl A, Ctrl C, Ctrl V, Ctrl V, Ctrl A,
Ctrl C, Ctrl V, Ctrl V

Below are few important points to note.


a) For N < 7, the output is N itself. b) Ctrl V can be used multiple times to print current buffer (See last two examples above). The idea is to
compute the optimal string length for N keystrokes by using a simple insight. The sequence of N keystrokes which produces an optimal string
length will end with a suffix of Ctrl-A, a Ctrl-C, followed by only Ctrl-V's (For N > 6).
The task is to find out the break=point after which we get the above suffix of keystrokes. Definition of a breakpoint is that instance after which we
need to only press Ctrl-A, Ctrl-C once and the only Ctrl-Vs afterwards to generate the optimal length. If we loop from N-3 to 1 and choose each
of these values for the break-point, and compute that optimal string they would produce. Once the loop ends, we will have the maximum of the
optimal lengths for various breakpoints, thereby giving us the optimal length for N keystrokes.
Below is C implementation based on above idea.
/* A recursive C program to print maximum number of A's using
following four keys */
#include<stdio.h>
// A recursive function that returns the optimal length string
// for N keystrokes
int findoptimal(int N)
{
// The optimal string length is N when N is smaller than 7
if (N <= 6)
return N;
// Initialize result
int max = 0;
// TRY ALL POSSIBLE BREAK-POINTS
// For any keystroke N, we need to loop from N-3 keystrokes
// back to 1 keystroke to find a breakpoint 'b' after which we
// will have Ctrl-A, Ctrl-C and then only Ctrl-V all the way.
int b;
for (b=N-3; b>=1; b--)
{
// If the breakpoint is s at b'th keystroke then
// the optimal string would have length
// (n-b-1)*screen[b-1];
int curr = (N-b-1)*findoptimal(b);
if (curr > max)

max = curr;
}
return max;
}
// Driver program
int main()
{
int N;
// for the rest of the array we will rely on the previous
// entries to compute new ones
for (N=1; N<=20; N++)
printf("Maximum Number of A's with %d keystrokes is %d\n",
N, findoptimal(N));
}

Output:
Maximum
Maximum
Maximum
Maximum
Maximum
Maximum
Maximum
Maximum
Maximum
Maximum
Maximum
Maximum
Maximum
Maximum
Maximum
Maximum
Maximum
Maximum
Maximum
Maximum

Number
Number
Number
Number
Number
Number
Number
Number
Number
Number
Number
Number
Number
Number
Number
Number
Number
Number
Number
Number

of
of
of
of
of
of
of
of
of
of
of
of
of
of
of
of
of
of
of
of

A's
A's
A's
A's
A's
A's
A's
A's
A's
A's
A's
A's
A's
A's
A's
A's
A's
A's
A's
A's

with
with
with
with
with
with
with
with
with
with
with
with
with
with
with
with
with
with
with
with

1 keystrokes is 1
2 keystrokes is 2
3 keystrokes is 3
4 keystrokes is 4
5 keystrokes is 5
6 keystrokes is 6
7 keystrokes is 9
8 keystrokes is 12
9 keystrokes is 16
10 keystrokes is 20
11 keystrokes is 27
12 keystrokes is 36
13 keystrokes is 48
14 keystrokes is 64
15 keystrokes is 81
16 keystrokes is 108
17 keystrokes is 144
18 keystrokes is 192
19 keystrokes is 256
20 keystrokes is 324

The above function computes the same subproblems again and again. Recomputations of same subproblems can be avoided by storing the
solutions to subproblems and solving problems in bottom up manner.
Below is Dynamic Programming based C implementation where an auxiliary array screen[N] is used to store result of subproblems.
/* A Dynamic Programming based C program to find maximum number of A's
that can be printed using four keys */
#include<stdio.h>
// this function returns the optimal length string for N keystrokes
int findoptimal(int N)
{
// The optimal string length is N when N is smaller than 7
if (N <= 6)
return N;
// An array to store result of subproblems
int screen[N];
int b; // To pick a breakpoint
// Initializing the optimal lengths array for uptil 6 input
// strokes.
int n;
for (n=1; n<=6; n++)
screen[n-1] = n;
// Solve all subproblems in bottom manner
for (n=7; n<=N; n++)
{
// Initialize length of optimal string for n keystrokes
screen[n-1] = 0;
// For any keystroke n, we need to loop from n-3 keystrokes
// back to 1 keystroke to find a breakpoint 'b' after which we
// will have ctrl-a, ctrl-c and then only ctrl-v all the way.
for (b=n-3; b>=1; b--)
{
// if the breakpoint is at b'th keystroke then
// the optimal string would have length

// (n-b-1)*screen[b-1];
int curr = (n-b-1)*screen[b-1];
if (curr > screen[n-1])
screen[n-1] = curr;
}
}
return screen[N-1];
}
// Driver program
int main()
{
int N;
// for the rest of the array we will rely on the previous
// entries to compute new ones
for (N=1; N<=20; N++)
printf("Maximum Number of A's with %d keystrokes is %d\n",
N, findoptimal(N));
}

Output:
Maximum
Maximum
Maximum
Maximum
Maximum
Maximum
Maximum
Maximum
Maximum
Maximum
Maximum
Maximum
Maximum
Maximum
Maximum
Maximum
Maximum
Maximum
Maximum
Maximum

Number
Number
Number
Number
Number
Number
Number
Number
Number
Number
Number
Number
Number
Number
Number
Number
Number
Number
Number
Number

of
of
of
of
of
of
of
of
of
of
of
of
of
of
of
of
of
of
of
of

A's
A's
A's
A's
A's
A's
A's
A's
A's
A's
A's
A's
A's
A's
A's
A's
A's
A's
A's
A's

with
with
with
with
with
with
with
with
with
with
with
with
with
with
with
with
with
with
with
with

1 keystrokes is 1
2 keystrokes is 2
3 keystrokes is 3
4 keystrokes is 4
5 keystrokes is 5
6 keystrokes is 6
7 keystrokes is 9
8 keystrokes is 12
9 keystrokes is 16
10 keystrokes is 20
11 keystrokes is 27
12 keystrokes is 36
13 keystrokes is 48
14 keystrokes is 64
15 keystrokes is 81
16 keystrokes is 108
17 keystrokes is 144
18 keystrokes is 192
19 keystrokes is 256
20 keystrokes is 324

Thanks to Gaurav Saxena for providing the above approach to solve this problem.

Find the minimum cost to reach destination using a train


There are N stations on route of a train. The train goes from station 0 to N-1. The ticket cost for all pair of stations (i, j) is given where j is greater
than i. Find the minimum cost to reach the destination.
Consider the following example:
Input:
cost[N][N] = { {0, 15, 80, 90},
{INF, 0, 40, 50},
{INF, INF, 0, 70},
{INF, INF, INF, 0}
};
There are 4 stations and cost[i][j] indicates cost to reach j
from i. The entries where j < i are meaningless.
Output:
The minimum cost is 65
The minimum cost can be obtained by first going to station 1
from 0. Then from station 1 to station 3.

The minimum cost to reach N-1 from 0 can be recursively written as following:
minCost(0, N-1) = MIN { cost[0][n-1],
cost[0][1] + minCost(1, N-1),
minCost(0, 2) + minCost(2, N-1),
........,
minCost(0, N-2) + cost[N-2][n-1] }

The following is C++ implementation of above recursive formula.


// A naive recursive solution to find min cost path from station 0
// to station N-1
#include<iostream>
#include<climits>
using namespace std;
// infinite value
#define INF INT_MAX
// Number of stations
#define N 4
// A recursive function to find the shortest path from
// source 's' to destination 'd'.
int minCostRec(int cost[][N], int s, int d)
{
// If source is same as destination
// or destination is next to source
if (s == d || s+1 == d)
return cost[s][d];
// Initialize min cost as direct ticket from
// source 's' to destination 'd'.
int min = cost[s][d];
// Try every intermediate vertex to find minimum
for (int i = s+1; i<d; i++)
{
int c = minCostRec(cost, s, i) +
minCostRec(cost, i, d);
if (c < min)
min = c;
}
return min;
}
// This function returns the smallest possible cost to
// reach station N-1 from station 0. This function mainly
// uses minCostRec().
int minCost(int cost[][N])
{
return minCostRec(cost, 0, N-1);
}
// Driver program to test above function
int main()
{
int cost[N][N] = { {0, 15, 80, 90},

{INF, 0, 40, 50},


{INF, INF, 0, 70},
{INF, INF, INF, 0}
};
cout << "The Minimum cost to reach station "
<< N << " is " << minCost(cost);
return 0;
}

Output:
The Minimum cost to reach station 4 is 65

Time complexity of the above implementation is exponential as it tries every possible path from 0 to N-1. The above solution solves same
subrpoblems multiple times (it can be seen by drawing recursion tree for minCostPathRec(0, 5).
Since this problem has both properties of dynamic programming problems ((see this and this). Like other typical Dynamic Programming(DP)
problems, re-computations of same subproblems can be avoided by storing the solutions to subproblems and solving problems in bottom up
manner.
One dynamic programming solution is to create a 2D table and fill the table using above given recursive formula. The extra space required in this
solution would be O(N2) and time complexity would be O(N3)
We can solve this problem using O(N) extra space and O(N2) time. The idea is based on the fact that given input matrix is a Directed Acyclic
Graph (DAG). The shortest path in DAG can be calculated using the approach discussed in below post.
Shortest Path in Directed Acyclic Graph
We need to do less work here compared to above mentioned post as we know topological sorting of the graph. The topological sorting of vertices
here is 0, 1, ..., N-1. Following is the idea once topological sorting is known.
The idea in below code is to first calculate min cost for station 1, then for station 2, and so on. These costs are stored in an array dist[0...N-1].
1) The min cost for station 0 is 0, i.e., dist[0] = 0
2) The min cost for station 1 is cost[0][1], i.e., dist[1] = cost[0][1]
3) The min cost for station 2 is minimum of following two.
a) dist[0] + cost[0][2]
b) dist[1] + cost[1][2]
3) The min cost for station 3 is minimum of following three.
a) dist[0] + cost[0][3]
b) dist[1] + cost[1][3]
c) dist[2] + cost[2][3]
Similarly, dist[4], dist[5], ... dist[N-1] are calculated.
Below is C++ implementation of above idea.
// A Dynamic Programming based solution to find min cost
// to reach station N-1 from station 0.
#include<iostream>
#include<climits>
using namespace std;
#define INF INT_MAX
#define N 4
// This function returns the smallest possible cost to
// reach station N-1 from station 0.
int minCost(int cost[][N])
{
// dist[i] stores minimum cost to reach station i
// from station 0.
int dist[N];
for (int i=0; i<N; i++)
dist[i] = INF;
dist[0] = 0;
// Go through every station and check if using it
// as an intermediate station gives better path
for (int i=0; i<N; i++)
for (int j=i+1; j<N; j++)
if (dist[j] > dist[i] + cost[i][j])
dist[j] = dist[i] + cost[i][j];

return dist[N-1];
}
// Driver program to test above function
int main()
{
int cost[N][N] = { {0, 15, 80, 90},
{INF, 0, 40, 50},
{INF, INF, 0, 70},
{INF, INF, INF, 0}
};
cout << "The Minimum cost to reach station "
<< N << " is " << minCost(cost);
return 0;
}

Output:
The Minimum cost to reach station 4 is 65

Vertex Cover Problem | Set 2 (Dynamic Programming Solution for Tree)


A vertex cover of an undirected graph is a subset of its vertices such that for every edge (u, v) of the graph, either u or v is in vertex cover.
Although the name is Vertex Cover, the set covers all edges of the given graph.
The problem to find minimum size vertex cover of a graph is NP complete. But it can be solved in polynomial time for trees. In this post a solution
for Binary Tree is discussed. The same solution can be extended for n-ary trees.
For example, consider the following binary tree. The smallest vertex cover is {20, 50, 30} and size of the vertex cover is 3.

The idea is to consider following two possibilities for root and recursively for all nodes down the root.
1) Root is part of vertex cover: In this case root covers all children edges. We recursively calculate size of vertex covers for left and right
subtrees and add 1 to the result (for root).
2) Root is not part of vertex cover: In this case, both children of root must be included in vertex cover to cover all root to children edges. We
recursively calculate size of vertex covers of all grandchildren and number of children to the result (for two children of root).
Below is C implementation of above idea.
// A naive recursive C implementation for vertex cover problem for a tree
#include <stdio.h>
#include <stdlib.h>
// A utility function to find min of two integers
int min(int x, int y) { return (x < y)? x: y; }
/* A binary tree node has data, pointer to left child and a pointer to
right child */
struct node
{
int data;
struct node *left, *right;
};
// The function returns size of the minimum vertex cover
int vCover(struct node *root)
{
// The size of minimum vertex cover is zero if tree is empty or there
// is only one node
if (root == NULL)
return 0;
if (root->left == NULL && root->right == NULL)
return 0;
// Calculate size of vertex cover when root is part of it
int size_incl = 1 + vCover(root->left) + vCover(root->right);
// Calculate size of vertex cover when root is not part of it
int size_excl = 0;
if (root->left)
size_excl += 1 + vCover(root->left->left) + vCover(root->left->right);
if (root->right)
size_excl += 1 + vCover(root->right->left) + vCover(root->right->right);
// Return the minimum of two sizes
return min(size_incl, size_excl);
}
// A utility function to create a node
struct node* newNode( int data )

{
struct node* temp = (struct node *) malloc( sizeof(struct node) );
temp->data = data;
temp->left = temp->right = NULL;
return temp;
}
// Driver program to test above functions
int main()
{
// Let us construct the tree given in the above diagram
struct node *root
= newNode(20);
root->left
= newNode(8);
root->left->left
= newNode(4);
root->left->right
= newNode(12);
root->left->right->left = newNode(10);
root->left->right->right = newNode(14);
root->right
= newNode(22);
root->right->right
= newNode(25);
printf ("Size of the smallest vertex cover is %d ", vCover(root));
return 0;
}

Output:
Size of the smallest vertex cover is 3

Time complexity of the above naive recursive approach is exponential. It should be noted that the above function computes the same subproblems
again and again. For example, vCover of node with value 50 is evaluated twice as 50 is grandchild of 10 and child of 20.
Since same suproblems are called again, this problem has Overlapping Subprolems property. So Vertex Cover problem has both properties (see
this and this) of a dynamic programming problem. Like other typical Dynamic Programming(DP) problems, re-computations of same subproblems
can be avoided by storing the solutions to subproblems and solving problems in bottom up manner.
Following is C implementation of Dynamic Programming based solution. In the following solution, an additional field vc is added to tree nodes. The
initial value of vc is set as 0 for all nodes. The recursive function vCover() calculates vc for a node only if it is not already set.
/* Dynamic programming based program for Vertex Cover problem for
a Binary Tree */
#include <stdio.h>
#include <stdlib.h>
// A utility function to find min of two integers
int min(int x, int y) { return (x < y)? x: y; }
/* A binary tree node has data, pointer to left child and a pointer to
right child */
struct node
{
int data;
int vc;
struct node *left, *right;
};
// A memoization based function that returns size of the minimum vertex cover.
int vCover(struct node *root)
{
// The size of minimum vertex cover is zero if tree is empty or there
// is only one node
if (root == NULL)
return 0;
if (root->left == NULL && root->right == NULL)
return 0;
// If vertex cover for this node is already evaluated, then return it
// to save recomputation of same subproblem again.
if (root->vc != 0)
return root->vc;
// Calculate size of vertex cover when root is part of it
int size_incl = 1 + vCover(root->left) + vCover(root->right);
// Calculate size of vertex cover when root is not part of it
int size_excl = 0;
if (root->left)
size_excl += 1 + vCover(root->left->left) + vCover(root->left->right);
if (root->right)
size_excl += 1 + vCover(root->right->left) + vCover(root->right->right);

// Minimum of two values is vertex cover, store it before returning


root->vc = min(size_incl, size_excl);
return root->vc;
}
// A utility function to create a node
struct node* newNode( int data )
{
struct node* temp = (struct node *) malloc( sizeof(struct node) );
temp->data = data;
temp->left = temp->right = NULL;
temp->vc = 0; // Set the vertex cover as 0
return temp;
}
// Driver program to test above functions
int main()
{
// Let us construct the tree given in the above diagram
struct node *root
= newNode(20);
root->left
= newNode(8);
root->left->left
= newNode(4);
root->left->right
= newNode(12);
root->left->right->left = newNode(10);
root->left->right->right = newNode(14);
root->right
= newNode(22);
root->right->right
= newNode(25);
printf ("Size of the smallest vertex cover is %d ", vCover(root));
return 0;
}

Output:
Size of the smallest vertex cover is 3

References:
http://courses.csail.mit.edu/6.006/spring11/lectures/lec21.pdf
Exercise:
Extend the above solution for n-ary trees.

Count number of ways to reach a given score in a game


Consider a game where a player can score 3 or 5 or 10 points in a move. Given a total score n, find number of ways to reach the given score.
Examples:
Input: n = 20
Output: 4
There are following 4 ways to reach 20
(10, 10)
(5, 5, 10)
(5, 5, 5, 5)
(3, 3, 3, 3, 3, 5)
Input: n = 13
Output: 2
There are following 2 ways to reach 13
(3, 5, 5)
(3, 10)

We strongly recommend you to minimize the browser and try this yourself first.
This problem is a variation of coin change problem and can be solved in O(n) time and O(n) auxiliary space.
The idea is to create a table of size n+1 to store counts of all scores from 0 to n. For every possible move (3, 5 and 10), increment values in table.
// A C program to count number of possible ways to a given score
// can be reached in a game where a move can earn 3 or 5 or 10
#include <stdio.h>
// Returns number of ways to reach score n
int count(int n)
{
// table[i] will store count of solutions for
// value i.
int table[n+1], i;
// Initialize all table values as 0
memset(table, 0, sizeof(table));
// Base case (If given value is 0)
table[0] = 1;
// One by one consider given 3 moves and update the table[]
// values after the index greater than or equal to the
// value of the picked move
for (i=3; i<=n; i++)
table[i] += table[i-3];
for (i=5; i<=n; i++)
table[i] += table[i-5];
for (i=10; i<=n; i++)
table[i] += table[i-10];
return table[n];
}
// Driver program
int main(void)
{
int n = 20;
printf("Count for %d is %d\n", n, count(n));
n = 13;
printf("Count for %d is %d", n, count(n));
return 0;
}

Output:
Count for 20 is 4
Count for 13 is 2

Exercise: How to count score when (10, 5, 5), (5, 5, 10) and (5, 10, 5) are considered as different sequences of moves. Similarly, (5, 3, 3), (3,
5, 3) and (3, 3, 5) are considered different.

Weighted Job Scheduling


Given N jobs where every job is represented by following three elements of it.
1) Start Time
2) Finish Time.
3) Profit or Value Associated.
Find the maximum profit subset of jobs such that no two jobs in the subset overlap.
Example:
Input: Number of Jobs n = 4
Job Details {Start Time, Finish Time, Profit}
Job 1: {1, 2, 50}
Job 2: {3, 5, 20}
Job 3: {6, 19, 100}
Job 4: {2, 100, 200}
Output: The maximum profit is 250.
We can get the maximum profit by scheduling jobs 1 and 4.
Note that there is longer schedules possible Jobs 1, 2 and 3
but the profit with this schedule is 20+50+100 which is less than 250.

A simple version of this problem is discussed here where every job has same profit or value. The Greedy Strategy for activity selection doesnt
work here as the longer schedule may have smaller profit or value.
The above problem can be solved using following recursive solution.
1) First sort jobs according to finish time.
2) Now apply following recursive process.
// Here arr[] is array of n jobs
findMaximumProfit(arr[], n)
{
a) if (n == 1) return arr[0];
b) Return the maximum of following two profits.
(i) Maximum profit by excluding current job, i.e.,
findMaximumProfit(arr, n-1)
(ii) Maximum profit by including the current job
}
How to find the profit including current job?
The idea is to find the latest job before the current job (in
sorted array) that doesn't conflict with current job 'arr[n-1]'.
Once we find such a job, we recur for all jobs till that job and
add profit of current job to result.
In the above example, "job 1" is the latest non-conflicting
for "job 4" and "job 2" is the latest non-conflicting for "job 3".

The following is C++ implementation of above naive recursive method.


// C++ program for weighted job scheduling using Naive Recursive Method
#include <iostream>
#include <algorithm>
using namespace std;
// A job has start time, finish time and profit.
struct Job
{
int start, finish, profit;
};
// A utility function that is used for sorting events
// according to finish time
bool myfunction(Job s1, Job s2)
{
return (s1.finish < s2.finish);
}
// Find the latest job (in sorted array) that doesn't
// conflict with the job[i]. If there is no compatible job,
// then it returns -1.
int latestNonConflict(Job arr[], int i)
{
for (int j=i-1; j>=0; j--)
{
if (arr[j].finish <= arr[i-1].start)
return j;
}
return -1;

}
// A recursive function that returns the maximum possible
// profit from given array of jobs. The array of jobs must
// be sorted according to finish time.
int findMaxProfitRec(Job arr[], int n)
{
// Base case
if (n == 1) return arr[n-1].profit;
// Find profit when current job is inclueded
int inclProf = arr[n-1].profit;
int i = latestNonConflict(arr, n);
if (i != -1)
inclProf += findMaxProfitRec(arr, i+1);
// Find profit when current job is excluded
int exclProf = findMaxProfitRec(arr, n-1);
return max(inclProf, exclProf);
}
// The main function that returns the maximum possible
// profit from given array of jobs
int findMaxProfit(Job arr[], int n)
{
// Sort jobs according to finish time
sort(arr, arr+n, myfunction);
return findMaxProfitRec(arr, n);
}
// Driver program
int main()
{
Job arr[] = {{3, 10, 20}, {1, 2, 50}, {6, 19, 100}, {2, 100, 200}};
int n = sizeof(arr)/sizeof(arr[0]);
cout << "The optimal profit is " << findMaxProfit(arr, n);
return 0;
}

Output:
The optimal profit is 250

The above solution may contain many overlapping subproblems. For example if lastNonConflicting() always returns previous job, then
findMaxProfitRec(arr, n-1) is called twice and the time complexity becomes O(n*2n). As another example when lastNonConflicting() returns
previous to previous job, there are two recursive calls, for n-2 and n-1. In this example case, recursion becomes same as Fibonacci Numbers.
So this problem has both properties of Dynamic Programming, Optimal Substructure and Overlapping Subproblems.
Like other Dynamic Programming Problems, we can solve this problem by making a table that stores solution of subproblems.
Below is C++ implementation based on Dynamic Programming.
// C++ program for weighted job scheduling using Dynamic Programming.
#include <iostream>
#include <algorithm>
using namespace std;
// A job has start time, finish time and profit.
struct Job
{
int start, finish, profit;
};
// A utility function that is used for sorting events
// according to finish time
bool myfunction(Job s1, Job s2)
{
return (s1.finish < s2.finish);
}
// Find the latest job (in sorted array) that doesn't
// conflict with the job[i]
int latestNonConflict(Job arr[], int i)
{
for (int j=i-1; j>=0; j--)
{
if (arr[j].finish <= arr[i].start)
return j;

}
return -1;
}
// The main function that returns the maximum possible
// profit from given array of jobs
int findMaxProfit(Job arr[], int n)
{
// Sort jobs according to finish time
sort(arr, arr+n, myfunction);
// Create an array to store solutions of subproblems. table[i]
// stores the profit for jobs till arr[i] (including arr[i])
int *table = new int[n];
table[0] = arr[0].profit;
// Fill entries in M[] using recursive property
for (int i=1; i<n; i++)
{
// Find profit including the current job
int inclProf = arr[i].profit;
int l = latestNonConflict(arr, i);
if (l != -1)
inclProf += table[l];
// Store maximum of including and excluding
table[i] = max(inclProf, table[i-1]);
}
// Store result and free dynamic memory allocated for table[]
int result = table[n-1];
delete[] table;
return result;
}
// Driver program
int main()
{
Job arr[] = {{3, 10, 20}, {1, 2, 50}, {6, 19, 100}, {2, 100, 200}};
int n = sizeof(arr)/sizeof(arr[0]);
cout << "The optimal profit is " << findMaxProfit(arr, n);
return 0;
}

Output:
The optimal profit is 250

Time Complexity of the above Dynamic Programming Solution is O(n2). Note that the above solution can be optimized to O(nLogn) using Binary
Search in latestNonConflict() instead of linear search. Thanks to Garvit for suggesting this optimization.
References:
http://courses.cs.washington.edu/courses/cse521/13wi/slides/06dp-sched.pdf

Longest Even Length Substring such that Sum of First and Second Half is same
Given a string str of digits, find length of the longest substring of str, such that the length of the substring is 2k digits and sum of left k digits is equal
to the sum of right k digits.
Examples:
Input: str = "123123"
Output: 6
The complete string is of even length and sum of first and second
half digits is same
Input: str = "1538023"
Output: 4
The longest substring with same first and second half sum is "5380"

Simple Solution [ O(n3) ]


A Simple Solution is to check every substring of even length. The following is C based implementation of simple approach.
// A simple C based program to find length of longest even length
// substring with same sum of digits in left and right
#include<stdio.h>
#include<string.h>
int findLength(char *str)
{
int n = strlen(str);
int maxlen =0; // Initialize result
// Choose starting point of every substring
for (int i=0; i<n; i++)
{
// Choose ending point of even length substring
for (int j =i+1; j<n; j += 2)
{
int length = j-i+1;//Find length of current substr
// Calculate left & right sums for current substr
int leftsum = 0, rightsum =0;
for (int k =0; k<length/2; k++)
{
leftsum += (str[i+k]-'0');
rightsum += (str[i+k+length/2]-'0');
}
// Update result if needed
if (leftsum == rightsum && maxlen < length)
maxlen = length;
}
}
return maxlen;
}
// Driver program to test above function
int main(void)
{
char str[] = "1538023";
printf("Length of the substring is %d", findLength(str));
return 0;
}

Output:
Length of the substring is 4

Dynamic Programming [ O(n2) and O(n2) extra space]


The above solution can be optimized to work in O(n2) using Dynamic Programming. The idea is to build a 2D table that stores sums of
substrings. The following is C based implementation of Dynamic Programming approach.
// A C based program that uses Dynamic Programming to find length of the
// longest even substring with same sum of digits in left and right half
#include <stdio.h>
#include <string.h>
int findLength(char *str)

{
int n = strlen(str);
int maxlen = 0; // Initialize result
// A 2D table where sum[i][j] stores sum of digits
// from str[i] to str[j]. Only filled entries are
// the entries where j >= i
int sum[n][n];
// Fill the diagonal values for sunstrings of length 1
for (int i =0; i<n; i++)
sum[i][i] = str[i]-'0';
// Fill entries for substrings of length 2 to n
for (int len=2; len<=n; len++)
{
// Pick i and j for current substring
for (int i=0; i<n-len+1; i++)
{
int j = i+len-1;
int k = len/2;
// Calculate value of sum[i][j]
sum[i][j] = sum[i][j-k] + sum[j-k+1][j];
// Update result if 'len' is even, left and right
// sums are same and len is more than maxlen
if (len%2 == 0 && sum[i][j-k] == sum[(j-k+1)][j]
&& len > maxlen)
maxlen = len;
}
}
return maxlen;
}
// Driver program to test above function
int main(void)
{
char str[] = "153803";
printf("Length of the substring is %d", findLength(str));
return 0;
}

Output:
Length of the substring is 4

Time complexity of the above solution is O(n2), but it requires O(n2) extra space.

[A O(n2) and O(n) extra space solution]


The idea is to use a single dimensional array to store cumulative sum.
// A O(n^2) time and O(n) extra space solution
#include<bits/stdc++.h>
using namespace std;
int findLength(string str, int n)
{
int sum[n+1]; // To store cumulative sum from first digit to nth digit
sum[0] = 0;
/* Store cumulative sum of digits from first to last digit */
for (int i = 1; i <= n; i++)
sum[i] = (sum[i-1] + str[i-1] - '0'); /* convert chars to int */
int ans = 0; // initialize result
/* consider all even length substrings one by one */
for (int len = 2; len <= n; len += 2)
{
for (int i = 0; i <= n-len; i++)
{
int j = i + len - 1;
/* Sum of first and second half is same than update ans */
if (sum[i+len/2] - sum[i] == sum[i+len] - sum[i+len/2])
ans = max(ans, len);

}
}
return ans;
}
// Driver program to test above function
int main()
{
string str = "123123";
cout << "Length of the substring is " << findLength(str, str.length());
return 0;
}

Output:
Length of the substring is 6

Thanks to Gaurav Ahirwar for suggesting this method.

[A O(n2) time and O(1) extra space solution]


The idea is to consider all possible mid points (of even length substrings) and keep expanding on both sides to get and update optimal length as the
sum of two sides become equal.
Below is C++ implementation of the above idea.
// A O(n^2) time and O(1) extra space solution
#include<bits/stdc++.h>
using namespace std;
int findLength(string str, int n)
{
int ans = 0; // Initialize result
// Consider all possible midpoints one by one
for (int i = 0; i <= n-2; i++)
{
/* For current midpoint 'i', keep expanding substring on
both sides, if sum of both sides becomes equal update
ans */
int l = i, r = i + 1;
/* initialize left and right sum */
int lsum = 0, rsum = 0;
/* move on both sides till indexes go out of bounds */
while (r < n && l >= 0)
{
lsum += str[l] - '0';
rsum += str[r] - '0';
if (lsum == rsum)
ans = max(ans, r-l+1);
l--;
r++;
}
}
return ans;
}
// Driver program to test above function
int main()
{
string str = "123123";
cout << "Length of the substring is " << findLength(str, str.length());
return 0;
}

Output:
Length of the substring is 6

Thanks to Gaurav Ahirwar for suggesting this method.

Minimum Cost Polygon Triangulation


A triangulation of a convex polygon is formed by drawing diagonals between non-adjacent vertices (corners) such that the diagonals never
intersect. The problem is to find the cost of triangulation with the minimum cost. The cost of a triangulation is sum of the weights of its component
triangles. Weight of each triangle is its perimeter (sum of lengths of all sides)
See following example taken from this source.

Two triangulations of the same convex pentagon. The triangulation on the left has a cost of 8 + 2?2 + 2?5 (approximately 15.30), the
one on the right has a cost of 4 + 2?2 + 4?5 (approximately 15.77).
This problem has recursive substructure. The idea is to divide the polygon into three parts: a single triangle, the sub-polygon to the left, and the
sub-polygon to the right. We try all possible divisions like this and find the one that minimizes the cost of the triangle plus the cost of the
triangulation of the two sub-polygons.
Let Minimum Cost of triangulation of vertices from i to j be minCost(i, j)
If j <= i + 2 Then
minCost(i, j) = 0
Else
minCost(i, j) = Min { minCost(i, k) + minCost(k, j) + cost(i, k, j) }
Here k varies from 'i+1' to 'j-1'
Cost of a triangle formed by edges (i, j), (j, k) and (k, j) is
cost(i, j, k) = dist(i, j) + dist(j, k) + dist(k, j)

Following is C++ implementation of above naive recursive formula.


// Recursive implementation for minimum cost convex polygon triangulation
#include <iostream>
#include <cmath>
#define MAX 1000000.0
using namespace std;
// Structure of a point in 2D plane
struct Point
{
int x, y;
};
// Utility function to find minimum of two double values
double min(double x, double y)
{
return (x <= y)? x : y;
}
// A utility function
double dist(Point p1,
{
return sqrt((p1.x
(p1.y
}

to find distance between two points in a plane


Point p2)
- p2.x)*(p1.x - p2.x) +
- p2.y)*(p1.y - p2.y));

// A utility function to find cost of a triangle. The cost is considered


// as perimeter (sum of lengths of all edges) of the triangle
double cost(Point points[], int i, int j, int k)
{
Point p1 = points[i], p2 = points[j], p3 = points[k];
return dist(p1, p2) + dist(p2, p3) + dist(p3, p1);
}
// A recursive function to find minimum cost of polygon triangulation
// The polygon is represented by points[i..j].
double mTC(Point points[], int i, int j)
{
// There must be at least three points between i and j
// (including i and j)
if (j < i+2)

return 0;
// Initialize result as infinite
double res = MAX;
// Find minimum triangulation by considering all
for (int k=i+1; k<j; k++)
res = min(res, (mTC(points, i, k) + mTC(points, k, j) +
cost(points, i, k, j)));
return res;
}
// Driver program to test above functions
int main()
{
Point points[] = {{0, 0}, {1, 0}, {2, 1}, {1, 2}, {0, 2}};
int n = sizeof(points)/sizeof(points[0]);
cout << mTC(points, 0, n-1);
return 0;
}

Output:
15.3006

The above problem is similar to Matrix Chain Multiplication. The following is recursion tree for mTC(points[], 0, 4).

It can be easily seen in the above recursion tree that the problem has many overlapping subproblems. Since the problem has both properties:
Optimal Substructure and Overlapping Subproblems, it can be efficiently solved using dynamic programming.
Following is C++ implementation of dynamic programming solution.
// A Dynamic Programming based program to find minimum cost of convex
// polygon triangulation
#include <iostream>
#include <cmath>
#define MAX 1000000.0
using namespace std;
// Structure of a point in 2D plane
struct Point
{
int x, y;
};
// Utility function to find minimum of two double values
double min(double x, double y)
{
return (x <= y)? x : y;
}
// A utility function
double dist(Point p1,
{
return sqrt((p1.x
(p1.y
}

to find distance between two points in a plane


Point p2)
- p2.x)*(p1.x - p2.x) +
- p2.y)*(p1.y - p2.y));

// A utility function to find cost of a triangle. The cost is considered


// as perimeter (sum of lengths of all edges) of the triangle
double cost(Point points[], int i, int j, int k)
{

Point p1 = points[i], p2 = points[j], p3 = points[k];


return dist(p1, p2) + dist(p2, p3) + dist(p3, p1);
}
// A Dynamic programming based function to find minimum cost for convex
// polygon triangulation.
double mTCDP(Point points[], int n)
{
// There must be at least 3 points to form a triangle
if (n < 3)
return 0;
// table to store results of subproblems. table[i][j] stores cost of
// triangulation of points from i to j. The entry table[0][n-1] stores
// the final result.
double table[n][n];
// Fill table using above recursive formula. Note that the table
// is filled in diagonal fashion i.e., from diagonal elements to
// table[0][n-1] which is the result.
for (int gap = 0; gap < n; gap++)
{
for (int i = 0, j = gap; j < n; i++, j++)
{
if (j < i+2)
table[i][j] = 0.0;
else
{
table[i][j] = MAX;
for (int k = i+1; k < j; k++)
{
double val = table[i][k] + table[k][j] + cost(points,i,j,k);
if (table[i][j] > val)
table[i][j] = val;
}
}
}
}
return table[0][n-1];
}
// Driver program to test above functions
int main()
{
Point points[] = {{0, 0}, {1, 0}, {2, 1}, {1, 2}, {0, 2}};
int n = sizeof(points)/sizeof(points[0]);
cout << mTCDP(points, n);
return 0;
}

Output:
15.3006

Time complexity of the above dynamic programming solution is O(n3).


Please note that the above implementations assume that the points of covnvex polygon are given in order (either clockwise or anticlockwise)
Exercise:
Extend the above solution to print triangulation also. For the above example, the optimal triangulation is 0 3 4, 0 1 3, and 1 2 3.
Sources:
http://www.cs.utexas.edu/users/djimenez/utsa/cs3343/lecture12.html
http://www.cs.utoronto.ca/~heap/Courses/270F02/A4/chains/node2.html

Searching for Patterns | Set 1 (Naive Pattern Searching)


Given a text txt[0..n-1] and a pattern pat[0..m-1], write a function search(char pat[], char txt[]) that prints all occurrences of pat[] in txt[].
You may assume that n > m.
Examples:
1) Input:
txt[] = "THIS IS A TEST TEXT"
pat[] = "TEST"

Output:
Pattern found at index 10

2) Input:
txt[] = "AABAACAADAABAAABAA"
pat[] = "AABA"

Output:
Pattern found at index 0
Pattern found at index 9
Pattern found at index 13

Pattern searching is an important problem in computer science. When we do search for a string in notepad/word file or browser or database,
pattern searching algorithms are used to show the search results.
Naive Pattern Searching:
Slide the pattern over text one by one and check for a match. If a match is found, then slides by 1 again to check for subsequent matches.

C
// C program for Naive Pattern Searching algorithm
#include<stdio.h>
#include<string.h>
void search(char *pat, char *txt)
{
int M = strlen(pat);
int N = strlen(txt);
/* A loop to slide pat[] one by one */
for (int i = 0; i <= N - M; i++)
{
int j;
/* For current index i, check for pattern match */
for (j = 0; j < M; j++)
if (txt[i+j] != pat[j])
break;
if (j == M) // if pat[0...M-1] = txt[i, i+1, ...i+M-1]
printf("Pattern found at index %d \n", i);
}
}
/* Driver program to test above function */
int main()
{
char txt[] = "AABAACAADAABAAABAA";
char pat[] = "AABA";
search(pat, txt);
return 0;
}

Python
# Python program for Naive Pattern Searching
def search(pat, txt):
M = len(pat)
N = len(txt)
# A loop to slide pat[] one by one
for i in xrange(N-M+1):

# For current index i, check for pattern match


for j in xrange(M):
if txt[i+j] != pat[j]:
break
if j == M-1: # if pat[0...M-1] = txt[i, i+1, ...i+M-1]
print "Pattern found at index " + str(i)
# Driver program to test the above function
txt = "AABAACAADAABAAABAA"
pat = "AABA"
search (pat, txt)
# This code is contributed by Bhavya Jain

Pattern found at index 0


Pattern found at index 9
Pattern found at index 13

What is the best case?


The best case occurs when the first character of the pattern is not present in text at all.
txt[] = "AABCCAADDEE"
pat[] = "FAA"

The number of comparisons in best case is O(n).


What is the worst case ?
The worst case of Naive Pattern Searching occurs in following scenarios.
1) When all characters of the text and pattern are same.
txt[] = "AAAAAAAAAAAAAAAAAA"
pat[] = "AAAAA".

2) Worst case also occurs when only the last character is different.
txt[] = "AAAAAAAAAAAAAAAAAB"
pat[] = "AAAAB"

Number of comparisons in worst case is O(m*(n-m+1)). Although strings which have repeated characters are not likely to appear in English text,
they may well occur in other applications (for example, in binary texts). The KMP matching algorithm improves the worst case to O(n). We will be
covering KMP in the next post. Also, we will be writing more posts to cover all pattern searching algorithms and data structures.

Searching for Patterns | Set 2 (KMP Algorithm)


Given a text txt[0..n-1] and a pattern pat[0..m-1], write a function search(char pat[], char txt[]) that prints all occurrences of pat[] in txt[].
You may assume that n > m.
Examples:
1) Input:
txt[] = "THIS IS A TEST TEXT"
pat[] = "TEST"

Output:
Pattern found at index 10

2) Input:
txt[] = "AABAACAADAABAAABAA"
pat[] = "AABA"

Output:
Pattern found at index 0
Pattern found at index 9
Pattern found at index 13

Pattern searching is an important problem in computer science. When we do search for a string in notepad/word file or browser or database,
pattern searching algorithms are used to show the search results.
We have discussed Naive pattern searching algorithm in the previous post. The worst case complexity of Naive algorithm is O(m(n-m+1)). Time
complexity of KMP algorithm is O(n) in worst case.
KMP (Knuth Morris Pratt) Pattern Searching
The Naive pattern searching algorithm doesnt work well in cases where we see many matching characters followed by a mismatching character.
Following are some examples.
txt[] = "AAAAAAAAAAAAAAAAAB"
pat[] = "AAAAB"
txt[] = "ABABABCABABABCABABABC"
pat[] = "ABABAC" (not a worst case, but a bad case for Naive)

The KMP matching algorithm uses degenerating property (pattern having same sub-patterns appearing more than once in the pattern) of the
pattern and improves the worst case complexity to O(n). The basic idea behind KMPs algorithm is: whenever we detect a mismatch (after some
matches), we already know some of the characters in the text (since they matched the pattern characters prior to the mismatch). We take
advantage of this information to avoid matching the characters that we know will anyway match.
KMP algorithm does some preprocessing over the pattern pat[] and constructs an auxiliary array lps[] of size m (same as size of pattern). Here
name lps indicates longest proper prefix which is also suffix.. For each sub-pattern pat[0i] where i = 0 to m-1, lps[i] stores length of the
maximum matching proper prefix which is also a suffix of the sub-pattern pat[0..i].
lps[i] = the longest proper prefix of pat[0..i]
which is also a suffix of pat[0..i].

Examples:
For the pattern AABAACAABAA, lps[] is [0, 1, 0, 1, 2, 0, 1, 2, 3, 4, 5]
For the pattern ABCDE, lps[] is [0, 0, 0, 0, 0]
For the pattern AAAAA, lps[] is [0, 1, 2, 3, 4]
For the pattern AAABAAA, lps[] is [0, 1, 2, 0, 1, 2, 3]
For the pattern AAACAAAAAC, lps[] is [0, 1, 2, 0, 1, 2, 3, 3, 3, 4]
Searching Algorithm:
Unlike the Naive algo where we slide the pattern by one, we use a value from lps[] to decide the next sliding position. Let us see how we do that.
When we compare pat[j] with txt[i] and see a mismatch, we know that characters pat[0..j-1] match with txt[i-j+1i-1], and we also know that
lps[j-1] characters of pat[0j-1] are both proper prefix and suffix which means we do not need to match these lps[j-1] characters with txt[i-ji-1]
because we know that these characters will anyway match. See KMPSearch() in the below code for details.
Preprocessing Algorithm:
In the preprocessing part, we calculate values in lps[]. To do that, we keep track of the length of the longest prefix suffix value (we use len variable
for this purpose) for the previous index. We initialize lps[0] and len as 0. If pat[len] and pat[i] match, we increment len by 1 and assign the
incremented value to lps[i]. If pat[i] and pat[len] do not match and len is not 0, we update len to lps[len-1]. See computeLPSArray () in the below
code for details.

C
// C program for implementation of KMP pattern searching
// algorithm
#include<stdio.h>
#include<string.h>
#include<stdlib.h>
void computeLPSArray(char *pat, int M, int *lps);
void KMPSearch(char *pat, char *txt)
{
int M = strlen(pat);
int N = strlen(txt);
// create lps[] that will hold the longest prefix suffix
// values for pattern
int *lps = (int *)malloc(sizeof(int)*M);
int j = 0; // index for pat[]
// Preprocess the pattern (calculate lps[] array)
computeLPSArray(pat, M, lps);
int i = 0; // index for txt[]
while (i < N)
{
if (pat[j] == txt[i])
{
j++;
i++;
}
if (j == M)
{
printf("Found pattern at index %d \n", i-j);
j = lps[j-1];
}
// mismatch after j matches
else if (i < N && pat[j] != txt[i])
{
// Do not match lps[0..lps[j-1]] characters,
// they will match anyway
if (j != 0)
j = lps[j-1];
else
i = i+1;
}
}
free(lps); // to avoid memory leak
}
void computeLPSArray(char *pat, int M, int *lps)
{
int len = 0; // length of the previous longest prefix suffix
int i;
lps[0] = 0; // lps[0] is always 0
i = 1;
// the loop calculates lps[i] for i = 1 to M-1
while (i < M)
{
if (pat[i] == pat[len])
{
len++;
lps[i] = len;
i++;
}
else // (pat[i] != pat[len])
{
if (len != 0)
{
// This is tricky. Consider the example
// AAACAAAA and i = 7.
len = lps[len-1];
// Also, note that we do not increment i here
}
else // if (len == 0)

{
lps[i] = 0;
i++;
}
}
}
}
// Driver program to test above function
int main()
{
char *txt = "ABABDABACDABABCABAB";
char *pat = "ABABCABAB";
KMPSearch(pat, txt);
return 0;
}

Python
# Python program for KMP Algorithm
def KMPSearch(pat, txt):
M = len(pat)
N = len(txt)
# create lps[] that will hold the longest prefix suffix
# values for pattern
lps = [0]*M
j = 0 # index for pat[]
# Preprocess the pattern (calculate lps[] array)
computeLPSArray(pat, M, lps)
i = 0 # index for txt[]
while i < N:
if pat[j] == txt[i]:
i+=1
j+=1
if j==M:
print "Found pattern at index " + str(i-j)
j = lps[j-1]
# mismatch after j matches
elif i < N and pat[j] != txt[i]:
# Do not match lps[0..lps[j-1]] characters,
# they will match anyway
if j != 0:
j = lps[j-1]
else:
i+=1
def computeLPSArray(pat, M, lps):
len = 0 # length of the previous longest prefix suffix
lps[0] # lps[0] is always 0
i = 1
# the loop calculates lps[i] for i = 1 to M-1
while i < M:
if pat[i]==pat[len]:
len+=1
lps[i] = len
i+=1
else:
if len!=0:
# This is tricky. Consier the example AAACAAAA
# and i = 7
len = lps[len-1]
# Also, note that we do not increment i here
else:
lps[i] = 0
i+=1
txt = "ABABDABACDABABCABAB"
pat = "ABABCABAB"
KMPSearch(pat, txt)
# This code is contributed by Bhavya Jain

Found pattern at index 10

Searching for Patterns | Set 3 (Rabin-Karp Algorithm)


Given a text txt[0..n-1] and a pattern pat[0..m-1], write a function search(char pat[], char txt[]) that prints all occurrences of pat[] in txt[].
You may assume that n > m.
Examples:
1) Input:
txt[] = "THIS IS A TEST TEXT"
pat[] = "TEST"

Output:
Pattern found at index 10
2) Input:
txt[] = "AABAACAADAABAAABAA"
pat[] = "AABA"

Output:
Pattern found at index 0
Pattern found at index 9
Pattern found at index 13

The Naive String Matching algorithm slides the pattern one by one. After each slide, it one by one checks characters at the current shift and if all
characters match then prints the match.
Like the Naive Algorithm, Rabin-Karp algorithm also slides the pattern one by one. But unlike the Naive algorithm, Rabin Karp algorithm matches
the hash value of the pattern with the hash value of current substring of text, and if the hash values match then only it starts matching individual
characters. So Rabin Karp algorithm needs to calculate hash values for following strings.
1) Pattern itself.
2) All the substrings of text of length m.
Since we need to efficiently calculate hash values for all the substrings of size m of text, we must have a hash function which has following property.
Hash at the next shift must be efficiently computable from the current hash value and next character in text or we can say hash(txt[s+1 .. s+m])
must be efficiently computable from hash(txt[s .. s+m-1]) and txt[s+m] i.e., hash(txt[s+1 .. s+m])= rehash(txt[s+m], hash(txt[s .. s+m1]) and rehash must be O(1) operation.
The hash function suggested by Rabin and Karp calculates an integer value. The integer value for a string is numeric value of a string. For example,
if all possible characters are from 1 to 10, the numeric value of 122 will be 122. The number of possible characters is higher than 10 (256 in
general) and pattern length can be large. So the numeric values cannot be practically stored as an integer. Therefore, the numeric value is calculated
using modular arithmetic to make sure that the hash valu