II CSE III Sem Advanced Data Structures & Algorithm
Unit-I
Introduction to Algorithm: Algorithm, Pseudo for expressing algorithms, Performance Analysis: Space
complexity, Time Complexity, Asymptotic notation: Big-oh [O], Omega, Theta notation and little oh
notation, Polynomial Vs Exponential Algorithms, Average, best and worst case complexities, Analysis
recursive programs.
Algorithm:
Algorithm is a problem solving technique. It can be defined as a step-by-step procedure to solve a
particular problem. It consists of English like statements. Each statement must be precise and well defined
to perform a specific operation.
Characteristics of an algorithm:
Each and every algorithm is characterized by the following five important characteristics.
Input: It may accept zero or more inputs
Output: It should produce at least one output.
Definiteness: Each instruction must be clear, well defined and precise. There should not be any ambiguity.
Finiteness: It should be a sequence of finite instructions. That is, it should end after a fixed time. It should
not enter into an infinite loop.
Effectiveness: the operations must be simple and must be completed in a finite time at one or more levels
of complexity.
Advantages of an Algorithm
o Effective Communication: Since it is written in a natural language like English, it becomes easy to
understand the step-by-step delineation of a solution to any particular problem.
o Easy Debugging: A well-designed algorithm facilitates easy debugging to detect the logical errors
that occurred inside the program.
o Easy and Efficient Coding: An algorithm is nothing but a blueprint of a program that helps
develop a program.
o Independent of Programming Language: Since it is a language-independent, it can be easily
coded by incorporating any high-level language.
Disadvantages of an Algorithm
o Developing algorithms for complex problems would be time-consuming and difficult to understand.
o It is a challenging task to understand complex logic through algorithms.
Examples:
Write an algorithm to add two numbers entered by user.
Step 1: Start
Step 2: Declare variables num1, num2 and sum.
[1]
II CSE III Sem Advanced Data Structures & Algorithm
Step 3: Read values num1 and num2.
Step 4: Add num1 and num2 and assign the result to sum.
sum←num1+num2
Step 5: Display sum
Step 6: Stop
Pseudo for expressing algorithms:
Pseudo code Conventions:
An algorithm can be described in three ways:
Natural language in English
Graphic representation called flow chart
Pseudo Code method
Pseudo-code refers to an informal high-level description of the operating principle of a computer
program or other algorithm. It uses structural conventions of a standard programming language intended
for human reading rather than the machine reading.
Pseudo-code is a notation system for writing algorithms. The pseudo-code notation specifies operations
that a machine can perform in as human-friendly (e.g., easy to read) way as possible, while avoiding
ambiguity.
In this method we typically represent algorithms as program, which resembles C language.
Advantages of Pseudo-code:
o Since it is similar to a programming language, it can be quickly transformed into the actual
programming language than a flowchart.
o The layman can easily understand it.
o Easily modifiable as compared to the flowcharts.
o Its implementation is beneficial for structured, designed elements.
o It can easily detect an error before transforming it into a code.
Disadvantages of Pseudo-code:
o Since it does not incorporate any standardized style or format, it can vary from one company to
another.
o Error possibility is higher while transforming into a code.
o It may require a tool for extracting out the Pseudo-code and facilitate drawing flowcharts.
o It does not depict the design.
Here are some notations to represent data objects.
1. Comments begin with // and continue until the end of line.
2. Blocks are indicated with matching braces { and }.
3. An identifier begins with a letter. The data types of variables are not explicitly declared.
4. Assignment of values to variables is done using the assignment statement.
‹variable› := ‹expression›;
5. There are two Boolean values true and false. In order to produce these values, the logical operators
and the relational operators.
[2]
II CSE III Sem Advanced Data Structures & Algorithm
Logical operators: AND, OR, NOT
Relational operators: <, ≤, =, ≠, >, ≥
6. Elements of multidimensional arrays are accessed using [and]. For example, if A is a two
dimensional array, the (i,j) th element of the array is denoted as A[i,j]. Array indices start at zero.
7. The following looping statements are employed: for, while, and repeat until.
while loop: for loop: repeat-until:
while ‹condition› do for variable:= value1 repeat
{ to value2 ‹statement 1›
‹statement 1› step step-value do .
. { .
. ‹statement 1› ‹statement n›
‹statement n› ‹statement n› until ‹condition›
} }
8. A conditional statement has the following forms:
if ‹condition› then ‹statement›
if ‹condition› then ‹statement 1› else ‹statement 2›
case statement:
case
{
:‹condition 1›: ‹statement 1›
..
:‹condition n›: ‹statement n›
:else: ‹statement n+1›
}
9. Input and output are done using the instructions read and write.
10. There is only one type of procedure: Algorithm.
Algorithm contains
Heading
Body
The heading takes the form
Algorithm Name (‹parameter list›) heading
{
……
body
……
}
Example Algorithm:
Algorithm Max(A, n)
// A is an array of size n.
{ n = 5, Result = 10
Result := A[1]; A[1] = 10
for i :=2 to n do A[2] = 87 Result = 87
if A[i] > result then A[3] = 45
Result := A[i]; A[4] = 66
return Result; A[5] = 99 Result = 99
}
[3]
II CSE III Sem Advanced Data Structures & Algorithm
Performance Analysis:
In computer science, there are multiple algorithms to solve a problem. When we have
more than one algorithm to solve a problem, we need to select the best one. Performance analysis helps us
to select the best algorithm from multiple algorithms to solve a problem.
When there are multiple alternative algorithms to solve a problem, we analyze them and
pick the one which is best suitable for our requirements. The formal definition is as follows.
Performance of an algorithm is a process of making evaluative judgments about algorithms.
Generally, the performance of an algorithm depends on the following elements...
1. Whether that algorithm is providing the exact solution for the problem?
2. Whether it is easy to understand?
3. Whether it is easy to implement?
4. How much space (memory) it requires to solve the problem?
5. How much time it takes to solve the problem? Etc.,
Based on this information, performance analysis of an algorithm can also be defined as follows.
Performance analysis of an algorithm is the process of calculating space and time required by that
algorithm.
Performance analysis of an algorithm is performed by using the following measures.
1. Space required to complete the task of that algorithm (Space Complexity). It includes program
space and data space
2. Time required to complete the task of that algorithm (Time Complexity)
Space complexity:
When we design an algorithm to solve a problem, it needs some computer memory to complete its
execution. For any algorithm, memory is required for the following purposes...
1. To store program instructions.
2. To store constant values.
3. To store variable values.
4. And for few other things like function calls, jumping statements etc,.
Space complexity of an algorithm can be defined as follows.
Total amount of computer memory required by an algorithm to complete its execution is called
as space complexity of that algorithm.
Space needed by an algorithm is given by
S(P) = C(fixed part) + Sp(variable part)
fixed part: Independent of instance characteristics of the inputs and outputs. This part typically includes
the instruction space, space for simple variables and fixed – size component variables.
Ex: Space for simple variables, constants etc.
variable part: Space for variables whose size is dependent on particular problem instance being solved,
the space needed by referenced variables, and the recursive stack space.
Ex: arrays
[4]
II CSE III Sem Advanced Data Structures & Algorithm
Examples for space complexity:
Algorithm Max(A, n)
// A is an array of size n.
{
Result := A[1];
for i :=2 to n do
if A[i] > result then
Result := A[i];
return Result;
}
variables i, n, result = 1 unit each
variable A = n units
Algorithm-1:
Algorithm abc(a,b,c)
a→1
{
b→1
return a+b+b*c+(a+b-c)/(a+b)+4.0;
c→1
}
3 units
Algorithm-2:
Algorithm Sum(a, n)
i→1
{
s:=0.0; s→1
for i:=1 to n do n→ 1
s := s + a[i];
a→ n units
return s;
} n+3 units
Time Complexity:
Every algorithm requires some amount of computer time to execute its instruction to perform the task.
This computer time required is called time complexity.
The time complexity of an algorithm can be defined as follows.
The time complexity of an algorithm is the total amount of time required by an algorithm
to complete its execution.
T(P) = compile time + execution time
T(P) = Tp(execution time)
Step count:
For algorithm heading → 0
[5]
II CSE III Sem Advanced Data Structures & Algorithm
For braces → 0
For expressions → 1
For any looping statements → number of times the loop is repeating.
Examples for Time complexity:
Algorithm-1:
Algorithm abc(a,b,c) →0
{ →0
return a+b+b*c+(a+b-c)/(a+b)+4.0; →1
} →0
Total step count →1
Algorithm-2
Algorithm Sum(a, n) →0
{ →0
s:=0.0; →1
for i:=1 to n do → n+1
s := s + a[i]; →n
return s; →1
} →0
2n+3 units
Asymptotic notation:
In mathematical analysis, asymptotic analysis of algorithm is a method of defining the mathematical
boundation of its run-time performance. Using the asymptotic analysis, we can easily conclude the
average-case, best-case and worst-case scenario of an algorithm.
It is used to mathematically calculate the running time of any operation inside an algorithm.
Example: Running time of one operation is x(n) and for another operation, it is calculated as f(n2). It refers
to running time will increase linearly with an increase in 'n' for the first operation, and running time will
increase exponentially for the second operation. Similarly, the running time of both operations will be the
same if n is significantly small. Usually, the time required by an algorithm comes under three types:
Worst case: It defines the input for which the algorithm takes a huge time.
Average case: It takes average time for the program execution.
Best case: It defines the input for which the algorithm takes the lowest time.
The commonly used asymptotic notations used for calculating the running time complexity of an algorithm
is given below:
o Big oh Notation (O)
o Omega Notation (Ω)
o Theta Notation (θ)
[6]
II CSE III Sem Advanced Data Structures & Algorithm
Big oh Notation (O):
o Big O notation is an asymptotic notation that measures the performance of an algorithm by simply
providing the order of growth of the function.
o This notation provides an upper bound on a function which ensures that the function never grows
faster than the upper bound. So, it gives the least upper bound on a function so that the function
never grows faster than this upper bound.
It is the formal way to express the upper boundary of an algorithm running time. It measures the worst case
of time complexity or the algorithm's longest amount of time to complete its operation. It is represented as
shown below:
For example:
If f(n) and g(n) are the two functions defined for positive integers, then f(n) = O(g(n)) as f(n) is big oh of
g(n) or f(n) is on the order of g(n)) if there exists constants c and no such that:
f(n)≤c.g(n) for all n≥no
This implies that f(n) does not grow faster than g(n), or g(n) is an upper bound on the function f(n). In this
case, we are calculating the growth rate of the function which eventually calculates the worst time
complexity of a function, i.e., how worst an algorithm can perform.
Let's understand through examples
Example 1: f(n)=2n+3 , g(n)=n
Now, we have to find Is f(n)=O(g(n))?
To check f(n)=O(g(n)), it must satisfy the given condition:
f(n)<=c.g(n)
First, we will replace f(n) by 2n+3 and g(n) by n.
2n+3 <= c.n
Let's assume c=5, n=1 then
2*1+3<=5*1
5<=5
For n=1, the above condition is true.
If n=2
2*2+3<=5*2
7<=10
For n=2, the above condition is true.
We know that for any value of n, it will satisfy the above
condition, i.e., 2n+3<=c.n. If the value of c is equal to 5, then it
will satisfy the condition 2n+3<=c.n. We can take any value of n
starting from 1, it will always satisfy. Therefore, we can say that for some constants c and for some
constants n0, it will always satisfy 2n+3<=c.n. As it is satisfying the above condition, so f(n) is big oh of
g(n) or we can say that f(n) grows linearly. Therefore, it concludes that c.g(n) is the upper bound of the
f(n).
[7]
II CSE III Sem Advanced Data Structures & Algorithm
Omega Notation (Ω):
o It basically describes the best-case scenario which is opposite to the big o notation.
o It is the formal way to represent the lower bound of an algorithm's running time. It measures the
best amount of time an algorithm can possibly take to complete or the best-case time complexity.
o It determines what is the fastest time that an algorithm can run.
If we required that an algorithm takes at least certain amount of time without using an upper bound,
we use big- Ω notation i.e. the Greek letter "omega". It is used to bound the growth of running time for
large input size.
If f(n) and g(n) are the two functions defined for positive integers, then f(n) = Ω (g(n)) as f(n) is Omega of
g(n) or f(n) is on the order of g(n)) if there exists constants c and no such that:
f(n)>=c.g(n) for all n≥no and c>0
Let's consider a simple example.
If f(n) = 2n+3, g(n) = n,
Is f(n)= Ω (g(n))?
It must satisfy the condition:
f(n)>=c.g(n)
To check the above condition, we first replace f(n) by 2n+3 and
g(n) by n.
2n+3>=c*n
Suppose c=1
2n+3>=n (This equation will be true for any value of n starting
from 1).
Therefore, it is proved that g(n) is big omega of 2n+3 function.
Theta Notation (θ):
o The theta notation mainly describes the average case scenarios.
o It represents the realistic time complexity of an algorithm. Every time, an algorithm does not
perform worst or best, in real-world problems, algorithms mainly fluctuate between the worst-case
and best-case, and this gives us the average case of the algorithm.
o Big theta is mainly used when the value of worst-case and the best-case is same.
o It is the formal way to express both the upper bound and lower bound of an algorithm running time.
Let's understand the big theta notation mathematically:
Let f(n) and g(n) be the functions of n where n is the steps required to execute the program then:
f(n)= θg(n)
The above condition is satisfied only if when
[8]
II CSE III Sem Advanced Data Structures & Algorithm
c1.g(n)<=f(n)<=c2.g(n)
where the function is bounded by two limits, i.e., upper and lower limit, and f(n) comes in between. The
condition f(n)= θg(n) will be true if and only if c1.g(n) is less than or equal to f(n) and c2.g(n) is greater
than or equal to f(n). The graphical representation of theta notation is given below:
Let’s consider the same example where
f(n)=2n+3
g(n)=n
As c1.g(n) should be less than f(n) so c1 has to be 1 whereas c2.g(n) should be greater than f(n) so c2 is
equal to 5. The c1.g(n) is the lower limit of the of the f(n) while c2.g(n) is the upper limit of the f(n).
c1.g(n)<=f(n)<=c2.g(n)
Replace g(n) by n and f(n) by 2n+3
c1.n <=2n+3<=c2.n
if c1=1, c2=2, n=1
1*1 <=2*1+3 <=2*1
1 <= 5 <= 2 // for n=1, it satisfies the condition c1.g(n)<=f(n)<=c2.g(n)
If n=2
1*2<=2*2+3<=2*2
2<=7<=4 // for n=2, it satisfies the condition c1.g(n)<=f(n)<=c2.g(n)
Therefore, we can say that for any value of n, it satisfies the condition c1.g(n)<=f(n)<=c2.g(n). Hence, it is
proved that f(n) is big theta of g(n). So, this is the average-case scenario which provides the realistic time
complexity.
[9]
II CSE III Sem Advanced Data Structures & Algorithm
Polynomial Vs Exponential Algorithms:
The time complexity (generally referred as running time) of an algorithm is expressed as
the amount of time taken by an algorithm for some size of the input to the problem. Big O notation is
commonly used to express the time complexity of any algorithm as this suppresses the lower order terms
and is described asymptotically. Time complexity is estimated by counting the operations (provided as
instructions in a program) performed in an algorithm. Here each operation takes a fixed amount of time in
execution. Generally time complexities are classified as constant, linear, logarithmic, polynomial,
exponential etc. Among these the polynomial and exponential are the most prominently considered
and defines the complexity of an algorithm. These two parameters for any algorithm are always influenced
by size of input.
Polynomial Running Time:
An algorithm is said to be solvable in polynomial time if the number of steps required to
complete the algorithm for a given input is O(nk) for some non-negative integer k, where n is the
complexity of the input. Polynomial-time algorithms are said to be "fast." Most familiar mathematical
operations such as addition, subtraction, multiplication, and division, as well as computing square roots,
powers, and logarithms, can be performed in polynomial time. Computing the digits of most interesting
mathematical constants, including pi and e, can also be done in polynomial time.
All basic arithmetic operations ((i.e.) Addition, subtraction, multiplication,division),
comparison operations, sort operations are considered as polynomial time algorithms.
Exponential Running Time:
The set of problems which can be solved by an exponential time algorithms, but for which
no polynomial time algorithms is known. An algorithm is said to be exponential time, if T(n) is upper
bounded by 2poly(n), where poly(n) is some polynomial in n. More formally, an algorithm is exponential
time if T(n) is bounded by O(2nk) for some constant k.
Examples of exact Exponential time algorithms, we are used to the idea that multiplication is a process of
repeated addition. Similarly, exponentiation is a process of repeated multiplication.
20 = 1
21 = 2
22 = 2 × 2 = 4
23 = 2 × 2 × 2 = 8
24 = 2 × 2 × 2 × 2 = 16
Algorithms which have exponential time complexity grow much faster than polynomial algorithms.
The difference you are probably looking for happens to be where the variable is in the equation that
expresses the run time. Equations that show a polynomial time complexity have variables in the bases of
their terms.
[10]
II CSE III Sem Advanced Data Structures & Algorithm
Examples: n3 + 2n2 + 1. Notice n is in the base, NOT the exponent. In exponential equations, the variable
is in the exponent. Examples: 2n. As said before, exponential time grows much faster. If n is equal to 1000
(a reasonable input for an algorithm), then notice 10003 is 1 billion, and 21000 is simply huge!.
Analysis recursive programs:
Some computer programming languages allow a module or function to call itself. This
technique is known as recursion. In recursion, a function α either calls itself directly or calls a
function β that in turn calls the original function α. The function α is called recursive function.
Analysis of Recursion:
One may argue why to use recursion, as the same task can be done with iteration. The first
reason is, recursion makes a program more readable and because of latest enhanced CPU systems,
recursion is more efficient than iterations.
Time Complexity:
In case of iterations, we take number of iterations to count the time complexity. Likewise,
in case of recursion, assuming everything is constant, we try to figure out the number of times a recursive
call is being made. A call made to a function is Ο(1), hence the (n) number of times a recursive call is made
makes the recursive function Ο(n).
Space Complexity:
Space complexity is counted as what amount of extra space is required for a module to
execute. In case of iterations, the compiler hardly requires any extra space. The compiler keeps updating
the values of variables used in the iterations. But in case of recursion, the system needs to store activation
record each time a recursive call is made. Hence, it is considered that space complexity of recursive
function may go higher than that of a function with iteration.
Example algorithm:
Algorithm fact(n)
{
If n==0
return 1;
else return n*fact(n-1;
}
T(n)= T(n-1) +3 if n>0
[11]