0% found this document useful (0 votes)
15 views34 pages

Data Structure and Algorithm Chapter One

Uploaded by

melesebaye580
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views34 pages

Data Structure and Algorithm Chapter One

Uploaded by

melesebaye580
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Chapter One

Introduction to data structure


and basic algorithm analysis
What is Data Structure?
• A program is a set of instruction which is written in order to solve a
problem.
• A solution to a problem actually consists of two things:
A way to organize the data.
Sequence of steps to solve the problem
• The way data are organized in a computers memory is said to be Data
Structure and the sequence of computational steps to solve a problem is
said to be an algorithm. Therefore, a program is data structures plus
algorithms.
program=data structures + algorithms.
• A data structure is defined as a particular way of storing and organizing
data in our devices to use the data efficiently and effectively.
• The main idea behind using data structures is to minimize the time and
space complexities.
• An efficient data structure takes minimum memory space and requires
minimum time to execute the data.
Classification of Data Structure:

 Linear data structure: Data structure in which data elements are arranged sequentially or
linearly, where each element is attached to its previous and next adjacent elements.
 Example: array, stack, queue, linked list.
 Non-linear data structure: Data structures where data elements are not placed sequentially
or linearly.
 In a non-linear data structure, we can’t traverse all the elements in a single run only.
– Examples trees and graphs.
Cont...
• Once we have given a problem, the first step to solve the problem is
obtaining ones own abstract view, or model , of the problem. This process
of modeling is called abstraction.
• The model defines an abstract view to the problem. This implies that the
model focuses only on problem related things and a programmer tries to
define the properties of the problem.
– These properties include
– The data which are affected and
– The operations that are involved in the problem.
• With abstraction you create a well-defined entity that can be properly
handled. These entities define the data structure of the program.
• An entity with the properties just described is called an abstract data type
(ADT).
Abstract Data Types
• An ADT consists of data to be stored and operations suported on them.
• ADT is a specification that describes a data set and the operations on that data.
– The ADT specifies:
1. What data can be stored
2. What operations can be done on the data. However, it does not
specify how to store and how to implement the operation
• It is language independent
• For example, if we are going to model employees of an organization:
– This ADT stores employees with their relevant attributes and discarding
irrelevant attributes.
Relevant:- Name, ID, Sex, Age, Salary, Dept, Address
Non Relevant:-weight, color, height
– This ADT supports hiring, firing, retiring, … operations.
• A data structure is a language construct that the programmer has defined in order
to implement an abstract data type.
• There are lots of formalized and standard Abstract data types such as Stacks,
Queues, Trees, Lists, Graph etc.
Abstraction
• Abstraction is a process of classifying characteristics as relevant and
irrelevant for the particular purpose at hand and ignoring the irrelevant
ones.
• Applying abstraction correctly is the essence of successful programming
E.g : model Students of MAU
Algorithms
• An algorithm is a well-defined computational procedure that takes some
value or a set of values as input and produces some value or a set of values
as output.
• From the data structure point of view, following are some important
categories of algorithms:
 Search − Algorithm to search an item in a data structure.
 Sort − Algorithm to sort items in a certain order.
 Insert − Algorithm to insert item in a data structure.
 Update − Algorithm to update an existing item in a data structure.
 Delete − Algorithm to delete an existing item from a data structure.
Cont..
• The correct data structures lead to simple and efficient algorithms and
correct algorithms lead to accurate and efficient data structures.
• Algorithm can be good or bad depending on different properties.
Properties of an algorithm
– Finiteness: Algorithm must complete after a finite number of steps.
– Definiteness: Each step must be clearly defined, having one and only
one interpretation. At each point in computation, one should be able to
tell exactly what happens next.
– Sequence: Each step must have a unique defined preceding and
succeeding step. The first step (start step) and last step (halt step) must
be clearly noted.
Cont..
– Correctness: It must compute correct answer for all possible legal inputs.
– Language Independence: It must not depend on any programming
language.
– Completeness: It must solve the problem completely.
– Effectiveness: It must be possible to perform each step exactly and in a
finite amount of time.
– Efficiency: It must solve with the least amount of computational resources
such as time and space.
– Generality: Algorithm should be valid on all possible inputs.
– Input/Output: There must be a specified number of input values, and one or
more result values.
Algorithm Analysis
• Algorithm analysis refers to the process of determining the amount of
computing time and storage space required by different algorithms. In other
words, it’s a process of predicting the resource requirement of algorithms in
a given environment.
• In order to solve a problem, there are many possible algorithms. So we
have to be able to choose the best algorithm for the problem at hand using
some scientific method.
• To classify some data structures and algorithms as good, we need precise
ways of analyzing them in terms of resource requirement. The main
resources are:
– Running Time: Running time is the dominant standard.
– Memory Usage Example
Cont..
• Running time is usually treated as the most important since computational
time is the most precious resource in most problem domains.
• There are two approaches to measure the efficiency of algorithms before
implementation and after implementation.
1. Empirical/expermental: it works based on the total running time of the
program. it uses actual system clock time. e.g T(n)=final time -intital time.
– However, it is difficult to use actual clock-time as a consistent measure of
an algorithm’s efficiency, because clock-time can vary based on many
things.
For example,
– Specific processor speed of computer. e.g 1.78GHz , 2.4GHz
– Current processor load
– Specific data for a particular program
• Input Size
• Input Properties
– Operating Environment: multitasking and single tasking.
Cont..
2 Theoretical analysis : Determining the quantity of resources required
using mathematical concepts.
• Analyze an algorithm according to the number of basic operations(time
units) required, rather than according to an absolute amount of time
involved.
• we use theoretical approach to determine the efficency of algorithm
becuase:
• The number of operation will not vary under diffrent condition.
• It helps us to have a meaning full measure that permits comparison
of algorithms independent of operating platform
• It helps to determine the complexity of algorithm
Algorithm Complexity Analysis
• Complexity Analysis is the systematic study of the cost of computation, measured
either in time units or in operations performed, or in the amount of storage space
required.
• The goal is to have a meaningful measure that permits comparison of algorithms
independent of operating platform.
• There are two important wys to decide the efficency of an algorithm:
1. Time Complexity: Determine the approximate amount of time(number of
operations) required to solve a problem of size n.
2. Space Complexity: Determine the approximate memory required to solve a
problem of size n.
 Complexity analysis involves two distinct phases:
– Algorithm Analysis: Analysis of the algorithm or data structure to produce a
function T (n), that describes the algorithm in terms of the operations performed
in order to measure the complexity of the algorithm.
– Order of Magnitude Analysis: Analysis of the function T (n) to determine the
general complexity category to which it belongs.
– It provides a useful approximation to the actual number of steps in the
computation.
cont...
Space Complexity
• Space complexity of an algorithm represents the amount of memory space required
by the algorithm in its life cycle. The space required by an algorithm is equal to the
sum of the following two components −
– A fixed part that is a space required to store certain data and variables.
• For example, simple variables and constants used, program size, etc.
– A variable part is a space required by variables, whose size depends on the
size of the problem. For example, dynamic memory allocation, recursion stack
space, etc.
• Space complexity S(P) of any algorithm P is S(P) = C + SP(I), where C is the
fixed part and SP(I) is the variable part of the algorithm.
– example:

– Here we have three variables A, B, and C and one constant. Hence S(P) = 1 +
3.
Time Complexity
Cont..
• Time complexity of an algorithm represents the amount of time required by the
algorithm to run to completion. Time requirements can be defined as a numerical
function T(n),
– example, addition of two n-bit integers takes n steps. Consequently, the total
computational time is T(n) = c ∗ n, where c is the time taken for the addition of
two bits.
• There is no generally accepted set of rules for Time complexity algorithm analysis.
However, an exact count of operations is commonly used.
Time Complexity Analysis Rules:
1. We assume an arbitrary time unit.
2. Execution of one of the following operations takes time 1:
– Assignment Operation. e.g i=0
– Single Input/Output Operation. e.g cin>>i or cout<<i
– Single Boolean Operations. e.g i>=5
– Single Arithmetic operation. e.g a+b
– Function Return. e.g return sum.
3. Running time of a selection statement (if, switch) is the time for the condition
evaluation + the maximum of the running times for the individual clauses in the
selection.
Cont..
4. Loops: Running time for a loop is equal to the running time for the
statements inside the loop * number of iterations+ time for setup(1)+time for
checking(number of iteration +1)+ time for update(noumber of iteration).
The total running time of a statement inside a group of nested loops is the
running time of the statements multiplied by the product of the sizes of all
the loops.
5. Running time of a function call is 1 for setup + the time for any parameter
calculations + the time required for the execution of the function body.
Note: Running time=summation of running time of all fragment
Examples: Count of Basic Operations (Time Units) to Compute
1. int count(){ 1 for the assignment statement: int k=0
int k=0; 1 ,1 for the input and output statement.
cout<< “Enter an integer”;
In the for loop:
cin>>n;
1 assignment, n+1 tests, and n increments.
for (i=0;i<n;i++)
k=k+1; n loops of 2 units for an assignment, and an addition.

return 0;} 1 for the return statement.

T (n)= 1+1+1+(1+n+1+n)+2n+1 = 4n+6


Cont..
Count of Basic Operations (Time Units) to
2. int total(int n){ Compute
int sum=0; 1 for the assignment statement: int sum=0
for (int i=1;i<=n;i++) In the for loop:
sum=sum+1; 1 assignment, n+1 tests, and n increments.
n loops of 2 units for an assignment, and an
return sum;} addition.
1 for the return statement.
T (n) = 1+ (1+n+1+n)+2n+1 = 4n+4

3. void func(){ Count of Basic Operations (Time Units)


int x=0;int i=0;int j=1; 1 for the first assignment statement: x=0;
1 for the second assignment statement: i=0;
cout<< “Enter value”; 1 for the third assignment statement: j=1;
cin>>n; 1 for the output statement.
while (i<n){ 1 for the input statement.
x++; In the first while loop:
n+1 tests
i++;} n loops of 2 units for the two increment
while (j<n){ (addition) operations
j++;}} In the second while loop:
n tests
n-1 increments
T (n) = 1+1+1+1+1+n+1+2n+n+n-1 = 5n+5
Cont..
4. int sum (int n){
int partial_sum = 0;
for (int i = 1; i <= n; i++)
partial_sum = partial_sum +(i * i * i);
return partial_sum;
}

5. Sum=0;
if(test==1) 6. int k=0
{ for(int i=1;i<=n;i++)
for for(int j=1;j<=n;j++
(i=1;i<=n;i++) k++
sum=sum+i
}
else
{
cout<<sum;
}
Formal Approach to Analysis
• In the above examples we have seen that analysis is a bit complex.
However, it can be simplified by using some formal approach in which case
we can ignore initializations, loop control, and book keeping.
• for Loops: Formally
– In general, a for loop translates to a summation. The index and bounds
of the summation are the same as the index and bounds of the for loop.
for (int i = 1; i <= N; i++) {
sum = sum+i; }
• Nested Loops: Formally
– Nested for loops translate into multiple summations, one for each for
loop.
for (int i = 1; i <= N; i++) {
for (int j = 1; j <= M; j++) {
sum = sum+i+j;}}
Simplified Rules to Compute Time Units
• Consecutive Statements: Formally
– Add the running times of the separate blocks of your code
for (int i = 1; i <= N; i++) {
sum = sum+i;}
for (int i = 1; i <= N; i++) {
for (int j = 1; j <= N; j++) {
sum = sum+i+j;}}
• Conditionals: Formally
– If (test) s1 else s2: Compute the maximum of the running time for s1
and s2.
if (test == 1) {
for (int i = 1; i <= N; i++) {
sum = sum+i;}}
else
for (int i = 1; i <= N; i++) {
for (int j = 1; j <= N; j++) {
sum = sum+i+j; }}
Measures of Times
 In order to determine the running time of an algorithm it is possible to
define three functions Tbest(n), Tavg(n) and Tworst(n) as the best, the
average and the worst case running time of the algorithm respectively.
• Worst Case (Tworst): The amount of time the algorithm takes on the worst
possible set of inputs.
• Example: while sorting, if the list is in opposite order.
– while searching , if the desired item is located at the last position or is missing.
• Average Case (Tavg): The amount of time the algorithm takes on an
"average" set of inputs.
– Example:while sorting, considering any arrangement( order of input
data) or while searching, if the desired item is located at any location.
• Best Case (Tbest): The amount of time the algorithm takes on the smallest
possible set of inputs.
– Example: for sorting algorithm,if the list is already sorted.
– for searching algorithm, if the desired item is located at first
accessed position.
Order of Magnitude or Asymptotic Analysis
 Order of Magnitude refers to the rate at which the storage or time grows as the size
the input increase. This type of analysis is called Asymptotic Analysis.
 Asymptotic Analysis is concerend with how the running time of an algorithm
increase with the size of the input in the limit or as the size of the input increases
without bound.
• There are five notations used to describe a running time function. These are:

– Big-Oh Notation (O)

– Big-Omega Notation (Ω)

– Theta Notation (Θ)

– Little-o Notation (o)

– Little-Omega Notation (ω)

• Asymptotic notations are mathematical tools to represent the time complexity of


algorithms for asymptotic analysis.
The Big-Oh Notation
• Big-Oh notation is a way of comparing algorithms and is used for
computing the complexity of algorithms; i.e., the amount of time that it
takes for computer program to run .
• It’s only concerned with what happens for very a large value of n.
Therefore only the largest term in the expression (function) is needed.
• For example, if the number of operations in an algorithm is n2 – n, n is
insignificant compared to n2 for large values of n. Hence the n term is
ignored. Of course, for small values of n, it may be important.
• However, Big-Oh is mainly concerned with large values of n.
• Formal Definition: f (n)= O (g (n)): if there exist postive constants c and k
such that 0 ≤ f(n) ≤ c.g(n) for all n ≥ k.
• Generally, Big-O notation represents the upper bound of the running time
of an algorithm. Therefore, it gives the worst-case complexity of an
algorithm.
Cont..
• Examples: The following points are facts that you can use for Big-Oh problems:
– 1<=n for all n>=1
– n<=n2 for all n>=1
– 2n <=n! for all n>=4
– log2n<=n for all n>=2
– n<=nlog2n for all n>=2
Example 1. f(n)=10n+5 and g(n)=n. Show that f(n) is O(g(n)).
To show that f(n) is O(g(n)) we must show that constants c and k such
that
• f(n) <=c.g(n) for all n>=k
• Or 10n+5<=c.n for all n>=k
• Try c=15. Then we need to show that 10n+5<=15n
• Solving for n we get: 5<5n or 1<=n.
• So f(n) =10n+5 <=15.g(n) for all n>=1.
• (c=15,k=1).
Example 2: f(n)=5n2+2n+1 g(n)=n2. Show that f(n) is O(g(n)).
Big-Omega Notation(Ω)
• Omega notation represents the lower bound of the running time of an
algorithm. Thus, it provides the best case complexity of an algorithm.
• Formal Definition: A function f(n) is Ω( g (n)) if there exist constants c
and k ε ℛ+ such that f(n) >=c. g(n) for all n>=k.
• f(n)= Ω( g (n)) means that f(n) is greater than or equal to some constant
multiple of g(n) for all values of n greater than or equal to some k.
• Example 1: If f(n) =n2, then f(n)= Ω ( n)
In simple terms, f(n)= Ω( g (n)) means that the growth rate of f(n) is
greater that or equal to g(n).
Example 2: f(n)=5n2+2n+1 g(n)=n2. Show that f(n) is Ω(g(n)).
Theta Notation(Θ)
• Theta notation encloses the function from above and below. Since it
represents the upper and the lower bound of the running time of an
algorithm.
• It is used for analyzing the average-case complexity of an algorithm.
• A function f (n) belongs to the set of Θ(g(n)) if there exist positive
constants c1 and c2 such that it can be sandwiched between c1.g(n) and
c2.g(n), for sufficiently large values of n.
Formal Definition: A function f (n) is Θ (g(n)) if it is both O( g(n) ) and
Ω ( g(n) ).
• In other words, there exist constants c1, c2, and k >0 such that c1.g
(n)<=f(n)<=c2. g(n) for all n >= k
• If f(n)= Θ(g(n)), then g(n) is an asymptotically tight bound for f(n).
• In simple terms, f(n)= Θ (g(n)) means that f(n) and g(n) have the same rate
of growth.
Eg: f(n)=5n2+2n+1 g(n)=n2. Show that f(n) is Θ (g(n)).
Little-o Notation(o)

• We use o notation to denote a upper bound that is not


asymptotically tight.
• f(n)=o(g(n)) means for all c>0 there exists some k>0 such that
f(n)<c.g(n) for all n>=k.
• Informally, f(n)=o(g(n)) means f(n) becomes insignificant
relative to g(n) as n approaches infinity.
• Example: f(n)=3n+4 is o(n2)
• In simple terms, f(n) has less growth rate compared to g(n).
Little-Omega (ω Notation)

• We use ω notation to denote a lower bound that is not


asymptotically tight.
• Formal Definition: f(n)= ω (g(n)) if there exists a constant c
and k, such that 0<= c. g(n)<f(n) for all n>=k.
• Example: 2n2= ω (n) but not ω(n2) .
Rules to estimate Big Oh of a given function
Rules to find Big-O from a given T(n)
• Take highest order terms and drop lower order terms and constant multipliers.
• Example: T(n)= 5n2+2n+2= O(n2)
Cont....
 Arrangement of common functions by growth rate. List of typical growth rates.

 The order of the body statements of a given algorithm is very important in


determining Big-Oh of the algorithm.
Finding Big O of given Algorithm

Example1. Example 2.
for (i=1;i<=n;i++) for (i=1;i<=n;i++)
cout<<i; for(j=1;j<=n;j++)
Soln. cout<<i;
T(n) = 1+n+1+n+n Soln.
=3n+2 T(n) =1+n+1+n+n(1+n+1+n+n)
T(n) = O(n)
=3n2 + 4n+2
T(n)=O(n2)
Exercise
Find Big O of the following algorithm:
1. for(i=1;i<=n;i++) 2. if(k==1)
Sum=sum+i; {
For (i=1;i<=n;i++) for (i=1;i<=100;i++)
For(j=1;j<=m;j++) for (j=1;j<=1000;j++)
Sum++; cout<<I
}
else if(k==2)
{
for(i=1;i<=n;i=i*2)
cout<<i;
}
else
{
{for (i=1;i<=n;i++)
Sum++;
}
cont...
3. 6.
for (i=1; i<=n;i++) for (int i=0; i<N; ++i)
{
for (int j=i; j>0; j=j/2)
if (i<10)
for(j=1;j<=n;j=j*2) {}
cout<<j; 7.
else for (int i=1; i<N; ++i)
cout<<i;
for (int j=0; j<N; j+=i)
}
{}
4. 8.
for (int i=1; i<=N; ++i) int i=0, sum=0;
for (int j=i; j>0; --j) while (sum < N)
{} // do nothing
sum = sum + i++;
5. 9.
for (int i=1; i<N; i=i*2) for (int i=1; i<N; i=i*2)
{} for (int j=0; j<i; ++j)
{}
cont...
10. 12.
i=1; void Test( int N){
while (i<n) for(int i =0 ; i < N ; i= i*4){
{ for(int j = N ; j > 2; j = j/3){
i=i*2; cout<<“I have no idea what this does”;
} } }}
11.
13.
for ( int i = 0; i < n; i++ )
for (int i = 0; i < n; i++)
for ( int j = 0; j <= n - i; j++ )
sum++;
int a = i; for (int j = 0; j < n; j++)
for (int i = 0; i < n; i ++ ) sum++;
for (int j = 0; j < n*n*n ; j++ )
sum++;

You might also like