0% found this document useful (0 votes)
16 views8 pages

Fundamentals of Data Structures Notes

Uploaded by

Gurpreet Poddar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views8 pages

Fundamentals of Data Structures Notes

Uploaded by

Gurpreet Poddar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

1.

2 Data

Computer system needs input before it can perform any sort on processing on it. Input can be directly
provided by the user of it can be retrieved from various other sources like internet, files, records, sheets,
etc. this input is often called data. Data is a collection of related data items. It has some meaning. Data
can be historical or current. Today's computing tries to gather as much of data s possible. Whether
useful or not, it is important that you gather as much as possible. Even though data is meaningful, it is
still raw and has no direct benefits can be drawn from it.

It is important that we process data so that knowledge can be derived from it. Processing data also
involves careful extraction of only the useful data. Because you collect everything that comes your way,
it is important that in the end you are able to identify the important data. Once the useful and important
data has been identified, the data is processed and transformed into information. Information consists
of only useful data including the aggregation and summarization of data. Information is also stored in
the form of charts and graphs.

For example, considered the sales data of a company from across the country. The data includes the
sales of each and every product for all the months and from all the respective regions. As the data is
for one complete year, you can imagine the size of data. If the data as such is presented to the
management, it is of no use from them. What is needed is to convert this data into information.

Suppose the management wants to discuss the sales of electronic items for 2 regions. Than only the
sales data of electronic items from 2 regions is extracted. Now there is a need to compute the
aggregation of each item for all months. Graphs and charts can be created to show the increase or
decrease in the sales of this year as compared to the sales of the last year on monthly basis. Sales of
newly launched product can be compared with old products. Which model no. of a particular item is in
demand and in which region can be presented form.

One you have information with you, knowledge can be derived from the same and it can be used by the
management for effective decision making.

1.3 Problem analysis

Data collection is done keeping in mind the problem in hand. Problem definition is the first step.
Problems can be from any real life work or standard computer science problem. For example, making
a software for library management in which you want to search for books based on indexes like book_Id,
Author_Name, etc. So the problem in hand in searching for book. Search is one the common operations
performed. Once the problem is defined, you need to record on daily basis data related to all books
issued. Then on the basis of problem definition and data in hand, you design efficient algorithm for the
same.

What is problem analysis? When you approach to find the solution for any defined problem, the
solution has to be within the constraints defined and should also be solvable in finite number of steps.
Constraints refer to the availability of resources and you have to limit your solution within the set
constraints. Resources in term of memory, CPU time, etc.

Also the solution has to be in finite number of steps. The steps may take any length of time to execute
when programmed, but the steps cannot be infinite. For some input of size N if the solution is non-finite,
the solution is not acceptable. Only the solution that provides answer in finite number of steps for all
values of N is acceptable.

The solution to any problem can be expressed in the form of text or picture. Algorithm is used to express
the solution in terms of text. Algorithm is a well-defined step by step solution for any defined problem.
It is written using simple English or any other human understandable form. Whereas flow chart is used
to represent solution in pictorial form. It is used to show the flow of control from beginning to end. You
can get a clear idea of where the flow is heading for with every step or iteration.
1.4 Data structures

Data structure is logical organization of data items in memory. Data structure is also used to defined
the relationship between various data items contained in it. Once you collect the data, the purpose for
which the data has been collected and the relationship between the data items varies a great deal.
There are a number of data structures available and you can choose the best that represents the kind
of relationship that exists.

Types of data structures

The data structures are divided into two group:

Figure 1.1 Types of data structures

1. Linear Data structures: Linear data structures are one in which the elements are stored sequentially
one after the other and there is only path between all elements in the data structure

a. Array: Arrays is used to represent homogeneous data items or data items of same type like
array of numbers, array of words, etc. data items in an array are stored contiguously in memory. Array
is static in nature which means the size and location of memory allocated to array cannot be changed
later on.

b. Linked List: data items in linked list are stored non-contiguously in memory and each data
item is represented as node. Each node contains the location in memory of the next node and linked
list can be used to represent non-homogeneous data items. Linked list is dynamic in nature which
means the size and location of memory allocated to array can be changed at any time.

c. Queue: Queue is special FIFO (first in first out) data structure in which the data item which
is inserted first is also the first data item to be removed from the queue. Insertion and deletion in queue
happens at two different ends of queue. Insertion happens from the back called the REAR of the queue
and deletion happens from the beginning called the FRONT of the queue.

d. Stack: Stack is special LIFO (last in first out) data structure in which the data item which is
last to be inserted is the first data item to be removed from the stack. Insertion and deletion in stack are
called PUSH & POP and they both happen at one end of the stack called the TOP of the stack.

1. Non-Linear Data structures: Non-Linear data structures are one in which the elements are not
stored sequentially and there exists multiple paths from one node or data element.
a. Tree: Tree data structure is used to represent hierarchical relationship between the data
items called nodes. A binary tree T consists of a ROOT node and left sub-tree LR and right sub-tree
RR. LR and RR are further two binary trees.

b. Graph: Graph is a collection of set of vertices (v) that represents the data items and set of
edges (E) that are used to connect the vertices.

1.5 Data structure operations

Following are the 6 common operations that performed on any data structure.

1. Traversal: Traversal operation is used to apply some process to each element of the data structure.
Traversal can be defined as an operation to visit each data item in the data structure exactly once.

2. Insertion- Insertion operation is used to add a single data item into the data structure.

3. Deletion- Deletion operation is used to remove a single data item from the data structure.

4. Search- Search operation is used to find the location of a particular data item in the data structure.

5. Merge- Merge operation is used to merge two sorted lists of size m & n respectively and generate a
single sorted list of size m + n.

6. Sort- Sort operation is used to arrange the data items in the data structure based on some ordering
rule. Two ordering rules are ascending order and descending order.

1.6 Algorithmic complexity and time space trade off

Problem has already been defined earlier in this lesson. Algorithm is a very popular programming tool.
Before you write a program, it is advisable that you begin with algorithm.

Algorithm is a well-defined step by step solution for any defined problem. Algorithm is not specific to
any computer programming language and is generally written in hum understandable form. It means
even a layman who has no knowledge computer programming can understand the solution provided in
the algorithm.

When you say that algorithm is a solution to a defined problem, it is possible that two different people
can design two different algorithms or solutions for the same problem. It then becomes imperative that
you are able to compare the solutions are find which one is more effective than the other. For this you
can compute the complexity of the 2 algorithms and compare the same to select the one with lesser
complexity.

Complexity of the algorithm can be defined using two parameters:

1. Time complexity: Time complexity for an algorithm can be measured in terms of number of
computational steps or number of operations performed by it.

Worst case: Maximum number of steps needed to execute algorithm successfully for all value of n.
Best case: Minimum number of steps needed to execute algorithm successfully for all values of n.
Average case: Average number of steps to execute algorithm for random values of n.

2. Space complexity: Space complexity for an algorithm refers to the amount of space needed by
variables, other objects and functions used by it.

Study of algorithm complexity is very important. With tremendous increase in data over last few years,
it is important that we continue to design algorithms having lesser complexity than the ones in use
today.
Time space trade off

Algorithm complexity is computed using two parameters: time and space. Algorithms for most of the
standard problems have already been designed. Still research is being conducted to design better
algorithms. It is not easy to design new algorithms that beats old algorithms on both parameters.

It is possible to compromise on one parameter to improve the second parameter. You can reduce the
time complexity of an algorithm by compromising on space and allocating more space for objects used
in algorithms. Similarly you can reduce the space complexity by compromising on time.

The best short example to understand time space trade off is swapping of two numbers. There are two
possible solutions to this problem. One uses third variable to swap two numbers and one is able to
swap two numbers without the use of third number. But the number of operations performed by the first
algorithm using three variables is half the operations done by second algorithm. Which shows that there
is clear trade off between time and space. Improvement of one leads to decline in other.

1.8 Asymptotic Notations

Asymptotic notation is also known as growth rate. Asymptotic notation is


studied to observe the effect of growing value of N which the input size to
the time efficiency or complexity of the algorithm.

 Asymptotic Notations are mathematical tools used to analyze the


performance of algorithms by understanding how their efficiency
changes as the input size grows.
 These notations provide a concise way to express the behavior of an
algorithm's time or space complexity as the input size approaches
infinity.
 Rather than comparing algorithms directly, asymptotic analysis focuses
on understanding the relative growth rates of algorithms' complexities.
 It enables comparisons of algorithms' efficiency by abstracting away
machine-specific constants and implementation details, focusing
instead on fundamental trends.
 Asymptotic analysis allows for the comparison of algorithms' space and
time complexities by examining their performance characteristics as
the input size varies.
 By using asymptotic notations, such as Big O, Big Omega, and Big
Theta, we can categorize algorithms based on their worst-case, best-
case, or average-case time or space complexities, providing valuable
insights into their efficiency.
There are mainly three asymptotic notations:
1. Big-O Notation (O-notation)
2. Omega Notation (Ω-notation)
3. Theta Notation (Θ-notation)

1. Theta Notation (Θ-Notation):


Theta notation encloses the function from above and below. Since it
represents the upper and the lower bound of the running time of an
algorithm, it is used for analyzing the average-case complexity of an
algorithm.
.Theta (Average Case) You add the running times for each possible input
combination and take the average in the average case.
Let g and f be the function from the set of natural numbers to itself. The
function f is said to be Θ(g), if there are constants c1, c2 > 0 and a natural
number n0 such that c1* g(n) ≤ f(n) ≤ c2 * g(n) for all n ≥ n0

Theta notation
Mathematical Representation of Theta notation:
Θ (g(n)) = {f(n): there exist positive constants c1, c2 and n0 such that 0 ≤
c1 * g(n) ≤ f(n) ≤ c2 * g(n) for all n ≥ n0}
Note: Θ(g) is a set
The above expression can be described as if f(n) is theta of g(n), then the
value f(n) is always between c1 * g(n) and c2 * g(n) for large values of n (n
≥ n0). The definition of theta also requires that f(n) must be non-negative
for values of n greater than n0.
The execution time serves as both a lower and upper bound on the
algorithm's time complexity.
It exist as both, most, and least boundaries for a given input value.
A simple way to get the Theta notation of an expression is to drop low-
order terms and ignore leading constants. For example, Consider the
expression 3n3 + 6n2 + 6000 = Θ(n3), the dropping lower order terms is
always fine because there will always be a number(n) after which Θ(n3)
has higher values than Θ(n2) irrespective of the constants involved. For a
given function g(n), we denote Θ(g(n)) is following set of functions.
Examples :
{ 100 , log (2000) , 10^4 } belongs to Θ(1)
{ (n/4) , (2n+3) , (n/100 + log(n)) } belongs to Θ(n)
{ (n^2+n) , (2n^2) , (n^2+log(n))} belongs to Θ( n2)
Note: Θ provides exact bounds.

2. Big-O Notation (O-notation):


Big-O notation represents the upper bound of the running time of an
algorithm. Therefore, it gives the worst-case complexity of an algorithm.
.It is the most widely used notation for Asymptotic analysis.
.It specifies the upper bound of a function.
.The maximum time required by an algorithm or the worst-case time
complexity.
.It returns the highest possible output value(big-O) for a given input.
.Big-O(Worst Case) It is defined as the condition that allows an algorithm
to complete statement execution in the longest amount of time possible.

If f(n) describes the running time of an algorithm, f(n) is O(g(n)) if there


exist a positive constant C and n0 such that, 0 ≤ f(n) ≤ cg(n) for all n ≥ n0
It returns the highest possible output value (big-O)for a given input.
The execution time serves as an upper bound on the algorithm's
time complexity.

Mathematical Representation of Big-O Notation:


O(g(n)) = { f(n): there exist positive constants c and n0 such that 0 ≤ f(n) ≤
cg(n) for all n ≥ n0 }
For example, Consider the case of Insertion Sort. It takes linear time in
the best case and quadratic time in the worst case. We can safely say that
the time complexity of the Insertion sort is O(n2).
Note: O(n2) also covers linear time.
If we use Θ notation to represent the time complexity of Insertion sort, we
have to use two statements for best and worst cases:
 The worst-case time complexity of Insertion Sort is Θ(n2).
 The best case time complexity of Insertion Sort is Θ(n).
The Big-O notation is useful when we only have an upper bound on the
time complexity of an algorithm. Many times we easily find an upper
bound by simply looking at the algorithm.
Examples :
{ 100 , log (2000) , 10^4 } belongs to O(1)
U { (n/4) , (2n+3) , (n/100 + log(n)) } belongs to O(n)
U { (n^2+n) , (2n^2) , (n^2+log(n))} belongs to O( n^2)
Note: Here, U represents union, we can write it in these manner
because O provides exact or upper bounds .

3. Omega Notation (Ω-Notation):


Omega notation represents the lower bound of the running time of an
algorithm. Thus, it provides the best case complexity of an algorithm.
The execution time serves as a lower bound on the algorithm's time
complexity.
It is defined as the condition that allows an algorithm to complete
statement execution in the shortest amount of time.
Let g and f be the function from the set of natural numbers to itself. The
function f is said to be Ω(g), if there is a constant c > 0 and a natural
number n0 such that c*g(n) ≤ f(n) for all n ≥ n0

Mathematical Representation of Omega notation :


Ω(g(n)) = { f(n): there exist positive constants c and n0 such that 0 ≤ cg(n)
≤ f(n) for all n ≥ n0 }
Let us consider the same Insertion sort example here. The time
complexity of Insertion Sort can be written as Ω(n), but it is not very useful
information about insertion sort, as we are generally interested in worst-
case and sometimes in the average case.
Examples :
{ (n^2+n) , (2n^2) , (n^2+log(n))} belongs to Ω( n^2)
U { (n/4) , (2n+3) , (n/100 + log(n)) } belongs to Ω(n)
U { 100 , log (2000) , 10^4 } belongs to Ω(1)
Note: Here, U represents union, we can write it in these manner
because Ω provides exact or lower bounds.
There are total of 3 asymptotic notations that are used to represent that complexity of an algorithm. In
this section only the Big O notation is explained as it represents the worst case complexity and is
accepted norm.
1. Big O Notation: represented as O() It is used to define only the upper bound of an algorithm
complexity,. It states that the complexity of given algorithm will never be more than its upper bound. It
used to represent the worst case complexity and is widely used to represent the complexity of algorithm.
f(n) is the complexity of given algorithm and g(n) is some standard function:

it is said that f(n) = Big O ( g(n)) if,

f(n) <= c g(n) for some positive value of c, and for all values of n >= n0.

The notation may not hold true for value of n < n0 or some other value of c.

Figure 1.2 Big O Notation

Consider the following,


f(n)=100n + 5
It can be expressed in terms of Big O as follows:
For f(n)=100n + 5, replacing all constants and selecting the highest degree of n you get g(n) = n
Now you need to select some value of c and n0 such that,
100n + 5 <= c * n for all value of n >= n0
For n0 =1 and c= 105,
F(n) = 100*1 + 5 = 105 <= c * g(n) = 105 * 1 =105
It is also true for all values of n > 1.
Hence f(n)=100n + 5 is Big O(g(n)) = Big O(n) for c =105 and all n >= 1

You might also like