Data Structures & Algorithms
Data Structures & Algorithms
Data structure is a special format or ways of organizing, processing, retrieving and storing data in such a
way that it can perform operations on these data in an effective way. Data structure is about rendering
data elements in terms of some relationship, for better organization and storage.
In Computer Science and Computer Programming, the use of data structure may require the selection or
design of algorithms for various purposes. But, in this case, the algorithm’s basic operations are
firmly/tightly coupled to the data structure design. Data Structures are widely used in almost every aspect
of Computer Science i.e. Operating System, Compiler Design, Artificial intelligence, Graphics and many
more.
Therefore, data structures are structures programmed to store ordered data, so that various operations can
be performed on it easily. It represents the knowledge of data to be organized in memory. It should be
designed and implemented in such a way that it reduces the complexity and increases the efficiency.
The awareness of Object Oriented Programming concepts is a great task in data structure, it associate
methods which are bound together as part of a “class” definition, and it collects different type of data
under one single entity. The only difference being that, data structures provides techniques to access and
manipulate data efficiently. On the other hand, the non-Object Oriented language defines functions to
work with data structures which might not technically be part of the data structures
a) Processor speed- Large amount of data required high speed processing, therefore, as data
increases, processor may fail to deal with these large amount of data, hence, it requires data
structures.
b) Data search- Using an inventory size of 101 items in a store, if our application needs to
search for a particular item; it needs to traverse 101 items every time, resulting in slowing
down the search process.
c) Simultaneous search on a web server by several users can cause very large server to fail
during processing to solve problems, otherwise the use of data structure will be highly
required. Where data is organized and searched instantly for the purpose it is meant for.
d) For managing large amounts of data, such as in a social network or a search engine.
e) For scheduling tasks, to decide which task a computer should do first.
f) For planning routes, like in a GPS system to find the shortest path from A to B.
g) For optimizing processes, such as arranging tasks so they can be completed as quickly as
possible.
h) For solving complex problems: From finding the best way to pack a truck to making a
computer 'learn' from data.
1
i) Make programs that run faster or use less memory.
j) Understand how to approach complex problems and solve them in a systematic way.
Data type is an attribute associated with a piece of data that tells a computer system how to
interpret its value. Mathematically, it is customary to classify variables according to certain
important characteristics. Clear distinctions are made between real, complex, and logical
variables or between variables representing individual values, or sets of values, or sets of sets, or
between functions, functional, sets of functions, and so on. This notion of classification is equally
if not more important in data processing. It will be perfect if we adhere to the principle that every
constant, variable, expression, or function is of a certain type. This type essentially characterizes
the set of values to which a constant belongs, or which can be assumed by a variable or
expression, or which can be generated by a function. Therefore, a data type is a set of values
together with the operations defined on the values: {(values) (operations)}. The operations are
performed on the values defined. E.g integer (-4,-1,1,3,4) are values while(+,-,*,/) are
operations. Data types also allow us to associate meaning to sequence of bits in the computer
memory
Basically data types can be of two types namely,(a) the System data types, or primitives (built-in)
data types (b) User define data types.
(a) The System data types, or primitives (built-in) data types referred to anything that store data,
which are also called primitive data types. The Primitive data types provided by many
programming languages are: INTEGER, FLOAT, DOUBLE, BOOLEAN, CHAR etc.
Integer type
The INTEGER type comprises of the whole numbers whose size may vary among individual
computer systems, that is numbers not having any fractional component. Integer number can
be positive, negative or zero.
2
Character type
The standard type CHAR commonly called comprises a set of printable characters. Most
Modern computer languages store alphabetic characters by representing each one using a
small numeric code. Characters are typically stored as 8-bit unsigned integer, having a range
of codes from 0 to 255. In order to support a wide variety of special symbols as might be
found in foreign languages, mathematics, science, character graphics, or elsewhere Java and
some other languages support wide character (wchars), having 16bits instead of 8.
Character string
Character strings are collections of characters such as words or sentences, as well as a library
of methods for manipulating those strings. For instance C stores character strings as arrays of
characters, and is terminated by a null byte. They are commonly manipulated using pointers
to characters. But C++ and Java define String classes
Pointer
These are variables that hold the address of where some other data or code is stored. That is, a
pointer represents a storage location in memory (RAM) .They are used to indirectly access
the item(s) pointed to by the pointers. It can also be changed in other to point other locations.
Real type
The REAL type denotes a subset of the real numbers. Whereas arithmetic with operands of
the types INTEGER is assumed to yield exact results, arithmetic on values of type REAL is
permitted to be inaccurate within the limits of round-off errors caused by computation on a
finite number of digits.
The number of bits allocated to each primitive data type depends on the programming
languages, compilers and the operating system .Though, for same primitive data type,
different languages may use different sizes e.g int may take 2 bytes or 4bytes. If it takes
2bytes (16bits), then the total possible values are between minus 32,768(-215 to 215-1). If it
takes 4bytes (32bits), then the possible values are between -2,147,483,648 and
+2,147,483,648(-231 to 213-1).
(b) Users define data types. These are data types that are defines by the user, for example;
arrays, class, structures, union, pointer etc. It allows user to create a data structure for a
particular part of of ones system, which can be used throughout ones project
Data Structures are commonly divided into two broad categories namely: (a) Primitive data
structure (b) Non-primitive data structure.
(a)Primitive data structures – Primitive data is classified as basic data and it consists of Boolean,
characters, integers, pointers, and fixed- and floating-point numbers. These data types are the
building blocks of data structures. Data types tell the interpreter or the computer how
3
the programmer plans on using the data. In addition, data analysts can choose from different data
structure classifications.
(b) Non-primitive data structures are those that are derived from primitive data structures. These
data structures cannot be operated or manipulated directly by the machine level instructions. They
focus on formation of a set of data elements that is either homogeneous (same data type) or
heterogeneous (different data type). These are further divided into linear and non-linear data
structure based on the structure and arrangement of data.
Data Structure
Data Structure
The following are the most frequent operations that may be performed on data structures:
a. Searching - Searching is an operation that may be done on data structures such as arrays, linked
lists, trees, graphs, and so on.
b. Sorting - Sorting is the process of ordering all data elements in a data structure in a certain order,
such as ascending or descending order.
c. Insertion – This referred to adding new data items to the data structure.
4
1.6 Abstract Data Types (ADT)
These data structures allow us to perform different operations on data. Though they are selected
based on which type of operation required. By default, all primitive data types support basic
operations such as addition and subtraction.
Otherwise, the data objects which comprise the data structure and their fundamental operations
are known as Abstract Data Type (ADT). For instance, this type encompasses graphs, queues,
stacks, and sets.
Characteristics Description
LINEAR Here , the data items are arranged in a linear sequence e.g ARRAY
NON- LINEAR Data items are not in sequence form e,g TREE and GRAPH
HOMOGENEOUS In homogenous data Structures all the elements are of the same type
e,g. ARRAY
NON-HOMOGENOUS The elements may or may not be of the same type e,g
STRUCTURES.
STATIC These are data structures whose sizes and structures associated
memory locations are fixed at compiled time e.g ARRAY.
DYNAMIC These are those which expand or shrink depending upon the program
need and its execution. Also their associated memory locations
changes e.g Linked list created using POINTER.
Operating system
Graphics
Computer Design
Blockchain
Genetics
Image Processing
Simulation
Database Systems
Web Applications
Machine Learning
Video Games
Cryptographic Systems
Data Analysis
Search Engines.
5
1.8 Algorithms
An algorithm is a set of step-by-step instructions to solve a given problem or achieve a specific goal. A
cooking recipe written on a piece of paper is an example of an algorithm, where the goal is to make a
certain dinner. The steps needed to make a specific dinner are described exactly.
When we talk about algorithms in Computer Science, the step-by-step instructions are written in a
programming language, and instead of food ingredients, an algorithm uses data structures. An algorithm
is a finite set of instructions or logic, written in order to accomplish a certain predefined task. Algorithm
is not the complete code or program, it is just the core logic (solution) of a problem, which can be
expressed either as an informal high level description as pseudocode or using a flowchart. An Algorithm
is said to be efficient and fast, if it takes less time to execute and consumes less memory space.
Algorithms are fundamental to computer programming as they provide step-by-step instructions for
executing tasks. An efficient algorithm can help us to find the solution we are looking for, and to
transform a slow program into a faster one. By studying algorithms, developers can write better
programs.
Algorithm examples:
The following are the approaches used after considering both the theoretical and practical
importance of designing an algorithm:
Brute force algorithm: The general logic structure used in this case is to apply and design an
algorithm. It is also known as an exhaustive search algorithm that searches all the possibilities to
provide the required solution. Such algorithms are of two types:
6
o Optimizing: This required finding all the solutions to a problem, and then takes out the
best solution or if the value of the best solution is known then it will terminate if the best
solution is known.
o Sacrificing: As soon as the best solution is found, then it will stop.
Divide and conquer: It is an implementation of an algorithm. It allows you to design an
algorithm in a step-by-step variation. It breaks down the algorithm to solve the problem in
different methods. It allows you to break down the problem into different methods, and valid
output is produced for the valid input. This valid output is passed to some other function.
Greedy algorithm: It is an algorithm paradigm that makes an optimal choice on each iteration
with the hope of getting the best solution. It is easy to implement and it has a faster execution
time. But, there are very rare cases in which it provides the optimal solution.
Dynamic programming: It makes the algorithm more efficient by storing the intermediate
results. It follows five different steps to find the optimal solution for the problem:
i. It breaks down the problem into a subproblem to find the optimal solution.
ii. After breaking down the problem, it finds the optimal solution out of these
subproblems.
iii. Stores the result of the subproblems is known as memorization.
iv. Reuse the result so that it cannot be recomputed for the same subproblems.
v. Finally, it computes the result of the complex program.
Branch and Bound Algorithm: The branch and bound algorithm can be applied to only integer
programming problems. This approach divides all the sets of feasible solutions into smaller
subsets. These subsets are further evaluated to find the best solution.
Randomized Algorithm: As we have seen in a regular algorithm, we have predefined input and
required output. Those algorithms that have some defined set of inputs and required output, and
follow some described steps are known as deterministic algorithms. What happens that when the
random variable is introduced in the randomized algorithm?. In a randomized algorithm, some
random bits are introduced by the algorithm and added in the input to produce the output, which
is random in nature. Randomized algorithms are simpler and efficient than the deterministic
algorithm.
Backtracking: Backtracking is an algorithmic technique that solves the problem recursively and
removes the solution if it does not satisfy the constraints of a problem.
a) Time Complexity
This is used to represent the amount of time required by the program to run till its completion,
where the completion is in its possible minimum time. In this case the performance measure is
termed Time Complexity. The time complexity of an algorithm or a program is a function of the
running time of the algorithm or program.
7
b) Space Complexity
This could be defined as the amount of memory space required by the algorithm during the course
of its execution. Space complexity must be seriously considered for Multi- user systems for
Limited memory availability. In doing so the following must be considered:
Instruction space
Data space
Environment space
The performance measure in such a case is called Space Complexity. The space complexity of an
algorithm or a program is a function of the space needed by the algorithm or program to run to its
completion.
Analyzing the complexity of any algorithm in terms of time and space can never provide an exact
number to define the time and space required by the computer to run the lines of code of the
algorithm, no of memory access, no of comparisons, temporary variables occupying the memory
space etc.
Therefore, Asymptotic notation are mathematical notations used to describe the running time of
an algorithm when the input tends towards a particular value or a limiting value .Or could be
referred to as meaningful approximations of functions that represent the time or space complexity
of a program. This Asymptotic notation simplifies the analysis and keeps us thinking about the
most important aspect of: the growth rate.
There are three types of asymptotic notation: (i) Big O notation (Upper bound), (ii) Big Omega
notation (Ω)(lower bound) and (iii) Big Theta notation (θ)
(i) Big O notation i.e asymptotic upper bound having the function f(n) = O(g(n)) if only if there
exist a positive constant C and K such f(n) ≤ C*g(n) for all n, n≤K
Example 1.3
f(n) g(n)
16n3 +78n2 + 12n n3 f(n) = O(n3)
34n-80 n f(n)= O(n)
37 1 f(n)=O(1)
Note that g(n) is the upper bound of the function f(n).
8
(ii) The lower bound for an algorithm (or a problem) is denoted by the symbol Ω, pronounced “big-
Omega” or just “Omega.”Big Omega notation (Ω): Asymptotic Lower bound. In this case the function
(n) =Ω(g(n)), if and only if there exist a positive constant C and K such that f(n) ≥C*g(n) for all n
n≥ K
At this point g(n) is the lower bound of the function f(n)
Example 1.4
f(n) g(n)
16n +8n2 + 2n
3
n3
f(n) = Ω(n3)
54n + 8 n f(n)= Ω(n)
The possible objective here is to give the largest rate of growth g(n) which is less than or equal to
the given algorithm’s rate of growth f(n).
(iii) Big Theta notation (θ): Asymptotic Tight bound- The function f(n)= θ(g(n)), if and only if
there exist a positive constant C1, C2 and K such that C1*g(n)≤ f(n) ≤C2* g(n) for all n, n≥K
9
Example 1.5
If algorithm has a time complexity of T(n)= (n2 + 3n+ 4) which is a quadratic equation . For large
of n, the 3n+ 4 part will become insignificant compared to the n2 part
i.e for n= 1000, n2 will be 1000000, while 3n + 4 will be 3004.
To analyze the given algorithm, we need to know with which inputs the algorithm takes less time
(performing well) and which inputs the algorithm takes a long time. We have already seen that an
algorithm can be represented in the form of an expression. That means we represent the algorithm
10
with multiple expressions: one for the case where it takes less time and another for the case where
it takes more time.
In general, the first case is called the BEST CASE and the second case is called the WORST
CASE for the algorithm and lastly AVERAGE CASE.To analyze an algorithm we need some
kind of syntax, and that forms the base for asymptotic analysis /notation.
• Average case - Provides a prediction about the running time of the algorithm.
- Run the algorithm many times, using many different inputs that come from
some distribution that generates these inputs, compute the total running time
(by adding the individual times), and divide by the number of trials.
- Assumes that the input is random.
11
2.0 THE ARRAYS
An array is an ADT whose objects are sequence of elements of the same type and the two operations
performed on it are store and retrieve. Thus if a is an array the operations can be represented as STORE
(a, i, e) and RETRIEVE (a, i) where i is termed as the index and e is the element that is to be stored in the
array. These functions are equivalent to the programming language statements a[i] := e, where i is termed
subscript and a the array variable name in programming language parlance.
Array could be of one – dimension, two - dimension, three- dimension or in general multi-dimension.
Figure 2.1 illustrates a one and two dimensional arrays. It may be observed that, the one dimensional
arrays are mathematically linked to vectors, while the two-dimensional arrays are linked to matrices. In
this regard, two-dimensional array also have the terminologies of rows and columns associated with them.
Array operations
A[1 ; 5] refers to one-dimensional array where 1,5 are the lower and upper indexes or the lower and upper
bounds of the index range respectively. Similarly B[1 ; 3,1 ;2] refers to a two-dimensional array with 1,3
and 1,2 being the lower and upper indexes of the rows and columns respectively. Also, each element of
the array vis, A[i] or B[i,j] resides in the memory location also called a cell. Here cell refers to a unit of
memory.
For example if A is an example of 5 elements then Fig 1.2 illustrate the operations performed on A.
12
Types of indexing in an array:
1 (one-based indexing): The first element of the array is indexed by the subscript of 1
Usually, programming languages allowing n-based indexing also allow negative index values, and other
scalar data types like enumerations, or characters may be used as an array index.
b. Arrays allow random access to elements. This makes accessing elements by position faster.
c. Arrays have better cache locality that can make a pretty big difference in performance.
d. Arrays represent multiple data items of the same type using a single name.
a. Pre-allocates all needed memory up front and wastes memory space for indices in the array that are
empty.
b. Fixed size: The size of the array is static (specify the array size before using it).
c. One block allocation: To allocate the array itself at the beginning, sometimes it may not be
possible to get the memory for the complete array (if the array size is big).
d. Complex position-based insertion: To insert an element at a given position, we may need to shift the
existing elements. This will create a position for us to insert the new element at the desired position. If
13
the position at which we want to add an element is at the beginning, then the shifting operation is more
expensive.
c, Used to Implement other data structures like Stacks, Queues, Heaps, Hash tables, etc.
Here, are some reasons for using arrays: Arrays are best for storing multiple values in a single variable.
Arrays are better at processing many values easily and quickly, sorting and searching the values is easier
in arrays. Array is particularly useful when we are dealing with lot of variables of the same type. For
example, let’s say I need to store
This is important because when arrays are declared in a program, it is essential that the numbers of
memory locations needed by the array are “booked” beforehand.
ONE-DIMENSIONAL ARRAY
The vector form of One –dim can be represented as A[1:U] and the number of elements is given by
(U:I+1). Take for instance;
Examples
(1) A[1:25], the element will be expressed as (25+1-1), where the upper bound is 25, while the lower
bound is 1.Hence the answer is 25.
(2) A[5:54] = (54-5+1) = 50.
(3) A[-1:26] = (26-(-1)+1) = 28.
The two dimensional array can be represented in the matrices as A[1:U1,1:U2], where U1 indicates the
number of rows and U2 the number of columns in the array. Hence, the number of elements in A is U1,
U2 i.e A[L1, U1, L2:U2] and has a size of (U1-L1+1) (U2-L2+1) elements.
14
Examples:
A Linked list is defined as a collection of nodes that can be traversed starting at the head node. It is
important to note that the head is not a node, but the address of the first node of the list. Another
traditional approach to implementing lists which makes use of pointers is called a linked list. The linked
list uses dynamic memory allocation, that is, it allocates memory for new list elements as needed. A
linked list is made up of a series of objects, called the nodes of the list. Because a list node is a distinct
object.
Linked List is very important where the program needs to manage memory very well and a contiguous
block is not needed. The linked list is very flexible dynamic data structures, that is, structure which grows
or shrinks as the data they hold changes. Lists, stacks and trees are all dynamic structures: items may be
added to it or deleted from it at will. A programmer need not worry about how many items a program will
have to accommodate: this allows us to write robust programs which require much less maintenance. A
very common source of problems in program maintenance is the need to increase the capacity of a
program to handle larger collections: even the most generous allowance for growth tends to prove
inadequate over time!
An array of linked list is an important structure because it combines a static structure (an array) and
dynamic structure (linked list) to form a useful data structure.
The abstract data type (ADT) was defined as the set of data objects and the basic operation that can be
performed on this set. In this regard, a linked list is an abstract data type used for storing data collections.
The link list has the following properties.
15
d. Can be made just as long as required (until systems memory exhausts)
e. Does not waste memory space (but takes some extra memory for pointers).
Fig 3.1
b, Delete: removes and returns the specified position element from the list
There are many other data structures that do the same thing as linked lists. Before discussing linked lists it
is important to understand the difference between linked lists and arrays. Both linked lists and arrays are
used to store collections of data, and since both are used for the same purpose, we need to differentiate
their usage. That means in which cases arrays are suitable and in which cases linked lists are suitable.
Linked lists have both advantages and disadvantages. The advantage of linked lists is that they can be
expanded in constant time. To create an array, we must allocate memory for a certain number of elements.
To add more elements to the array when full, we must create a new array and copy the old array into the
new array. This can take a lot of time.
We can prevent this by allocating lots of space initially but then we might allocate more than we need and
waste memory. With a linked list, we can start with space for just one allocated element and add on new
elements easily without the need to do any copying and reallocating.
There are a number of issues with linked lists. The main disadvantage of linked lists is access time to
individual elements. Array is random-access, which means it takes O(1) to access any element in the
array. Linked lists take O(n) for access to an element in the list in the worst case. Another advantage of
arrays in access time is special locality in memory. Arrays are defined as contiguous blocks of memory,
and so any array element will be physically near its neighbors. This greatly benefits from modern CPU
caching methods.
16
TYPES OF LINKED LIST
a Singly-linked list
The simplest kind of linked list is a singly-linked list (or list for short), which has one link per
node(LINK) but has one or more data items fields(DATA). This link points to the next node in the list, or
to a null value or empty list if it is the final node.
A singly-linked list containing two values: the value of the current node and a link to the next node. A
singly linked list's node is divided into two parts. The first part holds or points to information about the
node, and second part holds the address of next node. A singly linked list travels one way.
Example
A 27 B 34 H 80
A more sophisticated kind of linked list is a doubly-linked list or two-way linked list. Each node has
two links: one points to the previous node, or points to a null value or empty list if it is the first node; and
one points to the next, or points to a null value or empty list if it is the final node. A doubly-linked list is a
linked linear data structure, each node of which has one or more data fields but only two link fields
termed left link (LLINK) and right link (RLINK).The LLINK field of a given node point to the node on
its left and it RLINK point to the one on its right. A doubly linked list may or may not have a head node.
Also, it may or may not be circular.
A B G
Fig 3.3
c Circularly-linked list
In a circularly-linked list, the first and final nodes are linked together. This can be done for both singly
and doubly linked lists. To traverse a circular linked list, you begin at any node and follow the list in
either direction until you return to the original node. Viewed another way, circularly-linked lists can be
seen as having no beginning or end. This type of list is most useful for managing buffers for data ingest,
and in cases where you have one object in a list and wish to iterate through all other objects in the list in
no particular order. The pointer pointing to the whole list may be called the access pointer. A circularly-
linked list containing three integer values.
17
Fig 3,4
Linked lists are good to use when you have an unknown number of items to store. Using a data structure
like an array would require you to specify the size up front; exceeding that size involves invoking a
resizing algorithm which has a linear run time. You should also use linked lists when you will only
remove nodes at either the head or tail of the list to maintain a constant run time. This requires
maintaining pointers to the nodes at the head and tail of the list but the memory overhead will pay for
itself if this is an operation you will be performing many times. What linked lists are not very good for is
random insertion, accessing nodes by index, and searching. At the expense of a little memory and a few
more read/writes you could maintain a count variable that tracks how many items are contained in the list
so that accessing such a primitive property is a constant operation - you just need to update count during
the insertion and deletion algorithms.
Singly linked lists should be used when you are only performing basic insertions. In general doubly
linked lists are more accommodating for non-trivial operations on a linked list. We recommend the use of
a doubly linked list when you require forwards and backwards traversal. For the most cases this
requirement is present. For example, consider a token stream that you want to parse in a recursive descent
fashion. Sometimes you will have to backtrack in order to create the correct parse tree. In this scenario a
doubly linked list is best as its design makes bi-directional traversal much simpler and quicker than that of
a singly linked list.
4.0 GRAPHS
A map is a well-known example of a graph, which is a nonlinear data structure. Different linkages
between the cities are made on a map. The cities are linked by a network of highways, railroads,
and aerial routes.
18
A graph G = (V,E) is made up of a finite set of non-empty vertices V, which are also referred to
as points or nodes, and a finite set E of unordered pairs of distinct vertices known as edges, arcs,
or links.
EXAMPLE Figure 1 illustrates a graph. Here V = {a,b,c,d} and E = {(a,b), (a,c), (b,c), (c,d)}.
However it is convenient to represent edges using labels as shown in the figure below.
e1
a b
e2 e3
c
e4 d
Fig. 2.1: A graph
Multigraph
A multi-graph G = (V , E) also consists of a set of vertices and edges except that E may contain
multiple edges (i.e.) edges connecting the same pair of vertices, or may contain loops or self-edges
(i.e.) an edge whose end points are the same vertex.
Example Figure 2.2 illustrates a multi graph. Observe the multiple edges e1,e2 connecting vertices
a, b and e5, e6, e7 connecting vertices c, d respectively. Also note the self-edge e4.
However, it has to be made clear that graphs do not contain multiple edges or loops and hence are
different from multi graphs.
19
A graph whose definition makes reference to unordered pairs of vertices as edges is known as an
undirected graph. The edge e ij of such an undirected graph is represented as (vi, vj) where vi, vj are
distinct vertices. Thus an undirected edge (vi, vj) is equivalent to (vj, vi).
On the other hand, directed graphs or digraphs make reference to edges which are directed (i. e.) edges
which are ordered pairs of vertices. The edge eiji is referred to as <vi, vj> which is distinct from <vj, vi>
where vi, vj are distinct vertices. In <vi, vj>, vi is known as tail of the edge and vj as the head
In Fig. 2.3(a), e1 is a directed edge between v1 and v2, (i.e.) e1 = <v1, v2>, whereas in Fig. 2.3 (b) e1 is
an undirected edge between v1 and v2, (i.e.) e1 = (v1, v2).
Edges (G1): {<v1, v2><v2, v1><v1, v3><v3, v4><v4, v3>} or {e1, e2, e3, e4, e5}
Edges (G2): {(v1, v2) (v1, v3) (v2, v3) (v3, v4)} or {e1, e2, e3, e4,}
In the case of an undirected edge (vi, vj) in a graph, the vertices vi, vj are said to be adjacent or the edge
(vi, vj) I said to be incident on vertices vi, vj. Thus in Fig. 2.3(b) vertices v1, v3 are adjacent to vertex v2
and edge e1: (v1, v2), e3: (v2, v3) are incident on vertex v2.
On the other hand, if <vi, vj> is a directed edge, then vi is said to be adjacent to vj and vj is said to be
adjacent from vi. The edge <vi, vj> is incident to both vi and vj. Thus in Fig.2.3(a) vertices v2 and v3 are
adjacent from v1, and v1 is adjacent to vertices v2 and v3. The edges incident to vertex v3 are <v1, v3>,
<v3, v4> and <v4, v3>.
20
Complete graphs
The number of distinct unordered pairs (vi, vj), vi ≠ vjin a graph with n vertices Is nC2 = n . (n-1)/2. An n
vertex undirected graph with exactly n. (n-1)/2 edges is said to be complete.
Example Figure 2.4 illustrates a complete graph.. The undirected graph with 4 vertices has its 4C2 = 6
edges intact.
PATH
A path from a vertex Vi to vertex Vj in an undirected graph G is a sequence of vertices Vi, Vl1,
V12,…..V1k,Vj such that (V1, V11), ( V11, V12), … (V1k, Vj) are edges in G. If G is directed then the path
from Vi to Vj are specially known as a directed path consists of edges < Vi, V11>, < V11, V12>,… < V1k,
Vj>. In G.
Example, figure 2.7(a) illustrate a path P1 from vertex V1 to V4 in graph G1 of Fig.2.3(a) and Fig. 2.7 (b)
illustrate a path P2 from vertex V1 to V4 of graph G2 of Fig.2.3(b).
21
The length of a path is the number of edges on it.
A simple path - is a path in which all the vertices except possibly the first and the last vertices are
distinct.
Example in graph G2 (Fig.2.3(b)) the path from V1 to V4 given by {(V1,V2),(V2,V3),(V3,V4)} and written
as {V1,V2,V3,V4} is a simple path where as the path from V3 to V4 given by {(V3,V1),(V1,V2) ,(V2,V3),
(V3,V4)} and written as {V3,V1,V2,V3,V4} is not a simple path but a path due to the repetition of vertices.
Also in graph G1 (Fig.2.3(a)) the path from V1 to V3 is given by{<V1,V2>, <V2,V1>, <V1,V3>} written as
{V1,V2,V1,V3} is not a simple path but a mere path due to the repetition of vertices. However, the path
from V2 to V4 given by {<V2,V1> <V1,V3> <V3,V4>} written as {V2,V1,V3,V4} is a simple path.
A cycle- is a simple path in which the first and last vertices are the same. A cycle is also known as circuit,
elementary cycle, circular path or polygon.
Example: In a graph G2 (Fig.2.3(b)) the path {V1,V2,V3,V1} is a cycle. Also, in graph G1 (Fig. 2.3(a))
the path {V1,V2,V1} is a cycle or more specifically a directed cycle..
Strongly connected: A directed graph is said to be strongly connected if every pair of distinct vertices
Vi, Vj are connected (by means of a directed path). Thus if there exists a directed path from Vi to Vj then
there also exist a direct path from Vj to Vi.
Degree: The degree of a vertex in an undirected graph is the number of edges incident to that vertex. A
vertex with degree one is called as a pendant vertex or end vertex. A vertex with degree zero and hence
has no incident edges is called an isolated vertex.
Example: In graph G2 (Fig.2.3(b)) the degree of vertex V3 is 3 and that of vertex V2 is 2.in the case of
digraphs, we define the indegree of a vertex V to be the number of edges with V as the head and the
outdegree of a vertex to be number of edges with V as the tail.
Example: In graph G1(Fig.2.3(a)) the indegree of vertex V3 is 2 and the out degree of vertex V4 is 1.
22
Isomorphic graph: Two graph are said to be isomorphic if,
REPRESENTATIONS OF GRAPHS
The representation of graphs in a computer can be categorized as (1) sequential representation and (ii)
linked representation. Of the two, though sequential has several methods, all of them follow a matrix
representation there by calling for their implementation, using arrays. The linked representation of a graph
makes use of a singly linked list as its fundamental data structure.
The sequential or the matrix representation of graphs have the following methods:
The adjacency matrix of a graph G with n vertices is an n x n symmetric binary matrix given by a = [a ij]
defined as:
aij = 1 if the ith and jth vertices are adjacent (i.e) there is an edge connecting the ith and jth vertices = 0
otherwise (i.e.) if there is no edge linking the vertices.
23
LINKED REPRESENTATION OF GRAPHS
The linked representation of graphs is referred to as adjacency list representation and is comparatively
efficient with regard to adjacency matrix representation.
Given a graph G with n vertices and e edges, the adjacency list opens n head nodes corresponding to the n
vertices of graph G, each of which point to a singly linked list of node, which are adjacent to the vertex
representing to the head node.
Example: figure 2.18(a) illustrates a graph (b) its adjacency matrix representation (c) its adjacency list
representation.
24
Exercises:
1)
From graph G2 above, find: i) an Isolated Vertex ii) the In-degree and out- degree of each
25