0 ratings0% found this document useful (0 votes) 223 views21 pagesData Structures
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content,
claim it here.
Available Formats
Download as PDF or read online on Scribd
Data Structure
Data Structure
+ A data structure is a specialized format for organizing an
memory. General data structure types include array, file, recor
tree, graph and so on.
= A data structure can be viewed as consisting of a set of algorithms for
performing operations on the data it stores.
ae |
Tear data suture Non inear a
[ree] renh
storing data in
cd, table,
1, Linear Data Structure
A linear data structure traverses the data elements sequentially in which only
‘one data element can directly be reached eg., arrays, linked list, stack and
queue.
2. Non-Linear Data Structure
‘= Every data item is aitached to several other data items in a way that is
specific for reflecting relationships
= The data items are not arranged in a sequential structure eg, tree and
graph.
Memory Allocation in C
‘There are two types of memory allocations in C.
1. Compile-time of static allocation
2, Run-time or dynamic allocation
Compile-Time Allocation
«= In this type of allocation, the required amount of memory is allocated to
the program element at the start of the program.
syllabus
= Dota, information, Def
data structure
+ Arrays, stacks, Queues, Like
Lists, Trees, Graphs, Priory
Queues and Heaps
* File structure : Fields, Records
and Fil
‘= Sequential, Direct, Index
sequential and Relative fies
‘= Hashing, Inverted lists and
id+ Here, the memory to be allocated to the variable is
fixed and is determined by the compiler at the
compiler time.
‘eg, Consider the following declaration
int x,
float a5);
= When first element is encountered, the compiler
will allocate two bytes to each variables x and y
+ The second statement results into the allocation of
20 byte to the array a
Run-Time Allocation
€ provides the following dynamic allocation and
deallocation functions
(i) malloc ()
(ii) free ()
{)) The Malloc () Function
‘The malloc ( ) function allocates a block of memory in
bytes. The syntax of this function is as follows
malloc (number of elements * size of each element);
46,
(ii) calloc ()
{iv) realloc ()
int *ptr;
ptr = malloc (10*size of (int))
‘where, size represents the size of memory required in
bytes.
{ii) The Calloc ( ) Function
This function works exactly similar to malloc ( )
function except for the fact that it needs two arguments
fs against one argument required by malloc ( ).
£80
int "ptr;
pir = [int *) calloc (10, 2);
(ii) The Free () Function
"The free () function is used to deallocate the previously
allocated memory using malloc ( ) or calloc ( )
"functions.
‘The syntax of this function is
free (ptr-var);
here, ptr-var is the pointer in which the address of the
cd memory is assigned.
The Realloc () Function
function is used to resize the size of memory block
is already allocated (i.e, to modify the size of
allocated memory block).
Syntax ptrvar = realloc (prt_var, new_size);
Data Structures
where, ptr_var is the pointer holding the starting
address of already allocated memory block and
new size is the size in bytes you want the system to
allocate now, it may be smaller or greater than the size
of previously allocated memory block depending upon
the requirement.
Algorithm Performance
= An algorithm is any well defined computational
procedure that take some values as input and
produces some values as output.
= The performance evaluation of an algorithm is
obtained by totalling the number of occurrences
of each operation when running the algorithm.
1. 6-Notation
This notation bounds a function to within constant
factors. We can say f(n)=O(g(n)), if there exists
positive constant np, C, and C,.
2 (0)
Ho)
i e910)
7 i
Kn) = (a1)
Fig. 2
2. O-Notation (Upper Bound)
This notation gives an upper bound for a function to
within a constant factor. We can write f(n) = O(¢(n)), if
there are positive constants ny and C’such that to the
right of mo, the value of /(n) always lies on or below
Clr).
cain)
fo)
a) = O¢@4n))
Fig. 3ats cee ns Tack ana
124
3. Q-Notation (Lower Bound)
This notation gives a lower bound for a function to
within a constant factor. We can write f(n) = O¢(n), if
there are positive constants ny and C such that to the
right of mo, the value of f(n) always lies on or above
Cain).
Analysis of Algorithm
= The analysis of algorithms is the determination of
the amount of resources (such as time and storage)
necessary to execute them.
= The best, worst and average cases of a given
algorithm express what the resource usage is
atleast, atmost and on average respectively.
+ Usually the resource being considered is running
time but it could also be memory or other
resources
1. Best Case Performance
The term best case performance is used in computer
science to describe the way, an algorithm behaves
under optimal conditions
eg, The best case for a simple linear search on a
listoccurs when the desired element is the first element
of the list.
2. Worst Case Performance
‘The running time for any given size input will be lower
than the upper bound except possible for some values
of the input, where the maximum is reached.
3. Average Case Complexity
‘The running time for any given size input will be the
average number of operations over all problem
instances for a given size.
Categories of Algorithms,
Based on Big Oh notation, the algorithms can be
categorized as| :
UGC NET Zz» Computer Science & Applications
1. Constant time (0{1)) algorithms
2, Logrithmic time (O(log n)) algorithms
3, Linear time (O(n)) algorithms
4. Polynomial time (0(n"'), for &> I) algorithms
5. Exponential time (O{A"), for k> 1) algorithms
Array
‘= A collection of items having same data type stor
in contiguous memory allocation.
«Elements are stored contiguously with the fig
clement stored at the smallest memory address,
One-dimensional Arrays
A one-dimensional array is one in which only one
subscript specification in needed to specify
particular element of the array.
+ One-dimensional array can be declared x
follows
data_type var_name (Expression];
Data type is the type of elements to be stored a
the array, var_name specifies the name of array.
eg, int num[10};
Accessing Elements of an Array
Byte adress element [x] = base address
+ size (x-first index)
eg, Let the base address of the first clement of te
array is 200 and each element of array occupies 4 byte
in the memory, then address a fifth element of &
one-dimensional array a{10] will be given as—
Address of element a5] = 200 + 4 *(5 ~ 0)
= 200+4°5
200 +20
220
Stack
= A stack is a list to which items are added or fio®
which items can be deleted at one end only. TH
end is called as top.
+ Stack follows the LIFO policy because the elem!
which is added last will be removed first.
Operations on Stack ’
+ The insertion operation of elements into stack
called push operation.
+ The deletion of elements from stack is called P%
operation.
-Uses of Stacks
1. Function Calls
When a function is called all local storage for the
function is allocated on the system ‘stack’ and the return
address is pushed on the system stack
2. Implementing Recursion
They can be used to implement recursion, if the
programming language does not provide the facility for
the same,
3. Reversing a List
We can also use stack for reversing a list.
4, Parsing
Stacks are used by compilers to check the syntax of the
program and for generating executable code.
5. For Evaluating Expressions
Example of how a stack can be used for checking for
syntax.
{a) Infix expression
This the one where the binary operator comes between
the operand
cg, A+B
{b) Postfix expression
Here, the binary operator comes after the operands.
eg, AB+
(0) Prefix expression
Here, the binary operator proceedes the operands.
eg, + 4B
Queues
= A queue is an ordered collection of items from
which items may be deleted at one end (called the
front to the queue) and into which items may be
inserted at the other end (called the rear to the
queue).
+ The terms fronts and rear are used in describing a
linear list only when it is implemented as queue.
= Queues are also called First In First Out (FIFO)
lists.
static and Dynamic Lists
+ A static data structure is a data structure created for
an input data set which is not supposed to change
within the scope of the problem.
Data Structures 12!
= When a single element is to be added to deleted,
the update of a static data structure incurs
significant costs, often comparable with the
construction of the data structure.
= In real applications, dynamic data structures are
used, which allow for efficient updates when data
elements are inserted or deleted
Circular Queue
+ A circular queue is one in which the insertion of a
new element is done at the very first location of
the queue, if the last location of the queue is full.
210)
a 44)
Co) G21
Fig. 5
+A circular queue overcomes the problem of
‘unutilized space in linear queues implemented as
arrays.
Double Ended Queues (DEQUE)
It is also a homogeneous list of elements in which
insertion and deletion operations are performed from
both the ends.
i.e, We can insert elements from the rear end or from
the front ends. It is called as Double Ended Queue
(DEQUE).
Priority Queues
= Priority queues are useful data structures in
simulations particularly for maintaining a set of
future events ordered by time so that we can
quickly retrieve what the next thing to happen is.
In priority queue, elements are pulled highest
priority first.
Linked List
+ A linked list is a complex data structure, especially
useful in systems or applications programming. A
linked list is comprised of a series of nodes, each
node containing a data element and a pointer to
the next node.
= A linked list or one way list is a linear collection
of data elements called nodes, where the linearfi = applications
126 UGCNETZm Computer Science & Application
order is given by the means of pointers. i¢, Each
node is divided into two parts, the first part =a
contain the information of the element and second parol
part called the linked field or next pointer field, (node 1) (Node 2)
contain the address of the next node in the list. Fig. 6
Doubly Linked Lists
= Linked lists containing integer values or pointers to
directions are called the doubly linked lists
«= Each element in the list contains a piece of data togethe!
elements in the list.
= Using these pointers, it is possible to move through the list in
Stat
data with the ability to iterate over the list in bog
- with pointers which link to the previous and neg
both directions,
ater rs Doo be Dro ot
node 3
«t
node 1 ode 2
Fig. 7 Doubly linked lists
A <= 4p < Ag <2 Ay
# Sorting algorithm is an algorithm that pug
clements of a list in a certain order.
Circular Lists
A linked list whose last node points back to the first
node instead of containing the null pointer, called a
circular linked list. .
Selection Sort
i This type of sorting is called ‘selection sort? because it
Searching works: 5 repeatedly element. It works as follows :
A First find the smallest in the array and exchange it with
1. Linear Search 1 the element in the first position, then find the second
+ Search each record of.the file one at a time unt smallest element and exchange it with the element
finding the given name and hence the he second position and continue in this way untl the
corresponding record Seilesay nacre
= This method of searching for data in an array is )
straight forward and easy to understand. Insertion Sort
= To find a given item, begin your search at the ratect Sawai
a saitello : s -ps making the left side of the
start of the data collection and continue to loop rey sorted uid ie Slee er ea
until you have either found the target or a
ee je values seen so far and repeatedly insets
exaust” the.search gpaes, unseen values in the array into the left sored
a
2. Binary Search
Compare the given value with the value in the middle
of the list list has which half of the list contains value.
+ It has a time complexity on @(n”), thus being
slower than heapsort, merge sort and also shel
sort.
Sorting Merge Sort ig
= Let A be the list of n elements 4,, A,,...,4, in Merge sort has time complexity 0 (n log n) comparisel
memory sorting A refers to the operating of based sorting alggrthand Lataieel eee ee
rearranging the contents of A so that there are stable, means it preserves the input order of et
inerpasing in order that is so that elements in the sorted sequence.
SS hhh ZQuick Sort
Quick sort sorts by employing a divide and conquer
strategy to divide a list into sublists
The steps are as folows
1. Pick an element, called a pivot from the list
2, Reorder the list so that all the elements which are
less than the pivot come before the pivot and so
that all elements greater than the pivot come after
it, After this partitioning, the pivot is in its final
position.
Recursively sort the sublist of lesser elements and
the sublist of greater elements.
Trees
Trees are the types of a non-linear data structure. A
data structure accessed beginning at the root node.
Each node is either a leaf or an intemal node. An
internal node has one or more child nodes and is
called the parent of its child nodes.
* All children of the same node are siblings. The
root is usually depicted at the top of the structure
and leaves are depicted at the bottom.
Height of Tree
"= The maximum distance of any leaf from the root
of a tree. If a tree has only one node, the height is
zero. The height of a tree is also known as the
order.
= The degree of a node is the number of subtrees
associated with that node.
"= A node of degree zero has no subtrees, such a
node is called a leaf.
Binary Tree
A binary tree 7’ is defined as a finite set of
¢ elements, called nodes, such that
1
* 2.
an ordered pair of disjoint binary trees 7, and
7
If T does contain a root R, then the two trees 7,
and 7, are called respectively the left and right
subtrees of R.
= It 7, is non-empty, then its root is called the left
successor of RSimilarly, if T, is non-empty, then
_ its root is called the right successor of R
Search Tree
‘Search Tree (BST) is a binary tree (7') in which
internal node of T' stores an element such that the
T is empty (called the null tree or empty tree).
T contains a distinuguished node R called the
root of T and the remaining nodes of 7 form
Data Structures
element stored in the left subtree of 7 is less than or
equal to that node end elements stored in the right
subtree of T are greater than or equal to that node.
Traversing Binary Trees
There are three standard ways of traversing @ binary
tree T with root R. These three algorithms are called
preorder, inorder and postorder.
(@ Preorder
(a) Process the root R
(b) Traverse the left subtree of R in preorder.
(c) Traverse the right subtree of R in preorder
(ii) Inorder
(a) Traverse the left subtree of R in inorder.
(b) Process the root R
(c) Traverse the right subtree of R inorder.
(ii) Postorder
(a) Traverse the left subtree of R in postorder.
(b) Traverse the right subtree of R in postorder.
(©) Process of the root R
Breadth First Traversal
+ The breadth first traversal of a tree visits the nodes
in the order of their depth in the tree.
= Breadth first traversal first visits all the nodes at
depth zero (ie, root), then all the nodes at depth
fone and so on. At each depth, the nodes are
visited from left to right.
++ Breadth first traversal uses queue for traversing.
Depth First Traversal
Depth first traversal is an algorithm for traversing or
searching a tree, tree structure or graph. One starts at
root and explores as far as possible along each branch
before backtracking.
Height Balanced Tree
= A height balanced tree which is also a binary
search tree. It supports insert and delete
‘operations in time logarithmic in the number of
nodes in the tree.
+ Self balancing binary search tree or height
balanced binary search tree is a binary search tree
that attempts to keep its height or the number of
levels of nodes beneath the root, as small as
possible at all times, automatically,meena
128 UGC NET Zz computer Science & Applications
AVL Tree
An AVL tree is a self balancing binary search tree and
it is the first such data structure to be invented. In an
AVL tree, the height of the two child subtrees of any
node differ by a most one, therefore it is also said to be
height balance
Graphs
+A graph is a set of vertices and edges which
connect them
+ A graph G consists of two things
L.A set of vertices V
2. A set of edges E such that each edge ¢ in E is
identified with a unique pair [u, 2] of vertices in
V, denoted by ¢ = fu, 2}
Directed Graph
A directed graph G, also called digraph, is the same as a
multigraph except that each edge ¢ in G is assigned a
direction or in other words each edge ¢ is identified
th an order pair (u,v) of nodes in G rather than an
d pair |u, v}.
imate
Leal |
‘Undirected graph
Fig. 8
Strongly Connected Graph
A directed graph G is said to be connected or strongly
connected, if for each pair (u, v) of nodes in G there is a
path from u to» and there is also a path from 2 to u.
Spanning Tree
A spanning tree of graph, G is a set of |o| ~1 edges that,
connect all vertices of the graph.
Minimum spanning tree
The Minimum Spanning Tree (MST) of a graph defines
the cheapest subset of edges that keeps the graph in
‘one connected component.
Types of Binary Trees
1, Rooted binary tree is a tree with a root node in
which every node has atmost two children.
2. A full binary tree (strictly binary tree or 2-tree)
is a tree in which every node other than the
leaves has two children.
A perfect binary tree is a full binary
3 A Pty all leaves are at the same depth or 8
Tevel and in which every parent hy,"
children
4. A complete binary tree is a binary
hich every level, except possibly the ge ®
Completely filled and all nodes are as freg®
possible. q
Propeties of Binary Trees
1. The number of nodes m in a perfect bin
am be found using this formula,
aera
where, A is the height of the tree.
2. The number of nodes n in a complete b
tree is aleast m= 2" and atmost n =2!-1.)
where A is the height of the tree.
3. The number of nodes m in a perfect bin
can also be found using this formula n =L-p
where L is the number of leaf nodes inthe tee
4, The number of null links in a complete binay
tree of n nodes in (n +1)
5, The number of internal nodes in a compley
binary tree of n nodes is | n/2}
Heap Tree
|
= A binary tree has the heap property iff
(a) Itis empty or |
(b) The key in the root is larger than that in either | '
child and both subtrees have the hes}
property. /
+ Heap can be used as a priority queue. The hight
priority item is at the root and is trivaly| |
extracted. But if the root is deleted, we are let | -
with two subtrees and we must efficiently recs
a single tree with the heap property. '
Fig. 9(a)Data Structures
1. Sequential File Organization
+ A sequential file contains records organized by the
order in which they were entered. The order of
the records is fixed.
= Records in sequential file can be read or written
only sequentially
2. Indexed File Organization
Maximum heap because maximum
lement isa root node + Each record in the file has one or more embedded
Fig. 9(b) keys (referred as key data items); each key is
= We can create a minimum heap tree when the associated with an index.
An index provides a logical path to the data
records according to the contents of the associated
minimum element is at root node,
ile Structure embedded record key data items.
Fi = Index files must be direct access storage files.
Files = Records can be fixed length or variable length.
Afiile can be seen as = Each record in an indexed file must have
embedded prime key data item. When records are
inserted, updated or deleted, they are identified
solely by the values of their prime keys.
AStream File ‘= Thus, the value in each prime key data item must
le is viewed as a sequ be unique and must not be changed when the
ae record is updated.
er,
a Data semantics is lost : there is no way to get it 3. Relative File Organization
=a stream of bytes (no structure) or
=a collection of records with fields
apart again. ‘= Think of the file as a string of record areas, each
ade of which contains a single record. Each record
Field and Record Organization area is identified by a relative record number; the
access method stores and retrieves a record based
on its relative record number.
eg, The first record area is addressed by relative
Record A collection of related fields.
Field The smallest logically meaningful unit of
formation in a file.
a rd ber 1 and the 10th is
Key A subset of the fields in a record used to identify peinterscatetateateen o addressed by
eee) a recocl ‘SREiative' files tnt be direct acceat fen
lecord Keys 4. Hashing
Primary key A key that uniquely identifies a record, ‘= The search time of each algorithm depend on the
= Secondary key Other keys that may be used for number n of elements of the collection $ of the
Ch. data, A searching technique called hashing or hash
addressing which is essentially independent of the
Organization number n.
‘+ There are a basic types of organization— ‘= Hashing is used to index and retrieve items in a
1. Sequential database because it is faster to find the item using
a the shorter hashed key than to find it using the
_B. Indexed sequential (relative) CONS
4. Hashing F
Hash Function
In all cases, we view a file as a sequence of records
A records is a list of fields. Each field has a data * Hash functions are mostly used in hash tables, to
type quickly locate a data record given its search key.3 UGC NET
= The hash function is used to m
the index of a slot in the ‘able eee
Corresponding record is supposedly
Following are the different hash Reeeors "4
sh functions
(A) Division Method
In division remainder method, key k is divided by a
number m larger than the number n of keys in k and the
remainder of this division is taken as
faeces, tak index into the
Hk) = k mod m
The number m is usually choosen to be a prime
number or a number without small divisions, since this
frequently minimizes the number of collisions.
(8) Mid Square Method
In the mid square method, the key is first squared.
Therefore, the hash function is defined by a
Ak) = p
ee pis obtained by deleting digits from both sides
of k?
Example 1. Consider a hash table with 50 slots ie,
m =50 and key values k = 163, 173, 312
Sol. k: 163 173312
H: 26569 29999 97344
A(k) 34" 92" 84
(C) Folding Method
In folding method, the key & is partitioned into a
number of parts k,, k,,.... 4, where each part, except
possibly the last, has the same number of digits as the
required address. Then, the parts are added together,
ignoring the last carry ie,
Hh) =k, +k ++ hy
where, the leading digits carries, if any, are ignored.
Example 2. Consider a hash table with 10 slots i,
m =100 and key values k = 73, 76, 162, 361
Sol.
eet FT [xe
hia ||
6632-0 mod 10 2
1=10>140=1mod 10 [1
In last key, sum of parts come 10 and if we take mod
10, then it (A) will come 0 but Oth place is filled, then
a
«@* Computer Science & Applications
be broken into 1, 0 and we will
again 10 will
qf oarts 1201 and will do mod of 1, it will
» ome
place
Resolving Collisions
+ Hash collision is the process in which more
‘one key is mapped to the same memory log.
in the table '°Cig
eg, If we are using the division remainder
with following hash function meth
Then, key =4 and key =7 both map;
same location of the table Pot
Ak) = 4mod 3
Hk) =7 mod 3 ~]Hash cotisg
+ The efficiency of a hash function with a
resolution procedure is measured by the avers
umber of probes (key comparisions) needed’®
find the location of the record with a given key
= The efficiency depends on the load factor i,
(2) = average number of probes for a succeny
search
Ul) = average number of probes for
unsuccessful search
+ Load factor (4) is the ratio of the number of:
in the & to the number m of hash address in
This ratio, 2 = n/m is called load factor,
1. Collision Resolution by
Separate Chaining
In this method, all the elements where keys hash tothe
same hash table slot, are put in a one linked list.
Table
o
1
2
3[ Na
Fig. 10,
2. Collision Resolution by Linear
Probing
The linear probing uses the following hash function
Ak, i) = |W’ (k) +i] mod m fori =0, 1, 2...,m~1
where, mis the size of the hash table and fi (A) = kms!
m, the basic function and i is the probe number.Itilist Organization
a multilist organization, records which have
valent values for a given secondary index item are
ked together to form a list. Since, a particular item
lly assumes a number of values, say m, then n lists
created, one for each iter value
yack of Multi-list Organiztion
disadvantage is that on order to respond to a query
th @ conjunctive terms all records corresponding to
term having the shortest list must be individually
+ One way of overcoming the major disadvant
the multilit approach is to remove all linkages
from the file are to place the list in the
secondary index, that is to create an inverted list
= An inverted list can appear as a sequential,
indexed or direct file depending on how quickly
we desire a response to a query
«Ifa list is not extremely long, then it may be
possible to retain it in main memory while
processing user requests.
Tree
= Bure is balanced trees that are optimized for
situations when part or the entire tree must be
maintained secondary storage such as a magnetic
dhl. Since, dk acccues am peutve operations
Bue ues to minimize the number of disk
accesses,
Data Structures
have many children
es, fa Btree node has & keys
children
+
q Ger
= For Brees
Lower bound Every node other than root must
keys and at least ¢ ehildren.
Upper bound Every node can contain af most
2 Lkeys
Every internal node has at most 2¢ children,
B Tree
+ The B-tree is the classic disk based dats structure
for indexing records based on an ordered key set.
© The BY tree is a variant of the Buee
which all records are stored in the and all
leaves are linked sequentially.
+ AB" tree is a type of tree which represents sorted
data in a way that allows for efficient
-tree, in contrast o « Biree, all records are
stored at the lowest level of the tree; only keys are
‘stored in interior blocksData Structure
pefinition Syllabus.
e gzytre can be defined as mathematical or logical model of organization» Definition
Des sfned inthe compuer system, We can classy the data structure into simple and Composite
structures
joi
| Simple data structure
9, Compound data structure
‘arrays, Lists, Stacks, Queves
Prionty queues
* Binary trees, B-trees
1. simple Data Structure
structure are those data structure which are built from
ie simple di
‘na data types such as integers, floating point numbers, characters and
sa values. The simple data structures can be classified into arrays and
srt
2, Compound Data Structure
ple data structure are combined to form more complex data
nature, then i said to be compound data structure. Compound data structure
fa be farther classified into two sub categories—
(@) Linear data structure
Sich data structure are one level or single level data structures. Among these
dia ssuctures there are linked lists, queues and stacks.
{)) Non-linear data structure
Beh vps of data suctures are mulilevel data structures tee graphy
pee
Operations Performed in Data Structures
The main operations performed in the data structures are as follows
()) Traversals
refers visiting or accessing the data items stores in a data structure.
Searching
ing refers finding the location of a particular record in the stored dataTe eee eee a core ee en
596
(©) Insertion
Adding a new record in an existing data structure.
(d) Deletion
Deleting or removing a data or a record from a given
data structure
(e) Sorting
Sorting refers to the operation, which involves
arranging the records of a data structure in sorted
structure,
() Merging
Merging refers combining two or more than two data
structure into a one structure in sorted order.
Arrays
‘An array is a subscripted variable, in which we can
store various values of same data types into a single
identifier. It means in an array two different types of
data values cannot be stored. In an array of integers all
the values will be integers. Similarly in an array of float
all values of array will be of float only.
Arrays are probably the most common data structure
used to store collections of elements. In most
languages, arrays are convenient to declare and
provide the handy [] syntax to access any element by
its index number,
‘An array is a series of elements of the same type placed
in contiguous memory locations that can be
individually referenced by adding an index to a unique
identifier. Instead of that, using an array we can store
multiple values of the same type, with a unique
identifier. There are two types of arrays namely one
dimensional array and multi-dimensional array.
General syntax for declaration of an array will be
type name [elements];
where, type is a valid type (like int, float...), name is a
valid identifier and the elements field (which is always
enclosed in square brackets {)), specifies how many of
these elements the array has to contain.
Initializing Arrays
When declaring a regular array of local scope (within a
function, for example), if we do not specify otherwise,
its elements will not be initialized to any value by
default, so their content will be undetermined until we
store some value in them. The elements of global and
static arrays, on the other hand, are automatically
initialized with their default values, which for all
fundamental types this means they are filled with zeros.
UGC NET Zczr Computer Science & Applications
In both cases, local and global, when
ay, we have the possiblity (o ani iggy
each one of its elements by enclosing iho 't,*
braces { }. For example, va
int num [5] = (16, 2,77, 40, 1207};
This declaration would have created an ar
leg
(od ee
iG 2 7 40 | |
[0 Tr}
‘The amount of values between braces ( ) muy
larger than the number of elements that we deci
the array between square brackets []. For exagtlt
the example of aay we have declared th ste
elements and in the list of initial values within be
we have specified five values, one for each elem
When an initialization of values is, provided
array, C++ allows the possibility of leaving the 1 ®
brackets empty []- In this case, the comps’
assume a size for the array that matches the mings’
values included between braces {}. @
int num (] = ( 16, 2, 77, 40, 1207 };
After this declaration. The size of array num
since we have provided fivelvaldes//saaiiaimmnala
um [5] =
Accessing the Values of an Array
In any point of a program in which an array is vis
we can access the value of any of its lem
individually as if it was a normal variable, thus
able to both read and modify its value. The formatiss
simple as
name [index}
Following the previous examples in which aay ht
five elements and each of those elements was of tye
int, the name which we can use to refer to each element
is as following :
name(0] | name[1] | name[2__ | name(3]
16 2 7 40 wo
For example, to store the value 75 in the third elemett
of num, we could write the following statement :
num[2] = 75;
and for example, to pass the value of the third elem
of num to a variable called a, we could wile *=
num(2]. ee
In C++, it is syntactically correct to exceed the
range of indices for an array. This can create OOS
since accessing out of range elements do not
compilation errors but can cause runtime errors.
— =it is important to be able to clearly
mpewween the two uses that brackets [ ] have
ycith Oe They perform two different tasks, one
Hoa ge of arTays when they are declared
roe Oy one is (0 specify indices for concrete
30 so
BP es
Pen example in C++ to find the sum of
gory an array
ays example
jude
Py = {161 2s 77 40, 12071};
um
so, sees
cout << result;
return 0;
)
pe adres of element [x] =
base + size (x - first index)
qulti-dimensional Arrays
\isdimensional arrays can be described as arrays of
ins, For example, @ two-dimensional array can be
ned as a two-dimensional table made of elements,
‘Guithem of a same uniform data type. Suppose, we
fie declared a two-dimensional array as int mydata
[ptt will ook like as follows
0 L 2 3 4
o l
1
a
‘ai for example, the way to reference the second
‘neat vertically and fourth horizontally in an
‘azesion would be; mydata(1} [3]. We know that in
C+ the array indices always begin with zero.
Jc wodimensional array, array [r] [e] means that r
‘hor number of rows ¢ shows the number of columns.
‘ov, we want to access the element in two-dimensional
‘my then there are two methods.
|) Row major ordering
se all together. For example, consider an array
~-Ui)\|, ... Ub,] and BA is Base Address of the
Data Structure
given array, Cis the size of an element then location of
Ali)[j] be found as follows :
Loc (4,,,) =BA + [(i - &,)(UB, - i, +1)+(j - a)
(i) Column major ordering
Columns are all together. For example, consider an
array fl, ... Ub,] [iby ... Ub] and BA is the base
address of this array, C is the size of an element, the
location of array f[j] can be found as follows
Loc (4, ,)= BA + C[(j - By) (Ub, - , +0)
+(- 4)
Lists
A linked list is a data structure consisting of a group of
nodes which together represent a sequence. Under the
simplest form, each node is composed of a datum and
a reference (in other words, a link) to the next node in
the sequence. This structure allows for efficient
insertion or removal of elements from any position in
the sequence.
A linked list is a data structure which can change
during execution-successive elements, which are
connected by pointers and last element points to null.
It can grow or shrink in size during execution of a
program. It can be made just as long as required. It
does not waste memory space
‘A linked list whose nodes contain two fields, an integer
value and a link to the next node. The last node is,
linked to a terminator used to signify the end of the list
They can be used to implement several other common
abstract data types, including stacks, queues,
associative arrays and symbolic expressions, though it
is not uncommon to implement the other data
structures directly without using a list as the basis of
implementation,
The principal benefit of a linked list over a
conventional array is that the list elements can easily
be inserted or removed without reallocation or
reorganization of the entire structure because the data
items need not be stored contiguously in memory or
on disk. Linked lists allow insertion and removal of
nodes at any point in the list and can do so with a
constant number of operations, if the link previous to
the link being added or removed is maintained during
list traversal.
On the other hand, simple linked lists by themselves
do not allow random access to the data or any form of
efficient indexing. Thus, many basic operations such as
obtaining the last node of the list (assuming that the last
node is not maintained as separate node reference in
the list structure) or finding a node that contains aUGC NET Zz, computer Science
iven datum or locating the place where a new node
should be inserted, may require scanning most or all of
the list elements.
Inserting an Element into a
Linked List
For insertion
+ A record is created holding the new item.
+ The nextpointer of the new record is set to link it
to the item which is to follow it in the list.
+ The nextpointer of the item which is to precede it
must be modified to point to the new item.
Pictorially, it can be shown as follows
4T+—{2 T+ +e [4+-4
feotet a
Gi) ett
Lx]
Fig. 1
Deleting an Element froma
Linked List
To delete an element from a linked list the next pointer
of the item immediately preceding the one to be
deleted is altered and made to point to the item
following the deleted item.
We can show the deletion pictorially as follows :
Item to be deleted
(xT}-* 1+ Tt 5
Selchs a}
Fig. 2
Data Types of Linked Lists
Before writing the code to build the above list, we need
two data types.
(a) Node
The type for the nodes which will make up the body of
the list. These are allocated in the heap. Each node
Applications
contains a single client data element ang
the next node inthe list. Type: struct nog* Pi
struct node {
int data;
struct node* next;
hk
(0) Node Pointer
The type for pointers to nodes. This wil
aoe bed poiaer and the next fields tnaae pe
In C and C++, no separate type gei®
we
required, since the pointer type is just
folowed by a" Type : struct node? = Side
Various Implementation of Linkeg
Lists
(a) Dummy Header
Forbid the case, where the head pointer i
Gioore as a representation of the empty By Seek
dummy node whose data field is mane Tt
sdvaniage of this technique is that the pe
inter (reference parameter) case docs nol cms
For operations such as Push). Also, some a
iterations are now a litle simpler since they can
assume the existence of the dummy heador an
Gisvantage & that allocating an emp be ax
fequires allocating (and wasting, memory,
algorithms have an ugliness (3 tesa aaa rer a
realize that the dummy node does not coust Mast
the dummy header is for programmers to avoid
ugly reference parameter issues in functions
Push. Languages which do not allow seagull
jeters, such as Java, may ies fi
Feeder as a workaround. - oil
:
é
(©) Circular
Instead of setting the .next field of the last n0
set it to point back around to the first node.hich contains a head pointer, a tail pointer
pF yn length to make many operations more
ergy of the reference parameter problems go
anyat functions can deal with pointers to the
nether it is heap allocated or not). This is
ch (0 use ina language withont
such as Java. :
se es PP
y asamelers
yy Linked List
uP sf just a single .next field, each node
both next and previous pointers.
and deletion now requir more
but other operations are. simplified
fr to a node, insertion and deletion
ned directly whereas in the singly
i eve, the iteration typically needs to locate
hed just before the point of change in the list
feinext pointers can be followed downstream,
nk List
Gnifd of storing a single client element in each
batt ore a litle constant size array of client
Menents in each node. Tuning the number of
ae
Memance characteristics, many elements per
Mmenis per node has performance more like a
ied lst The chunk list is a good way to build a
Inked list with good performance.
5 pynamic Array
Insead of using a linked list, elements may be
sed in an array block allocated in the heap. Itis
gossble to grow and shrink the size of the block as
feeded with calls to the system function realloc( ).
Managing a heap block in this way is a fairly
omplex, but can have excellent efficiency for
sore and iteration, especially because modern
memory systems are tuned for the access of
‘contiguous areas of memory. In contrast, linked list
‘an actually be a litte inefficient, since they tend to
‘rate through memory areas that are not adjacent.
eue
ston in which the entities in the collection are kept.
‘ic and the principal (or only) operations on the
ion are the addition of entities to the rear
Portion and removal of entities from the front
‘This makes the queue a First In First
data structure. In a FIFO data structure,
lement added to the queue will be the first one
‘moved. This is equivalent to the requirement
damenh dn sdadlsall glmmais ant were
Data Structure 9
added before have to be removed before the new
element can be invoked. A queue is an example of &
linear data structure
Queues provide services in computer science, transport
and operations research where various entities such as
data, objects, persons or event are stored and held t0
be processed: later. In. these the queue
Performs the function of a buffer.
Queue Implementation
Theoretically, one characteristic of a queue is that it
does not have a specific capacity. Regardless of how
many elements are already contained, a new element
can always be added. It can also be empty, at which
point removing an element will be impossible until a
new element has been added again.
Fixed length arrays are limited in capacity and
inefficient because items need to be copied aia
head of the queue. However conceptually they are
simple and work with early languages such as
FORTRAN and BASIC which did not have pointers
or objects. Most modern languages with objects or
pointers can implement or come with libraries for
dynamic lists. Such data structures may have not
specified fixed capacity limit’ besides memory
constraints. Queue overflow results from trying to add
an element onto a full queue and queue underflow
happens when trying to remove an element from an
empty queue.
A bounded queue is a queue limited to a fixed number
of items.
There are several efficient implementations of FIFO
queues. An efficient implementation is one that can
perform the operations, enqueuing and dequeuing in
O{)) time.
(2) Linked List Implementation
+A doubly linked list has O(l) insertion and
deletion at both ends, so it is a natural choice for
queues.
+A regular singly linked list only has efficient
insertion and deletion at one end. However, a
small modification keeping « pointer to the last
node in addition to the first one will enable it to
implement an efficient queue.
(b) A deque implemented using a modified dynamic
array.
Double Ended Queue
‘A double ended queue (dequeue, often abbreviated to
deque, pronounced deck) is an-abstract data type that
implements a queue for which elements ean only be7 cance & Applications
” 600 UGCNETZ.zr Computer Science Application:
.st_priority_elem
atlded to or removed. from the front(head).or back 9 Pull. bighee™ PE ome stn
(al). It is also often called a head-tail linked list. sony and. return it alg! Se WAly
Deque is sometimes writen dequeue but this use i pop element), geL.maximum elt Wy
generally deprecated in technical literature or technical get front(most)_element; some my NY
writing because dequeue is also a verb meaning (© Consider lower priorities to be higher, or \!
remove from a queue. Nevertheless, several libraries cone known as get_minimum ea? ty
nd some writers, such as Aho, Hopcroft and Ullman Often referred to as get-min in the tee a
in their textbook Data Structures and Algorithms, spell cee ee also. sometimes ple, a
it dequeue. John Mitchell, author of Concepts in peek at_highest_priority_element I
Programming Languages, also uses this terminology. Belt slement functions, which ean beg af
DEQ and DQ are also used. to produce pull highest priority element Nl
Distinctions and Sub Types Stack
This differs from the queue abstract data type or First ei
In First Out (FIFO) list, where elements can only be A stack is a Last In First Out (LIFO) abstr
added to one end and removed from the other. This
general data class has some possible sub types data type as an element, but is characterized
three fundamental operations; push, pop and yey
An Input Restricted Deque The push operation adds a new item tothe yg
It is one where deletion can be made from both ends, stack, or initializes the stack if it is empty. Ifthe,
but insertion can only be made at one end. full and does not contain enough space to
given item, the stack is then considered to het
ap Cuttent Restricted Ogaue overflow state. The pop operation removes a
It is a type of deque where insertion can be made at from the top of the stack. A pop eth
both ends, but deletion can be made from one end — previously concealed items or results in an empy,
only. but if the stack is empty then it goes into
Both the basic and most common list types in state. It means no items are present in stack
computing, queues and stacks can be considered removed. The stack top operation gets the data
specializations of deques and can be implemented the top-most position and returns it to the user
deleting it. The same underflow state can also ocars
stack top operation if stack is empty.
Priority Queue A stack is a restricted data structure, because ea
A pony is an abstract data type which is like a ‘all number of operations are performed oni
regular queue or stack data structure but additionally, "ature of the pop and push operations also mea
stack elements have a natural order, Elements
act di
‘and linear data structure. A stack can have any 4. |
ey
using deques.
each element is associated with a priority. semoved fom ihe Gitta the
= Stack Elements are pulled in las in frst out order der of their addition therefore, the lower
(eg., a stack of papers) are those that have been on the stack the longest
= Queue Elements are pulled in first in first out
order (eg, a line in a cafeteria). Basic Architecture of a Stack
«Priority queue Elements are pulled highest typical stack, storing local data and call
priority first (¢., cutting in line, or VIP service). for nested procedure calls (not necessarily
tis a common misconception that a priority queue isa procedures!). ‘This stack grows downward
heap. A priority queue is an abstract concept like a list origin, The stack pointer points to the current
or @ map, just as a list can be implemented with a datum om the stack. A push operation decrem!
linked list or an array, a priority queue can be pointer and copies the data to the stack and 2
implemented with a heap or a variety of other Operation copies data from the stack and |
methods. increments the pointer. Each procedure called
A priority queue must at least support the following program stores procedure retum information 34
operations— data by pushing them onto the stack. This type o®
= Insert_with priority Add an element to the implementation is extremely common,
queue with an associated priority. vulnerable to buffer overflow attacks. A typi!memory with a fixed origin and a
lly the size of the stack is zero. A
in the form of a hardware register,
inter Tost recently referenced location on the
0 ck has a size of zero, the stack pointer
pet agin ofthe stack
oe erations applicable to all stacks are—
gn operation
‘on, in which a data item is placed at the
id to by the stack pointer and the address
Jjusted by the size of the data
Init
push OP
ine
Pinter i ad
te
* gp or Pull operation
all operation, a data item at the current
p oF P stack pointer is removed and
Foe is adjusted by the sizeof the data item,
ee pany variations on the basic principle of
ge ons, Every stack has a fixed location in
opr ch it begins. As data items are added to
mine sack pointer is displaced to indicate the
5 ff the stack, which expands away from
tent of
ot ex
rg.
{printers may point to the origin of a stack or to a
a range of addresses either above or below the
depending on the direction in which the stack
®) however, the stack pointer cannot cross the
ofthe stack. In other words, if the origin of the
jis at address 1000 and. the stack grows
rivards (towards addresses 999, 998, and so on},
‘ack pointer must never be incremented beyond
io 1001, 1002, etc,). If a pop operation on the
causes the stack pointer to move past the origin of
Sack, a stack underflow occurs. If a push operation
es the stack pointer to increment or decrement
yond the maximum extent of the stack, a stack
most two child nodes, usually distinguished as left
‘ight. Nodes with children are parent nodes and
odes may contain references to their parents.
the tree, there is often a reference to the root
the ancestor of all nodes), if it exists. Any node in
structure can be reached by starting at root
‘nd repeatedly following references to either the
Data Structure €
= A directed edge refers to the link from the parent
to the child.
+ The root node of a tree is the node with no
parents. There is at most one root node in a
rooted tree.
+ A leaf node has no children.
= The depth of a node n is the length of the path
from the root to the node. The set of all nodes at a
given depth is sometimes called a level of the tree.
The root node is at depth zero.
+ The height of a tree is the length of the path
from the root to the deepest node in the tree. A
(rooted) tree with only one node (the root) has @
height of zero,
Siblings are nodes that share the same parent
node.
= A node p is an ancestor of a node q, if it exists
on the path from q to the root. The node q is then
termed a descendant of p.
= The size of a node is the
descendants, it has including itself.
«In degree of a node is the number of edges
arriving at that node.
= Out degree of a node is the number of edges
leaving that node.
+ The root is the only node in the tree with
Incdegree =0.
number of
Types of Binary Trees
(a) Rooted Binary Tree
A rooted binary tree is a tree with a root node in which
every node has at most two children.
(b) Full Binary Tree
A full binary tree (sometimes proper binary tree or
2tee or strictly binary tree) is a tree in which every
node other than the leaves has two children.
Sometimes a full tree is ambiguously defined as a
perfect tree.
(c) Perfect Binary Tree
A perfect binary tree is a full binary tree in which all
leaves are at the same depth or same level and in
which every parent has two children. This is
ambiguously also called a complete binary tree.
(d) Complete Binary Tree
A complete binary tree is a binary tree in which every
level, except possibly the last, is completely filled and
all nodes are as far left as possible.\ UGC NET 7-4,
(e) Infinite Complete Binary Tree
An infinite complete binary tree is a tree with a
countably infinite number of levels, in which every
node has two children, so that there are 24 nodes at
level d. The set of all nodes is countably infinite, but
the set of all infinite paths from the root is uncountable;
it has the cardinality of the continuum. These paths
corresponding by an order preserving bijection to the
points of the Cantor set or (through the example of the
Stern-Brocot tree) to the set of positive irrational
numbers.
(f) Balanced Binary Tree
A balanced binary tree is commonly defined as 4
binary tree in which the height of the two subtrees of
every node never differ by more than one although in
general, itis a binary tree where no leaf is much farther
Seay from the root than any other leaf. Binary trees
that are balanced according to this definition have a
predictable depth (how many nodes are traversed from
Tre root to 2 leaf, root counting as node 0 and
subsequent as 1, 2, depth). This depth is equal to the
integel part of log) (n), where ms the number of nodes
on the balanced tree.
eg. (i) Balanced tree with 1 node, log (1) = 0 (depth
= 0)
eg. (ii) Balanced tree with 3 nodes, log, (3) =1.59
(depth = I.
eg. (iii) Balanced tree with 5 nodes, log ,(5) =2.32
(depth of tree is 2 nodes).
(g) Degenerate Tree
A degenerate tree is a tree, where for each parent node,
there is only one associated child node. This means that
in a performance measurement, the tree will behave
like a linked list data structure.
Properties of Binary Trees
= The number of nodes n in a perfect binary tree
can be found using this formula n = 2h+1-1,
where his the height of the tree
= The number of nodes n in a complete binary
tree is at least n = 2h and at most n= 2h+1~1,
where his the height of the tree.
= The number of leaf nodes L in a perfect binary
tree can be found using this formula L = 2h, where
‘his the height of the tree.
= The number of nodes n in a perfect binary tree
can also be found using this formula n= 22 +1,
where L is the number of leaf nodes in the tree.
= The number of null links (absent children of
nodes) in « complete binary ice, of » nodes i
Computer Science *
Applications
= The number of internal nodes
binary tree of n nodes is ma
= For any non-empty binary tree
odes and a, nodes of degree 2,q, "t
Proof net
Let 1 =the total number of nodes
B = number of branches
ng, n, and n, represent the number of
children, a single child and (0 children reece
B=n-1 (since all nodes except the root
from a single branch.) ode
Ban, +2*n,
n=n,+2*n, +1
nm, +2*n, +1=M +n, +m
Tree Traversal
Tree traversal refers 10 the process of visinge
inva tee, exactly once. Such traversal are casa
the order in which the nodes are visited. yl
1. Preorder Traversal
To traversal a non-empty binary tree in
perform the following operations recursh
er aartng with the TooPnods a
(a) Visit the root
(b) Traverse the left subtree .
(c) Traverse the right subtree
2. Inorder Traveral
To traverse a non-empty binary tree is inorder,
the following operations recursively at each
(a) Traverse the left subtree
(b) Visit the root
(c) Traverse the right subtree
3. Postorder Traversal
To traverse a non-empty tree in depthfint
perform the following operations recursively
hoe :
(a) Traverse the left subtree
(b) Traverse the right subtree
(c) Visit the root
B-Tree ‘
to Knuth's:definition, « Baree of
(the maximum number of children for each
tree which satisfies the following properties
(i) Every node has at most m
anode (except root) has at least m/2
paren
pl" pas atleast two children, if tis not a
ia Ts node
i es appear he sme Teel and carry
oY penation
Si teat node with k children contains (k—1)
Ho ode's elements act as separation values
pet fab tees, For example, if an internal
die es (or sub-tees) then it mus have
Pa aang rer elements a, and a. All values in
be less than @,, all values in
be between a, and a, and all
b-aree will be greater than a,
pepe sab-tree will
ae subaree wil
isp rightmost su
al Nodes
re all nodes except for leaf nodes and
They are usually represented as an
if elements and child pointers, Every
saontains a maximum of U children and a
[¢hildren. Thus, the number of elements
fan the number of child pointers (the
between L - Land U—1). Umust
‘efore each internal node is at
The relationship between Uand L implies
| el ‘nodes can be joined to make a legal
a 0 ll node can be split nto two legal nodes
es foo to push one element up nto the parent)
ier 00T T mnake it possible to delete and insert
tern
In 1} nodes #1
1
aot node:
ays |
a ‘ofelements is
eiher 2L oF 2L-ly ther
Pe
The Root Node
He root node's number of children has the same upper
lar x internal nodes, but has no lower limit. For
femple, when there are fewer than Ze elements in
te enire tee, the root will be the only node in the
ee, with no children at all.
) ems as a Betree of depth n but the cost of search,
and delete operations grows with the depth of the
As with any balanced tree, the cost grows much
slowly than the number of elements.
balanced trees store values only at leaf nodes and
kinds of nodes for leaf nodes and internal
603
Data Structure C
nodes, Brees keep values in every node in, the ee
and may use the same structure for all nodes.
Howeve? since leaf nodes never have children, the
Burees benefit from improved performance, if they use
a specialized structure.
In computer science, a B-tree is a tree data struct
that keeps data sorted and allows searches, sequential
access, insertions and deletions in logarithmic, WE
The Bitree is a generalization of a binary search fer
that anode can have more than two children. Unlike
selfbalancing binary search trees, the Butree | '®
Optimized for systems that read and write large blocks
of data. It is commonly used in databases and file
systems.
Overview
In Btrees, internal (non-leaf) nodes ean have @
number of child nodes within some predefined range,
When data is inserted or removed from a node, its
number of child nodes changes. In order to maintain
the predefined range, internal nodes may be joined or
split. Because a range of child nodes is permitted,
Betrees do not need rebalancing as frequently as other
self balancing search trees, but may waste some space,
since nodes are not entirely full. The lower and upper
bounds on the number of child nodes are typically
fixed for a particular implementation. For example, in
a 2-3 Bttree (often simply referred to as a 2-3 tree),
each internal node may have only 2 or 3 child nodes.
Each internal node of a B-tree will contain a number of
keys. Usually, the number of keys is chosen to vary
between d and 2d. In practice, the keys take up the
most space in a node. The factor of 2 will guarantee
that nodes can be split or combined. If an internal
node has 2d keys, then adding a key to that node can
be accomplished by splitting the 2d key node into 2d
key nodes and adding the key to the parent node. Each
split node has the required minimum number of keys.
Similarly, if an internal node and its neighbour each
have d keys, then a key may be deleted from the
internal node by combining with its neighbour.
Deleting the key would make the internal node have
d- | keys, joining the neighbour would add dkeys plus
one more key brought down from the neighbour's
parent. The result is an entirely full node of 2d keys.
‘A Bure is kept balanced by requiring that all leaf
nodes are at the same depth. This depth will increase
slowly as elements are added to the tree but an
increase in the overall depth is infrequent and results in
all leaf nodes being one more node further away from
the root.
variable604 UGC NET Zezer Computer Science & Applications
Burees have substantial advantages over alternative
implementations when node access times far exceed
access times within nodes because then the cost of
accessing the node may be amortized over multiple
operations within the node. This usually occurs when
the nodes are in secondary storage such as disk drives.
By maximizing the number of child nodes within each
internal node, the height of the tree decreases and the
number of expensive node accesses is reduced. In
addition, rebalancing the tree occurs less often. The
maximum number of child nodes depends on the
information that must be stored for each child node
and the size of a full disk block or an analogous size in,
secondary storage. While 2-3 B-trees are easier to
explain, practical B-trees using secondary storage want
a large number of child nodes to improve performance.
B* and B* Trees
The term B-tree may refer to a specific design or it may
refer to a general class of designs. In the narrow sense,
a B-tree stores keys in its internal nodes but need not
store those keys in the records at the leaves. The
general class includes variations such as the B*-tree
and the B*-tree.
= In the 8*-tree, copies of the keys are stored in the
internal nodes, the keys and records are stored in
leaves; in addition, a leaf node may include a
pointer to the next leaf node to speed sequential
access
« The B*tree balances more neighbouring internal
nodes to keep the internal nodes more densely
packed. This variant requires non-root nodes to be
at least 2/3 full instead of 1/2. To maintain this,
instead of immediately splitting up a node when it
gets full, its keys are shared with a node next to it.
When both nodes are full, then the two nodes are
split into three.
Counted B-trees store, with each pointer within the
tree, the number of nodes in the sub tree below that
pointer. This allows rapid searches for the nth record in
key order or counting the number of records between
any two records and various other related operations.
Graph
A tree is a special case of the more general graph (or
net). Graphs have vertices (or nodes) and edges (which
‘can be one way or undirected). Edges can have a cost
associated with them and the cost may depend on the
direction taken along the edge. Popular graph related
puzzles include:
a
« Finding whether one can traverse al
traversing any twice (an Euler pathy he
« The travelling salesman problem - "
Al the nodes at a minimal cost” POW
«= Finding the shortest (least cost) pay, =a
vertices.
« Finding the minimal spanning tree a
(with the least cost edges) that inca,
More formally, a graph is a pair (7, £) yy 2%
finite set and E is a binary relation on ppt?
vertex set whose elements are called verte Ue
collection of edges, where an edge is a pais
u,0 in V. In a directed graph, edges are orgyt!)
connecting @ source vertex to a target yon!
undirected graph, edges are unordered «* by
connect the two vertices in both directions, pra™ &
undirected graph (u,v) and (0,1) are tye
writing the same edge. wang
If some edge (u, ») is in graph, then vertex
to vertex u. In a directed graph, edge (y <2
out-edge of vertex u and an in-edge of verter, © ©
undirected graph edge (1, ) is incident on ce
and v, In a directed graph, the number of outejut*
vertex is its out-degree and the number of ined!”
in-degree. For an undirected graph, the muthe'®
edges incident to a vertex is its degree. a
A path is a sequence of edges in a graph s
target vertex of each edge isthe seures vegas ail
next edge in the sequence. If there is a path sag
vertex wand ending at vertex x we say th $y
reachable from u. A path is simple if none of
vertices in the sequence are repeated, A path isa
if the first and last vertex in the path are the same §
graph with no cycles is acyclic.
‘
“
Graph Traversal
Graphs can be traversed much as trees can (depth fire
breadth first, etc.), but care must be taken not to gi
stuck in a loop-trees by definition don't have cycles iat
in a tree there’s always only one path from the reoios
node whereas in a graph there may be many pals
between any pair of nodes.
The following directed graph has 6 nodes The
Dijkstra-Moore algorithm can be used to find
shortest path from a given node to each of the ote
long as costs are not negative. A breadth fist se
from the first node radiates out across the grap
reaches an unvisited node, it sets its distance fom
first node to be the distance of the previous node fas
Gepieanadeiplubs l of the edge from
previous node. If a node’s already been visited,
=na distance from the first node
the shortest distance, the current
So, a comparison is done and the
vet to the minimum.
as
“: graph below, suppose that A was
20 Fave shorter
Mode. In the first stage, node B’s distance
feo 7 and node C’s distance to 1. During
2 routes from nodes B and © would be
400% hen node Be reached from node C, the
we whee go node B’s distance from Ais reset to
eis 1* 5.077 As the algorithm progresses, node
a ori iter be reset t© 5, which is the final
3 ‘The algorithm terminates
ples
ample
Data Structure
have
1n all of the routes will
than
after 5 stages because by thei
have more
been traversed routes can't
number_of_vertices 1 edges.
Implementing Graph Algorithms
Graphs provide useful source material [01 practising
Cok, inheritance provides a useful mechanioy for
integrating node, edge and graph structnt™® and the
sacdiard Rbrary takes away much of the chore Some
issues to bear in mind are~
= You could start by writing 50
code to deal with the earlier examp!
forget that trees are a subset of BraP!
your code should be generalisable.
sh information will
+ Think about how the tree/grap!
be setup. If files are to be read in, what format
should the files have?
= As with the standard containers and algorithms,
it might be worth trying to separate the graph
information from the operations 10 be performed
on the graph.
me tree traversal
jles but don’t
yhs and that