DS Unit 1
DS Unit 1
Basics: Abstract Data Type (ADT) – introduction to data structures – representation – implementation.
Stack and list: representing stack – implementation – application – balancing symbols – conversion of
infix to postfix expression – evaluating a postfix expression – recursive function call – Linked list ADT
– implementation using arrays – limitations - linked list using dynamic variables- linked implementation
of stacks – circular list – doubly linked lists.
INTRODUCTION
Data structure is the structural representation of logical relationships between elements of data.
In other words a data structure is a way of organizing data items by considering its relationship
to each other.
Data structure mainly specifies the structured organization of data, by providing accessing
methods with correct degree of associativity. Data structure affects the design of both the
structural and functional aspects of a program.
Data Structures
Definition 1: A way of organizing all data items that considers not only the data stored but also
stores relationship between elements.
Definition 2: A data structure is defined as collection of elements and all possible operations
which are required for those set of elements.
Data Type
A data type is a term which refers to the kind of data that variables may hold in the programming
language. Set of elements that share common set of properties. Example: int, float, char, etc.
Types of data types
1. Built-in data types
2. User defined data types
1
IT T33 / DATA STRUCTURES UNIT 1
Example:
1. Array – int regno[10]; 2. Structure-
float marks[10]; typedef struct Stud
char result[20]; {
int a;
;
;
}s;
Stud s;
Data
A collection of facts, concepts, figures, observations, occurrences in a formalized manner.
Example: blue, 1098, 11.
Information
The meaning that is currently assigned to the data by means of the conversions applied to those
data (processed data).
Example: Uniform color = blue
Regno = 11078
Age = 18
2
IT T33 / DATA STRUCTURES UNIT 1
3
IT T33 / DATA STRUCTURES UNIT 1
Tree
A tree is a non-linear dynamic data structure that may point to one or more nodes at a time.
Graph
A graph is similar to the tree except that it has no hierarchical relationship between its adjacent
elements.
An abstract data type is a data type that is organized in such a way that the specification of object
ADT = Type + Function + Behaviour of function
and the operations on the object is separated from the representation of object and the
implementation of operation. In Abstract Data Type all the implementation details are hidden.
The abstract data type is a triple of
„D‟ - Set of domains
„F‟ - Set of functions
„A‟ – Set of axioms in which only what to be done is mentioned but how it has to be
done is not mentioned
4
IT T33 / DATA STRUCTURES UNIT 1
An abstract data type can be written with the help of instances and operations.
Instances
Instances represent the element on which various operations can be performed.
Operations
Operations represent basic operation that can be performed on those elements
5
IT T33 / DATA STRUCTURES UNIT 1
1.2 STACK
A stack is a collection of elements in an ordered manner in which the insertion and deletion are
restricted at only one end.
The end from which the elements are added and deleted is referred as the “top‟ of the stack.
Stack is also referred as PILES or PUSH down list.
The last element placed in the stack will be at the top of the stack.
The last element added to the stack is the first element to be removed. Hence stack are referred
as LAST IN FIRST OUT.
The primary operations that can be carried out on the stack are:
PUSH (Insertion)
POP (Deletion)
PEEK (Display the top of the stack)
DISPLAY
Representation of Stack:
Stack can be represented using 2 ways:
1. Static implementation of stack using array
2. Dynamic implementation of stack using linked list
6
IT T33 / DATA STRUCTURES UNIT 1
The simplest way of representing a stack is single dimensional array. Both stack and array are
ordered collection of elements but array and stack are 2 different things:
In array, the number of elements in the array is fixed and it is not possible to change the number
of elements through the operation insertion and deletion.
In stack, the size of the stack varies continuously when elements are pushed and popped. Since
the stack is stored in part of array, an array can be declared large enough to hold maximum
number of elements of the stack.
During the execution of the program, the stack size can be varied within the space reserved for it.
One end of the array (bottom of the sack) is fixed and the other end of the array (top of the stack)
is continuously changed depending upon the elements pushed or popped in the stack.
First, we have to allocate a memory block of sufficient size to accommodate the full
capacity.
Start from the first location of the memory block.
For array representation the following two ways, stated
Empty: Top < 1
Full: Top >= u, s = u + 1-1
Where S is the size of the stack.
l and u are the lower and upper value respectively.
Basic Operations:
PUSH
Push is an operation used to add new element into the stack. While implementing the push
operation, overflow condition of the stack is to be checked, to avoid to push an element in the
fully occupied stack.
Push operation involves series of steps –
Step 1 − Check if stack is full.
Step 2 − If stack is full, produce error and exit.
Step 3 − If stack is not full, increment top to point next empty space.
Step 4 − Add data element to the stack location, where top is pointing.
Step 5 − return success.
7
IT T33 / DATA STRUCTURES UNIT 1
Algorithm
1. If Top >= Size, then
2. Print “Stack is full”
3. Else
4. Top = Top + 1
5. A[Top] = Item
6. Endif
7. Stop
POP
Pop is an operation used to remove n element from the top of the stack. While implementing the
pop operation, underflow condition of stack is to be checked to avoid top of an element from the
empty stack.
POP operation may involve the following steps −
Step 1 − Check if stack is empty.
Step 2 − If stack is empty, produce error and exit.
Step 3 − If stack is not empty, access the data element at which top is pointing.
Step 4 − Decrease the value of top by 1.
Step 5 − return success.
8
IT T33 / DATA STRUCTURES UNIT 1
Algorithm POP:
1. If Top<1, then
2. Print “Stack is empty”
3. Else
4. Item = A[Top]
5. Top = Top – 1
6. Endif
7. Stop
When implementing a stack using linked list, it is not necessary to specify the number of
elements to be stored in a stack during its declaration (compile time) because the memory is
allocated dynamically at run-time when an element is to be added to the stack.
The linked list representation of stack can grow and shrink in size without wasting the memory
space depending upon the push and pop operation occurred in the stack. Multiple stack can be
represented efficiently using a chain for each stack.
Even though array representation of stack is very easy but it allows only pre-defined stack size.
Some application size of the stack may vary during execution.
To overcome this problem we go for stack using link list.
In line the DATA field and LINK field as usual to point the next item.
In link list, first nose is the current item (top) of the stack.
Last node containing the bottom most element.
PUSH and POP operations perform the normal function. (Insertion, Deletion)
The size of the stack is not important here because, it follows dynamic stack instead of stack.
In link list, the test for overflow is not applicable in this case.
9
IT T33 / DATA STRUCTURES UNIT 1
Basic operation
1
IT T33 / DATA STRUCTURES UNIT 1
The balanced parenthesis problem is to determine the set of parenthesis in the expression or not.
Steps:
Consider the expression with character and parenthesis (), {}, [] to determine if an expression is
balanced or not.
For each character that is used to read from the expression, follow the next steps.
If the character is an opening parenthesis (, {,[ then push it into the stack.
If the character is closing symbol or parenthesis ),},] then check the last entry in the stack using
pop operation and compare the symbol in opening brace list in corresponding position.
If it is matched, repeat the step from 2 until you reach the end of statement.
If it is not matched, or if the stack is empty which not reaching the end of expression, the given
expression is unbalanced.
If the stack is empty while the end of the expression the given expression is balanced.
Infix Expressions:
1
IT T33 / DATA STRUCTURES UNIT 1
Prefix Expressions:
Postfix Expressions:
In this type of expressions the arrangement of operands and operator is as follow
Postfix expression = Operand1 operand2 Operator
Sample Conversion:
1
IT T33 / DATA STRUCTURES UNIT 1
= ab +e / df +*
The postfix expression is scanned from left to right variable (operand) or constant are pushed on
the stack. When an operator is encountered in postfix expression, the indicated operator is
performed using the top two elements in the stack and the result is push onto the stack.
An expression consists of operators and operands. In an expression if the operator, which
performs an operation, is written in between the operands it is called an infix expression. If the
operator is written before the operands, it is called prefix expression. If the operator is written
after the operands, it is called postfix expression.
1
IT T33 / DATA STRUCTURES UNIT 1
In the above case the bracket (4-1) is evaluated first. Then (2-10/2) will be evaluated in which /,
being higher priority, 10/2 will be evaluated first.
The above sentence is questionable, because as we move from left to right, the first bracket,
which will be evaluated, will be (4-6/2+7).
1
IT T33 / DATA STRUCTURES UNIT 1
When there is a function call, all the important information that needs to be saved, such as
register values (corresponding to variable names) and the return address (which can be obtained
from the program counter, which is typically in a register), is saved "on a piece of paper" in an
abstract way and put at the top of a pile. Then the control is transferred to the new function,
which is free to replace the registers with its values.
If it makes other function calls, it follows the same procedure. When the function wants to
return, it looks at the "paper" at the top of the pile and restores all the registers. It then makes the
return jump.
Clearly, all of this work can be done using a stack, and that is exactly what happens in virtually
every programming language that implements recursion. The information saved is called either
an activation record or stack frame. The stack in a real computer frequently grows from the high
end of your memory partition downwards, and on many systems there is no checking for
overflow. There is always the possibility that you will run out of stack space by having too many
simultaneously active functions. Needless to say, running out of stack space is always a fatal
error.
In languages and systems that do not check for stack overflow, the program will crash without an
explicit explanation. On these systems, strange things may happen when the stack gets too big,
because the stack will run into part of the program. It could be the main program, or it could be
part of your data, especially if you have a big array.
If it runs into your program, the program will be corrupted; If the stack runs into the data, what is
likely to happen is that when something is written into the data, it will destroy stack information
-- probably the return address -- and the program will attempt to return to some weird address
and crash.
In normal events, stack space should not be lost; doing so is usually an indication of runaway
recursion (forgetting a base case). The following routine
void print_list( LIST L )
{
/*1*/ if( L != NULL )
{
/*2*/ print_element( L->element );
/*3*/ print_list( L->next );
}
1
IT T33 / DATA STRUCTURES UNIT 1
}
A bad use of recursion: printing a linked list
void print_list( LIST L )
{
top:
if( L != NULL )
{
print_element( L->element );
L = L->next;
goto top;
}
}
Printing a list without recursion; a compiler might do this (you should not)
Which prints out a linked list, is perfectly legal and actually correct. It properly handles the base
case of an empty list, and the recursion is fine. This program can be proven correct.
Unfortunately, if the list contains 20,000 elements, there will be a stack of 20,000 activation
records representing the nested calls of line 3.
Activation records are typically large because of all the information they contain, so this
program is likely to run out of stack space. (If 20,000 elements are not enough to make the
program crash, replace the number with a larger one.)
This program is an example of an extremely bad use of recursion known as tail recursion. Tail
recursion refers to a recursive call at the last line. Tail recursion can be mechanically eliminated
by changing the recursive call to a goto preceded by one assignment per function argument.
This simulates the recursive call because nothing needs to be saved -- after the recursive call
finishes, there is really no need to know the saved values. Because of this, we can just go to the
top of the function with the values that would have been used in a recursive call. The second
program in shows the improved version.
The goto is used here to show how a compiler might automatically remove the recursion.
Removal of tail recursion is so simple that some compilers do it automatically. Even so, it is best
not to find out that yours does not.
1
IT T33 / DATA STRUCTURES UNIT 1
Linked list is a linear collection of specially designed data elements, called nodes, linked to one
another by means of pointers.
Each node is divided into two parts: the first part contains the information of the element, and the
second part contains the address of the next node in the linked list. Address part of the node is
also called linked or next field.
The example linked list diagram is shown below.
Data Link
Structure of node
In single linked list each node contains only one link which points to the next node in the list
Header is an empty node which is used to store the pointer to the first node.
Starting from the first node we can reach to its last element.
The node contains the link field as null, indicates the end of the list.
Single linked list can move from left to right only.
Single list is also called as one way list
1
IT T33 / DATA STRUCTURES UNIT 1
Two parallel array of equal size are allocated and it would be sufficient to store entire linked list.
In this method there are three important things they are memory bank, Memory, Manager,
Garbage collector.
Memory bank- collection of free memory spaces.
During the node creations, when a node is required, then request is placed to MM (Memory
Manager).
MM searches the Memory Bank (MB) if it’s found than grants the block to the caller.
Memory Manager – Generally a program which manages the memory efficiently.
Garbage collector – Whenever the allocated note no more in use, then it has to be re- Allocated
to the memory bank
The process of Memory Management includes (MB, MM, GB) are said to be dynamic memory
Management
1
IT T33 / DATA STRUCTURES UNIT 1
From the above figure, the list of available memory spaces and its pointer are stored in AVAIL.
For a node request, the list AVAIL is searched for the block of right size.
In case the requested block sizes not available, then MM will return a message accordingly.
Suppose the block is found and let it be xy, the MM will return the points of xy to caller in a
Buffer
The newly availed node xy can be inserted at any position in the linked list by changing the
pointer of the concerned nodes
1
IT T33 / DATA STRUCTURES UNIT 1
Traversing:
Traversing is the process of visiting each and every node in the list, its starts from the first node
and ends with last node.
Insertion:
There are various positions where a node can be inserted.
1) Inserting at the front (first element)
2) Inserting at the end (last element)
3) Inserting at any position.
Before the discussion of above mentioned insertions, Let us assume a procedure get node (NODE).
Get node (NODE) – To get a pointer of memory block which suits the type NODE.
1. If (AVAIL = NULL) // Avail is the pointer, that contains the collection of free storage.
2. Return (NULL)
3. Print “insufficient memory // unable to allocate memory.
4. Else //sufficient memory is available.
5. Ptr = AVAIL // starts from the locations where Avail points.
6. While ( size of (ptr) # size of (NODE) and ptr – LINK # NULL) do
// Till the desired block is found or the search reaches the end of the pool.
7. Ptrl = ptr // To keep track of previous block.
8. Ptr = ptr – LINK // MOVE TO THE NEXT BLOCK.
9. End while.
10.If ( Size of (ptr) = size of (NODE) // memory block of right size is found.
11. Ptrl - LINK == ptr – LINK // update the avail first.
12. Return (ptr)
13. Else.
14. Print “the memory block is too large to fit”
15. Return (NULL)
2
IT T33 / DATA STRUCTURES UNIT 1
16. End if
17. End if
18. Stop
1. New = Get node (NODE). //Get a memory block of type node and store its ptr is new.
2. If (new = NULL) then // MM return NULL on searching MB.
3. Print “Memory underflow”
4. Exit
5. Else // Memory available.
6. new – LINK = Header – LINK // change to ptr 1
7. new – DATA = x // copy the data x the newly availed node.
8. HEADER – LINK = new // change of ptr 2.
9. Endif
10. Stop.
Operations
1. Set the space for new node
2. Assign value to the item field
3. Set the next field of NEWNODE to NULL
4. Set the next field of previous node to NEWNODE
2
IT T33 / DATA STRUCTURES UNIT 1
4. Exit
5. Else
6. Ptr = HEADER // start from header node
7. While (ptr – LINK # NULL ) do
8. Ptr = ptr – LINK // change the pointer to next node.
9. End while
10. Ptr – LINK = New // change the link field of last node.
11. New – DATA = x //copy the cont x in new node.
Deletion:
Operation:
1. If the node X, to be deleted is at first, store next field of x in some other variable y
2. Free the space occupied by X
3. Change the head pointer to point to the address
2
IT T33 / DATA STRUCTURES UNIT 1
1. Ptr = HEADER
2. If *( ptr – LINK = NULL) then
3. print “The list is empty”
4. Exit
5. Else
6. While (ptr – LINK # NULL) do
7. Ptrl = ptr
8. Ptr = ptr – LINK
9. End while
10. Ptrl = LINK = NULL
11. Return Node (ptr)
12. Endif
13. Stop
2
IT T33 / DATA STRUCTURES UNIT 1
Algorithm copy – SL
2
IT T33 / DATA STRUCTURES UNIT 1
Merging of nodes
Algorithm merge- SL
1. Ptr = HEADER 1
2. While (ptr – LINK # NULL) do //move to the lalst node in the list
3. ptr = ptr – LINK
4. End while
5. Ptr – LINK = HEADER 2 – LINK //last node in L1 points to the first node L2.
6. Return Node (HEADER 2)
7. HEADER = HEADER 1
8. Stop
Algorithm search.
2
IT T33 / DATA STRUCTURES UNIT 1
A doubly linked list is one in which all nodes are linked together by multiple links which help in
accessing both the successor (next) and predecessor (previous) node for any arbitrary node
within the list. Every nodes in the doubly linked list has three fields:
Previous Address Field
Data Field
Next Address Field
The previous link field contains the address of its previous node. This field is also referred as
backward link field. The data field stores the information part of the node. The next field
contains address of its next node in the list. This field is also referred to as forward link field.
In most of the real world applications it is necessary to reverse the list both in forward direction
and backward direction. The most appropriate data structure for such an application is a doubly
linked list. The three fields and structure of double linked list is shown below.
Basic operations
1. Insert
2. Delete
3. Traverse
Creating a node:
2
IT T33 / DATA STRUCTURES UNIT 1
Insertion at middle:
Insertion and deletion are two basic operations on such lists. Consider that a new node pointed to
by p is to be inserted after the node pointed to by q in a doubly linked list. Links of two nodes
are altered during insertion.
Algorithm:
Void insertmid()
{
p=(node*)malloc(sizeof(node));
p->data=3;
p- >next=q-
>next; p-
>prev=q;
q- >next=p;
if(p->next!=NULL)
p->next->prev=p;
}
2
IT T33 / DATA STRUCTURES UNIT 1
Insertion at beginning:
Void insertbeg()
{
p=(node*)malloc(sizeof(node));
p->data=x
p->prev=NULL
p->next=start;
if( start!=NULL)
start ->prev=p;
}
Insertion at end:
void insertend()
{
p=(node*)malloc(sizeof(node));
p->data=x
p->next=NULL;
q=start; while(q->next!=NULL)
q=q->next;
p->prev=q;
q->next=p
}
Deletion of node:
When a node, pointed to by P is to be deleted than its predecessor node x and its successor node y
will be affected.
Right link at x should be set to y.
Left link at y should be set to x.
Release the memory allocated to the node pointed to by p.
2
IT T33 / DATA STRUCTURES UNIT 1
P->prev->next=P->next;
p->next->prev=P->prev;
free(P);
A circular linked list is one, which has no beginning and no end. A singly linked list can be made
a circular linked list by simply storing the address of the very first node in the linked field of the
last node.
Circular linked list looks like a cyclic linked list where there is no null pointer to indicate the end
of the list. The example diagram for circular linked list is shown below
2
IT T33 / DATA STRUCTURES UNIT 1
Basic Operations:
1. Insertion
2. Deletion
3. Traversal / Display
Creation of A List
Insertion:
One of the most primitive operations that can be done in a singly linked list is insertion of the
node.
Memory is to be allocated for the new node (in a similar way that is done while creating a list)
before reading the data.
The new node will contain empty data field and empty link field.
The data field of the new node is then stored with the information read from the user.
The link field of the new node is assigned to NULL.
The new node can be inserted in three different places:
1. Insertion of a node in the beginning of the list.
2. Insertion of a node in the middle of the list.
3. Insertion of a node in the end of the list.
3
IT T33 / DATA STRUCTURES UNIT 1
Check whether the list is empty or not. (i.e., check whether the head pointer is pointing (to
NULL or not).
If the list is not empty then follow these steps:
The link field of the new node is made to point
The data field of the first node.
The head pointer is made to point the data field of the new node by assigning the address of the
new node.
ALGORITHM:
INSERT_FIRST (HEAD:NODE)
NEWNODE, LAST :NODE
Step1: Set NEWNODE=GETNODE ()
Step2: CALL READNODE (NEWNODE)
Step 3: If (HEAD==NULL)
Set HEAD=NEWNODE
Return
[End of If Structure]
Step 4: Set LAST=HEAD
Step 5: Repeat While (LAST->LINK!=HEAD)
Assign LAST=LAST->LINK
[End of While Loop]
Step 6: Assign LAST->LINK=NEWNODE
Step 7: Assign NEWNODE->LINK=HEAD
Step 8: Assign HEAD=NEWNODE, END INSERT _FIRST ()
3
IT T33 / DATA STRUCTURES UNIT 1
Deletion:
Another primitive operation that can be done in a circular singly linked list is the deletion of the
node. Memory is to be released for the node to be deleted. A node can be deleted from the list
from three places:
1. Deletion of the first node from the list.
2. Deletion of the intermediate node from the list.
3. Deletion of the last node from the list.
Traversal of A List:
To read the information or to display the information in the list, we must traverse the node one
by one starting from the first node until the last node. Traversing a circular list involves:
Check whether the head pointer is pointing NULL or not.
Display the information in the data field stored in the head pointer.
Traverse the node one by one by advancing the head pointer.
3
IT T33 / DATA STRUCTURES UNIT 1
Algorithm
A matrix is called sparse if it has a relatively number of zero elements. In the following example
out of 36 elements only 8 are non-zero.
This is sparse matrix.
If we store this matrix in an array in memory, there would be much wasted space. To avoid this,
space array can be represented by
Vector representation
Linked representation
3
IT T33 / DATA STRUCTURES UNIT 1
For representing a sparse matrix using linked list we require declaration of three structures: head
node, colnode, rownode.
3
IT T33 / DATA STRUCTURES UNIT 1
The structure headnode that contains 4 members: nrows for containing number of rows, ncols for
number of columns, nvals for number of non zero element and head pointer of row node type for
pointing to first row of the sparse matrix maintained as linked list.
The structure row node that contains 4 members: row is for row number, a pointer down of row
node type of pointing to next row and a pointer right of colnode type for pointing the first node
of the linked list, maintained for all non-zero column for that row. The structure colnode that
contains 3 members: first is the column number col, second is the value val and third is the next
pointer of type colnode that points to next node for this linked list.