Files 4 2022 January NotesHubDocument 1642765885
Files 4 2022 January NotesHubDocument 1642765885
A. K. Sharma
Professor and Dean
YMCA University of Science and Technology
Delhi • Chennai
Copyright © 2013 Dorling Kindersley (India) Pvt. Ltd.
Licensees of Pearson Education in South Asia
No part of this eBook may be used or reproduced in any manner whatsoever without the publisher’s
prior written consent.
This eBook may or may not include all assets that were part of the print version. The publisher
reserves the right to remove any material in this eBook at any time.
ISBN 9788131792544
eISBN 9789332514225
Head Office: A-8(A), Sector 62, Knowledge Boulevard, 7th Floor, NOIDA 201 309, India
Registered Office: 11 Local Shopping Centre, Panchsheel Park, New Delhi 110 017, India
To
my parents,
wife Suman and daughter Sagun
This page is intentionally left blank.
Contents
Preface to the Second Edition xiii
Preface xiv
About the Author xv
Chapter 1: Overview of C 1
1.1 The History 1
1.2 Characters Used in C 2
1.3 Data Types 2
1.3.1 Integer Data Type (int) 2
1.3.2 Character Data Type (char) 3
1.3.3 The Floating Point (f loat) Data Type 3
1.4 C Tokens 4
1.4.1 Identifiers 4
1.4.2 Keywords 5
1.4.3 Variables 5
1.4.4 Constants 7
1.5 Structure of a C Program 8
1.5.1 Our First Program 8
1.6 printf() and scanf() Functions 8
1.6.1 How to Display Data Using printf() Function 9
1.6.2 How to Read Data from Keyboard Using scanf() 10
1.7 Comments 10
1.8 Escape Sequence (Backslash Character Constants) 11
1.9 Operators and Expressions 13
1.9.1 Arithmetic Operators 13
1.9.2 Relational and Logical Operators 14
1.9.3 Conditional Operator 16
1.9.4 Order of Evaluation of Expressions 17
1.9.5 Some Special Operators 18
1.9.6 Assignment Operator 18
1.9.7 Bitwise Shift Operators 19
1.10 Flow of Control 20
1.10.1 The Compound Statement 21
1.10.2 Selective Execution (Conditional Statements) 21
1.10.3 Repetitive Execution (Iterative Statements) 25
1.10.4 The exit() Function 27
1.10.5 Nested Loops 28
1.10.6 The Goto Statement (Unconditional Branching) 28
viii Data Structures Using C
Index 501
Preface to the
Second Edition
I have been encouraged by the excellent response given by the readers to the first edition of the book to
work on the second edition. As per the feedback received from the teachers of the subject and the input
provided by the team at Pearson Education, the following topics in various chapters of the book have
been added:
1. Sparse matrices
2. Recursion
3. Hashing
4. Weighted binary trees
a. Huffman algorithm
5. Spanning trees, minimum cost spanning trees
a. Kruskal algorithm
b. Prims algorithm
6. Shortest path problems
a. Warshall’s algorithm
b. Floyd’s algorithm
c. Dijkstra’s algorithm
7. Indexed file organization
While revising the book, the text has been thoroughly edited and the errors found thereof have been
corrected. More examples on important topics have been included.
I hope the readers will like this revised edition of the book and, as before, will provide their much
needed feedback and comments for further improvement.
Acknowledgements
I am thankful to Khushboo Jain and Anuradha Pillai for helping me in preparing the solution manual
of the book.
A. K. Sharma
Preface
As a student, programmer, and teacher of computer engineering, I find ‘Data Structures’ a core course of
computer engineering and particularly central to programming process.
In fact in our day-to-day life, we are confronted with situations such as where I would keep a bunch
of keys, a pen, coins, two thousand rupees, a chalk, and five hundred thousand rupees.
I would keep the bunch of keys and coins in the left and right pockets of my pants, respectively. The
pen gets clipped to the front pocket of the shirt whereas two thousand rupees would go into my ticket
pocket. I would definitely put the five hundred thousand rupees into a safe, i.e., under the lock and key.
While teaching, I will keep the chalk in hand. The decision of choosing the places for these items is based
on two factors: ease of accessibility and security.
Similarly, given a problem situation, a mature programmer chooses the most appropriate data
structures to organize and store data associated with the problem. The reason being that the intel-
ligent choice of data structures will decide the fate of the software in terms of effectiveness, speed
and efficiency—the three most important much-needed features for the success of a commercial
venture.
I have taught ‘Data Structures’ for more than a decade and, therefore, the demand to write a book on
this subject was there for quite some time by my students and teacher colleagues.
The hallmark of this book is that it would not only help students to understand the concepts govern-
ing the data structures but also to develop a talent in them to use the art of discrimination to choose the
right data structures for a given problem situation. In order to provide a hands-on experience to budding
software engineers, implementations of the operations defined on data structures using ‘C’ have been
provided. The book has a balance between the fundamentals and advanced features, supported by solved
examples.
This book would not have been possible without the well wishes and contribution of many people
in terms of suggestions and useful remarks provided by them during its production. I record my
thanks to Dr Ashutosh Dixit, Anuradha Pillai, Sandya Dixit, Dr Komal Bhatia, Rosy Bhatia, Harsh, and
Indu Grover.
I am indebted to my teachers and research guides, Professor J. P. Gupta, Professor Padam Kumar,
Professor Moinuddin, and Professor D. P. Agarwal, for their encouragement. I am also thankful to
my friends, Professor Asok De, Professor Qasim Rafiq, Professor N.S. Gill, Rajiv Kapur and Professor
Rajender Sahu, for their continuous support and useful comments.
I am also thankful to various teams at Pearson who made this beautiful book happen.
Finally, I would like to extend special thanks to my parents, wife Suman and daughter Sagun for
saying ‘yes’ for this project when both wanted to say ‘no’. I know that I have stolen some of the quality
time which I ought to have spent with them.
Some errors might have unwittingly crept in. I shall be grateful if they are brought to my notice.
I would also be happy to acknowledge suggestions for further improvement of this book.
A. K. Sharma
About the Author
A. K. Sharma is currently Chairman, Department of Computer Engineering, and
Dean of Faculty, Engineering and Technology at YMCA University of Science and
Technology, Faridabad. He is also a member of the Board of Studies committee of
Maharshi Dayanand University, Rohtak. He has guided ten Ph.D. theses and has
published about 215 research papers in national and international journals of re-
pute. He heads a group of researchers actively working on the design of different
types of ‘Crawlers.
This page is intentionally left blank.
Overview of C
1
Chapter
n Loose typing
n Structured language
n Wide use of pointers to access data structures and physical memory of the system
Besides the above characteristics, the C programs are small and efficient. A ‘C’ program can be
compiled on variety of computers.
2 Data Structures Using C
n Long integer: A long integer is referred to as long int or simple long. It is stored in 32 bits and
The unsigned int is stored in one word of the memory whereas the unsigned short is stored in
16 bits and does not depend upon the word size of the memory.
Examples of invalid integers are:
(i) 9, 24, 173 illegal-comma used
(ii) 5.29 illegal-decimal point used
(iii) 79 248 blank used
Overview of C 3
n double: The numbers of double data type are stored in 64 bits of memory.
4 Data Structures Using C
A summary of C basic data types is given the Table 1.1. From this table, it may be observed that charac-
ter and integer type data can also be declared as unsigned. Such data types are called unsigned data types.
In this representation, the data is always a positive number with range starting from 0 to a maximum
value. Thus, a number twice as big as a signed number can be represented through unsigned data types.
1.4 C TOKEnS
A token is a group of characters that logically belong together. In fact, a programmer can write a program
by using tokens. C supports the following types of tokens:
n Identifiers
n Keywords
n Constants
n Variables
1.4.1 Identifiers
Symbolic names can be used in C for various data items. For example, if a programmer desires to store a
value 27, then he can choose any symbolic name (say, ROLL) and use it as given below:
ROLL = 27;
Where ROLL is a memory location and the symbol ‘5’ is an assignment operator.
The significance of the above statement is that ‘ROLL’ is a symbolic name for a memory location
where the value 27 is being stored. A symbolic name is generally known as an identifier.
The identifier is a sequence of characters taken from C character set. The number of characters in
an identifier is not fixed though most of the C compilers allow 31 characters. The rules for the formation
of an identifier are:
n An identifier can consist of alphabets, digits and and/or underscores.
n C is case sensitive, i.e., upper case and lower case letters are considered different from each other.
n An identifier can start with an underscore character. Some special C names begin with the
underscore.
n Special characters such as blank space, comma, semicolon, colon, period, slash, etc. are not allowed.
n The name of an identifier should be so chosen that its usage and meaning becomes clear. For
example, total, salary, roll no, etc. are self explanatory identifiers.
Overview of C 5
1.4.2 Keywords
A keyword is a reserved word of C. This cannot be used n Table 1.2 Standard keywords in C
as an identifier by the user in his program. The set of C
auto double int struct
keywords is given in Table 1.2.
break else long switch
1.4.3 Variables case enum register typedef
A variable is the most fundamental aspect of any com- char extern return union
puter language. It is a location in the computer memory const float short unsigned
which can store data and is given a symbolic name for continue for signed void
easy reference. The variables can be used to hold differ-
default goto sizeof volatile
ent values at different times during a program run. To
do if static while
understand this concept, let us have a look at the follow-
ing set of statements:
Total 5 500.25; ...(i)
Net 5 Total 2 100.00; ...(ii)
In statement (i), value 500.25 has been stored in a memory location called Total. The variable Total
is being used in statement (ii) for the calculation of another variable Net. The point worth noting is that
‘the variable Total is used in statement (ii) by its name not by its value’.
Before a variable is used in a program, it has to be defined. This activity enables the compiler to
make available the appropriate amount of space and location in the memory. The definition of a variable
consists of its type followed by the name of the variable. For example, a variable called Total of type float
can be declared as shown below:
float Total;
Similarly, the variable net of type int can also be defined as shown below:
int Net;
Examples of valid variable declarations are:
(i) int count;
(ii) int i, j, k;
(iii) char ch, first;
6 Data Structures Using C
From Figure 1.3, we can see that besides its type a variable has three entities associated with it, i.e.,
the name of variable (val), its physical address (4715), and its contents (100). The content of a variable
is also called its rvalue whereas the physical address of the variable is called its lvalue. Thus, lvalue
and rvalue of variable val are 4715 and 100, respectively. The lvalue is of more importance because it
is an expression that should appear on the left hand side of assignment operator because it refers to the
variable or object.
Memory
1.4.4 Constants
A constant is a memory location which can store data in such a manner that its value during execu-
tion of a program does not change. Any attempt to change the value of a constant will result in an
error message. A constant in C can be of any of the basic data types, i.e., integer constant, float-
ing point constant, and character constant. const qualifier is used to declare a constant as shown
below:
const <type> <name> 5 <val>;
where
const: is a reserved word of C
<type>: is any of the basic data types
<name>: is the identifier name
<val>: is the value to be assigned to the constant.
(1) Integer constant: It is a constant which can be assigned integer values only. For example, if we
desire to have a constant called rate of type integer containing a fixed value 50, then the follow-
ing declaration can be used:
const int rate 5 50;
The above declaration means that rate is a constant of type integer having a fixed value 50. Consider
the following declaration:
const int rate;
rate 5 50;
The above initialization of constant rate is illegal. This is because of the reason that a constant can-
not be initialized for a value at a place other than where it is declared. It may be further noted that if a
program does not change or mutates\a constant or constant object then the program is called as const
correct.
(2) Floating point constant: It is a constant which can be assigned values of real or floating point
type. For example, if it is desired to have a constant called Pi containing value 3.1415, then the
following declaration can be used.
const float Pi = 3.1415;
The above declaration means that Pi is a constant of type float having a fixed value 3.1415. It may be
noted here that by default a floating point constant is of type double.
(3) Character constant: A character constant can contain a single character of information. Thus,
data such as ‘Y’ or ‘N’ is known as a character constant. Let us assume that it is desired to have a
constant called Akshar containing the character ‘Q’; following declaration can be used to obtain
such a constant.
const char Akshar = ‘Q’;
The above declaration means that Akshar is a constant of type char having a fixed value ‘Q’.
A sequence of characters enclosed within quotes is known as a string literal. For example, the
character sequence “computer” is a string literal. When a string literal is assigned to an identifier
declared as a constant, then it is known as a string constant. In fact, a string is an array of
characters. Arrays are discussed later in the chapter.
8 Data Structures Using C
Note:
(i) C is a case sensitive language, i.e., it distinguishes between upper case and lower case characters.
Thus, main() is different from Main(). In fact, most of the characters used in C are lowercase.
Hence, it is safest to type everything in lower case except when a programmer needs to capitalize
some text.
(ii) Every C program has a function called main followed by parentheses. It is from here that
program execution begins. A function is basically a subprogram and is complete in itself.
(iii) The task to be performed by a function is enclosed in curly braces called its body, i.e., {}.
These braces are equivalent to begin and end keywords used in some other languages like
Pascal. The body of function contains a set of statements and each statement must end with
a semicolon.
The above statement would display the following data on the screen:
The age of student = 25
Thus, the text between the quotes has been displayed as such but for %d, the data stored in the
variable age has been displayed.
A float can be displayed on the screen by including a format specifier (%f) within the pair of quotes
as shown below:
float rate = 9.5;
printf (“The rate of provident fund = %f”, rate);
The above set of statements declares the variable roll of type int, and reads the value for roll from
the keyboard. It may be noted that ‘&’, the address operator, is prefixed to the variable roll. The reason for
specifying ‘&’ would be discussed later in the book.
Similarly, other format specifiers can be used to input data from the keyboard.
Example 1: Write an interactive program that reads the marks secured by a student for four subjects
Sub1, Sub2, Sub3, and Sub4, the maximum marks of a subject being 100. The program shall compute
the percentage of marks obtained by the student.
Solution: The scanf() and prinf() functions would be used to do the required task. The required
program is as follows:
#include <stdio.h>
main()
{
int sub1, sub2, sub3, sub4;
float percent;
printf (“Enter the marks for four subjects:”);
scanf (“%d %d %d %d”,&sub1, &sub2, &sub3, &sub4);
percent 5 (sub1 1 sub2 1 sub3 1 sub4)/ 400.00*100;
printf (“The percentage 5 %f”, percent);
}
For input data 45 56 76 90, the above program computes the percentage and displays the following output:
The percentage 5 66.750000
Though the output is correct, but it has displayed unnecessary four trailing zeros. This can be con-
trolled by using field width specifiers discussed later in the chapter.
1.7 COMMEnTS
A comment can be added to the program by enclosing the text between the pair / * ... * /, i.e.,
the pair ‘/*’ indicates the beginning of the comment whereas the pair ‘*/’ marks the end of it. It is also
known as multiple line comment. For example, the following line is a comment:
Overview of C 11
/* This is my first program */
Everything written within `/*´ and `*/´ is ignored by the compiler. A comment written in this
fashion can overflow to multiple lines as shown below:
/* This is an illustration of multiple line comment. The C compiler ignores
these lines. A programmer can use comment lines for documentation of his
programs*/
The character constant “\n” has been placed at the beginning of the text of the second printf()
statement and, therefore, the output of the program would be:
A back slash character constant
prints the output on the next line.
12 Data Structures Using C
The output of the above statement would be on the new line with spacing as shown below:
hello Comp-Boy
(2) ’\b’ (Backspace): This character is also called backspace character. It is equivalent to the back-
space key symbol ( )available on the computer or typewriter. It moves one column back-
ward and positions the cursor on the character displayed on that column. An example of usage
of character '\b' is given below:
#include <<stdio.h
main()
{
printf(“\n ASHOKA\b-”);
}
The output of the above statement would be the following message on the screen and a sounding of
a bell on the system speaker.
Error in data
(4) ’\n’ (new line): As discussed earlier, this character is called newline character. Wherever it
appears in the output statement, the immediate next output is taken to the beginning of the next
new line on the screen. Consider the statement given below:
#include <stdio.h>
main()
{
printf(“\n This is \n a test.”);
}
C supports many types of operators such as arithmetic, relational, logical, etc. An expression that
involves arithmetic operators is known as an arithmetic expression. The computed result of an arithmetic
expression is always a numerical value. The expression which involves relational and/or logical operators
is called as a boolean expression or logical expression. The computed result of such an expression is a
logical value, i.e., either 1 (True) or 0 (False).
The rules of formation of an expression are:
n A signed or unsigned constant or variable is an expression.
1.9.1.1 Unary Arithmetic Operators A unary operator requires only one operand or data item.
The unary arithmetic operators supported by C are unary minus (‘-’), increment (‘++’), and decrement
(‘--’). As compared to binary operators, the unary operators are right associative in the sense that they
evaluate from right to left.
14 Data Structures Using C
The unary minus operator is written before a numerical value, variable or an expression. Examples
of usage of unary minus operator are:
(i) 257 (ii) 22.923 (iii) 11x (iv) 22(a*b) (v) 8*(22(a + b))
It may be noted here that the result of application of unary minus on an operand is the negation of
its operand.
The operators ‘++’ and ‘--’ are unique to C. These are called increment and decrement operators,
respectively. The increment operator ‘++’ adds 1 to its operand. Therefore, we can say that the following
expressions are equivalent.
i = i + 1 ≡ ++i;
For example, if the initial value of i is 10 then the expression ++i will increment the contents of
i to 11. Similarly, the decrement operator ‘--’ subtracts 1 from its operand. Therefore, we can say that
the following expressions are equivalent:
j = j - 1 ≡ --j;
For example, if the initial value of j is 5 then the expression --j will decrement the contents of j to 4.
The increment and decrement operators can be used both as a prefix and as a postfix to a variable
as shown below:
++ x or x++
--y or y--
As long as the increment or decrement operator is not used as part of an expression, the prefix and
postfix forms of these operators do not make any difference. For example, ++x and x++ would produce
the same result. However, if such an operator is part of an expression then the prefix and postfix forms
would produce entirely different results.
In the prefix form, the operand is incremented or decremented before the operand is used in
the program. Whereas in the postfix form, the operand is used first and then incremented or decre-
mented.
Example 2: Take two variables x and y with initial values 10 and 15, respectively. Demonstrate the usage
of relational operators by using x and y as operands.
Solution: The usage of relational operators is illustrated below with the help of a table:
x = 10, y = 15
Expression Result
x.y False
x 1 5> 5 y True
x<y True
x<5y True
x55y False
x 1 555y True
x! 5 y True
A logical operator is used to connect two relational expressions or logical expressions. The result of
such an operation is always logical, i.e., either true or false. The valid logical operators supported by C
are given in Table 1.5.
If x evaluates to false (zero), then the outcome of the above expression is bound to be false irrespec-
tive of y evaluating to any logical value. Therefore, there is no need to evaluate the term y in the above
expression.
Similarly, in the following expression, if x evaluates to true (non zero) then the outcome is bound to
be true. Thus, y will not be evaluated.
x || y
The expression x > y is evaluated. If x is greater than y, then z is assigned x otherwise z gets y.
Examples of valid conditional expressions are:
(i)
y 5 (x . 5 10)?0:10;
Overview of C 17
true
E1 ? E2 : E3
false
(ii)
Res 5 (i < j)? sum 1 i : sum 1 j;
(iii)
q 5 (a 5 5 0)?0:(x/y);
It may be noted here that the conditional operator (? :) is also known as a ternary operator because
it operates on three values.
It may be noted here that in C, false is represented as zero and true as any non-zero value. Thus,
expressions that use relational and logical operators return either 0 (false) or 1 (true).
18 Data Structures Using C
2. Comma operator: The comma operator is used to string together a number of expressions
which are performed in a sequence from left to right.
For example, the following statement
a 5 (x 5 5, x 1 2)
executes in the following order
(i) value 5 is assigned to variable x.
(ii) x is incremented by 2.
(iii) the value of expression x 1 2 (i.e., 7) is assigned to the variable a.
The following points may be noted regarding comma operator:
n A list of expressions separated by a comma is always evaluated from left to right.
n The final value and type of a list of expressions separated by a comma is always same as the type
The expression on the right hand side could be a constant, a variable or an arithmetic, relational, or
logical expression. Some examples of assignment statements are given below:
a 5 10;
a 5 b;
a 5 b*c;
(i) The assignment operator is a kind of a store statement, i.e., the value of the expression on the
right hand side is stored in the variable appearing on the left side of the assignment operator.
Overview of C 19
The variable on the left side of the assignment operator is also called lvalue and is an accessible
address in the memory. Expressions and constants on the right side of the assignment operator
are called rvalues.
(ii) The assignment statement overwrites the original value contained in the variable on the left
hand side with the new value of the right hand side.
(iii) Also, the same variable name can appear on both sides of the assignment operator as shown
below:
(iv) Multiple assignments in a single statement can be used, especially when same value is to be
assigned to a number of variables.
a 5 b 5 c 5 30;
These multiple assignment statements work from right to left and at the end, all variables have the
same value. The above statement assigns the value (i.e., 30) to all variables c, b, and a. However, the
variables must be of same type.
(v) A point worth nothing is that C converts the type of value on the right hand side to the data
type on the left.
0 1 0 1 1 1 0 1
Lost bits
1 1 1 0 1 0 0 0
Appended zero bits
Similarly, the following expression would shift the contents of y to right by two bits:
y >> 2
In this case, the rightmost two bits would be lost and the resultant 2 vacant leftmost bits would be
filled by zeros as shown in Figure 1.6.
1 1 0 1 1 0 1 1
Lost bits
0 0 1 1 0 1 1 0
Appended zero bits
The high level languages are designed for computers based on Von-Neumann architecture. Since this
architecture supports only sequential processing, the normal flow of execution of statements in a high
level language program is also sequential, i.e., each statement is executed in the order of its appearance
in the program. For example in the following C program segment, the order of execution is sequential
from top to bottom.
:
x 5 10;
y 5 20; Order of execution
z 5 x 1 y;
:
Overview of C 21
The first statement to be executed is ‘x 5 10’, and the second statement to be executed is ‘y 5
20’. The execution of statement ‘z 5 x 1 y’ will take place only after the execution of the statement
‘y 5 20’. Thus, the processing is strictly sequential. Moreover, every statement is executed once and
only once.
Depending upon the requirements of a problem, it is often required to alter the normal sequence
of execution in the program. This means that we may desire to selectively and/or repetitively execute a
program segment. A number of C control structures are available for controlling the flow of processing.
These structures are discussed in the following sections.
In fact, the function of curly braces ‘{‘ and `}’ in a C program to create a block, is same as the
function of begin and end, the reserved words in Pascal. C calls these braces as delimiters.
(2) The if-else statement: It can be observed from the above examples that the simple if statement
does nothing when the expression is false. An if-else statement takes care of this aspect. The
general form of this construct is given below:
if (expression)
{
statement sequence1
}
else
{
statement sequence2
}
where if: is a reserved word.
expression: is a boolean expression, written within parentheses.
statement sequence1: can be a simple or a compound statement.
else: is a reserved word.
statement sequence2: can be a simple or a compound statement.
Examples of if-else statements are:
(i) if (A > B) C = A;
else C 5 B;
(ii) if (x == 100)
printf (“\n Equal to 100”);
else
printf (“\n Not Equal to 100”);
It may be noted here that both the ‘if ’ and ‘else’ parts are terminated by semicolons.
(3) Nested if statements (if-else-if ladder): The statement sequence of if or else may contain an-
other if statement, i.e., the if-else statements can be nested within one another as shown below:
if (exp1)
if (exp2)
{
Overview of C 23
:
}
else
if (exp3)
{
:
}
else
{
:
}
It may be noted here that sometimes the nesting may become complex in the sense that it becomes
difficult to decide “which if does the else match”. This is called “dangling else problem’’. The C
compiler follows the following rule in this regard:
Rule: Each else matches to its nearest unmatched preceding if.
Consider the following nested if:
if (x < 50) if (y > 5) Net = x + y; else Net = x – y;
In the above statement, the else part matches the second if (i.e., if (y . 5)) because it is the nearest
preceding if. It is suggested that the nested if(s) should be written with proper indentation. The else(s)
should be lined up with their matching if(s). Nested if(s) written in this fashion are also called if-else-if
ladder. For example, the nested if given above should be written as:
if (x < 50)
if (y > 5)
Net = x + y;
else
Net = x – y;
However, if one desires to match the else with the first if, then the braces should be used as shown
below:
if (x < 50) {
if (y > 5)
Net = x + y;
}
else
Net = x – y;
The evaluation of if-else-if ladder is carried out from top to bottom. Each conditional expression
is tested and if found true, only then its corresponding statement is executed. The remaining ladder is,
therefore, by passed. In a situation where none of the nested conditions is found true, the final else part
is executed.
(4) Switch statement (selection of one of many alternatives): If it is required in a program to
select one of several different courses of action, then the switch statement of C can be used. In
fact, it is a multibranch selection statement that makes the control to jump to one of the several
statements based on the value of an int variable or expression. The general form of this state-
ment is given as:
24 Data Structures Using C
switch (expression)
{
case constant 1:
statement;
break;
case constant 2:
statement;
break;
:
default:
statement;
}
n If a match is found then its corresponding statements are executed and when break is encountered,
the flow of control jumps out of the switch statement. If break statement is not encountered, then
the control continues across other statement. In fact, switch is the only statement in C which is error
prone. The reason is that if the control is in a particular case and then it keeps running through all
cases in the absence of a proper break statement. This phenomenon is called “fall-through”.
n If no match is found and if a default label is present, then the statement corresponding to default
is executed.
n The values of the various case constants must be unique.
(ii)
switch (code)
Overview of C 25
{
case 101 : Rate = 50; break;
case 102 : Rate = 70; break;
case 103 : Rate = 100; break;
default : Rate = 95;
}
In the statement (ii), it can be observed that depending on the value of the code one out of the four
instructions is selected and obeyed. For example, if the code evaluates to 103, the third instruction (i.e.,
Rate 5100) is selected. Similarly if the code evaluates to 101, then the first instruction (i.e., Rate 5 50) is
selected.
The variable i and j have been initialized to values 1 and 10, respectively. Please note that these
initialization expressions are separated by a `comma’. However, the required semicolon remains
as such. During the execution of the loop, i increases from 1 to 10 whereas j decreases from 10 to 1
simultaneously. Similarly, the increment and decrement operations have been separated by a `comma’
in the ‘for statement’.
Though the power of the loop can be increased by including more than one initialization and
increment expression separated with the comma operator but there can be only one test expression
which can be simple or complex.
Overview of C 27
1.10.3.1 The ‘break’ and ‘continue’ Statements The break statement can be used in any ‘C᾽ loop
to terminate execution of the loop. We have already seen that it is used to exit from a switch statement. In
fact, whenever the break statement is encountered in a loop the control is transferred out of the loop. This
is used in a situation where some error is found in the program inside the loop or it becomes unnecessary
to continue with the rest of the execution of the loop.
Consider the following program segment:
:
while (val != 0)
{
:
printf (“\n %d”, val);
if (val < 0) {
printf (“\n Error in input”);
break;
}
:
}
Whenever the value contained, in variable val becomes negative, the message `Error in input’
would be displayed and because of the break statement the loop will be terminated.
The continue statement can also be used in any ‘C᾽ loop to bypass the rest of the code segment of
the current iteration of the loop. The loop, however, is not terminated. Thus, the execution of the loop
resumes with the next iteration. For example, the following loop computes the sum of positive numbers
in a list of 50 numbers:
:
sum = 0;
for (i = 0; i < 50; i ++)
{
printf (“\n %d”, val);
if (val <= 0) continue;
sum = sum + val;
}
:
It may be noted here that continue statement has no relevance as far as the switch statement is con-
cerned. Therefore, it cannot be used in a switch statement.
It may be noted here that the file process.h has to be included as header file because it contains the
function prototype of the library function exit().
-
-
goto rpara;
-
-
rpara: -
-
-
Overview of C 29
The goto statement will cause the control to be transferred to a statement whose label is rpara. The
normal flow of execution will continue from this statement (i.e., having label rpara) onwards. Consider
the following program segment:
:
k = 50;
back: I++
:
sum = sum + k*I ;
k--;
goto back;
:
It is clear from the above segment that as and when the control reaches the goto statement, it is
transferred to the statement with label back.
Statement label A statement label is an identifier which can be placed before any C statement followed
by a colon (i.e., :) as shown below:
again: C 5 A 1 B;
Thus, again is a label attached to the statement C 5 A 1 B and the control of execution can be
transferred to this statement by the following goto statement.
goto again;
It may be noted here that the transfer of control out of a control structure is allowed. However, a goto
branch into a control structure is not allowed.
Some example programs using control structures are given below.
Example 3: Write a program that reads three numbers and prints the largest of them.
Solution: The if-else ladder would be used. The required program is given below:
/* This program displays the largest of three numbers */
#include <stdio.h>
main()
{
int A, B, C;
printf (“\n Enter the Numbers A, B, C:“);
scanf (“%d %d %d”, &A, &B, &C);
/* if-else ladder */
if (A > B)
if ( A > C)
printf (“\n A is largest”);
else
printf (“\n C is largest”);
else
if (B > C)
printf (“\n B is largest”);
else
30 Data Structures Using C
printf (“\n C is largest”);
/* end of ladder */
}
Example 4: Write a program that prints all perfect numbers between 2 and 9999, inclusive. A perfect
number is a positive integer such that the number is equal to the sum of its proper divisors. A proper
divisor is any divisor whose value is less than the number.
Solution: The remainder operator ‘%’ would be used to find whether a number is divisor of an-
other or not. The maximum value of perfect divisor cannot be more than the half the value of the
number.
The required program is given below:
/* This program finds out all the perfect numbers between 2 and 9999 both
inclusive */
#include <stdio.h>
main()
{
int num, divisor;
int sum;
char ch;
In order to use the above functions, the programmer must include the ‘stdio.h᾽ header file at the
beginning of his program. The C or ‘C environment’ assumes keyboard as the standard input device and
VDU as standard output device. A consol comprises both keyboard and VDU.
By default, all stream I/Os are buffered. At this point, it is better that we understand the term buff-
ered I/O before we proceed to discuss the stream I/O functions.
Input buffer
Stream
I/O devices
Outbut buffer
Similarly, data written by an output stream pointer is not directly written onto the device but into an
output buffer. As soon as the output buffer becomes full, a block of data is written on to the I/O device
making output buffer empty.
32 Data Structures Using C
When the header file stdio.h is included in a program, the following standard stream pointers are
automatically opened:
name Meaning
stdin standard input
stdout standard output
stderr standard error
stdaux auxiliary storage
stdprn printer
In the computer, ‘stdin’ and ‘stdout’ refer to the user’s console, i.e., input device as keyboard
and output device as VDU (screen). The stream pointers ‘stdprn’ and ‘stdaux’ refer to the printer and
auxiliary storage, respectively. The error messages generated by the stream are sent to standard error
(stderr).
In order to clear the buffers, a function called ‘fflush()’ can be used. For example, input buffer of
standard input device can be cleared by invoking the function: ‘fflush(stdin)’.
Since the function getchar() is a stream I/O function, it is buffered in the sense that the character
typed by the user is not passed to the variable ch until the user hits the enter or return key, i.e., ↵. The
enter key is itself a character and it also gets stored in the buffer along with the character typed by the
user. The entry of `enter key’ character into the buffer creates problems in the normal functioning of
getchar() function. Therefore, it is suggested that after an input operation from the standard input
device (i.e., keyboard) the input buffer should be cleared. This operation is required to avoid interference
with subsequent input operations. The function call for this activity is: fflush (stdin). The usage is
shown below:
:
char ch;
ch = getchar();
fflush (stdin);
:
Overview of C 33
The function putchar() is used to send a single character to the standard output device. The character
to be displayed on the VDU screen is included as an argument to the putchar() function as shown below:
:
ch = `A’;
putchar (ch);
:
Once the above program segment is executed, the character ‘A’ will be displayed on the screen.
The putchar() function is also buffered. The function call for clearing the output buffer is fflush
(stdout). The output buffer is cleared on its own only when a new line character `\n’ is used.
(2) getc() and putc() Functions: The getc() and putc() functions are also character-
based functions. They have been basically designed to work with files. However, these
functions can also be used to read and write data from standard input and output devices by
specifying stdin and stdout as input and output files, respectively.
For example, the function getc(stdin) is equivalent to the function getchar(). Both will get a
character from the standard input device (keyboard). The function getchar() by default reads from
keyboard whereas the argument stdin of function getc(stdin) makes it read from a file represented
by the standard input device which is nothing but the keyboard.
Similarly, the function putc(ch, stdout) is equivalent to the function putchar(ch). Both the
functions send the character ch to the standard output device, i.e., VDU screen.
(3) getche() and putch() Functions: These two functions are character-based versions of con-
sole I/O functions. Therefore, the header file conio.h has to be included. In fact, these functions
act as an extension to the stream I/O functions. The console I/O functions read from the key-
board and write to the screen directly.
n getche(): This function directly reads a character from the console as soon as it is typed
without waiting for the enter key (↵) to be pressed. There is another similar function getch()
which also reads a character from the keyboard exactly in the same manner but does not show
it on the screen. This function is useful in menu selections where the programmer does not
want to display the character typed by the user. In fact, the extra ‘e’ in the getche() function
stands for echo, i.e., display the character.
Examples of usage of these functions are:
(i) ch 5 getche();
(ii) ch 5 getch();
n putch(): This function directly writes a character on the console. It takes character as an
argument.
Examples of usage of this function are:
(i) putch (‘\n’); //It takes the cursor to next line of the screen.
(ii) putch (ch); //It displays the character stored in the variable ch on the screen.
(iii) putch (‘A’); //It displays the character ‘A’ on the screen.
Note: These functions cannot read special keys from the keyboard such as function keys.
(1) gets() and puts() functions: These functions are string-based I/O functions. The function
gets() reads a string from the standard input device (keyboard). It is also buffered and, there-
fore, the return or enter key has to be typed to terminate the input. The fflush() function
should also be used after each gets() to avoid interference. For example, the program segment
given below reads a string in a variable name from keyboard.
:
name = gets() ;
fflush (stdin);
:
Similarly, the function puts() is used to send a string to the standard output device. The
string to be displayed on the VDU screen is included as an argument to the puts function as shown
below:
:
city = “New Delhi”;
puts (city);
:
Once this program segment is executed, the following message will be displayed on the screen:
New Delhi
1.12 aRRaYS
A detailed discussion on arrays as data structure is given in Chapter 3.
1.13 STRuCTuRES
Arrays are very useful for list and table processing. However,
the elements of an array must be of the same data type. In Day Month Year
certain situations, we require a construct that can store data 01 02 1998
items of mixed data types. C supports structures for this pur-
pose. The basic concept of a structure comes from day-to-day
life. We observe that certain items are made up of compo- Fig. 1.8 Composition of ‘date’
nents or sub-items of different types. For example, a date is
composed of three parts: day, month, and year as shown in
Figure 1.8.
Similarly, the information about a student is composed of many
Name SACHIN KUMAR
components such as name, age, roll number., and class; each belong-
ing to different types (see Figure 1.9). Age 18
The collective information about the student as shown above is Roll 10196
called a structure. It is similar to a record construct supported by Class CE41
other programming languages. Similarly, the date is also a structure.
Student
The term structure can be precisely defined as ‘a group of related data
items of arbitrary types᾽. Each member of a structure is, in fact, a
Fig. 1.9 Student
variable that can be referred to through the name of the structure.
Overview of C 35
Let us now define the following ‘C᾽ structure to describe the information about the student given in
Figure 1.9.
struct student
{
char Name [20];
int age;
int roll;
char class[5];
};
The above declaration means that the student is a structure consisting of data members: Name, age,
roll, and class. A variable of this type can be declared as given below:
struct student x;
Once the above declaration is obeyed, we get a variable x of type student in the computer memory
which can be visualized as shown in Figure 1.10.
The programmer is also allowed to declare one or more structure variables along with the structure
declaration as shown below:
Name
age
roll
class
In the above declaration, the structure declaration and the variable declaration of the structure type
have been clubbed, i.e., the structure declaration is followed by two variable names stud1 and stud2 and
the semicolon.
Please notice that the values are enclosed within curly braces. The values are assigned to the mem-
bers of the structure in the order of their appearance, i.e., first value is assigned to the first member,
second to second, and so on.
This statement will perform the necessary internal assignments. The concept of structure assign-
ment can be better understood by the following example program. In this program, a structure variable
stud1 is initialized with certain values and it is assigned to another structure variable stud2. The con-
tents of the variable stud2 are displayed.
/* This program illustrates copying of complete structures */
#include <stdio.h>
main()
{
struct student {
char name[20];
int roll;
int age;
char class[6];
};
/* initialize a variable of type student */
38 Data Structures Using C
struct student stud1 = {“Sachin Kumar”, 101, 16, “CE_IV”};
struct student stud2;
/* assign stud1 to stud2 */
stud2 = stud1;
/* Display contents of stud2 */
printf (“\n The copied contents are....”);
printf (“\n Name: %s”, stud2.name);
printf (“\n Roll: %d”, stud2.roll);
printf (“\n Age : %d” , stud2.age);
printf (“\n class: %s”, stud2.class);
}
DD MM YY
The nested structure shown in Figure 16.4 can be declared in a program as shown below:
/* Nested structures /*
struct date
{
int dd;
int mm;
int yy;
};
struct student
{
char name[20];
int roll;
struct date dob;
int marks;
};
It may be noted here that the member of a nested structure is referenced from the outermost to
inner most with the help of dot operators. Let us declare a variable stud of type student:
struct student stud;
Overview of C 39
The following statement illustrates the assignment of value 10 to the member mm of this nested
structure stud:
stud.dob.mm 5 10;
The above statement means: store value 10 in mm member of a structure dob which itself is a mem-
ber of structure variable called stud.
In the above declaration, the variable wages-of-month has been declared of type salary, where salary
itself is of float type. Thus, we can see that readability of a program can be enhanced by self documenting
the variables and their data types with the help of typedef.
For example, the declaration:
typedef int age;
can be used to increase the clarity of code as shown below :
age boy, girl, child;
The above statement clearly suggests as to what data the variables boy, girl, and child are going to store.
Similarly, typedef can also be used to define arrays.
typedef float lists [20];
lists BP, SAL, GSAL;
40 Data Structures Using C
typedef is also extremely useful for shortening the long and inconvenient declarations of structures.
For instance, the structure struct can be declared as a typedef as shown below:
typedef struct {
char name [20];
int age;
int roll;
char class[6];
} student;
Now the structure student has become the type and a variable of this type can be defined as:
student stud1, stud2;
The above declaration means that stud1 and stud2 are variables of type student.
Now, the variable A of type color can use the enumerations (blue, yellow, etc.) in a program.
Thus, the following operations on variable A are valid:
A = red;
:
if (A == red)
x = x + 10;
else
x = x 2 5;
Overview of C 41
Enumerated data type can be precisely defined as the ordered set of distinct constant values defined
as a data type in a program.
Examples of some valid enumerated data types are:
(i) enum furniture
{
chair,
table,
stool,
desk
};
It may be noted here that the ordering of elements in enumerated data type is specified by their posi-
tion in the enum declaration. Thus for furniture type, the following relations hold good:
Expression Value
Chair<Table True
Stool<Desk True
Table==Desk False
Chair<Desk True
The enumerated types can also be used in a switch construct. Consider the following program segment.
:
/* Enumerated type declaration */
enum furniture
{
chair,
table,
stool,
desk
};
main()
{
/* Enumerated variable declaration */
enum furniture item;
float cost;
:
switch item
42 Data Structures Using C
{
case chair : cost = 250; break;
case table : cost = 700; break;
case stool : cost = 120; break;
case desk : cost = 1500; break;
}
:
Similarly, enumerated data types can also be used as a control variable in a for loop as shown
below:
:
for (item = chair; item <= desk; item++)
{
:
:
}
The enumerations are also called symbolic constants because an enumeration has a name and
an integer value assigned to it which cannot be changed in the program. Therefore, enumerated
data types cannot be plainly used in I/O statements. As a result, the statements scanf, printf,
gets, puts, etc. cannot be directly applied to these data types because the results will not be as
desired.
1.15 unIOnS
Suppose in a fete, every visitor is assured to win any one of the items listed below:
(1) Wall clock
(2) Toaster
(3) Electric Iron
(4) DVD player
Obviously, a visitor to the fete must carry a bag large enough to carry the largest item so that any one
of the items won by him can be held in that bag. With the same philosophy in mind, the designers of C
introduced a special variable called union which may hold objects of different sizes and types but one at
a time. Though a union is a variable and can hold only one item at a time, it looks like a structure. The
general format of a union is given below:
union <name> {
member 1
member 2
:
member
};
1.16 FunCTIOnS
A function is a subprogram that can be defined by the user in his program. It is a complete program in
itself in the sense that its structure is similar to main() function except that the name main is replaced
by the name of the function. The general form of a function is given below:
<type> <name> (arguments)
where <type>: is the type of value to be returned by the function. If no value is returned, then
keyword void should be used.
<name>: is a user-defined name of the function. The function can be called from another
function by this name.
arguments: is a list of parameters (parameter is the data that the function may receive when
called from another function). This list can be omitted by leaving the param-
eters empty.
The program segment enclosed within the opening brace and closing brace is known as the function
body. For example, the main function in C is written as shown below:
main()
{
}
Let us write a function tri_area() which calculates the area of a triangle. Two variables base and
height will be used to read the values from the keyboard. The function is given below:
float tri_area()
{
float base, height, area;
printf (“\n Enter Base and Height”);
scanf (“%f %f”, &base, &height);
area = (0.5)*base*height;
return (area);
}
It may be noted here that the function tri_area() has defined its own variables. These are known
as local variables. The advantage of their usage is that once the execution of the function is over these
variables are disposed off by the system, freeing the memory for other things.
Let us write another function clear () which clears the screen of the VDU
/* This function clears the screen by writing blank lines on the screen */
void clear()
{
int i;
for (i = 0; i< = 30; i++)
printf (“\n”);
}
Similarly, let us write a function add which receives two parameters x and y of type integer from the
calling function and returns the sum of x and y.
44 Data Structures Using C
int add (int x, int y)
{
int temp;
temp = x + y;
return temp;
}
In the above function, we have used a local variable temp to obtain the sum of x and y. The value con-
tained in temp is returned through return statement. In fact, variable temp has been used to enhance the read-
ability of the function only, otherwise it is unnecessary. The function can be optimized as shown below:
ind add (int x, int y)
{
return (x + y);
}
It may be noted that a function prototype ends with a semicolon. In fact, the function prototype
is required for those functions which are intended to be called before they are defined. In absence of
Overview of C 45
a prototype, the compiler will stop at the function call because it cannot do the type checking of the
arguments being sent to the function. Therefore, the main purpose of function prototype is to help the
compiler in static type checking of the data requirement of the function.
main()
{
char ch;
int flag;
float Area;
clear(); /* Clear the screen */
flag = 0;
while (flag == 0)
{
Area = tri_area(); /* call tri_area to compute area */
printf (“\n Area of Triangle = %5.2f”, Area);
float tri_area()
{
float base, height, area;
46 Data Structures Using C
printf (“\n Enter Base and Height”);
scanf (“%f %f”, &base, &height);
area = (0.5)*base*height;
return (area);
}
/* This function clears the screen by writing blank lines on the screen */
void clear()
{
int i;
for (i = 0; i< = 30; i++)
printf (“\n”);
}
It may be noted here that when this program is executed, the first statement to be executed is the
first executable statement of the function main() of the program (i.e., clear() in this case). A function
will be executed only after it is called from some other function or itself.
The variables base, height, and area are local to function tri_area and they are not available in the
main() function. Similarly, the variable flag declared in function main() is also not available to func-
tions tri_area() and clear().
However, the variables declared outside any function can be accessed by all the functions. Such
variables are called global variables.
Let us consider the following program which declares the variables flag and area outside
the main function instead of declaring them inside the functions main() and tri_area (),
respectively.
#include <stdio.h>
/* Function prototypes */
void tri_area();
void clear();
/* Global Variables */
int flag;
float area;
main()
{
char ch; /* Local Variable */
clear(); /* Clear the screen */
flag = 0;
while (flag == 0)
{
tri_area(); /* call tri_area to compute area */
printf (“\n Area of Triangle = %5.2f”, area);
printf (“\n Want to calculate again: Enter Y/N”);
fflush(stdin);
ch = getchar();
if (ch == ‘N’ || ch == ‘n’)
flag = 1;
Overview of C 47
clear();
}
}
/* Function Definitions */
void tri_area()
{
float base, height;
printf (“\n Enter Base and Height”);
scanf (“%f %f”, &base, &height);
area = (0.5)*base*height;
}
/* This function clears the screen by writing blank lines on the screen */
void clear()
{
int i;
for (i = 0; i< = 30; i++)
printf (“\n”);
}
It may be noted in the above program that the variable area is available to both the functions tri_
area() and main(). On the other hand, the local variables base and height are available to function
tri_area only. They are neither available to main() function nor to the function clear(). Similarly,
variable i is also local to function clear().
In our previous programs, we have used our own function clear() to clear the screen. C also pro-
vides a function called clrscr() that clears the screen. From now onwards, we will use this function.
However, in order to use clrscr() in a program we will have to include conio.h, a header file at the
beginning of the program.
Note: We should not use global variables in our programs as they make the program or function
insecure and error prone.
The function big() can be called from the function main() by including the variables a and b as
the actual arguments of the function call as shown below.
main()
{
int a = 30;
int b = 45;
int c;
c = big (a, b);
printf (“\n The bigger = %d “, c);
}
Let us now write a program which reads the values of two variables a and b. The values of a and b
are sent as parameters to a function exchange(). The function exchanges the contents of the parameters
with the help of a variable temp. The changed values of a and b are printed.
#include <stdio.h>
#include <conio.h>
void exchange( int x, int y);
main()
{
int a, b;
clrscr(); /* clear screen */
printf (“\n Enter two values”);
scanf (“%d %d”, &a, &b);
exchange(a,b);
/* print the exchanged contents */
printf (“\n The exchanged contents are: %d and %d”, a, b);
}
void exchange (int x, int y)
{
int temp;
temp = x;
x = y;
y = temp;
}
Overview of C 49
The above program seems to be wonderful but it fails down badly because variables a and b do not
get any values back from the function exchange(). The reason is that although all values of actual
parameters (i.e., a and b) have been passed on to formal parameters (i.e., x and y) correctly, the changes
done to the values in the formal parameters are not reflected back into the calling function.
Once the above program is executed for input value (say 30 and 45), the following output is
produced:
The exchanged contents are: 30 and 45
The reason for this behaviour is that at the time of function call, the values of actual parameters a and
b get copied into the memory locations of the formal parameters x and y of the function exchange().
It may be noted that the variables x and y have entirely different memory locations from variables a
and b. Therefore any, changes done to the contents of x and y are not reflected back to the variables a and b.
Thus, the original data remains unaltered. In fact, this type of behaviour is desirable from a pure function.
Since in this type of parameter passing, only the values are passed from the calling function to the
called function, the technique is called call by value.
(2) Call by reference: If the programmer desires that the changes made to the formal parameters
be reflected back to their corresponding actual parameters, then he should use call by reference
method of parameter passing. This technique passes the addresses or references of the actual
parameters to the called function. Let us rewrite the function exchange(). In fact, in the called
function, the formal arguments receive the addresses of the actual arguments by using pointers
to them.
For instance, in order to receive the addresses of actual parameters of int type the exchange ()
function can be rewritten as shown below:
void exchange (int *val1, int *val2)
{
int temp;
temp = *val1;
*val1 = *val2;
*val2 = temp;
}
It may be noted that the above function has used two pointers called val1 and val2 to receive the
addresses of two int type actual parameters. However, initially, these are dangling pointers as shown
below:
val 1 val 2
The exchange() function would be called from the main() function by sending the addresses of
the actual arguments as shown below:
exchange (& a, & b);
Note: Kindly refer to Chapter 5 before reading the text of this section.
50 Data Structures Using C
After the exchange() function is called by the above statement, the addresses of the actual param-
eters a and b would be assigned to pointers val1 and val2, respectively as shown below:
val 1 val 2
a b
Since the function exchanges the contents of variables being pointed by the pointers val1 and
val2, the contents of variables a and b get exchanged which is the goal of this exercise.
After the above program is executed for input values (say 30 and 45), the following output is
produced:
The exchanged values are: 45 30
Now, we can see that the program has worked correctly in the sense that changes made inside the
function exchange() have been available to corresponding actual parameters a and b in the function
main(). Thus, the address operator ‘&’ has done the required task.
The main advantage of call by reference method is that more than one value can be received back
from the called function.
Note:
(1) As far as possible, we should use ‘pass by value’ technique of parameter passing. If some value is to
be received back from the function, it should be done through a return statement of the function.
Overview of C 51
(2) A function call can use both the methods, i.e., call by value and call by reference simultaneously.
For example, the following is a valid function prototype:
void calc (int a, float *b, float *c);
(3) When actual parameters are passed from the calling function to the called function by call by
value method, the names of the corresponding dummy parameters could be the same or differ-
ent. However, it is suggested that they may be named different in order to make the program
more elegant.
(3) Array parameters: Arrays can also be passed as parameters to functions. This is always done
through call by reference method. In C, the name of the array represents the address of the first
element of the array and, therefore, during the function call only the name of the array is passed
from the calling function. For example, if an array xyz of size 100 is to be passed to a function
some() then the function call will be as given below:
some (xyz);
The formal parameters can be declared in the called function in three ways. The different ways are
illustrated with the help of a function some() that receives an array xyz of size 100 of integer type.
Method I:
In this method, the array is declared without subscript, i.e., of an known size as shown below:
void some (int xyz [ ])
{
}
Method III:
In this method, the operator * is applied on the name of the array indicating that it is a pointer.
void some (int *xyz)
{
}
All the three methods are equivalent and the function some() can be called by the following statement:
some (xyz);
Note:
(1) The arrays are never passed by value and cannot be used as the return type of a function.
(2) In C, the name of the array represents the address of the first element or the zeroth element of the
array. So when an array is used as an argument to a function, only the address of the array gets passed
and not the copy of the entire array. Hence, any changes made to the array inside the function are
automatically reflected in the main function. You do not have to use ampersand (&) with array name
even though it is a call by reference method. The ampersand (&) is implied in the case of array.
52 Data Structures Using C
On the other hand, whenever a function returns non-integer values, the function must be prefixed
with the appropriate type. For example, a function xyz that returns a value of type float can be declared
as shown below:
float xyz (.....)
{ float a;
:
return a
}
Similarly, a function ABC that returns a value of type char can be declared as shown below:
char ABC ( ..... )
{ char ch;
:
return ch;
}
(2) Call by reference: In this method of passing the structures to functions, the address of the ac-
tual structure variable is passed. The address of the structure variable is obtained with the help
of address operator `&’. Thus, the changes done to the contents of such a variable inside the func-
tion are reflected back to the calling function. The program given below illustrates the usage of
call by reference method.
This program reads the data of a student in the function main() and passes the structure by refer-
ence to the function called print_data().
/* This program illustrates the passing of the structures by reference to
a function */
#include <stdio.h>
struct student
{
char name [20];
int age;
int roll;
char class [5];
};
/* Prototype of the function */
void print_data (struct student &Sob);
main()
{
54 Data Structures Using C
struct student stud;
printf (“\n Enter the student data”);
printf (“\nName:”); fflush(stdin);
gets (stud.name);
printf (“\nAge:”); scanf(“%d”,&stud.age);
printf (“\nRoll:”); scanf(“%d”,&stud.roll);
printf (“\nClass:”); fflush(stdin);
gets (stud.class);
print_data(&stud); /*The structure is being passed by reference*/
}
void print_data (struct student &Sob)
{
struct student *ptr;
ptr = sob;
printf (“\n The student data..”);
printf (“\nName:”); fflush(stdout);
puts (ptr−>name);
printf (“Age: %d”,ptr->age);
printf (“\nRoll: %d”,ptr->roll);
printf (“\nClass:”); fflush(stdout);
puts (ptr−>class);
}
It may be noted here that structures are usually passed by reference in order to prevent the over-
heads associated with the technique of passing structures by value. The overheads are extra memory
space and CPU time used to pass bigger structures.
(3) Returning structures from functions: It is possible to supply a structure as a return value
of a function. This can be done by specifying structure as the return type of the function.
This feature of C is helpful in a situation where a structure has been passed by value to a
function and the changes done to the contents of the structure are needed by the calling
function without disturbing the original contents. This concept is illustrated in the example
given below:
Example 5: Write a program that reads the data of a students whose structure is given in the figure
shown below. The structure is then passed by value to a function called change-data() where the
name of the student is capitalized. The changed contents of the student data are returned by the function
to the function main().
EmpAdd Desig
Emp_Code
Apt Colony State
No FL
Overview of C 55
Solution: We will use the function toupper() in this program to capitalize the name of the student
inside the function change-data(). The required program is given below:
/* This program illustrates the process of returning a structure from a
function */
#include <stdio.h>
#include <ctype.h>
struct student
{
char name [20];
int age;
int roll;
char class [5];
};
/* Prototype of the function */
void print_data (struct student *ptr);
struct student change_data (struct student Sob);
main()
{
struct student stud, newstud;
printf (“\n Enter the student data”);
printf (“\nName:”); fflush(stdin);
gets (stud.name);
printf (“\nAge:”);scanf(“%d”,&stud.age);
printf (“\nRoll:”);scanf(“%d”,&stud.roll);
printf (“\nClass:”); fflush(stdin);
gets (stud.class);
/* call function to change data */
newstud = change_data(stud);
printf (“\n The changed data ..”);
print_data(&newstud); /*The structure is being passed by value*/
}
/* This function changes the data of a student */
struct student change_data (struct student Sob)
{
int i = 0;
while (Sob.name[i] != ‘\0’)
{ /* Capitalize the letters */
Sob.name[i] = toupper (Sob.name[i]);
i++;
}
return (Sob);
}
void print_data (struct student *ptr)
{
printf (“\n The student data..”);
56 Data Structures Using C
printf (“\nName:”); fflush(stdout);
puts (ptr−>name);
printf (“Age: %d”, ptr−>age);
printf (“\nRoll: %d”, ptr−>roll);
printf (“\nClass:”); fflush(stdout);
puts (ptr−>class);
}
1.17 RECuRSIOn
We have already discussed iteration through various loop structures discussed in Section 1.10.3. Itera-
tion can be defined as an act of performing computation each time by the same method and the result of
computation is utilized as the source of data in the next repetition of the loop. For example the factorial
of an integer N can be computed by the following iterative loop:
fact = 1;
for (i = 1; i <= N; i++)
fact = fact * i;
printf (“\n The factorial of %d %d”, N, fact);
We can also use an alternative approach wherein we reduce the problem into a smaller instance of the
same problem. For example, factorial of N can be reduced to a product of N and the factorial of N − 1 as
given below:
Fact ( N )= N 3 Fact (N – 1)
In the above statement, a function Fact(N) has been defined in terms of Fact(N – 1), where Fact
(N – 1) is a smaller problem as compared to Fact(N). This type of definition is known as recursive defi-
nition. Recursion can be defined as: the ability of a concept being defined within the definition itself. In
programming terms, recursion is defined as the ability of a function being called from within the func-
tion itself.
Now, by the same definition, the factorial of (N – 1) can be defined as given below :
Fact(N – 1) = (N – 1) 3 Fact(N – 2)
We can continue this process till we end up with Fact(0) which is equal to 1, and is also the termi-
nating condition for this reduction process. The terminating condition for a recursive function is also
called the basic solution. For instance, the basic solution for a list of elements is that if the list contains
only one element then this element is largest as well as smallest element. Similarly, if a list contains only
one element then it is already sorted.
The process described above can be implemented by using a recursive function Fact() which is
defined in terms of itself as shown below:
1if N = 0 or N =1
Fact (N) =
N * Fact (N −1) otherwise
In C such recursive functions can be implemented by making the function call itself. For example,
the recursive function Fact(N) can be defined as follows:
Overview of C 57
int fact (int N) Calling Function
{
if (N = = 0)
return (1); Fact (4) = 4 * Fact (3)
else 24
return (N * fact (N-1);
Fact (3) = 3 * Fact (2)
}
6
It may be further noted that the
function Fact is being called by itself Fact (2) = 2 * Fact (1)
but with parameter N being replaced by 2
N – 1. A trace of Fact (N) for N = 4 is
given in Fig.1.11 Fact (1) = 1 * Fact (0)
1
Example 7: Write a program that uses
recursive function fact (N) to compute Fact (0) = 1
the factorial of a given number N. 1
#include<stdio.h>
#include<conio.h>
long int fact(int);
void main()
{
int num, factorial;
clrscr();
printf ("\n This program computes the factorial of a number");
printf ("\nEnter the number :");
scanf ("%d",&num);
factorial = fact(num);
printf("\nThe factorial of %d is %d", num, factorial);
}
Recursion can be used to write simple, short, and elegant programs. However, a recursive algorithm
requires a basic condition that must terminate the recursive calls. The absence of this condition would
58 Data Structures Using C
result in infinite recursive calls. For example, in function fact the condition ‘N= = 0’ is the required ter-
minating condition.
Note: Since recursion involves overheads such as CPU time and memory storage, it is suggested that
recursion should be used with care.
Solution: We can use the property that x y is simply a product of x and x y-1. For example, 6 4 = 6 × 6 3 .
The recursive definition of x y is given below:
1if y = 0
Power(x, y) =
X * power (x, y −1)otherwise
Note: The character ‘^’ has been used to indicate power operator.
Example 9: Write a program that computes GCD (greatest common divisor) of given two numbers.
Solution: The greatest common divisor of two numbers can be computed by the algorithm given
below:
Algorithm gcd()
Steps
1. Find the larger of the two numbers and store larger in x and smaller in y.
2. Divide the x by y and store the remainder in rem.
3. If rem is equal to zero, then the smaller number (y) is the required GCD and stop.
The required program that uses the function gcd(ÿ) is given below :
/* This program computes GCD of two numbers with the help of a function
gcd() .*/
#include<stdio.h>
int gcd(int,int);
void main()
{
int x,y,ans;
printf("\nEnter the integers whose gcd is to be found :");
scanf ("%d %d",&x, &y);
if (x>y)
ans = gcd(x,y);
else
ans = gcd(y,x);
printf("\nThe GCD is : %d", ans);
}
Example 10: Write a program that generates the first n terms of Fibonacci sequence by recursion. The
sequence is 0, 1, 1, 2, 3, 5, 8, ....
Solution: In a fibonacci series each term (except the first two) can be obtained by the sum of its two
immediate predecessors. The recursive definition of this sequence is given below:
0 if n =1
Fib(n) = 1if n = 2
Fib(n −1)+ Fib(n − 2) if n >= 2
60 Data Structures Using C
Let us now write a program that uses a function Fib() to compute the first n terms of the series.
/* This program generates Fibonacci series*/
#include<stdio.h>
int fib(int);
void main()
{
int n;
int i,term;
printf( "\nEnter the terms to be generated: ") ;
scanf ("%d", &n);
for(i=1;i<=n;i++)
{
term = fib(i);
printf("%d ", term);
}
}
int fib(int n)
{
if (n==1)
return 0;
else
if (n==2)
return 1;
else
return(fib(n - 1) + fib(n - 2));
}
} } }
The direct recursion can be further divided into three categories—linear, binary and tail recursion.
Though there are many more terms in vogue such as tree recursion etc, but we will discuss the popular
forms of recursion as given below:
A. Linear Recursion
When a recursive function has a simple repetitive structure and calls itself once then it is called as linear
recursion. For example, the recursive function called power(), given below, is a linear recursive func-
tion.
int power ( int x, int y)
{
if ( y == 0 )
return ( 1 );
else
return (x * power ( x, y-1));
}
It may be noted that it checks the terminating condition and thereafter performs the single recursive
call to itself.
B. Binary Recursion
When a recursive function calls itself twice from within the function then it is called as binary recursion.
For example, the fib(), given below, calls itself twice i.e. for fib(n – 1) and fib(n – 2).
// Function to return a fibonacci term
int fib(int n)
{
if (n==1)
return 0;
else
62 Data Structures Using C
if (n==2)
return 1;
else
return(fib(n-1)+fib(n-2));
}
Another example of binary recursion is binary tree recursive operations, done for both left child and
right child sub trees.
C. Tail Recursion
When a recursive call in a recursive function is the last call without any pending operation then
the recursion is called tail recursion. Thus, the last result of the recursive call is the final result of the
function.
Consider the function fact(), given below:
long int fact(int n)
{
if (n==0)
return(1);
else
return(n*fact(n-1));
}
This function is not tail recursive because the last statement is not recursive call but the multiplica-
tion operation: n * fact (n – 1).
However, the following modified function fact( ) is tail recursive.
long fact(int N, int result)
{
if (N == 1)
return result;
else
return fact(N-1, N * result); // Tail recursion
}
It may be noted that the last statement in the above function is a recursive call without any pending
operation. Hence the function is tail recursive. Since the argument ‘result’ is acting as an accumulator,
its initial value must be set to 1. Therefore, for computing factorial of a given number (say 5), the above
function must be called with the following statement:
Factorial = fact ( 5, 1);
Example 11: Write a program that computes the factorial of a given number by using the above given
tail recursive function fact().
Example 13: Write a program that tests the tail recursive function of Example 12 for given values of x
and y.
Solution: The required program is given below:
# include <stdio.h>
long power(int x, int y,long result)
{
if (y == 0)
return result;
else
return power(x, y-1, x * result); // Tail recursion
}
void main()
{
64 Data Structures Using C
long pow;
int x, y;
printf("\n Enter the x,y");
scanf ("%d %d", &x,&y);
pow = power(x,y,1);
printf("\n x ^ y =%u",pow);
}
Example 14: Write both normal and tail recursive versions of a function sum(N) that adds the first N
integers. For example, sum(6) is computed as following:
Sum(6) = 1 + 2 + 3 + 4 + 5 + 6
Solution: The normal recursive function is given below:
long sum (int N)
{
if (N == 1)
return 1;
else
return N + sum (N-1);
}
Example 15: Write a complete program that takes the tail recursive version of the function sum() to
compute the first N integers.
A group of monks or priests are moving the disks from source tower to destination tower as per the
following rules:
66 Data Structures Using C
1. At the beginning of the creation, the disks were placed in decreasing size on the source tower
2. The mandate is to move the disks to one of the other tower (say destination) with the help of a
third tower called middle.
3. Only one disk can be moved at a time, taken from top of the tower
4. A disk can never be placed on a smaller disk.
The legend predicts that as soon as all the disks are moved to the destination tower, the world will
come to an end, i.e., the priests will crumble into dust and the world will vanish.
This problem can be solved with the help of recursion. Assume there are N disks on the source
tower. Since on the destination tower, the bottommost disk needs to be the bottom most of the source
tower, it is necessary that we move N – 1 disks to the middle tower and thereafter move the largest (bot-
tommost) disk to the destination tower as shown in Figure 1.14.
Fig. 1.14 Moving N − 1 disks to the middle tower and the bottom most to final tower
Now repeat this exercise with N – 1 disks using source tower as the middle tower, and middle tower
playing the role of the source tower.
The above given steps can be written using a recursive algorithm as given below:
Algoritm towOfHan ( N, source, dest, middle )
{
if ( N == 1)
move a disk from source to dest
else
{ towOfHan ( N-1, source, middle, dest)
move a disk from source to dest
towOfHan ( N-1, middle, dest, source)
}
}
Example 16: Write a program that simulate the moves of Tower of Hanoi for a given number of disks,
say N.
Solution: The above given algorithm towOfHan() has been used to code the required program which
is given below:
// This program simulates the moves of tower of Hanoi
# include <stdio.h>
# include <conio.h>
Overview of C 67
void movDisk(int source, int dest);
void towOfHan(int numDisk, int source, int dest, int mid);
void main ()
{
int numDisk;
clrscr();
printf("\nHow many disks do you want to move?");
scanf ("%d", &numDisk);
printf("\n Disk To move %d disks from 1 to 2 using 3 as middle disk:",
numDisk);
towOfHan(numDisk, 1, 2, 3);
}
void movDisk(int source, int dest)
{
printf("\n move a disk from %d to %d", source, dest);
}
void towOfHan(int numDisk, int source, int dest, int mid)
{
if (numDisk == 1)
movDisk(source, dest);
else
{
towOfHan(numDisk-1, source, mid, dest);
movDisk(source, dest);
towOfHan(numDisk-1, mid, dest, source);
}
}
EXERCISES
1. Define the terms: token, keyword, identifier, variable, constant and const correct.
2. What is meant by basic data types? Explain in brief.
3. What would be the appropriate data type for the following?
n Length of a cloth
n Age of a student
n An exclamation mark
4. What is meant by a backslash character? What is the utility of these characters in a program?
5. What would be the output of the following program?
#include <stdio.h>
{
int k;
printf (“k = %d”, k );
}
i. an error ii. unpredictable iii. 34216 iv. 0
6. Write a program that inputs a value in inches and prints in centimeters.
7. Write a program that converts degree Celsius temperature into its degree Fahrenheit equivalent
using the following formula:
F 5 9/5 C 1 32
8. What is the purpose of ‘else’ clause in an ‘if ’ statement of C?
9. Write a program that computes ab where a and b are of real and integer types, respectively.
10. Describe the function of break and continue statements in C.
11. Write a program that reverses and sums the digits of an integer number.
12. Write a program that prints the following output on the screen.
A
B B
C C C
D D D D
E E E E E
Overview of C 71
Menu
Option1 'a'
Option2 'b'
Option3 'c'
Option4 'd'
Enter your choice:
In response to user’s selection, it prints the following messages:
Option Message
1 'One is alone'
2 'Two is a company'
3 'Three is a crowd'
4 'Quitting'
14. Define the terms: arrays, subscript, subscripted variables, and strings.
15. Write a program that removes duplicates from a list.
16. Write a program that computes the sum of diagonal elements of a square matrix.
17. Write a C structure for the record structure given in figure shown below.
Publishers
Author
Title Cost Edition
Name
Name Address
• 2.1 Overview
Chapter Outline
2.1 Overview
A normal desktop computer is based on Von Neumann architecture. It is also called a stored program
computer because the program’s instructions and its associated data are stored in the same memory as
shown in Figure 2.1.
The program instructions tell the system what operation is to be performed and on what data.
The above model is based on the fact that every practical problem that needs to be solved must have
some associated data. Let us consider a situation where a list of numbers is to be sorted in ascending
order. This problem obviously has the following two parts:
(1) The list: say 9, 8, 1, −1, 4, 2, 6, 15, 3
(2) A procedure to sort the given list of Memory
numbers. CPU
It may be noted that elements of list and
the procedure represent the data and program
of the problem, respectively. Instructions
Now, the efficiency of the program would Program
primarily depend upon the organization Communication
Channel Data
of data. When the data is poorly organized,
(Bus)
then the program cannot efficiently access
it. For instance, the time taken to open the
lock of your house depends upon where you
have kept the key to it: in your hand, in front
pocket, in the ticket pocket of your pant or
the inner pocket of your briefcase, etc. There Fig. 2.1 The stored program computer
Data Structures and Algorithms: An Introduction 73
may be a variety of motives for selecting one or the other of these places to keep the key. Key in the
hand or in the front pocket would be quickly accessible as compared to other places whereas the key
placed in the inner pocket of briefcase is comparatively much secured. For similar reasons, it is also
necessary to choose a suitable representation or structure for data so that a program can efficiently
access it.
List 9 8 1 1 4 2 6 15 3
n Ease of retrieval: The data should be stored in such a manner that the program can easily access it.
As an array is an index–value pair, a data item stored in the array (see Figure 2.2) can be very
easily accessed through its index. For example, List [5] will provide the data item stored at 5th
location, i.e. ‘2’. From this point of view, the array is a random-access structure.
n Operations allowed: The operations needed to be performed on the data must be allowed by the
representation.
For example, component by component operations can be done on the data items stored in the
array. Consider the data stored in List of Figure 2.2. For the purpose of sorting, the smallest
number must come to the head of the list. This shall require the following two operations:
(1) Visit every element (component) in the list to search the smallest and its position (‘−1’ at
location 3).
(2) Exchange the smallest with the number stored at the head of the list (exchange ‘−1’stored at
location 3 with ‘9’ stored at location 0). This can be easily done by using a temporary location
(say Temp) as shown below:
Temp = List[0];
List[0] = List[3];
List[3] = Temp;
List 1 8 1 9 4 2 6 15 3
The operations 1 and 2, given above, can be repeated on the unsorted part of the List to obtain the
required sorted List.
From above discussions, it can be concluded that an array provides a suitable representation for
the storage of a list of numbers. Such a representation is also called a data structure. Data structure is
a logical organization of a set of data items that collectively describe an object. It can be manipulated
by a program. For example, array is a suitable data structure for a list of data items.
It may be observed that if the array called ‘List’ represents the data structure then the operations
carried out in steps 1 and 2 define the algorithm needed to solve the problem. The overall performance
of a program depends upon the following two factors:
(1) Choice of right data structures for the given problem.
(2) Design of a suitable algorithm to work upon the chosen data structures.
0 1 2 3 58 59
nameList
roll
marks
percent
grade
char nameList[60][15];
int roll[60];
int marks[60];
float percent[60];
char grade[60];
Now in the above declaration, every list has to be considered as individual. The generation of merit
list would require the list to be sorted. The sorting process would involve a large number of exchange
operations. For each list, separate exchange operations would have to be written. For example, the
exchange between data of ith and jth student would require the following operations:
char tempName[15],chTemp;
int i,j, temp;
float fTemp;
temp = roll[i];
roll[i] = roll[j];
roll[j] = temp;
temp = marks[i];
marks[i] = marks[j];
marks[j] = temp;
fTemp = percent[i];
percent[i] = percent[j];
percent[j] = fTemp;
chTemp = grade[i];
grade[i] = grade[j];
grade[j] = chTemp;
Though the above code may work, but it is not only a clumsy way of representing the data but would
require a lot of typing effort also. In fact, as many as fifteen statements are required to exchange the data
of ith student with that of jth student.
Choice II: From the built-in data structures, the struct can be selected to represent the data of a
student as shown below:
struct student {
char name[15];
int roll;
int marks;
float percent;
char grade;
};
76 Data Structures Using C
Similarly, from built-in data structures, an array called studList can be selected to represent a
list. However, in this case each location has to be of type student which itself is a structure. The required
declaration is given below:
struct student studList [60];
It may be noted that the above given array of structures is a user-defined data structure constructed
from two built-in data structures: struct and array. Now in this case, exchange between data of ith and
jth student would require the following operations:
int i,j;
struct student temp;
temp = studList[i];
studList [i] = studList [j];
studList [j] = temp;
Thus, only three statements are required to exchange the data of ith student with that of jth student.
A comparison of the two choices discussed above establishes that Choice II is definitely better than
Choice I as far as the selection of data structures for the given problem is concerned. The number of lines
of code of Choice II is definitely less than the Choice I. Moreover, the code of Choice II is comparatively
easy and elegant.
From the above discussion, it can be concluded that choice of right data structures is of paramount
importance from the point of view of representing the data belonging to a problem.
(1) Linear data structures: These data structures represent a sequence of items. The elements follow a
linear ordering. For instance, one-dimensional arrays and linked lists are linear data structures. These can
be used to model objects like queues, stacks, chains (linked lists), etc., as shown in Figure 2.5.
Top
Front Rear
Queue
Head
(2) Non-linear data structures: These data structures represent objects which are not in sequence
but are distributed in a plane. For instance, two-dimensional arrays, graphs, trees, etc. are non-
linear data structures. These can be used to model objects like tables, networks and hierarchy, as
shown in Figure 2.6.
The tree structure does not follow the first property of linear data structures and the graph does
not follow any property of the linear data structures. A detailed discussion on this issue is given later in
the book.
(5) Constant: It is the memory location that does not change its contents during the execution of a
program.
(6) Variable: It is the most fundamental aspect of any computer language and can be defined as a loca-
tion in the memory wherein a value can be manipulated, i.e., stored, accessed, and modified.
n It should be unambiguous in the sense that the logic should be crisp and clear.
previous contents will be lost. However, if we use the value of a memory location then the contents of
the memory do not change, i.e., it remains intact. For example, in the following statement, the value of
variables B and C are being used and their sum is stored in variable A
A5B1C
where, the symbol ‘5’ is an assignment operator.
If previous values of A, B, C were 15, 45, 20, respectively then after execution of the above statement,
the new contents of A, B, C would be 65, 45, and 20, respectively.
Consider the following statement:
Total 5 Total 1 100;
The abnormal looking statement is very much normal as far as computer memory is concerned. The
statement reveals the following fact:
“Old value of variable ‘Total’ is being added to an integer 100 and the result is being stored as new
value in the variable ‘Total’, i.e., replacing its old contents with the new contents.”
After the execution of the statement with old contents of ‘Total’ (say 24), the new contents of ‘Total’
will be 124.
2.3.1.1 Identifying Inputs and Outputs It is very important to identify inputs and outputs of an
algorithm. The inputs to an algorithm mean the data to be provided to the target program through input
devices such as keyboard, mouse, input file, etc. The output of the algorithm means the final data to be
generated for the output devices such as monitor, printer, output file, etc.
The inputs and outputs have to be identified at the beginning of the program development. The
most important decision is to differentiate between major and minor inputs and outputs. The minor
inputs are sometimes omitted at the algorithm development stage. The other important issue is to find at
which part of the program the inputs would be required and the output generated.
Example 1: Let us reconsider the problem of sorting a given list of numbers. In this case, the required
inputs are:
(1) List of numbers
(2) Size of list
(3) Type of numbers to be stored in the list
(4) In which order to be sorted
The outputs are:
(1) The sorted list
(2) Message for asking the size of the list
(3) Message for displaying the output list
The messages are the minor outputs.
After identifying the inputs and the outputs, the additional information must be specified before
writing the final algorithm.
Input specifications
n In what order and format, the input values will be read?
n What are the lower and upper limits of the input values? For example, the size of the list should
not be less than zero and its type has to be necessarily an integer.
n When to know that there is no more input to be read, i.e., what marks the end of the list?
80 Data Structures Using C
Output specifications
n In what order and format the outputs values will be produced?
n What types of values to be produced with what spacing i.e., decimal spacing, significant digits, etc.
n What main headings and column headings to be printed on the output page.
10 20 30 40 50 60 70 80 90 100
The problem at hand is “How to print the table”. The first version of the algorithm is given below:
Algorithm Print_Table Ver1.
{
Step 1. Print table
}
When we look at the table, we find that the table is nothing but a collection of rows and we have to
print only 10 rows. So, the problem has been reduced to printing a row 10 times. The second refined
version of the algorithm is given below:
Algorithm Print_Table Ver2.
{
Step
1. Row = 1
2. Print the row
3. Row = Row + 1
4. If Row <= 10 repeat steps 2 and 3
5. Stop
}
Now our focus is on “how to print a row”. A closer look at the problem suggests that a row is nothing
but a collection of columns and we have to print only 10 columns for each row. The third refined version
of the algorithm is given below:
Data Structures and Algorithms: An Introduction 81
(1) Sequence structure: The order in which the statements or statement blocks are written within a pair
of curly braces ({,}) in an algorithm are sequentially executed by the computer.
Example 3: Write an algorithm that computes the area of a triangle.
Solution: The solution is trivial. The inputs are base and height. The following formula will be used for
the computation of the area of triangle:
Area 5 ½ * base * height
82 Data Structures Using C
where area is the output to be displayed. The simple sequence structure required for this problem is
given below:
Algorithm AreaTriangle ()
{
Step
1. Read base, height;
2. Area = 0.5 * base * height;
3. Print Area;
4. Stop
}
(3) Iteration: The iteration structure is used to repeatedly execute a statement or a block of statements.
The loop is repeatedly executed until a certain specified condition is satisfied.
Data Structures and Algorithms: An Introduction 83
Example 5: Write an algorithm that finds the largest number from a given list of numbers.
Solution:
Input: List of numbers, size of list, prompt for input.
Type of numbers: integer.
Output: Display the largest number and a message in this regard.
The most basic solution to this problem is that if there is only one number in the list then the number is
the largest. Therefore, the first number of the list would be taken as largest. The largest would then be com-
pared with the rest of the numbers in the list. If a number is found larger than the current largest, then it is
stored in the largest. This process is repeated till the end of the list. For iteration, For loop can be employed.
The concerned algorithm is given below:
Algorithm Largest
{
Step
1. Prompt “Enter the size of the list”;
2. Read N;
3. Prompt “Enter a Number”;
4. Read Num;
5. Largest = Num; /* The first number is the largest*/
6. For (I = 2 to N)
{
6.1 Prompt “Enter a Number”
6.2 Read Num
6.3 If (Num > Largest) Largest = Num
}
7. Prompt “The largest Number =”
8. Print Largest;
9. Stop
}
It may be noted here that if a loop terminates after a number of iterations, then the loop is called a
finite loop. On the other hand if a loop does not terminate at all and keeps on executing endlessly, then
it is called an infinite loop.
(4) Use of accumulators and counters: Accumulator is a variable that is generally used to compute sum
or product of a series. For example, the sum of the following list of numbers can be computed by initial-
izing a variable (say ‘Total’) to zero.
4 8 9 40 32 12 21 1 9
The 1st number of the list is added to Total and then the 2nd, 3rd and so on till the end of the list.
Finally, the variable Total would get the accumulated sum of the list given above.
Example 6: Write an algorithm that reads a list of 50 integer values and prints the total of the list.
Solution: Let us assume that a variable called ‘sum’ would be used as an accumulator. A variable called
Num would be used to read a number from the input list one at time. For loop will be employed to do
the iteration.
84 Data Structures Using C
Example 8: Write a program that computes the sum of first N multiples of an integer called ‘K’, i.e.
sumSeries 5 1*K 1 2*K 1 3*K 1 ………N*K
Solution: An accumulator called sumSeries would be used to compute the required sum of the series.
As the number of steps are known (i.e., N), ‘For loop’ is the most appropriate for the required computa-
tion. The index of the loop will act as the counter.
Data Structures and Algorithms: An Introduction 85
Input: N, K
Output: sumSeries
The program is given below:
/* This program computes the sum of a series */
#include <stdio.h>
void main()
{
int sumSeries, i;
int N; /* Number of terms */
int K; /* The integer multiple*/
printf(“\n Enter Number of terms and Integer Multiple”);
scanf (“%d %d”, &N,&K);
sumSeries = 0; /*Accumulator initialized */
For (i=1; i<=N; i++)
{
sumSeries = sumSeries + i*K;
}
printf (“\n The sum of first %d terms = %d”, N, sumSeries);
}
In fact, it can be concluded that irrespective of the programming paradigms, a general algorithm is
designed based on following constructs:
(1) Sequence structure: Statement after statements are written in sequence in a program.
(2) Selection: Based on a decision (i.e., if-else), a section of statements is executed.
(3) Iteration: A group of statements is executed within a loop.
(4) Function call: At a certain point in program, the statement may call a function to perform a
particular task.
It is the responsibility of the programmer to use the right construct at right place in the algorithm so
that an efficient program is produced. A discussion on analysis of algorithms is given in next section.
As a first step, it is understandable that an algorithm would be fast for a small input data. For
example, the search for an item in a list of 10 elements would definitely be faster as compared to
searching the same item within a list of 1,000 elements. Therefore, the execution time T of an algo-
rithm has to be dependent upon N, the size of input data. Now the question is how to measure the
execution time T. Can we exactly compute the execution time? Perhaps not because there are many
factors that govern the actual execution of program in a computer system. Some of the important
factors are listed below:
n The speed of the computer system
Thus, it is better that we compare the algorithms based on their relative time instead of actual execu-
tion time.
The amount of time taken for completion of an algorithm is called time complexity of the algo-
rithm. It is a theoretical estimate that measures the growth rate of the algorithm T(n) for large number
of n, the input size. The time complexity is measured in terms of Big-Oh notation. A brief discussion on
this notation is given in next section.
(1) Simple statement: Let us assume that a statement takes a unit time to execute, i.e., 1. Thus, T(n) 5 1
for a simple statement. To satisfy the Big-Oh condition, the above relation can be rewritten as shown
below:
T(n) ,5 1*1 where c 5 1 and n0 5 0.
Comparing the above relation with relation 2.1, we get g(n) 5 1. Therefore, the above relation can be
expressed as:
T(n) 5 O(1)
(2) Sequence structure: The execution time of sequence structure is equal to the sum of execution time
of individual statements present within the sequence structure.
Consider the following algorithm. It consists of a sequence of statements 1 to 4:
Data Structures and Algorithms: An Introduction 87
Algorithm AreaTriangle
{
Step
1. Read base, height;
2. Area = 0.5 * base * height;
3. Print Area;
4. Stop
}
The above algorithm takes four steps to complete.
Thus, T(n) 5 4.
To satisfy the relation 2.1, the above relation can be rewritten as:
T(n) ,5 4*1 where c 5 4 and n0 5 0.
Comparing the above relation with relation 2.1, we get g(n) 51. Therefore, the above relation can be
expressed as:
T(n) 5 O(1)
We can say that the time complexity of above algorithm is O(1), i.e., of the order 1.
(3) The loop structure: The loop structure iteratively executes statements written within its bound. If a
loop executes for N iterations, it contains only simple statements.
Consider the algorithm Count_zeros of Example 7 which contains a loop. This algorithm takes
the following steps for its completion:
No. of simple steps 5 4 (Steps 1, 3, 4, 5)
No. of Loops of steps 1 to N 51
No. of statements within the loop 5 3 (1 comparison, 1 addition, 1 assignment)
Thus, T(n) 5 3*N 1 4
Now for N .5 4, we can say that
3*N 1 4 ,5 3*N 1 N ,5 4N
Therefore, we can say that
T(n) ,5 4N where c 5 4, n0 .5 4
T(n) 5 O(N)
Hence, the given algorithm has time complexity of the order of N, i.e., O(N). This indicates that the term
4 has negligible contribution in the expression: 3*N 1 4.
Note: In case of nested loops, the time complexity depends upon the counters of both outer and
inner loops. Consider the nested loops given below. This program segment contains ‘S’, a sequence of
statements within nested loops I and J.
For (I = 1; I <= N; I++)
{
For (J = 1; J <= N; J++)
{
S;
}
}
88 Data Structures Using C
It may be noted that for each iteration of I loop, the J loop executes N times. Therefore, for N iterations
of I loop, the J loop would execute N*N 5 N2 times. Accordingly, the statement S would also execute
N2 times.
Thus, T(n) 5 N2
To satisfy the relation 2.1, the above relation can be rewritten as shown below:
T(n) ,5 1 * N2 where c 5 1 and n0 5 0.
Comparing the above relation with relation 2.1, we get g(n) 5 N2. Therefore, the above relation can be
expressed as:
T(n) 5 O(N2)
Hence, the given nested loop has time complexity of the order of N2, i.e., O(N2).
Example 9: Compute the time complexity of the following nested loop:
For (I = 1; I <= N; I++)
{
For (J = 1; J <= I; J++)
{
S;
}
}
Solution: In this nested loop structure, for an iteration of I loop, the J loop executes in an increasing
order from 1 to N as given below:
1 1 2 1 3 1 4 1……. (N 2 1) 1 N 5 N*(N 1 1)/2 5 N2/2 1 N/2.
Thus, T(n) 5 N2/2 1 N/2.
Now for N2 .5 N/2, we can say that:
N2/2 1 N/2 ,5 N2/2 1 N2 ,5 3 N2/2.
Therefore, we can say that
T(n) ,5 3 N2/2 where, c 5 3/2, n0 5 1/2
T(n) 5 O(N2)
Hence, the given algorithm has time complexity
of the order of N2, i.e., O(N2). This indicates that Condition
the term N/2 has negligible contribution in the True False
expression N2/2 1 N/2.
complexity T2. The time complexity of if-then-else structure is taken to be the maximum of the two,
i.e., max(T1, T2).
Example 10: Consider the following algorithm. Compute its time complexity.
If (x > y)
{
x = x+1;
}
Else
{
For (i=1; i <= N, i++)
{
x = x+i;
}
}
Solution: It may be noted that in the above algorithm, the time complexity of ‘then’ and ‘else’ part are
O(1) and O(N), respectively. The maximum of the two is O(N). Therefore, the time complexity of the
above algorithm is O(N).
Example 11: compute the time complexity of the following relations:
(1) T(n) 5 2527
(2) T(n) 5 8*n 1 17
(3) T(n) 5 15*N2 1 6
(4) T(n) 5 5*N3 1 N2 1 2*N
Solution:
(1) T(n) 5 2527 ,5 2527*1, where c 5 2527 and n0 5 0.
5 O(1)
Ans. The time complexity 5 O(1)
(2) T(n) 5 8*n 1 17 ,5 8*n 1 n ,5 9 *n for c 5 9 and n0 5 17
5 O(N)
Ans. The time complexity 5 O(N)
(3) T(n) 5 15*N 1 6 ,5 15*N 1 N for N 5 6
2 2
Now for N ,5 N2, we can rewrite the above relation as given below:
15*N2 1 N ,5 15*N2 1 N2 ,5 16 N2 for c 5 16 and n0 5 6
Thus, T(n) 5 O(N2)
Ans. The time complexity 5 O(N2)
(4) T(n) 5 5*N 1 N 1 2*N
3 2
Note: All the relations, given in example 10 n table 2.1 Growth rates of functions
were polynomials. From the answers, we can
n lg n n lg n n2
say that any polynomial has the time complex-
ity of Big-Oh of its leading term with coeffi- 1 0 0 1
cient of 1. 2 1 2 4
It is also understandable that the Big-Oh does 16 4 64 256
not provide the exact execution time. Rather, it 256 8 2048 65536
gives only an asymptotic upper bound of the ex-
1024 10 10240 1048576
ecution time of an algorithm.
For different types of functions, the growth 4096 12 49152 16777216
rates for different values of input size n have been
tabulated in Table 2.1.
It can be observed from Table 2.1 that for a given input, the growth rate O(lg n) is faster than
O(n lg n). For the same input size, the growth rate O(n2) is the slowest.
Later in the book, for various algorithms, timing estimates will be included using Big-Oh notation.
exerCiSeS
18. Write an algorithm that computes GCD and LCM of two given numbers.
19. Explain the selection (conditional control) structures with the help of examples.
20. Write an algorithm that determines whether a number is even or odd.
21. Write an algorithm that computes profit or loss incurred by a salesperson.
22. What is iteration?
23. Write an algorithm that sorts a given list of numbers in ascending order.
24. Write an algorithm that determines whether a number is prime or not.
25. What is the use of accumulators and counters?
26. Write an algorithm that computes average marks of 60 students of a class.
27. Write an algorithm that finds the number of odds and evens present in a given list of N numbers.
28. What are the parameters on the basis of which an algorithm can be analyzed?
29. Define the term ‘time complexity’. How can the time complexity of a given algorithm be found?
30. Explain Big-Oh notation with the help of examples.
31. Compute the time complexity of the following relations:
i. T(n) 5 410
ii. T(n) 5 4*n − 12
iii. T(n) 5 13*N2 − 7
iv. T(n) 5 3*N3 1 N214*N
32. What is the time complexity of the following loop?
for (int i = 0; i < n; i++)
{
statement;
}
33. What is the Big-Oh complexity of the following nested loop?
i. for (int i = 0; i < n; i++)
{
for (int j = 0; j < n; j++)
{
statement;
}
}
ii. for (int i = 0; i < n; i++)
{
for (int j = i + 1; j < n; j++)
{
statement;
}
}
iii. for (int i 5 0; i , n; i11)
{
for (int j = n; j > i; j--)
{
statement;
}
}
92 Data Structures Using C
• 3.1 Introduction
ChapTeR OUTlINe
3.1 INTRODUCTION
An array is a data structure with the help of which a programmer can refer to and perform operations on
a collection of similar data types such as simple lists or tables of information. For example, a list of names
of ‘N’ number of students of a class can be grouped under a common name (say studList). This list
can be easily represented by an array called studList for ‘N 5 45’ students as shown in Figure 3.1
0 1 2 43 44
In fact, the studList shown in Figure 3.1 can be looked upon by the following two points of views:
(1) It is a linear list of 45 names stored in contiguous locations—an abstract view of a list having
finite number of homogeneous elements (i.e., 45 names).
(2) It is a set of 0 to 44 memory locations sharing a common name called studList—it is an array
data structure in which 45 names have been stored.
It may be noted that all the elements in the array are of same type, i.e., string of char in this case.
The individual elements within the array can be designated by an index. The individual elements are
randomly accessible by integers, called the index.
For instance, the zeroth element (Sagun) in the list can be referred to as studList [0] and the 43rd
element (Ridhi) as studList [43] where 0 and 43 are the indices. An index has to be of type integer.
94 Data Structures Using C
An index is also called a subscript. Therefore, individual elements of an array are called subscripted
variables. For instance, studList [0] and studList [43] are subscripted variables.
From the above discussion, we can define an array as a finite ordered collection of items of same
type. It is a set of index, value pairs.
An array is a built-in data structure in every programming language. Arrays are designed to have
a fixed size. Some languages provide zero-based indexing whereas other languages provide one-based
indexing. ‘C’ is an example of zero-based indexing language because the index of its arrays starts from 0.
Pascal is the example of one-based indexing because the index of its arrays starts from 1.
An array whose elements are specified by a single subscript is known as one-dimensional array. The array
whose elements are specified by two or more than two subscripts is called as multi-dimensional array.
It may be noted that in the above program, a component by component processing has been done on
the all elements of the array called marksList.
Arrays: Searching and Sorting 95
3.2.1 Traversal
It is an operation in which each element of a list, stored in an array, is visited. The travel proceeds from
the zeroth element to the last element of the list. Processing of the visited element can be done as per the
requirement of the problem.
Example 2: Write an algorithm called ‘listTravel’ that travels a list stored in an array called List of
size N. Each visited element is operated upon by an operator called OP.
Solution: The For-loop can be employed to visit and apply the operator OP on every element of the array.
The required algorithm is given below:
Algorithm listTravel()
{
Step
1. For (i 5 0; i < N, i++)
{
1.1 List[i] 5 OP(List[i]);
}
}
Example 3: A list of N integer numbers is given. Write a program that travels the list and determines as
to how many of the elements of the list are less than zero, zero, and greater than zero.
Solution: In this program, three counters called numNeg, numZero and numPos, would be used to keep
track of the number of elements that are less than zero, zero, and greater than zero present in the list. The
required program is given below:
/* This program determines the number of less than zero, zero, and greater
than zero numbers present in a list */
#include <stdio.h>
void main()
{
int List [30];
int N;
int i, numNeg, numZero, numPos;
printf (“\n Enter the size of the list”);
scanf (“%d”, &N);
96 Data Structures Using C
printf (“Enter the elements one by one”);
/* Read the List*/
for (i 5 0; i < N; i++)
{
printf (“\n Enter Number:”);
scanf (“%d”, &List[i]);
}
numNeg 5 0; /* Initialize counters*/
numZero 5 0;
numPos 5 0;
/* Travel the List*/
for (i50; i < N; i++)
{
if (List[i] < 0)
numNeg 5 numNeg + 1;
else
if (List[i] 55 0)
numZero 5 numZero + 1;
else
numPos 5 numPos + 1;
}
}
3.2.2 Selection
An array allows selection of an element for a given (random) index. Therefore, the array is also called
random access data structure. The selection operation is useful in a situation wherein the user wants
query about the contents of a list for a given position.
Example 4: Write a program that stores a merit list in an array called ‘Merit’. The index of the
array denotes the merit position, i.e., 1st, 2nd, 3rd, etc. whereas the contents of various array loca-
tions represent the percentage of marks obtained by the candidates. On user’s demand, the program
displays the percentage of marks obtained for a given position in the merit list as per the following
format:
Position: 4
Percentage: 93.25
Solution: An array called ‘Merit’ would be used to store the given merit list. Two variables pos and
percentage would be used to store the merit position and the percentage of a candidate, respectively. A
do-while loop would be used to iteratively interact with the user through following menu displayed to
the user:
Menu:
Query----1
Quit------2
Enter your choice
Note: As per the demand of the program, we will leave the zeroth location as unused and fill the array
from index 1.
Arrays: Searching and Sorting 97
Searching and sorting are very important activities in a data processing environment and they are
discussed in detail in the subsequent section.
3.2.3 Searching
There are many situations where we want to find out whether a particular item is present in a list or not.
For instance, in a given voter list of a colony a person may search his name to ascertain whether he is a
valid voter or not. For similar reasons, passengers look for their names in the railway reservation lists.
Note: System programs extensively search symbols, literals, mnemonics, ‘compiler and assembler’ direc-
tives, etc.
In fact, search is an operation in which a given list is searched for a particular value. The location
of the searched element is informed. Search can be precisely defined an activity of looking for a value or
item in a list.
A list can be searched sequentially wherein the search for the data item starts from the beginning
and continues till the end of the list. This simple method of search is also called linear search. It may be
noted that for a list of size 1,000, the worst case is 1,000 comparisons.
Let us consider a situation wherein we are interested in searching a list of numbers called ‘numList’
for an element having value equal to the contents of a variable called val. It is desired that the location
of the element, if found, be displayed.
The list of numbers can be comfortably stored in an array called numList of type int. To find the ele-
ment, the list would be travelled in such a manner that each visited element would be compared to the vari-
able ‘val’. If the match is found, then the location of the corresponding position would be stored in a variable
called Pos. As the number of elements in the list is known, For-loop would be used to travel the array.
Algorithm searchList()
{
step
1. read numList
2. read Val
3. Pos 5 −1 ‘initialize Pos to a non-existing position’
4. for (i 5 0; i < N; i++)
{
4.1 if (val 55 numList[i])
Pos 5 i;
}
5. if (Pos !5 −1)
print Pos.
}
The above discussed search through a list, stored in an array, has the following characteristics:
n The search is linear.
n The search starts from the first element and continues in a sequential fashion from element to
Thus, the linear search is slow and to some extent inefficient. In special circumstances, faster
searches can be applied.
For instance, binary search is a faster method as compared to linear search. It mimics the process
of searching a name in a directory wherein one opens a page in the middle of the directory and
examines the page for the required name. If it is found, the search stops; otherwise, the search is
applied either to first half of the directory or to the second half.
Arrays: Searching and Sorting 101
3.2.3.1 Binary Search If a list is already sorted, then the search for an entry (say Val) in the list can
be made faster by using ‘divide and conquer’ technique. The list is divided into two halves separated by
the middle element as shown in Figure 3.2.
1 N/2 N
0 1 2 3 4 5 6 7 8
Series 3 4 5 7 11 13 14 17 21
Suppose we desire to search the Series for a value (say 14) and its position in it. Binary search begins
by looking at the middle value in the Series. The middle index of the array is approximated by averag-
ing the first and last indices and truncating the result, i.e., (0 + 8)/2 5 4. Now, the content of the fourth
location in Series happens to be ‘11’ as shown in Figure 3.4. Since the value we are looking for (i.e., 14) is
greater than 11, the middle value, it may be present in the right half (Series [5] to Series [8]).
0 1 2 3 4 5 6 7 8
Series 3 4 5 7 11 13 14 17 21
Middle
Now the middle of the right half is approximated i.e., (5 + 8)/2 5 6. We find that the desired element
exists at the middle of the right half, i.e., Series [6] 5 14 as shown in Figure 3.5.
5 6 7 8
13 14 17 21
Middle
It may be noted that the desired element has been found only in two steps. Thus, it is a much faster
method as compared to the linear search. For instance, a list of 1,000 sorted elements would require 10
comparisons to search the entire list. An algorithm for this method is given below.
In this algorithm, we would employ a Boolean variable called flag. The flag will indicate the pres-
ence or absence of the element being searched.
Algorithm binSearch ()
{
Step
1. First 5 0;
2. Last 5 N − 1;
3. Pos 5-1;
4. Flag 5 false;
5. While (First < 5 Last and Flag 55 false)
{
5.1 Middle 5 (first + last) div 2
if (Series [middle] 55 Val)
{Pos5 middle;
flag 5 true;
break from the loop;
}
else
if (Series[middle] < Val) First 5 middle + 1;
else Last 5 middle − 1;
}
6. if (flag 55 true)
prompt “The value found at”;
write pos;
else prompt “The value not found”;
}
It may be noted that ‘div’ operator has been used to indicate that it is an integer division. The inte-
ger division will truncate the results to the nearest integer. If the desired value is found, then the flag is
set to true and the while loop terminates; otherwise, a stage arrives when first becomes greater than the
last, indicating the failure of the search. Thus, the variables First and Last keep track of the lower and
the upper bounds of the array, respectively.
Arrays: Searching and Sorting 103
Example 6: Write a program that uses binary search to search a given value called Val in a list of N
numbers called Series.
Solution: The Algorithm binSearch() discussed above is used to write the required program.
/* This program uses binary search to find a given value called val in a
list of N numbers */
#include <stdio.h>
#define true 1
#define false 0
void main()
{
int First;
int Last;
int Middle;
int Series[20]; /*The list of N sorted numbers*/
int Val;
int flag; /*The value to be searched */
int N, Pos, i;
Binary search through a list, stored in an array, has the following characteristics:
n The list must be sorted, i.e., ordered.
n A list with large number of elements would increase the total execution time, the reason being
3.2.3.2 Analysis of Binary Search As discussed above, we know that the binary search took two
steps to search an element in a list of nine elements. However, in the worst case scenario for a list of 32
elements, we can visualize as follows:
1st step would get a sublist of size 16
2nd step would get a sublist of size 8
3rd step would get a sublist of size 4
4th step would get a sublist of size 2
5th step would get a sublist of size 1
Let us tabulate (Table 3.1) the number of steps taken by binary search for a given list of size n.
From table 3.1, we can infer that for a given input size n = 2k, number of steps = k.
n Table 3.1
Size of list (n) Number of steps
8 (23) 3
32 (2 )
5
5
256 (28) 8
512 (2 )
9
9
0 1 2 3 4 5 6 7 8
charList A B C E F G
Once the above copying operation is over, charList [3] becomes available for the storage of incom-
ing character ‘D’ as shown below:
charList [3] 5 ‘D’;
0 1 2 3 4 5 6 7 8
charList A B C D E F G
The array charList after the insertion operation is shown in Figure 3.7.
An algorithm for insertion of an element called val at ith location in an array called charList of
size N is given below. Let us assume that the last element in the array is at Mth location. For instance, N 5 8
and M 5 5 in case of charList shown in Figure 3.6. A variable called Back is being used to point to
the last empty location:
Algorithm insertArray ()
{
Step
106 Data Structures Using C
/* point to the last vacant position */
1. if (M < N) then Back 5 M + 1;
else
STOP;
2. while (Back > I);
{ /* Copy elements to next location in the list */
2.1 charList [Back] 5 charList [Back − 1];
2.2 Back 5 Back − 1;
}
3. charList [I] 5 val; /*Insert the element at ith location */
4. M 5 M + 1;
}
Example 7: Write a program that inserts an element called ‘key’ at a location called ‘loc’ in a list of M
numbers called ‘List’.
Solution: The algorithm insertArray() is being used to implement the required program.
/* This program inserts an element called Key into a list of numbers */
#include <stdio.h>
#define N 20
void main()
{
int List[N];
int Key;
int Loc;
int i, M;
int back;
3.2.4.2 Deletion Deletion is the operation that removes an element from a given location of a list.
When the list is represented using an array, the element can be very easily deleted from the end of the
list. However, if it is desired to delete an element from an ith location of the list, then all elements from
the right of (i+1)th location will have to be shifted one step towards left to preserve contiguous locations
in the array.
For instance, if it is desired to remove an element from 4th location of the list given in Figure 3.8,
then all elements from right of 4th location would have to be shifted one step towards left.
0 1 2 3 4 5 6 7 8
charList A B C D F G
Now to delete the character ‘E’, the characters ‘F’ and ‘G’ will have to be shifted one step towards left
to locations 4th and 5th, respectively as per the operations shown below:
charList [4] = charList[5];
charList [5] = charList[6];
It may be noted that the contents of charList [5] (i.e., ‘F’) overwrites the contents of charList [4]
(i.e., E). The array charList after the deletion operation is shown in Figure 3.9.
0 1 2 3 4 5 6 7 8
charList A B C D F G
An algorithm for deletion of an element from ith location in an array called charList of size N
is given below. Let us assume that the last element in the array is at Mth location. For instance, N 5 8
and M 5 6 in case of charList shown in Figure 3.8. A variable called Back is used to point to the last
empty location.
Algorithm delElement()
{
Step
1. Back 5 I; /* point to the location from where deletion is desired */
2. while (Back < M)
{
2.1 charList[Back] 5 charList[Back + 1]; /* Shift elements one
step left*/
2.2 Back 5 Back + 1;
}
3. M 5 M − 1;
4. Stop
}
It may be noted that in step 3, the contents of M have been decremented by 1 to indicate that after
deletion the last element in the charList would be at (M 2 1)th location.
Example 8: Write a program that deletes an element from a location called loc in a list of M numbers
called ‘List’.
Solution: The algorithm delElement() is used to implement the required program.
/*This program deletes an element from list of numbers */
#include <stdio.h>
#define N 20
void main()
{
int List[N];
int Loc;
int i, M;
int back;
printf (“\n Enter the size of the list (< 20)”);
scanf (“%d”, &M);
printf (“\n Enter the list one by one”);
for (i 5 0; i< M; i++)
{
scanf (“%d”, &List[i]);
}
printf (“\n Enter location from where the deletion is required”);
scanf (“%d”, &Loc);
/* Insert the ‘key’ at ‘Loc’ in ‘List’ */
if (Loc > M)
printf (“\n Deletion not possible”);
Arrays: Searching and Sorting 109
else
{
back 5 Loc;
/* Shift elements one step left */
while (back < M)
{
List[back] 5 List [back + 1];
back++;
}
M5M − 1;
/* Display the final list */
printf (“\n The Final List is ......”);
for (i 5 0; i < M; i++)
{
printf (“%d “, List[i]);
}
}
}
It may be noted that arrays are not suitable data structures for problems requiring insertions and
deletions in a list. The reason is that we need to shift elements to right or to left for insertion and dele-
tion operations, respectively. The problem aggravates when the size of the list is very large as an equally
large number of shift operations would be required to insert and delete the elements. Thus, insertion and
deletion are very slow operations as far as arrays are concerned.
3.2.5 Sorting
It is an operation in which all the elements of a list are arranged in a predetermined order. The elements
can be arranged in a sequence from smallest to largest such that every element is less than or equal to its
next neighbour in the list. Such an arrangement is called ascending order. Assuming an array called List
containing N elements, the ascending order can be defined by the following relation:
List[i] <= List [i + 1], 0 < i < N 2 1
Similarly in descending order, the elements are arranged in a sequence from largest to smallest such
that every element is greater than or equal to its next neighbour in the list. The descending order can be
defined by the following relation:
List[i] >= List [i + 1] , 0 < i < N 2 1
It has been estimated that in a data processing environment, 25 per cent of the time is consumed
in sorting of data. Many sorting algorithms have been developed. Some of the most popular sort-
ing algorithms that can be applied to arrays are in-place sort algorithms. An in-place algorithm is
generally a comparison-based algorithm that stores the sorted elements of the list in the same array
as occupied by the original one. A detailed discussion on sorting algorithms is given in subsequent
sections.
3.2.5.1 Selection Sort It is a very simple and natural way of sorting a list. It finds the smallest element
in the list and exchanges it with the element present at the head of the list as shown in Figure 3.10.
110 Data Structures Using C
Smallest
8 20 2 1 4 19 7 11
Sorted
1 20 2 8 4 19 7 11
Unsorted part
It may be noted from Figure 3.10 that initially, whole of the list was unsorted. After the exchange of
smallest with the element on the head of the list, the list is divided into two parts: sorted and unsorted.
Now the smallest is searched in the unsorted part of the list, i.e., ‘2’ and exchanged with the element
at the head of unsorted part, i.e., ‘20’ as shown in Figure 3.11.
Smallest
8 20 2 1 4 19 7 11
Sorted
1 2 20 8 4 19 7 11
Unsorted part
This process of selection and exchange (i.e., a pass) continues in this fashion until all the elements
in the list are sorted (see Figure 3.12). Thus, in selection sort, two steps are important—selection and
exchange.
From Figures 3.11 and 3.12, it may be observed that it is a case of nested loops. The outer loop is
required for passes over the list and the inner loop for searching smallest element within the unsorted
part of the list. In fact, for N number of elements, N − 1 passes are made.
An algorithm for selection sort is given below. In this algorithm, the elements of a list stored in an
array called LIST[N] are sorted in ascending order. Two variables called Small and Pos are used to
locate the smallest element in the unsorted part of the list. Temp is the variable used to interchange the
selected element with the first element of the unsorted part of the list.
Algorithm SelSort()
{
Step
1. For I 5 1 to N − 1 /* Outer Loop */
Arrays: Searching and Sorting 111
Smallest
1 2 20 8 4 19 7 11
Smallest
1 2 4 8 20 19 7 11
Smallest
1 2 4 7 20 19 8 11
Smallest
1 2 4 7 8 19 20 11
Smallest
1 2 4 7 8 11 20 19
1 2 4 7 8 11 19 20
{
1.1 small 5 List [I];
1.2 Pos 5 I;
1.3 For J 5 I + 1 to N /* Inner Loop */
{
1.3.1 if (List [J] < small)
{
small 5 List[J];
Pos 5 J; /* Note the position of the smallest*/
}
}
112 Data Structures Using C
1.4 Temp 5 List [I]; /*Exchange smallest with the Head */
1.5 List [I] 5 List [Pos];
1.6 List [Pos] 5 Temp;
}
2. Print the sorted list
}
Example 9: Given is a list of N randomly ordered numbers. Write program that sorts the list in ascend-
ing order by using selection sort.
Solution: The required program is given below:
In this program, the elements of a list are stored in an array called List. The elements are sorted
using above given Algorithm selSort(). Two variables small and pos have been used to locate the
smallest element in the unsorted part of the list. Temp is a variable used to interchange the selected ele-
ment with the first element of the unsorted part of the list. With each step, the unsorted part becomes
smaller. The process is repeated till all the elements are sorted.
/* This program sorts a list by using selection sort */
#include <stdio.h>
main()
{
int list [10];
int small, pos, N, i, j, temp;
printf (“\n Enter the size of the list:”);
scanf (“%d”, & N);
pos 5 i;
/* Find the smallest of the unsorted list */
for (j 5 i+1; j < N; j++)
{
if (small > list [j])
{
small 5 list [j];
pos 5 j;
}
}
/* Exchange the small with the
Arrays: Searching and Sorting 113
first element of unsorted list */
temp 5 list [i];
list [i] 5 list [pos];
list [pos] 5 temp;
}
printf (“\n The sorted list ...”);
for (i 5 0; i < N; i++)
printf (“%d “, list[i]);
}
3.2.5.2 Analysis of Selection Sort In selection sort, there are two major operations: comparison and
exchange. The average number of exchange operations would be difficult to estimate. Therefore, we will
focus on comparison operations.
It may be noted that for every execution of outer loop, the inner loop executes in decreasing order.
For example, in a list of (say) eight numbers, the first number would be compared to the remaining seven
numbers to find out the smallest. After bringing the smallest to the first position, the second number
would be compared to the remaining six, the third number with remaining five, and so on. Thus, the
total number of comparisons for a list on N elements would be:
(N − 1) + (N − 2) + …….2 + 1 = (N − 1) N/2 = N2/2 − N/2
Thus, T(n) = N2/2 − N/2.
we can say that
N2/2 − N/2 <= N2/2.
Therefore, we can say that
T(n) <= N2/2 where c = ½ , n0 = 0 , g(n) = N2
T(n) = O(N2)
Hence, selection sort has time complexity of the order of N2, i.e., O(N2).
3.2.5.3 Bubble Sort It is also a very simple sorting algorithm. It proceeds by looking at the list from
left to right. Each adjacent pair of elements is compared. Whenever a pair is found not to be in order, the
elements are exchanged. Therefore after the first pass, the largest element bubbles up to the right end of
the list. A trace of first pass on a list of numbers is shown in Figure 3.13.
It may be noted that after the pass is over, the largest element in the list (i.e., 20) has bubbled up to
the end of the list and six exchanges were made. Now the same process can be repeated for the list for
second pass as shown in Figure 3.14.
We observe that after the second pass is over, the list has become sorted and only two exchanges
were made. Now a point worth noting is as to how and when it will be decided that the list has become
sorted. The simple criteria would be to check whether or not any exchange(s) has been made during the
current pass. If ‘yes’, then the list is not yet sorted otherwise if it is ‘no’, then it can be decided that the
list has just become sorted.
An algorithm for bubble sort is given below. In this algorithm, the elements of a list, stored in an
array called List[N], are sorted in an ascending order. The algorithm uses two loops—the outer while
114 Data Structures Using C
8 20 9 10 11 19 12 13
8 9 20 10 11 19 12 13
8 9 10 20 11 19 12 13
8 9 10 11 20 19 12 13
8 9 10 11 19 20 12 13
8 9 10 11 19 12 20 13
8 9 10 11 19 12 13 20
8 9 10 11 19 12 13 20
8 9 10 11 12 19 13 20
8 9 10 11 12 13 19 20
loop and inner For loop. The inner For loop makes a pass on the list. If during the pass, any exchange(s)
is made then it is recorded in a variable called flag, i.e., flag is set to false. The outer while loop keeps
track of the flag. As soon as the flag informs that no exchange(s) took place during the current pass
indicating that the list is now sorted, the algorithm stops.
Arrays: Searching and Sorting 115
Algorithm bubbleSort()
{
Step
1. Flag 5 false;
2. While (Flag 55 false)
{
2.1 Flag 5 true;
2.2 For j 5 0 to N − 2
{
2.2.1 if (List [J] > List [J + 1])
{
temp 5 List[J];
List[J] 5 List [J + 1];
List [J + 1] 5 temp;
Flag5false;
}
}
}
3. Print the sorted list
4. Stop
}
It may be noted that the algorithm stops as soon as the list becomes sorted. Thus, ‘bubble sort’ is a
very useful algorithm when the list is almost sorted i.e., only a very small percentage of elements are out
of order.
Example 10: Given is a list of N randomly ordered numbers. Write a program that sorts the list in
ascending order by using bubble sort.
Solution: The required program is given below:
In this program, the elements of a list are stored in an array called List. The elements are sorted
using above given algorithm bubbleSort().
/ *This program sorts a given list of numbers in ascending order, using
bubble sort */
#include <stdio.h>
#define N 20
#define true 1
#define false 0
void main()
{
int List[N];
int flag;
int size;
int i, j, temp;
int count; /* counts the number of passes*/
printf (“\n Enter the size of the list (< 20)”);
scanf (“%d”, &size);
116 Data Structures Using C
printf (“\n Enter the list one by one”);
for (i 5 0; i< size; i++)
{
scanf (“%d”, &List[i]);
}
/* Sort the list by bubble sort */
flag 5 false;
count 5 0;
while (flag 55 false)
{
flag 5 true; /* Assume no exchange takes place*/
count++;
for (j 50; j< size−1; j++)
{
if (List[j] > List[j+1])
{ /* Exchange the contents */
temp 5 List[j];
List[j] 5 List[j + 1];
List[j + 1] 5 temp;
flag 5 false; /* Record the exchange operation*/
}
}
}
/* Print the sorted list*/
printf (“\n The sorted list is ....”);
for (i 5 0; i< size; i++)
printf (“%d “, List[i]);
printf (“\n The number of passes made 5 %d”, count);
}
It may be noted that the above program has used a variable called count that counts the number
of passes made while sorting the list. The test runs conducted on the program have established that the
program is very efficient in case of almost sorted list of elements. In fact, it takes only one scan to estab-
lish that the supplied list is already sorted.
Note: Bubble sort is also called a sinking sort meaning that the elements sink down in the list to
their proper position.
3.2.5.4 Analysis of Bubble Sort In bubble sort, again, there are two major operations—comparison
and exchange. The average number of exchange operations would be difficult to estimate. Therefore, we
will focus on comparison operations.
A closer look reveals that for a list of N numbers, bubble sort also has the following number of com-
parisons, i.e., same as the selection sort.
(N 2 1) + (N 2 2) + ……. 2 + 1 = (N 2 1) N/2 = N2/2 2 N/2
Therefore, the time complexity of bubble sort is also O(N2).
However if the list is already sorted in the ascending order, no exchange operations would be required
and it becomes the best case. The algorithm will have only N comparisons, i.e., only a linear running time.
Arrays: Searching and Sorting 117
Example 11: Given is a list of N randomly ordered numbers. Write a program that sorts the list in
ascending order by using insertion sort.
Solution: The required program is given below:
/*This program sorts a given list of numbers in ascending order using in-
sertion sort */
#include <stdio.h>
#define N 20
void main()
{
int List[N];
int size;
int i, j, temp;
printf (“\n Enter the size of the list (< 20)”);
scanf (“%d”, &size);
printf (“\n Enter the list one by one”);
for (i 5 0; i < size; i++)
{
scanf (“%d”, &List[i]);
}
/* Sort the list by Insertion sort */
for (i51; i<size; i++)
{
temp 5 List[i]; /* Pick and save the first element of the unsorted
part*/
j5 i - 1;
while ((temp < List[j])&& (j>50)) /* Scan for proper place */
{
List[j + 1] 5 List[j];
j 5 j - 1;
}
List[j+1] 5 temp; /* Insert the element at the proper place */
}
/* Print the sorted list*/
printf (“\n The sorted list is ....”);
for (i 5 0; i< size; i++)
{
printf (“%d “, List[i]);
}
}
3.2.5.6 Analysis of Insertion Sort A critical look at the algorithm indicates that in worst case, i.e.,
when the list is sorted in reverse order, the jth iteration requires (j − 1) comparisons and copy operations.
Therefore, the total number of comparison and copy operations for a list of N numbers would be:
1 + 2 + 3 +…….(N 2 2) + (N 2 1) = (N 2 1) N/2 = N2/2 2 N/2
Therefore, the time complexity of insertion sort is also O(N2).
Arrays: Searching and Sorting 119
However if the list is already sorted in the ascending order, no copy operations would be required. It
becomes the best case and the algorithm will have only N comparisons, i.e., a linear running time.
Merging of lists It is an operation in which two ordered lists are merged into a single ordered list.
The merging of two lists PAR1 and PAR2 can be done by examining the elements at the head of the two
lists and selecting the smaller of the two. The
smaller element is then stored into a third list
called mergeList. For example, consider the Ptr1
lists PAR1 and PAR2 given in Figure 3.18. Let PAR1 2 5 6 8
Ptr1, Ptr2, and Ptr3 variables point to the
first locations of lists PAR1, PAR2, and PAR3,
Ptr2
respectively. The comparison of PAR1[Ptr1]
and PAR2[Ptr2] shows that the element of PAR2 4 7 9 19
PAR1 (i.e., ‘2’) is smaller. Thus, this element
will be placed in the mergeList as per the fol- Ptr3
lowing operation: mergeList 2
mergeList[Ptr3] = PAR1[Ptr1];
Ptr1++;
Fig. 3.18 Merging of lists (first step)
Ptr3++;
120 Data Structures Using C
Since an element from the list PAR1 has been taken to mergeList, the variable Ptr1 is accordingly
incremented to point to the next location in the list. The variable Ptr3 is also incremented to point to
next vacant location in mergeList.
This process of comparing, storing and shifting is repeated till both the lists are merged and stored
in mergeList as shown in Figure 3.19.
Ptr1
PAR1 2 5 6 8
Ptr2
PAR2 4 7 9 19
Ptr3
mergeList 2 4
It may be noted here that during this merging process, a situation may arise when we run out of
elements in one of the lists. We must, therefore, stop the merging process and copy rest of the elements
from unfinished list into the final list.
The algorithm for merging of lists is given below. In this algorithm, the two sub-lists are part of the
same array List[N]. The first sub-list is stored in locations List[lb] to List[mid] and the second
sub-list is stored in locations List [mid+1] to List [ub] where lb and ub mean lower and upper
bounds of the array, respectively.
Algorithm merge (List, lb, mid, ub)
{
Step
1. ptr1 5 lb; /* index of first list */
2. ptr2 5 mid; /* index of second list */
3. ptr3 5 lb; /* index of merged list */
4. while ((ptr1 <mid) && ptr2 < 5 ub) /* merge the lists */
{
4.1 if (List[ptr1] < 5 List [ptr2])
{mergeList [ptr3] 5 List[ptr1]; /* element from first list is taken */
ptr1++; /* move to next element in the list*/
ptr3++;
}
4.2 else
{mergeList [ptr3] 5 List[ptr2]; /* element from second list is taken*/
ptr2++; /* move to next element in the list*/
ptr3++;
}
Arrays: Searching and Sorting 121
}
5. while (ptr1 < mid) /* copy remaining first list */
{
5.1 mergeList [ptr3] 5 List[ptr1];
5.2 ptr1++;
5.3 ptr3++;
}
6. while (ptr2 <5 ub) /* copy remaining second list */
{
6.1 mergeList [ptr3] 5 List[ptr2];
6.2 ptr2++;
6.3 ptr3++;
}
7. for (i 5 lb; i<ptr3; i++) /* copy merged list back into original
list */
7.1 List[i] 5 mergeList[i];
8. Stop
}
It may be noted that an extra temporary array called mergeList is required to store the intermediate
merged sub-lists. The contents of the mergeList are finally copied back into the original list.
The algorithm for the merge sort is given below. In this algorithm, the elements of a list stored in
an array called List[N] are sorted in an ascending order. The algorithm has two parts—mergeSort
and merge. The merge algorithm, given above, merges two given sorted lists into a third list, which is also
sorted. The mergeSort algorithm takes a list and stores into an array called List[N]. It uses two variables
lb and ub to keep track of lower and upper bounds of list or sub-lists as the case may be. It recursively
divides the list into almost equal parts till singletons or empty lists are left. The sub-lists are recursively
merged through merge algorithm to produce final sorted list.
Algorithm mergeSort (List, lb, ub)
{
Step
1. if (lb < ub)
{
1.1 mid 5 (lb + ub)/2; /* divide the list into two sub-lists */
1.2 mergeSort (List, lb, mid); /* sort the left sub-list */
1.3 mergeSort (List, mid +1, ub); /* sort the right sub-list */
1.4 merge(List, lb,mid+1,ub); /* merge the lists */
}
2. Stop
}
Example 12: Given is a list of N randomly ordered numbers. Write a program that sorts the list in
ascending order by using merge sort.
Solution: The required program uses both the algorithms—mergeSort() and merge().
/* This program sorts a given list of numbers in ascending order using
merge sort */
122 Data Structures Using C
#include <stdio.h>
#include <conio.h>
#define N 20
void mergeSort (int List[], int lb, int ub);
void merge (int List[], int lb, int mid, int ub);
void main()
{
int List[N];
int i, size;
int mid;
printf (“\n Enter the size of the list (< 20)”);
scanf (“%d”, &size);
printf (“\n Enter the list one by one”);
for (i50; i< size; i++)
{
scanf (“%d”, &List[i]);
}
/* Sort the list by merge sort */
mergeSort (List,0,size − 1);
printf (“\n The sorted list is ....”);
for (i 5 0; i< size; i++)
{
printf (“%d “, List[i]);
}
}
void mergeSort (int List[], int lb, int ub)
{
int mid;
if (lb < ub)
{
mid 5 (lb + ub)/2;
mergeSort (List, lb, mid);
mergeSort (List, mid + 1, ub);
merge(List, lb, mid + 1,ub);
}
}
void merge (int List[], int lb, int mid, int ub)
{
int mergeList[20];
int ptr1, ptr2, ptr3;
int i;
ptr15lb;
ptr25mid;
ptr35lb;
while ((ptr1 <mid) && ptr2 < 5 ub)
{
Arrays: Searching and Sorting 123
if (List[ptr1] < 5 List [ptr2])
{mergeList [ptr3] 5 List[ptr1];
ptr1++;
ptr3++;
}
else
{mergeList [ptr3] 5 List[ptr2];
ptr2++;
ptr3++;
}
}
while (ptr1 < mid)
{mergeList [ptr3] 5 List[ptr1];
ptr1++;
ptr3++;
}
3.2.5.8 Analysis of Merge Sort The merge sort requires the following operations:
(1) Divide the list into two sub-lists.
(2) Sort each sub-list.
(3) Merge the two sub-lists.
(4) If number of elements = 1, then list is sorted.
For a list of N elements, steps 1 and 2 require two times the number of comparisons needed for
sorting a sub-list of size N/2. The step 3 requires N steps to merge the two sub-lists.
Therefore, number of comparisons in merge sort can be defined by the following recursive
relation:
NumComp (N) = 0 if N = 1,
= 2*NumComp(N/2) + N for N > 1
Thus, NumComp (1) = 0
NumComp (N) = 2*NumComp(N/2) + N (3.1)
By similarity, we can define the following:
NumComp (N/2) = 2*NumComp(N/4) + N
Putting above expressions into Eq. 3.1, we get
NumComp (N) = 4 NumComp(N/4) + 2N = 22 NumComp(N/22) + 2N (3.2)
124 Data Structures Using C
3.2.5.9 Quick Sort This method also uses the technique of ‘divide and conquer’. On the basis of
a selected element (pivot) from of the list, it partitions the rest of the list into two parts—a sub-list
that contains elements less than the pivot and other sub-list containing elements greater than the
pivot. The pivot is inserted between the two sub-lists. The algorithm is recursively applied to the
sub-lists until the size of each sub-list becomes 1, indicating that the whole list has become
sorted.
Consider the list given in Figure 3.20. Let the first element (i.e., 8) be the pivot. Now the rest of the
list can be divided into two parts—a sub-list that contains elements less than ‘8’ and the other sub-
list that contains elements greater than ‘8’ as shown in Figure 3.20.
Pivot
8 5 6 9 4 19 7 2
8 5 6 2 4 7 19 9
8 5 6 2 4 7 19 9
7 5 6 2 4 8 19 9
Now this process can be recursively applied on the two sub-lists to completely sort the whole list.
For instance, ‘7’ becomes the pivot for left sub-list and ‘19’ becomes pivot for the right sub-list.
Note: Two sub-lists can be safely joined when every element in the first sub-list is smaller than
every element in the second sub-list. Since ‘join’ is a faster operation as compared to a ‘merge’
operation, this sort is rightly named as a ‘quick sort᾽.
The algorithm for the quick sort is given below:
In this algorithm, the elements of a list, stored in an array called List[N], are sorted in an ascend-
ing order. The algorithm has two parts—quickSort and partition. The partition algorithm divides the
list into two sub-lists around a pivot. The quickSort algorithm takes a list and stores it into an array
called List[N]. It uses two variables lb and ub to keep track of lower and upper bounds of list or sub-
lists as the case may be. It employs partition algorithm to sort the sub-lists.
Algorithm quickSort()
{
Step
1. Lb 5 0; /*set lower bound */
2. ub 5 N − 1; /* set upper bound */
3. pivot 5 List [lb];
4. lb++;
5. partition (pivot, List, lb, ub);
}
Example 13: Given is a list of N randomly ordered numbers. Write a program that sorts the list in
ascending order by using quick sort.
Solution: The required program uses both the algorithms—quickSort() and partition(). In this
program, a variable called Key has been used that acts as a pivot.
/* This program sorts a given list of numbers in ascending order, using
quick sort */
#include <stdio.h>
#include <conio.h>
#define N 20
void partition (int Key, int List[], int lb, int ub);
void quicksort (int List[], int lb, int ub);
void main()
{
int List[N];
int i, size, Pos, temp;
int lb, ub;
void partition (int Key, int List[], int lb, int ub)
{
int i, j, temp;
i 5 lb;
j 5 ub;
Arrays: Searching and Sorting 127
while (i<5j)
{
while (List[i] <5 Key) i++;
while (List[j] > Key) j−−;
printf(“\ni5%d j5%d”, i, j);
getch();
if (i <5j)
{
temp 5 List[i];
List[i] 5 List[j];
List[j] 5 temp;
}
}
temp 5 List[j];
List[j] 5 List[lb − 1];
List[lb − 1] 5 temp;
if (j > lb) quicksort (List, lb, j − 1);
if (j < ub) quicksort (List, j + 1, ub);
}
3.2.5.10 Analysis of Quick Sort The quick sort requires the following operations:
Step
(1) If number of elements = 1, then list is sorted and exit.
(2) Partition the list into two sub-lists around a pivot.
(3) Place pivot in the middle of two sub-lists.
(4) Repeat steps 1 to 3.
The best case for quick sort algorithm comes when we split the input as evenly as possible into two
sub-lists. Thus in the best case, each sub-list would be of size n/2. Anyway, the partitioning operation
would require N comparisons to split the list of size N into the required sub-lists.
Now, for a list of size N, the number of comparisons can be defined as follows:
NumComp(N) = 0 when N = 1
= N + NumComp(sub-list1 of size N/2) + NumComp(sub-list2 of size N/2)
= N + 2*NumComp(N/2)
The above relation is same as defined in merge sort; therefore, it can be deduced to the following:
T(N) = O(N*log2N)
Thus, the best case time complexity of quicksort is of the order Nlog2N or simply n log n.
3.2.5.11 Shell Sort It is an improvement on insertion sort. Given a list of List[N], it is divided into
k sets of N/k items each. k can assume values from a set {1, 3, 5, 19, 41…}. Thus, the ith set will have
the following elements:
Set i = List[i] List[i + k] List[i + 2k] …
8 5 6 2 4 7 19 3 7 5 3 2 4 8 19 6
Fig. 3.21 The sets of a list for k = 5 Fig. 3.22 The list after first pass
Set 2 5 6, 3
Set 3 5 2
Set 4 5 4
Now on each set, the insertion sort is applied resulting in the arrangement shown in Figure 3.22.
For k 5 3, the list is divided into the following sets:
Set 0 5 7, 2, 19
Set 1 5 5, 4, 6
Set 2 5 3, 8
Now on each set, the insertion sort is applied resulting into the arrangement shown in Figure 3.23.
For k 5 1, the list is represented by the following set:
Set 0 5 2, 4, 3, 7, 5, 8, 19, 6
Application of insertion sort on the above set results in the arrangement shown in Figure 3.24
2 4 3 7 5 8 19 6
2 3 4 5 6 7 8 19
Fig. 3.23 The list after second pass Fig. 3.24 The list after third pass
It may be noted that the list is now sorted after the third pass. Since the variable k takes the dimin-
ishing values—5, 3, and 1; the shell sort is also called diminishing step sort.
The algorithm for the shell sort is given below. In this algorithm, a list of elements, stored in an
array called List[N], are sorted in an ascending order. The algorithm divides the list into sets as per the
description given above. The diminishing step values are stored in a list called dimStep. A variable ‘s’ is
used that moves the insertion operation to the next set. The variable k takes the step size from dimStep
and moves the index i within the set from one element to another.
Algorithm shellSort()
{
Step
1. initialize the set dimStep to values 1,3,5,...
/* sort the list by shell sort */
2. for (step 50; step <3; step++)
{
2.1 k 5 dimStep[step]; /* set k to diminishing step */
2.2 s 5 0; /* start from the set of the list */
Arrays: Searching and Sorting 129
2.3 for (i 5 s + k; i <size; i + 5 k)
{
temp 5 List[i]; /* save the element from the set */
j 5 i − k;
/* find the place for insertion */
while ((temp <List[j]) && (j > 5 0))
{
List[j + k] 5 List[j];
j 5 j − k;
}
List[j + k] 5 temp; /* insert the saved element at its place */
s++; /* go to next set */
}
}
3. print the sorted list
4. Stop
}
Example 14: Given is a list of N randomly ordered numbers. Write a program that sorts the list in
ascending order by using shell sort.
Solution: The required program uses the algorithm shellSort().
/* This program sorts a given list of numbers in ascending order using
Shell sort */
#include <stdio.h>
#define N 20
void main()
{
int List[N];
int size;
int i, j, k, p;
int temp, s, step;
int dimStep[] 5 {5,3,1}; /* the diminishing steps */
printf (“\n Enter the size of the list (< 20)”);
scanf (“%d”, &size);
printf (“\n Enter the list one by one”);
for (i50; i< size; i++)
{
scanf (“%d”, &List[i]);
}
/* sort the list by shell sort */
for (step 50; step <3; step++)
{
k 5 dimStep[step]; /* set k to diminishing step */
s50; /* start from the set of the list */
for (i 5 s + k; i <size; i +5 k)
130 Data Structures Using C
{
temp 5 List[i]; /* save the element from the set */
j5i − k;
while ((temp <List[j]) && (j >50)) /* find the place for insertion */
{
List[j + k] 5 List[j];
j 5 j − k;
}
List[j + k] 5 temp; /* insert the saved element at its place */
s++; /* go to next set */
}
} /* Print the sorted list*/
printf (“\n The sorted list is ....”);
for (i 5 0; i< size; i++)
{
printf (“%d “, List[i]);
}
}
MAT_A [0][0] 5 9 7 4
2 8 3 5
6 1 0 12 MAT_A [2][3]
MAT_A [2][1]
(i.e., zeroth row and zeroth column) and adding the required value ‘5’ to its contents as shown
below:
MAT_A[0][0] = MAT_A[0][0] + 5;
Now to complete the rest of the operations, there is a choice, i.e., either move along the zeroth row
(MAT_A[0][1], MAT[0][2]…) and add ‘5’ to each visited element or to move along the zeroth col.
(MAT_A[1][0], MAT[2][0]…) and add ‘5’ to each visited element. Once a row or column is over, we
can move to the next row or column in the order and continue to travel in the same fashion.
Thus, a two-dimensional array can be travelled ‘row by row’ or ‘column by column’. The former
method is called as row major order and the later as column major order.
An algorithm for row major order of travel of the array MAT_A is given below. This algorithm visits
every element of the array MAT_A and adds the value ‘5’ to the contents of the visited element.
Algorithm rowMajor()
{
Step
1. for row 5 0 to 2
2. for col 5 0 to 3
MAT_A[row][col] 5 MAT_A[row][col] + 5;
3. Stop
}
In the above algorithm, nested loops have been used. The outer loop is employed to travel the rows
and the inner to travel the columns within a given row.
Example 15: Given two matrices A[ ]m × r and B[ ]r × n , compute the product of two matrices such that:
C[ ]m × n 5 A [ ] m × r*B [ ] r × n
Solution: The multiplication of two matrices
5 4 1 7 5*1 + 4*2 5*7 + 4*9
requires that a row of matrix A be multiplied 3 2 * 2 9 = 3*1 + 2*2 3*7 + 2*9
to a column of matrix B to generate an element
of matrix C as shown in Figure 3.27.
Fig. 3.27 The matrix multiplication operation
A close observation of the computation
shown in Figure 3.27 gives the following rela-
tion for an element of matrix C:
C[I][J] 5 C[I][J] + A[I][K] × B[K][J] for k varying from 0 to r21
132 Data Structures Using C
We would use the above relation to compute the various elements of the matrix C. The required
program is given below:
Example 16: A nasty number is defined as a number which has at least two pairs of integer factors such
that the difference of one pair equals to sum of the other pair. For instance, the factors of ‘6’ are 1, 2, 3,
and 6.
Now the difference of factor pair (6, 1) is equal to sum of factor pair (2, 3), i.e., 6 2 1 5 2 + 3. There-
fore, ‘6’ is a nasty number.
Choose appropriate data structure and write a program that displays all the nasty numbers present
in a list of numbers.
Solution: The following data structures would be used to store the list of numbers and the list of factor-
pair:
(1) A one-dimensional array called List to store the list of numbers.
(2) A two-dimensional array called pair_of_factors to store the list of pair-factors of a given
number.
For example, the list of pair factors of ‘24’ would be stored as shown below:
1 24
2 12
3 8
4 6
Finally, using a nested loop, the difference of pair factors would be compared with the sum of all pair
of factors for equality. The required program is given below:
/* This program finds all the nasty numbers from a given list of integer
numbers */
#include <stdio.h>
#include <math.h>
main()
{
int LIST [20];
int pair_of_factors[50][2];
int i, j, k, size, num, diff, sum, count;
Physical Memory
List
1 2
2 6 100 1
2
3 8 100 + 1 2
6
4 9 Compiler
100 + 2 3
8
5 10 100 + 3 4
9
6 11 100 + 4 5
10
100 + 5 11 6
Logical view
Physical Address Logical
Address
Physical Memory
TAB
[1][1]
100 8 1, 1
8 2 6
100 + 1 2 1, 2
5 4 1
[2][3] Complier 100 + 2 6 1, 3
100 + 4 4 2, 2
100 + 5 1 2, 3
Physical Address Logical
Address
We know that in two-dimensional array called TAB, for every row, there are three columns. Thus,
for an element of Ith row the starting address becomes: Base + (I 2 1)*3 and if the element in row is at
Jth column, the complete address becomes:
Address of TAB[I][J] 5 Base + (I 2 1)* 3 + J 2 1 (3.7)
To test the above row major order formula, let us calculate the address of element represented by
TAB[2][1] with Base 5 100.
Given: I 5 2, J 5 1, Base 5 100. Putting these values in Eq. 3.7, we get
Address 5 100 + (2 2 1)*3 + 1 2 1 5 100 1 3 + 0 5 103.
From Figure 3.29, we find that the address value is correct.
Now we can generalize the formula of Eq. 3.7 for an array of size [M][N] in the following
form:
Address of [I][J] the element 5 BASE 1 (I 2 1)*N 1 (J 2 1) (3.8)
Arrays: Searching and Sorting 137
The formula given above is a simple case, i.e., it is assumed that every element of TAB would occupy
only one location of the physical memory. However if an element of an array occupies ‘S’ number of
memory locations, the row major order formula (3.8) can be modified as given below:
Address of TAB [I][J] the element 5 BASE 1 ((I 2 1)*N 1 (J 2 1))*S (3.9)
where
TAB: is the name of the array.
Base: is starting or base address of array.
I: is the row index of the element.
J: is the column index of the element.
N: is number of columns present in a row.
S: is the number of locations occupied by an element of the array.
Similarly, the formula for column major order is as follows:
Address of TAB [I][J] the element 5 BASE 1 ((J 2 1)*M 1 (I 2 1))*S (3.10)
where
TAB: is the name of the array.
Base: is starting or base address of array.
I: is the row index of the element.
J: is the column index of the element.
M: is number of rows present in a column.
S: is the number of locations occupied by an element of the array.
Example 17: TAB is a two-dimensional array with five rows and three columns. Each element occupies
one memory location. If TAB[1][1] begins at address 500, find the location of TAB [4][2] for row major
order of storage.
Solution: Given base 5 500, I 5 4, J 5 2, M 5 5, N 5 3. Applying formula 3.9, we get:
Address 5 Base 1 ((I 2 1)*N + (J 2 1))*S
5 500 + ((4 2 1)*3 + (2 2 1))*1
5 500 + (9 + 1)
5 510.
Example 18: MAT is a two-dimensional array with ten rows and five columns. Each element is stored in
two memory locations. If MAT[1][1] begins at address 200, find the location of MAT [3][4] for row
major order of storage.
Solution: Given base 5 200, I 5 3, J 5 4, M 5 10, N 5 5. Applying formula 3.9, we get:
Address 5 Base + ((I 2 1)*N + (J 2 1))*S
5 200 + ((3 2 1)*5 + (4 2 1))*2
5 200 + (10 + 3)*2
5 226.
Example 19: An array A[10][20] is stored in the memory with each element requiring two memory
locations. If the base address of the array in the memory is 400, determine the location of array element
A[8][3] when array is stored as column major order.
Solution: Given base 5 400, I 5 8, J 5 3, M 5 10, N 5 20. Applying formula 3.10, we get:
Address 5 Base + ((J 2 1)*M + (I 2 1))*S
5 400 + ((3 2 1)*10 + (8 2 1))*2
5 400 + (20 + 7)*2
5 454.
138 Data Structures Using C
Pol 3 6 5 8 2 5 0
No. of terms
Fig. 3.31 Array representation of a polynomial with zeroth place reserved for number of terms
Example 20: Write a program that reads two polynomials pol1 and pol2, adds them and gives a third
polynomial pol3 such that pol3 5 pol1 1 pol2.
Solution: A close look at the array representation of polynomials indicates, that two polynomials can
be added by merging them using ‘algorithm merge()’ of Section 3.2.5.7. The only change would be that
when the exponents from both the polynomials are same, then the coefficients of both the terms would
be added.
Arrays: Searching and Sorting 139
readPol(pol1,terms);
readPol(pol2,terms);
We can also choose other data structures to represent polynomials. For instance, a term can be
represented by ‘struct’ construct of ‘C’ as shown below:
struct term
{
int coef;
int exp;
};
and the polynomial called ‘pol’ of ten terms can be represented by an array of structures of type ‘term’
as shown below:
Lets us now apply the above representation to MatA to obtain the list of ordered pairs as given
below:
0, 4, 2
1, 1, 3
1, 6, 5
3, 3, 7
4, 7, 4
5, 0, 8
The above representation is also called a condensed matrix. It is using comparatively very less space
which will become even prominent for large matrices, say of the order of 500 × 500 or more.
The proposed list of ordered pairs can be mod- int MatA [7][3] = { 6, 8, 6,
eled as a two dimensional matrix of order of n + 1 × 0, 4, 2
3. Where n is the number of nonzero elements pres- 1, 1, 3
ent in a sparse matrix. However, the 0th ordered pair 1, 6, 5
can be suitably used to store the order of the sparse 3, 3, 7
matrix and the number of nonzero elements present 4, 6, 4
in the matrix as shown in Figure 3.33. 5, 0, 8 };
Example 21: Write a program that reads a sparse ma- Fig. 3.33 The condensed representation
trix of order m*n and stores it into the condensed rep-
resentation as discussed above i.e. a two dimensional array of size n+1 * 3. Where n is number of non
zero elements. Print the contents of the condensed matrix.
Solution: The required program is given below:
/* This program reads a sparse matrix of order m*n and stores it into a
condensed matrix format */
# include <stdio.h>
# include <conio.h>
main()
{
int matA[7][3];
int i,j,k;
int m,n, element;
clrscr();
printf (“\n Enter the order (m,n) of the sparse matrix”);
scanf (“%d %d”, &m,&n);
matA[0][0] = m; /* no. of rows */
matA[0][1] = n; /* no. of cols */
k=1; /* point to 1st position of sparse mat */
printf (“\n Enter the elements of the matrix in row major order”);
for (i=0; i<m; i++)
for (j=0; j < n; j++)
{
scanf (“%d”, &element);
if (element != 0)
Arrays: Searching and Sorting 143
{ /* store non zero element into condensed matrix */
matA[k][0] = i;
matA[k][1] = j;
matA[k][2] = element;
k++;
}
}
matA[0][2] = k-1; /* record no. of non zero elements */
printf (“\n The condensed matrix is...”);
for (i=1;i <k; i++)
printf (“\n %d %d %d”, matA[i][0], matA[i][1],matA[i][2]);
}
Note: No space has been provided for the sparse matrix. The elements have been read one by one and
the nonzero elements have been stored into the condensed matrix.
A sample run of the program is given below:
3.5.2.1 Sparse Matrix Addition Two sparse matrices can be added in the same way as the polynomi-
als were added in previous section. Consider the three sparse matrices Mat1, Mat2, and Mat3 given in
Figure 3.34.
where Mat3 = Mat1 + Mat2
The equivalent condensed form of Mat1, Mat2, Mat3 is given in Figure 3.35.
144 Data Structures Using C
exeRCISeS
n A string of 25 characters.
n A matrix of order 5 × 4.
• 4.1 Stacks
ChApTer OuTline
Similarly, the nested calls to functions and procedures are also handled in LIFO fashion in the sense
that control from last called procedure is handled first.
A stack can be more precisely defined as a linear collection of items which allows addition and dele-
tion operations only from one end called top.
The push operation adds an item onto the top of the stack whereas the pop operation removes
an element from the current top of the stack. The size of the stack depends upon the number of
items present in the stack. With the addition and deletion of items, the size of stack accordingly
increases or decreases. Therefore, stack is also called a dynamic data structure with a capacity to
enlarge or shrink.
Top E
D Top D
C C Top C
B B B Top B
A A A A
It may be noted that if more pop operations are done on the stack then a stage will come when
there will be no items left on the stack. This condition is called stack empty.
The algorithms for push and pop operations are given below. These algorithms use an array called
Stack[N] to represent a stack of N locations. A variable called Top keeps track of the top of the
stack, i.e., the location where additions and deletions are made. Another variable called item is used to
store the item to be pushed or popped from the stack.
Stacks and Queues 153
It may be noted that the item is added to the stack (i.e., push operation) provided the stack is not
already full, i.e., at step1, it is checked whether or not the stack has room for another item.
The algorithm for pop is given below:
Algorithm Pop()
{
Step
1. if (Top < 0) then prompt (“Stack Empty”);
else
Item = Stack [Top];
Top = Top - 1;
2. stop
}
It may be noted that in the above algorithm, before popping an item from the top, it is checked at
step 1 to see if there is at least one item on the stack that can be removed.
The performance of the stack data structure for ‘n’ elements is:
n The space complexity is O(n).
n The size of the stack, implemented using array, must be defined in advance, i.e., a priori. More-
Solution: The above defined algorithms Push() and Pop() would be employed to write the program.
The menu would be created using printf () statements and displayed within a do-while loop. In fact,
a do-while loop is the most appropriate loop for display and manipulations of menu items.
154 Data Structures Using C
Note:
n The Push() function would return 1 if the operation is successful and 0 otherwise, i.e., 0 indi-
cates that the stack was full.
n The Pop() function would return the popped value if the operation is successful and 9999 oth-
erwise, i.e., the unusual value ‘9999’ indicates that the stack was empty.
The required program is given below:
/* This program simulates stack operations */
#include <stdio.h>
#include <conio.h>
int push (int stak[], int *top, int val, int size);
int pop (int stak[], int *top);
void display (int stack[], int top);
void main()
{
int stack[20]; /* the stack declaration */
int top = −1; /* initially the stack is empty */
int val;
int size;
int choice;
int result;
switch (choice)
{
case 1: printf (“\n Enter the value to be pushed”);
scanf (“%d”, & val);
result = push( stack, &top, val, size);
if (result == 0)
printf (“\n The stack full”);
Stacks and Queues 155
break;
case 2: result = pop( stack, &top);
if (result == 9999)
printf (“\n The stack is empty”);
else
printf (“\n The popped value = %d”, result);
break;
case 3: display (stack, top);
break;
}
printf (“\n\n Press any key to continue”);
getch();
}
while (choice != 4);
}
int push (int stack[], int *top, int val, int size)
{
if (*top >= size)
return 0; /* the stack is Full */
else
{
*top= *top + 1;
stack[*top]= val;
return 1;
}
}
int pop (int stack[], int *top)
{int val;
if (*top < 0)
return 9999; /* the stack is empty */
else
{
val = stack[*top];
*top = *top − 1;
return val;
}
}
n Backtracking
n Total 2 100
n Total/Rate*Interest
n X^Y 2 B*C
expression,
A^B^C
Stacks and Queues 157
The sub-expressions are evaluated from right to left as shown in Figure 4.5.
The arithmetic expression, as discussed above,
is called an infix expression wherein the opera- A ^ B ^ C
tor is placed between two operands. The evaluation
takes place according to priorities assigned to the 1
operators (see Table 4.2). However, to overrule the
priority of an operator, parenthesis is used. For 2
example, the two infix expressions given below
are entirely different because in Expression 2, the
priority of the division operator ‘/’ has been over- Fig. 4.5 Evaluation of exponential operator
ruled by the embedded parenthesis.
(1) X 2 Y/Z
(2) ( X 2 Y )/Z
The order of evaluation of above infix expressions is shown in Figure 4.6.
X − Y / Z (X − Y) / Z
1
1
2 2
Thus, parenthesis play an important role in the evaluation of an infix expression and the order
of evaluation depends upon the placement of parentheses and the operator precedence. Therefore, the
infix representation of arithmetic expressions becomes inefficient from the compilation point of view.
The reason being that to find the sub-expression with highest priority, at each step, repeated scanning of
the expression is required from left to right.
A Polish logician J. Lukasicwicz introduced a notation which permits arithmetic expressions with-
out parentheses. The absence of embedded parentheses allows simpler compiler interpretation, transla-
tion, and execution of arithmetic expressions. This notation is also called Polish notation. It appears
in two forms: prefix and postfix forms.
4.2.1.1 Prefix Expression (Polish Notation) In the prefix form, an operator is placed before its
operands, i.e., the operators are prefixed to their operands. For example, the infix expression A + B is
written as ‘+AB’ in prefix form.
Examples of some valid prefix expressions with their equivalent infix expressions are given in
Table 4.3.
n Table 4.3 Some prefix expressions
prefix expression equivalent infix expression
1AB A1B
*1ABC (A 1 B)*C
1/A 1 BC*D 2 EF A/(B 1 C) 1 D*(E 2 F)
158 Data Structures Using C
4.2.1.2 Postfix Expression (Reverse Polish Expression) In the postfix expression, the operator is
placed after its operands, i.e., the expression uses suffix or postfix operators. For example, the infix
expression ‘A + B’ is written as ‘A B +’ in postfix form.
Examples of some valid postfix expressions with their equivalent infix expressions are given in
Table 4.4.
n Table 4.4 Some postfix expressions
postfix expression equivalent infix expression
AB1 A1B
AB 1 C* (A 1 B)*C
ABC 1/DEF 2*1 A/(B 1 C) 1 D*(E 2 F)
It may be noted that in Polish and reverse Polish notations, there are no parentheses and, therefore,
the order of evaluation will be determined by the positions of the operators and related operands in the
expression.
A critical look at the postfix expressions indicates that such representation is excellent from the
execution point of view. The reason being that while execution, no consideration has to be given to the
priority of the operators rather the placement of operators decides the order of execution. Moreover, it is
free from parentheses. Same is true for prefix expressions but in this book, only the postfix expres-
sions would be considered.
Thus, there is a need for conversion of infix to postfix expression. The subsequent section dis-
cusses two methods to convert the infix expressions to postfix expressions.
4.2.1.3 Conversion of Infix Expression to Postfix Expression An infix expression can be con-
verted into postfix by two methods: parenthesis method and stack method.
(1) Parenthesis method The following steps are used to convert an infix expression to postfix
expression by parenthesis method:
(1) Fully parenthesize the infix expression.
(2) Replace the right hand parenthesis by its corresponding embedded operator.
(3) Remove the left hand parenthesis. The resultant expression is in postfix notation.
Consider the infix expression A/(B + C) + D*(E − F). Let us apply the above steps to convert this
expression to postfix notation. The operations are shown in Figure 4.7.
Step 2 (( A /(B + C) ) + ( D * ( E − F ) ) )
Step 3 A B C + / D E F − * +
The method discussed above is rather inefficient because it requires following two passes over the
expression:
(1) The first pass to fully parenthesize the expression.
(2) The second pass to replace the right hand parentheses by their corresponding embedded
operators.
It may be noted that the infix expression within innermost parentheses should be the first to be
converted into postfix before expressions in outermost parentheses are picked. This successive elimi-
nation of parentheses is done until whole of the expression is converted into postfix. This LIFO nature
of the 2nd pass suggests that a stack can be used for the conversion process.
(2) The stack method In this method, two priorities are assigned to the operators: instack priority and
incoming priority as shown in Table 4.5.
It may be noted that the left hand parenthesis ‘(‘ has the highest incoming priority and least instack
priority.
Let us assume that an infix expression is terminated by the character ‘;’. We shall call operators
and operands of an expression by a general name element. The following broader level algorithm
convert() can be used to convert an infix expression to postfix expression.
Algorithm convert()
{
Step
1. Scan the expression from left to right.
2. If the element is an operand, then store the element into the tar-
get area.
3. If the element is an operator, then push it onto a stack with fol-
lowing rules:
3.1 While the incoming priority is less than or equal to instack
priority, pop the operators and store into the target area.
3.2 If the element is right hand parenthesis ‘)’, then pop all the
elements from the stack till left hand parenthesis ‘(‘ is popped
out. The popped elements are stored in the target area.
4. If the element is ‘;’, then pop all the elements till the stack is
empty and store them into the target area.
5. Stop
}
160 Data Structures Using C
Example 2: Write a program that uses a stack to convert an infix expression to a postfix expression.
Solution: We would employ the algorithm convert(), given above. The following data structures
would be used:
The list of allowed binary operators along with their icp and isp is stored in an array of structures
called ‘opTab[]. An array called stack of type operator A is the main data structure. match() function
is being used to find out as to whether a given element is an operator or an operand depending upon the
result (res) of match operation. The required program is given below:
/* This program converts an infix expression to a postfix expression */
#include <stdio.h>
#include <conio.h>
void main()
{
char infix[20];
char target[20];
struct operator stack[20]; /* the stack declaration */
int top = −1; /* initially the stack is empty */
int res;
char val;
int size;
int pos;
int i;
struct operator op, opTemp;
struct operator opTab[8] = {‘(‘, 0, 6, /* The operators
information */
‘)’, 0, 0,
‘^’, 4, 5,
‘*’, 3, 3,
‘/’, 3, 3,
‘+’, 2, 2,
‘−’, 2, 2,
‘;’, 0, −1,
};
Stacks and Queues 161
printf (“\n Enter the size of the infix expression”);
scanf (“%d”, &size); size--;
printf (“\n Enter the terms of infix expression one by one”);
for (i= 0; i <= size; i++)
{fflush(stdin); /* flush the input buffer */
scanf (“%c”, &infix[i]);
}
pos = 0; /* position in target expression */
for (i=0; i <= size; i++)
{
res = match(opTab, infix[i], &op); /* find whether operator/operand */
if (res==0)
{target[pos] = infix[i]; /* store into target expression */
pos++;
}
else
{if (top < 0)
push (stack, &top, op, size); /* first time Push */
else
{opTemp.opName=’#’; /* place any value to opTemp */
if (op.opName == ‘)’)
{ while (opTemp.opName != ‘(‘ )
{
opTemp = pop (stack, &top);
if (opTemp.opName != ‘(‘ ) /* omit ‘(‘ */
{ target [pos] = opTemp.opName;
pos++;
}
}
}
else
{while (stack[top].isp >= op.icp && top >=0)
{
opTemp = pop (stack, &top);
target [pos] = opTemp.opName;
pos++;
}
push(stack, &top, op, size);
}
}
}
}
{
val = stack[*top];
*top = *top − 1;
return val;
}
}
For following given infix expressions, the above program produces the correct output as shown
below:
Infix expression Postfix expression
(1) A ^ B ^ C + D; ABC^^D+
(2) (A 2 B)*( C/D ) + E; A B 2 C D/* E +
Example 3: Use a stack to convert the following infix arithmetic expression into a postfix
expression. Show the changing status of the stack in tabular form.
A 1 B*C
Solution: We would use the algorithm convert(). The simulation of this algorithm for the above
expression is given in Table 4.6.
The output postfix expression is ABC* 1
Example 4: Use a stack to convert the following infix arithmetic expression into a postfix
expression. Show the changing status of the stack in tabular form.
(A 2 B)*(C/D) 1 E
Solution: We would use the algorithm convert(). The simulation of this algorithm for the above
expression is given in Table 4.7.
The output postfix expression is AB 2 CD/* E1
164 Data Structures Using C
Example 5: Use a stack to convert the following infix arithmetic expression into a postfix
expression. Show the changing status of the stack in tabular form.
X 1 Y*Z ^ P 2 (X/Y 1 Z)
Solution: We would use the algorithm convert(). The simulation of this algorithm for the above
expression is given in Table 4.8.
The output postfix expression is XYZP^* 1 XY/Z12
Note: Since a postfix expression has no parentheses, its evaluation becomes a very easy exercise. In
fact, a stack can be suitably used for this purpose.
4.2.1.4 Evaluation of Postfix Expressions A stack can be conveniently used for the evaluation of
postfix expressions. The expression is read from left to right. Each encountered element is examined.
If the element is an operand, then the element is pushed on to the stack. If the element is an operator,
then the two operands are popped from the stack and the desired operation is done. The result of the
operations is again pushed onto the stack. This process is repeated till the end of the expression (i.e., ‘;’)
is encountered. The final result is popped from the stack.
An algorithm for the evaluation of a postfix expression is given below:
Algorithm evalPostfix()
{
Step
1. pick an element from the postfix expression
Stacks and Queues 165
2. if (element is an operand)
if ( element != ‘;’)
{push element on the stack}
else
{
pop result from the stack;
print the result;
exit to step 4;
}
else
{
pop operand1 from stack;
pop operand2 from stack;
perform the operation;
push result on the stack;
166 Data Structures Using C
}
3. repeat steps 1 and 2
4. stop
}
#include <stdio.h>
#include <conio.h>
int push (float stack[], int *top, float val, int size);
float pop (float stack[], int *top);
void main()
{
float stack[20];
struct term postFix[20];
Stacks and Queues 167
int top = −1;
int i;
int size;
float result;
int temp;
Example 8: Use a stack to evaluate the following postfix arithmetic expression. Show the changing
status of the stack in tabular form.
X Y Z P ^ * + A B/C + − for X = 1, Y = 5, Z, = 2, P = 3, A = 15, B = 3, C = 8
Solution: We would use the algorithm evalPostfix(). The simulation of this algorithm for the above
expression is given in Table 4.10.
1 X1 Push 1
2 Y5 Push 15
3 Z2 Push 152
4 P3 Push 1523
5 ^0 Pop(3) 152
Pop(2) 15
Result = 2 ^ 3 = 8
Push 158
(continued)
170 Data Structures Using C
9 B3 Push 41 15 3
10 /0 Pop (3) 41 15
Pop(15) 41
Result = 15/3 = 5
Push 41 5
11 C8 Push 41 5 8
12 +0 Pop (8) 41 5
Pop(5) 41
Result = 5 + 8 = 13
Push 41 13
13 −0 Pop (13) 41
Pop(41)
Result = 41 − 13
= 28 28
Push
14 ;0 Pop (28)
Print result
4.3 QueueS
A queue is the most common situation in this world where the
first person in the line is the person to be served first; the new
comers join at the end. We find that in this kind of arrange-
ment, all additions are made at one end and the deletions made
at the other. The end where all additions are made is called the
rear end. The other end from where the deletions are made is
called the front end as shown in Figure 4.8.
The queue is also a linear data structure of varying size
Front Rear
in the sense that the size of a queue depends upon the num-
ber of items currently present in it. With additions and/
or deletions, the size of the queue increases or decreases, Fig. 4.8 The queue
respectively. Therefore, the queue is also a dynamic data
structure with a capacity to enlarge or shrink.
Stacks and Queues 171
Consider the queue of processes shown in Figure 4.9. The operating system maintains a queue for
scheduled processes that are ready to run. The dispatcher picks the process from the front end of the
queue and dispatches it to CPU and a new process joins at the rear end of the queue.
ALU
CU
Dispatched Queue of processes New
process Process
It may be noted that when the new processes join in, the size of the queue increases whereas the size
of the queue will reduce when processes finish their job and leave the system.
0 N −1
Queue A B C D E F G
Front Rear
Let us now add an item ‘H’ into the queue. Before the item is added into the queue, the Rear is incre-
mented by one to point to the next vacant location as shown in Figure 4.11. The new item ‘H’ is added at
location currently being pointed by Rear.
0 N −1
Queue A B C D E F G H
Front Rear
It may be noted that if more additions are made on the queue, then a stage would come when Rear =
N − 1, i.e., last location of the array. Now, no more additions can be performed on the queue. This condi-
tion (Rear = N − 1) is called Queue-Full.
Let us now delete an item from the queue. Before an item is deleted or removed, the Front is incre-
mented by one to point to the next location, i.e., the location containing item ‘A’. The item (i.e., ‘A’) is
then deleted as shown in Figure 4.12.
0 N −1
Queue B C D E F G H
Front Rear
Fig. 4.12 Deletion of the item ‘A’ from the front end
It may be noted that if more deletions are done on this queue, then a stage would reach when there
will be no items on the queue, i.e., Front = Rear as shown in Figure 4.13. This condition (Front =
Rear) is called Queue_empty.
0 N −1
Queue
Front Rear
The algorithms for addition and deletion operations on the queue are given below. These algorithms
use an array called Queue[N] to represent a queue of size N locations. A variable called Front keeps track
of the front of the queue, i.e., the location from where deletions are made. Another variable called Rear
keeps track of the rear end of the queue, i.e., the location where additions are made. Another variable
called item is used to store the item to be added or deleted from the queue.
The algorithm for addition is given below:
Algorithm addQ()
{
Step
1. if (Rear >= N) then {prompt (“Queue Full”); exit}
2. Rear = Rear + 1;
3. Queue [Rear] = item;
4. stop
}
It may be noted that the item is added to the queue (i.e., addition operation) provided the queue is
not already full, i.e., it is checked at step1 whether or not the queue has room for another item.
Stacks and Queues 173
Note that in the above algorithm, before removing an item from the queue, it is checked at step 1 to
see if there is at least one item on the queue that can be removed.
The performance of the queue data structure for ‘n’ elements is given below:
n The space complexity is O(n).
n The size of the queue, implemented using array, must be defined in advance, i.e., a priori. More-
otherwise, i.e., the unusual value ‘9999’ indicates that the queue was empty.
int addQ (int Queue[], int *Rear, int val, int size);
int delQ (int Queue[], int *Front, int *Rear);
174 Data Structures Using C
void display (int Queue[], int Front, int Rear);
void main()
{
int Queue[20]; /* the queue declaration */
int Front;
int Rear;
int val;
int size;
int choice;
int result;
Front = Rear=0; /* initially the queue set to be empty */
printf(“\n Enter the size of the queue”);
scanf (“%d”, & size);
size--; /* adjusted for index = 0 */
/* create menu */
do
{
clrscr();
printf (“\n Menu - Queue Operations”);
printf (“\n Add 1”);
printf (“\n delete 2”);
printf (“\n Display queue 3”);
printf (“\n Quit 4”);
switch (choice)
{
case 1: printf (“\n Enter the value to be added”);
scanf (“%d”, & val);
result = addQ(Queue, &Rear, val, size);
if (result == 0)
printf (“\n The queue full”);
break;
case 2: result = delQ(Queue, &Front, &Rear);
if (result == 9999)
printf (“\n The queue is Empty”);
else
printf (“\n The deleted value = %d”, result);
break;
case 3: display (Queue, Front, Rear);
break;
}
printf(“\n\n Press any key to continue”);
getch();
Stacks and Queues 175
}
while (choice != 4);
}
int addQ (int Queue[], int *Rear, int val, int size)
{
if ( *Rear >= size)
return 0; /* the queue is Full */
else
{
*Rear= *Rear +1;
Queue[*Rear] = val;
return 1;
}
}
int delQ (int Queue[], int *Front, int *Rear)
{ int val;
if (*Front == *Rear)
return 9999; /* the queue is empty */
else
{
*Front = *Front + 1;
val = Queue[*Front];
return val;
}
}
void display (int Queue[], int Front, int Rear)
{
int i;
printf (“\n The contents of queue are:”);
for (i = Front+1; i <= Rear; i++)
printf (“%d “, Queue[i]);
}
Note:
(1) The drawback of a linear queue is that a stage may arrive when both Front and Rear are equal
and point to the last location of the array as shown in Figure 4.14.
0 Unutilized space N−1
Queue
Front Rear
Now, the queue is useless because the queue is full (Rear > = N − 1) as well as empty (Front =
Rear). Thus, no additions and deletions can be performed. This situation can be handled by resetting
Front and Rear to 0th location, i.e., Front = Rear = 0.
(2) The major drawback of linear queue is that at a given time, all the locations to the left of Front
are always vacant and unutilized.
Queue C
Rear Front
It may be noted that after more additions, a stage will come when Rear becomes equal to Front
(Rear = Front), indicating that the queue is full.
0 N−1
Queue C I R C U L A R
Rear Front
Let us now delete an element from the queue. The Front is pointing to location N−1. After incre-
menting Front (i.e., Front = Front + 1 = N − 1 + 1 = N), we find that it is pointing to a location
equal to N, which does not exist. Therefore, Front is set to location 0, the beginning of the array and the
item ‘C’ is deleted from this location pointed by Front as shown in Figure 4.17.
It may be noted that after more deletions, a stage will come when Front becomes equal to Rear
(Front = Rear), indicating that the queue is empty.
0 N−1
Queue I R C U L A R
Front Rear
It may be further noted that a very interesting situation has turned up, i.e., for both events—queue
‘full’ and ‘empty’—the condition to be tested is (Rear = Front). This conflict can be resolved by the
following decisions:
n The queue empty condition remains the same, i.e., ‘Front == Rear’.
n In this arrangement one location, being pointed by Front, shall remain vacant.
n The following formulae would be used for automatic movement of the variables ‘Front’ and
It may be noted that the item is added to the queue (i.e., addition operation) provided the queue is
not already full, i.e., at step1 it is checked whether or not the queue has room for another item.
The algorithm for deletion is given below:
Algorithm delQ()
{
Step
1. if (Front == Rear) then prompt (“Queue Empty”);
else
Front = (Front + 1) % N
Item = Queue [Front];
return Item;
2. stop
}
Note that in the above algorithm, before removing an item from the queue, it is checked at step 1 to
see if there is at least one item on the queue that can be removed.
Example 10: Write a program in ‘C’ that uses the following menu to simulate the circular queue opera-
tions on items of int type.
Solution: The above defined algorithms addQ() and delQ() for circular queue would be employed to
write the program. The menu would be created using printf () statements and displayed within a
do-while loop.
Note:
n The addQ() function would return 1 if the operation is successful and 0 otherwise, i.e., 0 indi-
cates that the queue was full.
n The delQ() function would return the deleted value if the operation is successful and 9999
otherwise, i.e., the unusual value ‘9999’ indicates that the queue was empty.
int addQ (int cirQ[],int *Front, int *Rear, int val, int size);
int delQ (int cirQ[], int *Front, int *Rear, int size);
void display (int cirQ[], int Front, int Rear, int size);
Stacks and Queues 179
void main()
{
int cirQ[20]; /* the queue declaration */
int Front;
int Rear;
int val;
int size;
int choice;
int result;
Front = Rear=0; /* initially the queue set to be empty */
printf (“\n Enter the size of the queue”);
scanf (“%d”, & size);
/* create menu */
do
{
clrscr();
printf (“\n Menu – Circular queue operations”);
printf (“\n Add 1”);
printf (“\n Delete 2”);
printf (“\n Display queue 3”);
printf (“\n Quit 4”);
switch (choice)
{
case 1: printf (“\n Enter the value to be added”);
scanf (“%d”, & val);
result = addQ(cirQ, &Front, &Rear, val, size);
if (result == 0)
printf (“\n The queue full”);
break;
case 2: result = delQ(cirQ, &Front, &Rear, size);
if (result == 9999)
printf (“\n The queue is Empty”);
else
printf (“\n The deleted value = %d”, result);
break;
case 3: display (cirQ, Front, Rear, size);
break;
}
printf(“\n\n Press any key to continue”);
getch();
}
180 Data Structures Using C
while (choice != 4);
}
int addQ (int cirQ[],int * Front, int *Rear, int val, int size)
{
if ( ((*Rear + 1) % size) == *Front)
return 0; /* the queue is Full */
else
{
*Rear= (*Rear + 1) % size;
cirQ[*Rear]= val;
return 1;
}
}
int delQ (int cirQ[], int *Front, int *Rear, int size)
{int val;
if (*Front == *Rear)
return 9999; /* the queue is empty */
else
{
return val;
}
}
void display (int cirQ[], int Front, int Rear, int size)
{
int i;
printf (“\n The contents of queue are:”);
i=Front;
while ( i != Rear)
{
i = (i + 1) % size;
printf (“%d “, cirQ[i]);
}
}
It may be noted that the circular queue has been implemented in a one-dimensional array, which is
linear by nature. We only view the array to be a circle in the sense that it is imagined that the last location
of the array is immediately followed by the 0th location as shown in Figure 4.18.
The shaded area shows the queue elements. When the queue is full, then it satisfies the condition:
(Rear + 1) % N == Front. The queue empty condition is indicated by the condition: Front ==
Rear. However, at least one location remains vacant all the time.
Stacks and Queues 181
Front
0
Rear
N–1
ALU
CU
Highest priority Priority queue New
process process
Consider the job pool given in Table 4. 11. The processes opened n Table 4.11 Job pool
by the operating system have been assigned priorities.
Now, the scheduler shall create a priority queue that can store process no. priority
a list of pairs (Process No., priority). It has two possible choices to P8 2
store the pairs as shown in Figure 4.20. P5 1
Choice (a): The addition operation simply adds the newly P7 9
arrived job at the rear end, i.e., in the order of its arrival. This
P4 6
operation is O(1).
P2 5
The deletion operation requires the process with higher pri-
ority to be searched and removed, P7 in this case. This opera-
tion is O(n) for n elements in the queue as we have to traverse the queue to find the element.
Choice (b): The addition operation requires the process with higher priority to be inserted at its
proper position in the queue. This operation is O(n) for n elements in the queue as the location has
to be found where the insertion can take place.
The deletion operation simply removes the process from the front of the queue because the pro-
cess with highest priority is already at the front of the queue. This operation is O(1).
Let us now use the choice (b) to implement a priority queue for processes called ‘proc’ of following
structure type
struct proc
{
char process [3];
int priority;
};
A function called addPq() inserts an ith process ‘pi’ with priority ‘pr’ at its proper position in the
priority queue represented using the array pQ[N] where N is the size of the queue. A variable ‘Rear’
points to rear end of the priority queue, i.e., the least priority process.
A function called delPq() removes a process from the current ‘Front’ of the priority queue.
The required program is given below:
/* This program simulates priority queue operations */
#include <stdio.h>
#include <conio.h>
struct proc{
char process[3];
int priority;
};
int addPq (struct proc pQ[], int *Rear, struct proc val, int size);
int delPq (struct proc pQ[], int *Front, int *Rear, struct proc *val);
void display (struct proc pQ[], int Front, int Rear);
void main()
{
switch (choice)
{
184 Data Structures Using C
case 1: printf (“\n Enter the process number and its priority to
be added”);
scanf (“%s %d”, & val.process, &val.priority);
result = addPq( pQ, &Rear, val, size);
if (result == 0)
{printf (“\n The queue full”);}
break;
case 2: result = delPq(pQ, &Front,& Rear, &val);
if (result == 0)
printf (“\n The queue is empty”);
else
printf (“\n The deleted proc = %s with priority %d”, val.
process, val.priority );
break;
case 3: display (pQ, Front, Rear);
break;
}
printf(“\n\n Press any key to continue”);
getch();
}
while (choice != 4);
}
int addPq (struct proc pQ[], int *Rear, struct proc val, int size)
{
int i;
if ((*Rear + 1)>= size)
return 0; /* the queue is Full */
else
{
*Rear= (*Rear + 1);
i=*Rear − 1;
while (val.priority > pQ[i].priority)
{
pQ[i+1] =pQ[i];
i = i − 1;
}
i++;
pQ[i] = val;
}
return 1;
}
int delPq (struct proc pQ[], int *Front, int *Rear, struct proc *val)
{
if (*Front == *Rear)
return 0; /* the queue is empty */
else
Stacks and Queues 185
{
*Front = (*Front + 1);
*val = pQ[*Front];
return 1;
}
}
void display ( struct proc pQ[], int Front, int Rear)
{
int i;
printf (“\n The contents of queue are:”);
i=Front;
while ( i != Rear)
{
i = i + 1;
printf (“(%s , %d) “, pQ[i].process, pQ[i].priority);
}
}
It may be noted that in order to keep it simple, the priority queue has been implemented as a linear
queue. A sample input is given below:
Enter the process number and its priority to be added P4 7
where ‘P4’ is the name of the process and ‘7’ is its priority.
Add to Add to
Front Rear
Front Rear
Remove Remove
from from
Front Rear
Some programmers view a deque as two stacks connected back to back as shown in Figure 4.22.
For deque arrangement shown in Figure 4.22, the following four operations need to be
designed:
(1) pushRear(): adds an item at the rear end
(2) popRear(): removes an item from the rear end
(3) pushFront(): adds an item at the front end
(4) popFront(): removes an item from the front end
Push to Push to
Front Rear
Front Rear
Pop from Pop from
Front Rear
Example 11: Write a program in ‘C’ that uses the following menu to simulate the deque operations on
items of int type.
Solution: A one-dimensional array called dQ[] would be used to implement the deque wherein the
following four functions would be used. The algorithms for these functions are same as used in a linear
queue.
(1) addRear(): adds an item at the rear end
(2) delRear(): removes an item from the rear end
(3) addFront(): adds an item at the front end
(4) delFront(): removes an item from the front end
The menu would be created using printf() statements and displayed within a do-while loop.
Stacks and Queues 187
#include <stdio.h>
#include <conio.h>
int addRear (int dQ[], int *Rear, int val, int size);
int addFront (int dQ[], int *Front, int val);
int delRear (int dQ[], int *Rear, int *Front, int *val);
int delFront (int dQ[], int *Front, int *Rear, int *val);
void main()
{
int dQ[20]; /* the queue declaration */
int Front;
int Rear;
int val;
int size;
int choice;
int result;
Front = Rear=0; /* initially the queue set to be empty */
printf(“\n Enter the size of the queue”);
scanf (“%d”, & size);
size−−; /* adjusted for index = 0 */
/* create menu */
do
{
clrscr();
printf (“\n Menu - dQue Operations”);
printf (“\n Add item at Rear 1”);
printf (“\n Add item at Front 2”);
printf (“\n Delete item from Rear 3”);
printf (“\n Delete item from Front 4”);
printf (“\n Display deque 5”);
printf (“\n Quit 6”);
switch (choice)
{
case 1: printf (“\n Enter the value to be added”);
scanf (“%d”, & val);
result = addRear(dQ, &Rear, val, size);
188 Data Structures Using C
if (result == 0)
printf (“\n The queue is full”);
break;
case 2: printf (“\n Enter the value to be added”);
scanf (“%d”, & val);
result = addFront(dQ, &Front, val);
if (result == 0)
printf (“\n The queue is full”);
break;
case 3: result =delRear(dQ, &Rear,& Front, &val);
if (result == 0)
printf (“\n The queue is empty”);
else
printf (“\n The deleted value = %d”, val);
break;
case 4: result =delFront(dQ, &Front, &Rear, &val);
if (result == 0)
printf (“\n The queue is empty”);
else
printf (“\n The deleted value = %d”, val);
break;
case 5: display (dQ, Front, Rear);
break;
}
printf(“\n\n Press any key to continue”);
getch();
}
while (choice != 6);
}
{
Stacks and Queues 189
if (*Front <= 0)
return 0; /* the queue is full */
else
{
dQ[*Front] = val;
*Front = *Front − 1;
return 1;
}
}
int delRear (int dQ[], int *Rear, int *Front, int *val)
{
if ( *Rear==*Front)
return 0; /* the queue is empty */
else
{
*val = dQ[*Rear];
*Rear= *Rear − 1;
return 1;
}
}
int delFront (int dQ[], int *Front, int *Rear, int *val)
{
if (*Front == *Rear)
return 0; /* the queue is empty */
else
{
*Front = *Front + 1;
*val = dQ[*Front];
return 1;
}
}
Example 12: The items can be iteratively removed from both ends of a deque and compared with
each other. Therefore, the deque can be easily used to determine whether a given string is a pal-
indrome or not. Write a program in ‘C’ that uses a deque to determine whether a given string is a
palindrome or not.
190 Data Structures Using C
Solution: An array called String[N] would be used to store the string. As one location in a queue nec-
essarily remains vacant, an array called dQ[N+1] would be used to implement a deque. The functions
developed in Example 11 would be used to do the required task, i.e., to determine whether the input
string is palindrome or not.
#include <stdio.h>
#include <conio.h>
int addRear (char dQ[], int *Rear, char val, int size);
int delRear (char dQ[], int *Rear, int *Front, char *val);
int delFront (char dQ[], int *Front, int *Rear, char *val);
int addFront (char dQ[], int *Front, char val);
i =0;
size=10; /* adjusted for index = 0 */
Front = Rear = 0; /* initially deque is empty */
size = i − 1;
flag = 1; /* Assuming that the string is
while ( flag && result)
Stacks and Queues 191
{
result = delRear(dQ, &Rear, &Front, &val1);
{
if (*Front <= 0)
return 0; /* the queue is full */
else
{
dQ[*Front] = val;
*Front = *Front − 1;
return 1;
}
}
int delRear (char dQ[], int *Rear, int *Front, char *val)
{
if ( *Rear==*Front)
return 0; /* the queue is empty */
else
{
*val = dQ[*Rear];
/* If the size of string is odd then do not decrement
Rear*/
if ( (*Rear − 1) > *Front) *Rear = *Rear−1;
return 1;
}
192 Data Structures Using C
}
int delFront (char dQ[], int *Front, int *Rear, char *val)
{
if (*Front == *Rear)
{
return 0;
} /* the queue is empty */
else
{
*Front = *Front + 1;
*val = dQ[*Front];
return 1;
}
}
void display (char Queue[], int Front, int Rear)
{
int i;
printf (“\n The contents of queue are:”);
for (i = Front+1; i <= Rear; i++)
printf (“%c “, Queue[i]);
}
Top 1 Top 2
Fig. 4.23 Two stacks that grow towards middle of the array
Stacks and Queues 193
#include <stdio.h>
#include <conio.h>
int val;
int choice;
int result;
/* create menu */
do
{
clrscr();
printf (“\n Menu − twoStack Operations”);
printf (“\n PushStack1 1”);
printf (“\n PushStack2 2”);
printf (“\n PopStack1 3”);
printf (“\n PopStack2 4”);
printf (“\n DispStack1 5”);
printf (“\n DispStack2 6”);
printf (“\n Quit 7”);
switch (choice)
{
case 1: printf (“\n Enter the value to be pushed”);
scanf (“%d”, & val);
result = pushStack1(twoStack, &top1, &top2, val);
if (result == 0)
int pushStack1 (int stack[], int *top1, int *top2, int val)
{
if (*top1 + 1 == *top2)
return 0; /* the stack is full */
else
{
*top1= *top1 + 1;
stack[*top1]= val;
return 1;
}
}
int pushStack2 (int stack[],int *top1, int *top2, int val)
{
if (*top1 + 1 == *top2)
return 0; /* the stack is full */
Stacks and Queues 195
else
{
*top2= *top2 − 1;
stack[*top2]= val;
return 1;
}
}
int popStack1 (int stack[], int *top1)
{int val;
if (*top1 < 0)
return 9999; /* the stack is empty */
else
{
val = stack[*top1];
*top1 = *top1 − 1;
return val;
}
}
int popStack2 (int stack[], int *top2)
{ int val;
if (*top2 >10)
return 9999; /* the stack is empty */
else
{
val = stack[*top2];
*top2 = *top2 + 1;
return val;
}
}
void dispStack1 (int stack[], int top1)
{
int i;
printf (“\n The contents of stack1 are:”);
for (i = top1; i >=0; i−−)
printf(“%d “, stack[i]);
}
void dispStack2 (int stack[], int top2)
{
int i;
printf (“\n The contents of stack2 are:”);
for (i = top2; i <=10; i++)
printf (“%d “, stack[i]);
}
196 Data Structures Using C
exerCiSeS
1. Give static implementation of stack by writing push and pop routine for it.
2. Explain overflow and underflow conditions of a stack with examples.
3 How can a stack be used in checking the well-formedness of an expression, i.e., balance of the
left and right parenthesis of the expression?
4. Write down an algorithm to implement two stacks using only one array. Your stack routine
should not report an overflow unless every slot in the array is used.
5. Write a non-recursive program to reverse a string using stack.
6. Explain prefix, infix, and postfix expressions with examples.
7. Describe a method to convert an infix expression in to a postfix expression with the help
of a suitable example.
8. Translate following infix expressions to their equivalent postfix expressions:
n (x 1 y 2 z)/(h 1 k)*s
n j 2 k/g^h + (n 1 m)
• 5.1 Introduction
ChapTeR OUTlINe
5.1 INTRODUCTION
Pointers are the most powerful feature of ‘C’. The beginners of ‘C’ find pointers hard to understand and
manipulate. In an attempt to unveil the mystery behind this aspect, the basics of pointers which generally
remain in background are being discussed in the following sections.
2712 1232
int x = 15;
x 15 2712 y
y = & x;
From Figure 5.3, it is clear that y is the variable that contains the x y
address (i.e., 2712) of another variable x, whereas the address of y
15
itself is 1232 (assumed value). In other words, we can say that y is a
pointer to variable x. See Figure 5.4.
A pointer variable y can be declared in ‘C’ as shown below: Fig. 5.4 y is a pointer to x
int *y;
The above declaration means that y is a pointer to a variable of type int. Similarly, consider the
following declaration:
Indicates to be a pointer
#include <stdio.h>
main()
{
int x = 15;
int *y;
y = &x;
printf (“\n Value of x = %d“, x);
printf (“\n Address of x = %u“, &x);
printf (“\n Value of x = %d“, *y);
printf (“\n Address of x = %u”, y);
printf (“\n Address of y = %u”, &y);
}
Value of x = 15
Address of x = 2712
Value of x = 15
Address of x = 2712
Address of y = 1232
Thus, a pointer of float type will point to an n Table 5.1 Amount of storage taken
address of 4 bytes of location and, therefore, an by data types
increment to this pointer will increment its contents
by 4 locations. It may be further noted that more Data type amount of storage
than one pointer can point to the same location. character 1 byte
Consider the following program segment: integer 2 byte
int x = 37 float 4 byte
int *p1, *p2; long 4 byte
double 8 byte
The following statement:
p1 = &x;
p1 x
Makes the pointer p1 point to variable x as
37
shown in Figure 5.8.
The following statement:
p2 = p1 Fig. 5.8 Pointer p1 points to variable x
#include <stdio.h>
main()
{
int x, y;
int *p1, *p2, *p3; /* pointers to integers */
202 Data Structures Using C
p1 p2 p1 p2
x y x y
(a) (b)
Note: The output would show the exchanged values because of exchange of pointers (see Figure 5.10)
whereas the contents of x and y have remained unaltered.
unpredictable. The ‘C’ compiler does not consider it as an error, though some compilers may give a
warning.
Therefore, following steps must be followed if pointers are being used:
Steps:
(1) Allocate the pointer.
(2) Allocate the variable or entity to which the pointer is to be pointed.
(3) Point the pointer to the entity.
Let us assume that it is desired that the pointer ptr should point to a variable val of type int. The
correct code in that case would be as given below:
ptr val
int * ptr;
int val; 50
ptr = &val;
*ptr = 50;
The result of the above program segment is shown in Fig. 5.12 The correct assignment
Figure 5.12.
This array will be stored in the contiguous locations of the main memory with starting address
equal to 1001(assumed value), as shown in Figure 5.13. Since the array called ‘list’ is of integer type,
each element of this array occupies two bytes. The name of the array contains the starting address of
the array, i.e., 1001.
list 20 30 35 36 39
Fig. 5.13 The contiguous storage allocation with starting address 5 1001
Once the above program segment is executed, the output would be as shown below:
Address of zeroth element of array list = 1001
Value of zeroth element of array list = 20
204 Data Structures Using C
We could have achieved the same output by the following program segment also:
printf (“\n Address of zeroth element of array list = %d”, & list [0]);
printf (“\n Value of zeroth element of array list = %d”, list []);
Both the above given approaches are equivalent because of the following equivalence relations:
list ≡ &list [0] - both denote address of zeroth element of the array list
*list ≡ list[0] - both denote value of zeroth element of the array list
The left side approach is known as pointer method and right side as array indexing method. Let us
now write a program that prints out a list by array indexing method.
/* Array indexing method */
#include <stdio.h>
main()
{
static int list [] = {20, 30, 35, 36, 39};
int i;
printf (“\n The list is ...”);
for (i = 0; i < 5; i++)
printf (“\n %d %d ---element”, list[i], i);
}
The above program can also be written by pointer method as shown below:
/* Pointer method of processing an array */
#include <stdio.h>
main()
{
static int list [] = {20, 30, 35, 36, 39};
int i;
printf (“\n The list is ...”);
for (i = 0; i < 5; i++)
printf (“\n %d %d ---element”, *(list +i), i);
}
It may be noted in the above program that we have used the term * (list + i) to print an ith
element of the array. Let us analyse this term. Since list designates the address of zeroth element of the array,
we can access its value through value at address operator, i.e., `*´. The following terms are equivalent:
−> *list ≡ *(list + 0) ≡ list [0]
−> *(list + 1) ≡ list [1] and so on
Pointers 205
Thus, we can refer to ith element of array list by either of the following ways:
*(list + i) or *(i + list) or list [i]
So far, we have used the name of an array to get its base address and manipulate it. However, there is
a small difference between an ordinary pointer and the name of an array. The difference is that the array
name is a pointer constant and its contents cannot be changed. Thus, the following operations on list are
illegal because they try to change the contents of a list, a pointer constant.
List = NULL; /* Not allowed */
List = & Val; /* Not allowed */
list++ /* Not allowed */
list−− /* Not allowed */
It may be noted that the above declarations are not at all equivalent though the behaviour may be same.
In fact, the array declaration `text [6]´ asks from the complier for six locations of char type whereas the
pointer declaration char *p asks for a pointer that can point to any variable of following type:
(i) char – a character
(ii) string – a string of a characters
(iii) Null – nowhere
Consider the following declarations:
int list = {20, 30, 35, 36, 39};
int *p; /* pointer variable p */
p = list; /* assign the starting address of array list to pointer p */
Since p is a pointer variable and has been assigned the starting address of array list, the following
operations become perfectly valid on this pointer:
p++, p−− etc.
206 Data Structures Using C
Let us now write a third version of the program that prints out the array. We will use pointer variable
p in this program.
/* Pointer variable method of processing an array */
#include <stdio.h>
main()
{
static int list [] = {20, 30, 35, 36, 39};
int *p;
int i = 0;
p = list; /* Assign the starting address of the list */
printf (“\n The list is ...”);
while (i < 5)
{
printf (“\n %d %d ---element”, *p, i);
i++;
p++; /* increment pointer */
}
}
The output of the above program would be as shown below:
20 0………element
30 1………element
35 2………element
36 3………element
39 4………element
From our discussion on arrays, we know that the strings are stored and manipulated as array of
characters with last character being a null character (i.e., ‘\0’).
For example, the string ENGINEERING can be stored in an array (say text) as shown in Figure 5.14.
0 1 2 3 4 5 6 7 8 9 10 11
Text E N G I N E E R I N G \0
We can declare this string as a normal array by the following array declaration:
char text [11];
Consider the following declaration:
char *p;
p = text;
In the above set of statements, we have declared a pointer p that can point to a character or string
of characters. The next statement assigns the starting address of character string `text´ to the variable
pointer p (see Figure 5.15).
Pointers 207
0 1 2 3 4 5 6 7 8 9 10 11
Text E N G I N E E R I N G \0
Since text is a pointer constant, its contents cannot be changed. On the contrary, p is a pointer vari-
able and can be manipulated like any other pointer as shown in the program given below:
/* This program illustrates the usage of pointer to a string */
#include <stdio.h>
main()
{
char text[] = “ENGINEERING”; /* The string */
char *p; /* The pointer */
Once the above declarations are made, we can assign the address of the structure variable abc to
the structure pointer ptr by the following statement:
ptr = &abc;
Since ptr is a pointer to the structure variable abc, the members of the structure can also be
accessed through a special operator called arrow operator, i.e., ‘ ’ (minus sign followed by greater
than sign). For example, the members a and y of the structure variable abc pointed by ptr can be
assigned values 30 and 50.9, respectively by the following statement:
ptr −> a = 30;
ptr −> y = 50.9;
Solution: We will use a pointer ptr to point to the structure called ‘item’. The required program is given
below:
/* This program demonstrates the usage of an arrow operator */
#include <stdio.h>
main()
{
struct item {
char code[5];
int Qty;
float cost;
};
struct item item_rec; /* Define a variable of struct type */
struct item *ptr; /* Define a pointer of type struct */
/* Read data through dot operator */
printf (“\n Enter the data for an item”);
printf (“\nCode:”); scanf (“%s”, &item_rec.code);
printf (“\nQty:”); scanf (“%d”, &item_rec.Qty);
printf (“\nCost:”); scanf (“%f”, &item_rec.cost);
/* Assign the address of item_rec */
ptr = &item_rec;
210 Data Structures Using C
/* Print data through arrow operator */
printf (“\n The data for the item...”);
printf (“\nCode : %s”, ptr −> code);
printf (“\nQty : %d”, ptr −> Qty);
printf (“\nCost : %5.2f”, ptr −> cost);
}
From the above program, we can see that the members of a static structure can be accessed by both
dot and arrow operators. However, dot operator is used for simple variable whereas the arrow operator
is used for pointer variables.
Example 5: What is the output of the following program?
#include <stdio.h>
main()
{
struct point
{
int x, y;
} polygon[]= {{1,2},{1,4},{2,4},{2,2}};
ptr++;
ptr −> x++;
printf (“\n %d”, ptr −> x);
}
Solution: From the above program, it can be observed that `ptr´ is a pointer to an array of structures
called polygon. The statement ptr++ moves the pointer from zeroth location (i.e., {1, 2}) to first loca-
tion (i.e., {1, 4}). Now, the statement x++ has incremented the field x of the structure by one (i.e., 1
has incremented to 2).
The output would be 2.
free (p_var);
Let us consider pointers shown in Figure 5.23. If it is desired to return the dynamic variable pointed
by pointer Bptr, we can do so by the following statement:
free (Bptr);
Once the above statement is executed, the dynamic variable is returned back to the system and we
are left with two dangling pointers Bptr and Aptr as shown in Figure 5.24. The reason for this is obvi-
ous because both were pointing to the same location. However, the dynamic variable containing value
15 is anonymous and not available anymore and, hence, a total waste of memory.
Pointers 213
Aptr 15
Lost
Bptr
Fig. 5.24 The two dangling pointers and a lost dynamic memory variable
Example 6: Write a program that dynamically allocates an integer. It initializes the integer with a value,
increments it, and print the incremented value.
Solution: The required program is given below:
#include <stdio.h>
#include <alloc.h>
main()
{
int *p; /* A pointer to an int */
int Sz;
/* Compute the size of the int */
Sz = sizeof (int);
/* Allocate a dynamic int pointed by p */
p = (int *) malloc (Sz);
printf (“\n Enter a value :”);
scanf (“%d”, p);
*p = *p + 1;
/* Print the incremented value */
printf (“\n The Value = %d”, *p);
free(p);
}
Example 7: Write a program that dynamically allocates a structure whose structure diagram is given
below. It reads the various members of the structure and prints them.
Solution: We will use the following steps to write the required program:
(1) Declare a structure called student.
(2) Declare a pointer ptr to the structure student.
(3) Compute the size of the structure.
(4) Ask for dynamic memory of type student pointed by the pointer ptr.
(5) Read the data of the various elements using arrow operator.
(6) Print the data.
(7) Return the dynamic memory pointed by ptr.
214 Data Structures Using C
Note: An array can also be dynamically allocated as demonstrated in the following example.
Example 8: Write a program that dynamically allocates an array of integers. A list of integers is read
from the keyboard and stored in the array. The program determines the smallest in the list and prints its
location in the list.
Solution: The solution to this problem is trivial and the required program is given below:
/* This program illustrates the usage of dynamically allocated array */
#include <stdio.h>
main()
{
int i, min, pos;
int *list;
int Sz, N;
printf (“\n Enter the size of the list:”);
scanf (“%d”, &N);
Pointers 215
Sz = sizeof(int) *N ; /* Compute the size of the list */
From the above example, it can be observed that the size of a dynamically allocated array can be
specified even at the run time and the required amount of memory is allocated. This is in sharp contrast
to static arrays for whom the size have to be declared at the compile time.
The structure ‘chain’ consists of two members: val and p. The member val is a variable of type int
whereas the member p is a pointer to a structure of
type chain. Thus, the structure chain has a member
val p
that can point to a structure of type chain or may be
itself. This type of self referencing structure can be chain
viewed as shown in Figure 5.25.
Since pointer p can point to a structure variable
of type chain, we can connect two such structure Fig. 5.25 Self referential structure chain
216 Data Structures Using C
From Figure 5.26 and the above program segment, we observe that the pointer p of structure variable B
is dangling, i.e., it is pointing to nowhere. Such pointer can be assigned to NULL, a constant indicating that
there is no valid address in this pointer. The following statement will do the desired operation:
B.p = NULL;
The data elements in this linked structure can be assigned by the following statements:
A.val = 50;
B.val = 60;
A B
The linked structure now looks like as
50 60
shown in Figure 5.27.
We can see that the members of struc-
ture B can be reached by two methods:
Fig. 5.27 Value assignment to data elements
(1) From its variable name B through dot
operator.
(2) From the pointer p of variable A because it is also pointing to the structure B. However, in this
case the arrow operator is needed.
Consider the statements given below:
printf (“\n the contents of member val of B = %d”, B.val);
printf (“\n the contents of member val of B = %d”, A.p -> val);
Once the above statements are executed, the output would be:
The contents of member val of B = 60
The contents of member val of B = 60
The linked structures have great potential and can be used in numerous programming situations
such as lists, trees, etc.
Example 9: Write a program that uses self referential structures to create a linked list of nodes of following
structure. While creating the linked list, the student data is read. The list is also travelled to print the data.
Pointers 217
Name roll next
Student
Solution: We will use the following self referential structure for the purpose of creating a node of the
linked list.
struct stud{
char name [15];
int roll;
struct stud next;
};
The required linked list would be created by using three pointers: first, far and back. The fol-
lowing algorithm would be employed to create the list:
Step
first
(1) Take a new node in pointer called first.
(2) Read first name and first roll.
(3) Point back pointer to the same node
being pointed by first, i.e., back 5 first. first
(4) Bring a new node in the pointer called back
far.
(5) Read far name and far roll.
(6) Connect next of back to for, i.e., back
next 5 far.
first far
(7) Take back to far, i.e., back 5 far. back
(8) Repeat steps 4 to 7 till whole of the list
is constructed.
(9) Point next of far to NULL, i.e., far
next 5 NULL. first far
back
(10) Stop.
The required program is given below:
#include <stdio.h> back
first far
#include <alloc.h>
main()
{
struct student {
char name [15]; Fig. 5.28 Creation of a linked list
int roll;
struct student *next;
};
struct student *First, *Far, *back;
int N, i;
int Sz;
Sz = sizeof (struct student);
218 Data Structures Using C
printf (“\n Enter the number of students in the class”);
scanf (“%d”, &N);
/* Take first node */
First = (struct student *) malloc(Sz);
/* Read the data of the first student */
(3) Point the pointer called front_half to first and travel the list again starting from the first
and reach to the middle.
(4) Point a pointer called back-half to the node that succeeds the middle of the list.
(5) Attach NULL to the next pointer of the middle node.
(6) Print the two lists, i.e., pointed by front-half and back-half.
p = &x;
p −> Net = 50;
printf (“\n %d”, ++ (x.Net));
}
Ans. The output would be 50.
7. What would be the output of the following code?
#include <stdio.h>
main()
{
struct val {
int Net;
};
struct val x;
struct val *p;
p = &x;
p −> Net = 50;
printf (“\n %d”, (x.Net)++);
}
Ans. The output would be 21.
eXeRCISeS
1. Explain the `&´ and `*´ operators in detail with suitable examples.
2. Find the errors in the following program segments:
224 Data Structures Using C
(a)
int val = 10
int * p;
p = val;
(b)
char list [10];
char p;
list = p;
3. (i) What will be the output of the following program:
#include <stdio.h>
main()
{
int val = 10;
int *p, **k;
p = &val;
k = &p;
printf (“\n %d %d %d %d”, p, *p, *k, **k);
}
(ii)
#include <stdio.h>
main()
{
char ch;
char *p;
ch = ’A’;
p = &ch;
printf (“\n %c %c”, ch, (*p)++);
}
• 6.1 Introduction
ChapTER OUTLINE
6.1 INTRODUCTION
An array is a very simple and extremely useful data structure. A large number of applications based on
stacks and queues can be easily implemented with the help of arrays. Lists and tables have natural simi-
larity with one- and two-dimensional arrays. Though the allocation of space is sequential in an array, the
access to its elements is random. For a given index the element can be accessed in a time, independent
of its location in the array.
However, there are many problems associated with this data structure. Some of the important
problems are as follows:
(1) An array is a static data structure and, therefore, its size should be known in advance before its
usage.
(2) If a list of data items is to be stored, then an array of approximate size is used, i.e., the array
has to be large enough to hold the maximum amount of data one could logically expect. This
is totally a guess work and can result in overflow or wastage of main memory.
(3) What if the number of elements in the list is not known in advance?
(4) Insertion or deletion operations in an array, except from the end, are a costly exercise. These
operations require large number of shifting of elements in one direction or the other. This a very
time consuming exercise.
The above problems can be dealt with the help of linked structures, discussed in the subsequent sections.
An element in the list is called a node. The node is a self-referential structure, having two parts: Data
and Next as shown in Figure 6.1(b). The Data part contains the information about the element and the
Next part contains a pointer to the next element or node in the list. For example, in Figure 6.1(a), a list
consisting four names (“Preksha”, “Sagun”, “Bhawna”, and “Ridhi”) is represented using linked structures.
Such a list is called linked list or a linear linked list or a singly linked list.
Linked list is a series of nodes. Each node has two parts: data and next. The data part contains
the information about the node and the next part is a pointer that points to next node. The next of
last node points to NULL.
List
Node
Data Next
(b) A node
It may be noted that the list is pointed by a pointer called ‘List’. Currently List is pointing
to a node containing the name ‘Preksha’, the next part of which is
X
pointing to ‘Sagun’, the next of which is pointing to ‘Bhawna’, and
so on.
By convention, a node X means that the node is being pointed by
a pointer X as shown in Figure 6.2. In an algorithm, the data part of X
is written as DATA(X) and the next part is written as NEXT(X). Data Next
Let us now insert a new node containing data called ‘Samridhi’
between ‘Preksha’ and ‘Sagun’. This insertion operation can be Fig. 6.2 A node called X
carried out by the following steps:
Step
1. Get a node called ptr.
2. DATA (ptr) = ‘Samridhi’
3. NEXT (ptr) = NEXT(List)
4. NEXT(List) = ptr
The third step means that the Next pointer of ptr should point to the same location that is currently
being pointed by Next of List.
The effect of above steps (1–4) is shown in Figure 6.3.
Linked Lists 229
List
List
Ptr Samridhi
Step 3 NEXT (ptr) = NEXT(List)
Data Next
List
Ptr Samridhi
Step 4 NEXT(List) = ptr
Data Next
It may be noted that the insertion operation is a very simple and easy operation in a linked list
because, unlike arrays, no copying or shifting operations are required.
Similarly, it is easy to delete an element from a linked list. To delete Bhawna from the list given in
Figure 6.3, we need only to redirect the next pointer of Sagun to point to Ridhi as shown in Figure 6.4.
This can be done by the following the following steps:
Step
1. Point a pointer called ptr1 to the node containing the name ‘Bhawna’.
2. Point a pointer called ptr2 to the node containing the name ‘Sagun’.
3. Next (ptr2) = Next(ptr1)
4. Return ptr1
230 Data Structures Using C
List ptr2
ptr1
Samridhi
Data Next
List ptr2
Samridh Bhawna
ptr1
Data Next Free (ptr1 )
From the above discussion, it is clear that unlike arrays, there is no longer a direct correspondence
between the logical order of elements present in a list and the way the nodes are arranged physically in
the linked list (see Figure 6.5).
List
I My Love India
The physical order of the nodes shown in Figure 6.5 is “I My Love India” whereas the logical order
is “I Love My India”. It is because of this feature that a linked list offers a very easy way of insertions and
deletions in a list without the physical shifting of elements in the list. While manipulating a linked
list, the programmer keeps in mind only the logical order of the elements in the list. Physically, how
the nodes are stored in the memory is neither known to nor the concern of the programmer.
Linked Lists 231
6.3.1.1 Self-referential Structures When a member of a structure is declared as a pointer to the struc-
ture itself, then the structure is called self-referential structure. Consider the following declaration:
struct chain { int val;
struct chain *p;
};
From Figure 6.7 and the above program segment, we observe that the pointer p of structure vari-
able B is dangling, i.e., it is pointing to nowhere. Such pointer can be assigned to NULL, a constant
indicating that there is no valid address in this pointer. The following statement will do the desired
operation:
B.p = NULL; A B
Once the above statements are executed, the output would be:
The contents of member val of B = 60
The contents of member val of B = 60
The linked structures have great potential and can be used in numerous programming situations
such as lists, trees, etc. The major advantage of linked structures is that the pointers can be used to
allocate dynamic memory. Consider the following declaration:
struct chain *ptr;
The above statement declares a pointer called ptr, which can point to a structure of type chain
[see Figure 6.9 (a)]. Lets us use ptr to take one such structure through dynamic allocation with the
help of malloc() function of ‘C’. From Chapter 5, we know that the malloc() function needs the size
ptr
ptr
(b) The dynamic allocation of a structure of type chain pointed by the pointer ptr
of the memory that is desired by the programmer. Therefore, in the first step, given below, the size of
the structure has been obtained and in the second step the memory of that size has been asked through
malloc() function.
Size = sizeof (struct chain);
ptr = (struct student *) malloc (Size);
The effect of the above statements is shown in Figure 6.9 (b). It may be noted that the node pointed by
ptr has no name. In fact all dynamic memory
allocations are anonymous. Name roll next
Let us now use self-referential struc- Student
tures to create a linked list of nodes of
structure type given in Figure 6.10. While
creating the linked list, the student data Fig. 6.10 The self-referential structure ‘student’
is read. The list is also travelled to print
the data.
We will use the following self-referential structure for the purpose of creating a node of the linked list:
struct stud {
char name [15];
int roll;
struct stud*next;
};
The required linked list would be created by using three pointers—first, far, and back. The
following algorithm would be employed to create the list:
Algorithm createLinkList()
{
Step
1. Take a new node in pointer called first.
2. Read first −> name and first −> roll.
3. Point back pointer to the same node being pointed by first, i.e.,
back = first.
4. Bring a new node in the pointer called far.
5. Read far −> name and far −> roll.
6. Connect next of back to far, i.e., back −> next = far.
7. Take back to far, i.e., back = far.
8. Repeat steps 4 to 7 till whole of the list is constructed.
9. Point next of far to NULL, i.e., far −> next = NULL.
10. Stop.
}
Step
1, 2 first
back
3 first
back far
4, 5 first
back
6 first far
back
7 first far
int roll;
struct student *next;
};
void main()
{
struct student *First;
First = create();
dispList(First);
}
scanf(“%d”, &N);
if ( N == 0) return NULL; /* List is empty */
/* Take first node */
First = (struct student *) malloc (Sz);
/* Read the data of the first student */
back->next = Far;
/* point back where Far points */
back = Far;
} /* Repeat the process */
Far->next = NULL;
It may be noted that the next pointer of the last node of a linked list always points to NULL to
indicate the end of the list. The pointer First points to the beginning of the list. The address of an in
between node is not known and, therefore, whenever it is desired to access a particular node, the travel
to that node has to start from the beginning.
236 Data Structures Using C
First
Let us use a pointer ptr to travel the list. A variable called count would be used to count the
number of nodes in the list. The travel stops when a NULL is encountered. The following algorithm called
travelList() travels the given list pointed by First.
Algorithm travelList()
{
Step
1. if First == NULL then { print “List Empty”; Stop}
2. ptr = First; /* ptr points to the node being pointed by First */
3. count = 0;
4. while (ptr != Null)
{
4.1 count = count + 1;
4.2 ptr = NEXT(ptr);
}
5. print “The number of nodes =”, count;
6. End
In the above algorithm, step 4.2, given below, is worth noting:
ptr = NEXT(ptr)
The above statement says that let ptr point to the same location which is currently being pointed by
NEXT of ptr. The effect of this statement on the list of Figure 6.12 is shown in Figure 6.13.
It may be noted that the step 4.2 takes the ptr to next node. Since this statement is within the scope
of the while loop, ptr travels from the First to the NULL pointer, i.e., the last node of the list.
Example 1: Write a program that travels a linked list consisting of nodes of following struct type. While
traveling, it counts the number of nodes in the list. Finally, the count is printed.
struct student {
char name [15];
int roll;.
struct student *next;
};
Linked Lists 237
First ptr
First ptr
Solution: A linked list of nodes of student type would be created by using a function called create().
The nodes would be added to the list until the user enters a roll number less than or equal to 0, i.e., (roll
<= 0). A function countNode() based on the algorithm travelList() would be used to travel and
count the nodes in the list.
The required program is given below:
/* This program travels a linked list and counts the number of nodes */
#include <stdio.h>
#include <alloc.h>
struct student {
char name [15];
int roll;
struct student *next;
};
struct student *create();
int countNode(struct student *First);
void main()
{
int count;
struct student *First;
First = create();
count = countNode(First);
printf(“\n The number of nodes = %d”, count);
}
if (First->roll <=0){
printf (“\n Nodes = 0”);
First = NULL;
exit(1);} /* Empty list*/
Example 2: Write a program that travels a linked list consisting of nodes of following struct type. As-
sume that the number of students in the list is not known.
struct student {
char name [15];
int roll;
struct student *next;
};
While travelling the list of students, it is split into two sub-lists pointed by two pointers: front_half
and back_half. The front_half points to the front half of the list and the back_half to the back half of the
list. If the number of students is odd, the extra student should go into the front list.
Solution: The list of students would be created in the same fashion as done in Example 1. The number of
nodes would be counted during the creation of the linked list itself. Assume that the linked list is being
pointed by First. The following steps would be followed to split the list in the desired manner:
(1) Compute the middle of the list.
(2) Point front_half to First.
(3) Point another pointer Far to First. Let Far travel the list starting
from the first and reach to the middle.
(3) Point back_half to the node that succeeds the middle of the list.
(4) Attach NULL to the next pointer of the middle node, now pointed by Far.
(5) Print the two lists, i.e., pointed by front_half and back_half.
The required program is given below:
/* This program travels a linked list and splits it into two halves */
#include <stdio.h>
#include <alloc.h>
struct student {
char name [15];
int roll;
struct student *next;
};
struct student * create (int *count);
struct student * split (struct student *First, int *count);
void dispList(struct student *First);
void main()
{ struct student *First,*front_half, *back_half;
int count;
First = create(&count);
if (count == 1)
{
printf (“\n The list cannot be split”);
front_half = First;
}
else
{
240 Data Structures Using C
front_half = First;
back_half = split (First, &count);
}
printf (“\n The First Half...”);
dispList(front_half);
printf (“\n The Second Half...”);
dispList(back_half);
}
6. if (flag ==1) then print “Item found” else print “Item not found”
7. Stop
}
Example 3: Modify program of Example 1 such that the program searches the record of a student whose
roll number is given by the user.
Solution: Instead of writing the code for creation and display of linked lists again and again, now onwards
we would use the functions create() and dispList() developed in earlier programs. A new function
based on algorithm searchList() would be written and used to search the record of a student.
The required program is given below:
/* This program searches an item in a linked list */
#include <stdio.h>
#include <alloc.h>
#include <process.h>
struct student {
char name [15];
int roll;
struct student *next;
};
void main()
{
struct student *First;
int Roll, result;
First =create();
The list structure before and after the insertion operation is shown in Figure 6.14.
First
ptr NULL
First NULL
ptr
First = ptr
It may be noted that after the insertion operation, both First and ptr are pointing to first node
of the list.
If it is desired to insert a node at arbitrary position in a linked list, then it can either be inserted
after a node or before a node in the list. An algorithm for insertion of a node after a selected node
in a linked list is below.
Let First be the pointer to the linked list. In this algorithm, a node is inserted after a node with data
part equal to ‘val’. A pointer ‘ptr’ travels the list in such a way that each visited node is checked for data
part equal to val. If such a node is found then the new node, pointed by a pointer called ‘nptr’, is inserted.
Algorithm insertAfter()
{
Step
1. ptr = First
2. while (ptr != NULL)
{
2.1 if (DATA (ptr) = val)
{ take a node in nptr;
Read DATA(nptr);
Linked Lists 245
Next (nptr) = Next (ptr);
Next (ptr) = nptr;
break;
}
2.2 ptr = Next(ptr);
}
3. Stop
}
The list structure before and after the insertion operation is shown in Figure 6.15.
First ptr
Val
First ptr
Val
nptr
First ptr
Val
Example 4: Modify Example 3 such that after creation of a linked list of students, the program asks from
the user to enter data for the student which is desired to be inserted into the list after a given node.
Solution: The required program is given below. It takes care of both the cases, i.e., it inserts the student
as per his roll number in the list, which can be at the beginning or at any other location in the list.
/* This program inserts an item in a linked list */
#include <stdio.h>
#include <alloc.h>
#include <process.h>
struct student {
char name [15];
246 Data Structures Using C
int roll;
struct student *next;
};
Note: It is easier to insert a node after a selected node in a linked list as only one pointer is needed
to select the node. For example, in the above program, the Far pointer has been used to select the node
after which the insertion is made.
However, if insertion is needed to be made before a selected node then it is cumbersome to insert node
with only one pointer. The reason being that when the node is found before which the insertion is desired,
there is no way to reach to its previous node and make the insertion. The easier and elegant solution is to
use two pointers (Far and back) which move in tandem, i.e., the pointer Far is followed by back. When
Far reaches the desired location, back points to its previous location as shown in Figure 6.16.
ptr
An algorithm for insertion of data before a selected node in a linked list pointed by First is
given below:
Algorithm insertBefore()
{
Far = First;
If (DATA(Far) ==’Val’) /* Check if the first node is the desired node */
{
take a new node in ptr;
Read DATA(ptr);
Next(ptr) = Far;
First = ptr;
Stop;
}
while (Far != NULL )
{
back = Far;
Far = Next(Far);
If (DATA(Far) ==’Val’) /* Check if the node is the desired node */
{
take a new node in ptr;
Read DATA(ptr);
Next(ptr) = Far;
Next(back) = ptr;
break;
}
}
}
Example 5: A sorted linked list of students in the order of their roll numbers is given. Write a program
that asks from the user to enter data for a missing student which is desired to be inserted into the list
at its proper place.
Solution: The sorted linked list of students in the order of their roll numbers would be created by
using function create(), developed in the previous programs. The data of the student which is to be
inserted is read and its proper location found, i.e., before the node whose roll number is greater than the
roll number of this in coming student. The algorithm insertBefore() would be used for the desired
insertion operation.
The required program is given below:
/* This program inserts a node in a linked list */
#include <stdio.h>
#include <alloc.h>
#include <process.h>
struct student {
char name [15];
int roll;
struct student *next;
Linked Lists 249
};
struct student *create();
struct student * insertList(struct student *First, struct student newStud,
int *flag);
void dispList(struct student *First);
void main()
{
struct student *First, stud;
int Roll, result;
First = create();
printf (“\n Enter the data of the student which is to be inserted”);
Algorithm delNode()
{
1. if (DATA (First) = ‘VAL’ /* Check if the starting node is the desired
one */
{
1.1 ptr = First;
1.2 First = Next (First);
1.3 Free(ptr);
Linked Lists 251
1.4 Stop;
}
2. Back = First;
3. ptr = Next (First)
The list structure before and after the deletion operation is shown in Figure 6.17.
Node to be
deleted
Free
First back ptr
First back
Example 6: A linked list of students in the order of their roll numbers is given. Write a program that asks
from the user to enter roll number of the student which is desired to be deleted from the list.
Solution: The linked list of students in the order of their roll numbers would be created by using func-
tion create(), developed in the previous programs. The roll number of the student which is to be
deleted is read and its proper location found. The algorithm delNode() would be used for the desired
deletion operation.
The required program is given below:
/* This program deletes a node from a linked list */
#include <stdio.h>
#include <alloc.h>
#include <process.h>
struct student {
char name [15];
int roll;
struct student *next;
};
struct student *create();
struct student *delNode(struct student *First, int roll, int *flag);
void dispList(struct student *First);
void main()
{
struct student *First;
int Roll, result;
First = create();
printf (“\n Enter the Roll of the student which is to be deleted”);
First
The main advantage of the circular linked list is that from any node, in the list, one can reach to
any other node. This was not possible in a linear linked list, i.e., there was no way of reaching to nodes
preceding the current node.
A modified algorithm for creation of circular linked list is given below:
Algorithm createCirLinkList()
{
Step
1. Take a new node in pointer called first.
2. Read Data (First).
3. Point back pointer to the same node being pointed by first, i.e.,
back = first.
4. Bring a new node in the pointer called far.
5. Read Data (far).
6. Connect next of back to far, i.e., back −> next = far.
7. Take back to far, i.e., back = far.
8. Repeat steps 4 to 7 till whole of the list is constructed.
9. Point next of far to First, i.e., far −> next = First.
10. Stop.
}
Example 7: Write a function create() that creates a circular linked list of N students of following
structure type:
struct student {
char name [15];
int roll;
struct student *next;
};
Solution: We have used the algorithm createCirLinkList() to write the required function which
is given below:
struct student *create()
{
struct student * First, *Far, *back;
Linked Lists 255
int N,i;
int Sz;
Sz = sizeof (struct student);
printf(“\n Enter the number of students in the class”);
scanf(“%d”, &N);
if ( N==0) return NULL; /* List is empty */
/* Take first node */
First = (struct student *) malloc (Sz);
/* Read the data of the first student */
Example 8: Write a program that travels a circular linked list consisting of nodes of following struct
type. While travelling, it counts the number of nodes in the list. Finally, the count is printed.
struct student {
char name [15];
int roll;
struct student *next;
};
Solution: A circular linked list of nodes of student type would be created by using the function called cre-
ate() developed in Example 7. The list would be travelled from the first node till we reach back to the
first node. While travelling, the visited nodes would be counted. The required program is given below:
/* This program travels a circular linked list and counts the number of
nodes */
#include <stdio.h>
#include <alloc.h>
256 Data Structures Using C
struct student {
char name [15];
int roll;
struct student *next;
};
do
{ count = count +1;
/* Point to the next node */
ptr = ptr->next;
}
while (ptr != First); /* The travel stops at the first node */
return count;
}
Note: The above representation of circular linked list (Figure 6.18) has a drawback as far as the
insertion operation is concerned. Even if the node is to be added at the head of the list, the complete
list will have to be travelled so that the last pointer is made to point to the new node. This operation
ensures that the list remains circular after the insertion. It may be noted that this extra travel was not
required in the case of linear linked list.
The above drawback of circular linked lists can be avoided by shifting the First pointer to the
last node in the list as shown in Figure 6.19.
It may be noted that by having this representation, both ends of the list can be accessed through the
First. The First points to one end and the Next (First) points to another.
Linked Lists 257
First
A node can be added at the head of the list by the following algorithm:
Algorithm insertHead()
{
Step
1. Take a new node in ptr.
2. Read DATA(ptr).
3. Next(ptr) = Next (First).
4. Next(First) = ptr.
}
ptr
First
ptr
First
ptr
First
Fig. 6.20 Insertion of a node at the head of a circular linked list without travel
A node can be added at the tail (i.e., last) of the list by the following algorithm:
Algorithm insertTail()
{ Step
1. Take a new node in ptr.
2. Read DATA(ptr).
3. Next(ptr) = Next (First).
258 Data Structures Using C
4. Next(First) = ptr.
5. First = ptr.
}
First
ptr
First
ptr
ptr
First
Fig. 6.21 Insertion of a node at the tail of a circular linked list without travel
It may be noted that in both the cases of insertion (at the head and tail), no travel of the circu-
lar linked list was made. Therefore, the new representation has removed the drawback of the ordinary
circular linked list. In fact, it offers the solution in the order of O(1).
NULL
NULL
NULL
NULL
ptr→rightLink→leftLink
Fig. 6.24 The left of right and right of left is the same node
6.4.2.1 Insertion in Doubly Linked List Insertion of a node X before a node pointed by ptr can be
done by the following program segment:
(1) X −> rightLink = ptr −> leftLink −> rightLink;
(2) ptr −> leftLink −> rightLink = X;
(3) X −> leftLink = ptr −> leftLink;
(4) ptr −> leftLink =X;
The effect of the above program segment is shown in Figure 6.25. The above four operations have
been labelled as 1–4 in Figure 6.25.
ptr
First
NULL
NULL
4
3
2
1
X
Similarly, insertion of a node X after a node pointed by ptr can be done by the following program
segment:
(1) X −>leftLink = ptr −>rightLink −>leftLink;
(2) ptr −>rightLink −>leftLink = X;
(3) X −>rightLink = ptr −>rightLink;
(4) ptr −>rightLink =X;
The effect of the above program segment is shown in Figure 6.26. The above four operations have
been labeled as 1–4 in Figure 6.26.
6.4.2.2 Deletion in Doubly Linked List Deletion of a node pointed by ptr can be done by the
following program segment:
(1) ptr −> left −> right = ptr −> right;
(2) ptr −> right −> left = ptr −> left;
(3) free(ptr);
Linked Lists 263
ptr
First
NULL
NULL
4
1
2 3
X
The effect of the above program segment is shown in Figure 6.27. The above three operations have
been labeled as 1–3 in Figure 6.27.
ptr
First 1
NULL
NULL
ptr
3 Free (ptr)
First
NULL
Note: During insertion or deletion of a node, the leftLink and rightLink pointers are appropriately
fixed so that they point to the correct node in the list.
From Figures 6.23 to 6.27, it may be observed that the doubly linked list has the following advan-
tages over the singly linked list:
(1) With the help of rightLink, the list can be traversed in forward direction.
(2) With the help of leftLink, the list can be traversed in backward direction.
(3) Insertion after or before a node is easy as compared to singly linked list.
(4) Deletion of a node is also very easy as compared to singly linked list.
It is left as an exercise for the readers to implement the insertion and deletion operations using ‘C’.
264 Data Structures Using C
First
25 37 68 NULL
Dummy Node
It may be noted that the list shown in Figure 6.28 represents a list of integers (25, 37, 68 …). The
dummy node does not contain any value. It may be further noted that first node of the list (containing
25) is also having dummy node as its predecessor and, therefore, insertion at the head of the node can be
done in a normal fashion as done in algorithm insertAfter() of section 6.3.4.
Similarly, dummy nodes can also be used for circular and doubly linked lists as shown in Figure 6.29.
First
Dummy Node
(a) Circular Linked List
First
NULL
NULL
Dummy Node
(b) Doubly linked list
Fig. 6.29 Circular and doubly linked lists with dummy nodes
Linked Lists 265
Note: With the inclusion of a dummy node, an algorithm does not have to check for special or excep-
tional cases such as existence of empty list or insertion/deletion at the head of the list and the like. Thus,
the algorithm becomes easy and applicable to all situations.
Some programmers employ the dummy node to store useful information such as number of ele-
ments in the list or the name of the group or set to which the elements belong to. For example, the list
shown in Figure 6.28 can be modified so that the dummy node contains information about number of
elements present in the list (3 in this case). The modified list is given in Figure 6.30.
First
3 25 37 68 NULL
Dummy Node
Singly
First First
linear
linked list
NULL NULL
Dummy Node
Singly First
circular First
linked list
NULL
Dummy Node
First
First NULL
Doubly
linked list NULL NULL
Dummy Node
Fig. 6.31 Comparison of empty lists with and without dummy nodes
266 Data Structures Using C
It may be noted that with the help of dummy nodes, the basic nature of linked lists remains
intact.
Algorithm Push()
{
Step
1. Take a new node in ptr;
2. Read DATA(ptr);
3. Next (ptr) = Top;
4. Top = ptr;
}
The structure of linked stack after a push operation is shown in Figure 6.33.
An algorithm for popping an element from the stack is given below. In this algorithm, a node from
the Top of the stack is removed. The removed node is pointed by ptr.
Algorithm POP()
{
Step
1. if (Top = NULL) then prompt ‘stack empty’; Stop
2. ptr = Top
3. Top = Next (Top);
4. return DATA(ptr)
5. Free ptr
}
The structure of linked stack after a POP operation is shown in Figure 6.34.
Linked Lists 267
ptr Top
Top
ptr
NULL NULL
Top ptr
ptr
ptr Free
Top
Top
NULL NULL
NULL
Example 10: Write a menu driven program that simulates a linked stack for integer data items.
Solution: The algorithms Push() and POP() developed in this section would be used to write the
program that simulates a linked stack for integer data items. Each node of the linked stack would be of
following structure type:
268 Data Structures Using C
struct node
{
int item;
struct node * next;
};
#include <stdio.h>
#include <alloc.h>
#include <conio.h>
struct node
{
int item;
struct node *next;
};
NULL
An algorithm for adding an element on to the queue is given below. In this algorithm, a node pointed
by ptr is added into the queue.
Algorithm AddQueue()
{
Step
1. Take a new node in ptr;
2. Read DATA(ptr);
3. Next (ptr) = Next (Rear);
4. Next(Rear) = ptr;
5. Rear = ptr;
}
The structure of linked queue after addition operation is shown in Figure 6.36.
Front Rear
NULL
ptr
Front
NULL
3
2 1
Rear
ptr
An algorithm for deleting an element from the linked queue is given below. In this algorithm, a node
from the front of the queue is removed.
Algorithm delQueue()
{
Step
1. if (Front == Rear) then prompt ‘Queue empty’; Stop
2. ptr = Front;
3. Front = Next (Front);
4. Free ptr
5. return DATA(Front)
}
The structure of linked queue after a deletion operation is shown in Figure 6.37.
Front
Rear
NULL
ptr
Front
Rear
NULL
ptr
Free Front
Rear
NULL
ptr
It may be noted that the Front always points to a blank node, i.e., which does not contain any data.
Thus, when the queue is empty then both Front and Rear point to a blank node.
Example 11: Write a menu-driven program that simulates a linked queue for integer data items.
Solution: The algorithms AddQueue() and delQueue() developed in this section would be used to
write the program that simulates a linked queue for integer data items. Each node of the linked queue
would be of following structure type:
struct node
{
int item;
272 Data Structures Using C
struct node * next;
};
struct node *addQ (struct node *Rear,struct node *Front, int item);
struct node *delQ (struct node *Rear,struct node *Front, int *item);
void dispQ(struct node *Rear,struct node *Front);
void main ()
{
int item, choice, size;
struct node *Front, *Rear;
size = sizeof (struct node);
Front = (struct node *) malloc (size);
Front->item = −9999; /* dummy node */
Front->next = NULL;
Rear = Front; /* Initialize Queue to empty */
do
{ clrscr();
printf (“\n Menu “);
printf (“\n Add 1”);
printf (“\n Delete 2”);
printf (“\n Display 3”);
printf (“\n Quit 4”);
printf (“\n\n Enter Choice : “);
scanf (“%d”, &choice);
switch (choice)
{
case 1: printf (“\n Enter the integer item to be added:”);
scanf (“%d”, &item);
Rear = addQ (Rear, Front, item);
break;
case 2: if (Front == Rear)
{ printf (“\n Queue empty”);
break;
}
Front =delQ (Rear, Front, &item);
Linked Lists 273
printf (“\n The deleted item = %d”, item);
break;
case 3: dispQ(Rear, Front);
break;
}
printf (“\n press any key”); getch();
}
while (choice != 4);
}
struct node * addQ (struct node *Rear, struct node *Front, int item)
{
int size;
struct node *ptr;
size = sizeof (struct node);
ptr = (struct node *) malloc (size);
ptr->item = item;
ptr->next = Rear->next;
Rear->next = ptr;
Rear = ptr;
return ptr;
}
struct node * delQ (struct node *Rear, struct node *Front, int * item)
{
struct node *ptr;
ptr = Front;
Front = Front->next;
*item = Front->item;
Front->item = −9999; /* The dummy node */
free (ptr);
return Front;
}
void dispQ(struct node *Rear, struct node *Front)
{
if (Front == Rear)
printf (“\n The queue empty”);
else
{ printf (“\n The queue is... “);
do
{
Front = Front->next;
printf(“ %d “, Front->item);
}
while (Front != Rear);
}
}
274 Data Structures Using C
the elements to the list by getting the memory allocated dynamically. Therefore, the number
of elements present in the linked list is limited only by the memory available to the operating
system. We can say that the limitation of arrays has been effectively handled by the linked list
implementation.
n Linked lists store the elements in non-contiguous locations. The nodes are independent of each
other.
n Linked lists are useful for storing elements in the order of insertion and subsequent traversal in
that order.
n However, the linked lists occupy more space than the arrays because each node contains an
through a system call to operating system making the processing very slow. Moreover, every
element is referred to through an extra reference made through a pointer thereby making the
processing of linked list further slow.
n Access to an element in a linked list is purely sequential whereas an element in array can be
accessed through its index. Thus, arrays are faster in accessing an element.
n Nevertheless, a linked list offers a very easy way of insertions and deletions in a list without the
pointers.
List
NULL
ptr Ahead
List back
NULL
NULL
List
NULL
8. Ahead = next(Ahead)
9. next (ptr) = back; /* reverse the link */
10. back = ptr;
11. List = ptr;
12. Stop;
}
Problem 2: Write a program which travels a linked list pointed by List. It travels the list in such a way
that it reverses the links of the visited nodes, i.e., at the end the list becomes reversed and its contents
are displayed.
Solution: The algorithm revLink() developed in Problem 1 is used to write the given program that
reverses a linked list of students pointed by a pointer called List. The required function is given
below:
/* This function reverses the links of a linked list of students */
#include <stdio.h>
#include <alloc.h>
struct student {
char name [15];
int roll;
struct student *next;
};
struct student *revLink( struct student *List) /* The function that reverses
the list */
{
struct student *back, *ptr, *ahead;
if (List == NULL)
{
Linked Lists 277
printf (“\n The list is empty”);
return NULL;
}
ptr = List;
back = List;
ahead = ptr->next;
List->next = NULL;
while (ahead != NULL)
{
ptr = ahead;
ahead = ahead->next;
ptr->next = back; /* Connect the pointer to preceding node*/
back = ptr;
}
List = ptr;
return List;
}
Problem 3: Write a program that adds two polynomials Pol1 and Pol2, represented as linked lists of
terms, and gives a third polynomial Pol3 such that Pol3 = Pol1 + Pol2.
Solution: We will use the following structure for the term of a polynomial:
Pol Coef. Exp. Next
As done in Chapter 3, the polynomials will be added by merging the two linked lists. The required
program is given below:
/* This program adds two polynomials represented as linked lists
pointed by two pointers pol1 and pol2 and gives a third polynomial
pointed by a third pointer pol3 */
# include <stdio.h>
struct term
{
278 Data Structures Using C
int coef;
int exp;
struct term *next;
};
struct term * create_pol();
void main()
{
struct term *pol1,*pol2,*pol3, *ptr1,*ptr2,*ptr3, *back, *bring, *ptr;
int size;
size = sizeof (struct term);
printf (“\n Polynomial 1”);
pol1 = create_pol();
printf (“\n Polynomial 2”);
pol2 = create_pol();
ptr1=pol1;
ptr2=pol2;
ptr3 =pol3=NULL;
while (ptr1 != NULL && ptr2 != NULL) /* Add Polynomials by merging */
{ if ( ptr1->exp > ptr2->exp)
{
if (pol3 == NULL)
{
pol3 = (struct term *) malloc(size);
pol3->coef=ptr1->coef;
pol3->exp=ptr1->exp;
ptr3=pol3;
ptr1= ptr1->next;
ptr3->next = NULL;
}
else
{
bring =(struct term *) malloc(size);
bring->coef=ptr1->coef;
bring->exp=ptr1->exp;
ptr3->next =bring;
ptr3=bring;
ptr1=ptr1->next;
ptr3->next =NULL;
}
}
else
{
if ( ptr1->exp < ptr2->exp)
{
if (pol3 == NULL)
{
Linked Lists 279
pol3 = (struct term *) malloc(size);
pol3->coef=ptr2->coef;
pol3->exp=ptr2->exp;
ptr3=pol3;
ptr2= ptr2->next;
ptr3->next = NULL;
}
else
{
bring =(struct term *) malloc(size);
bring->coef=ptr2->coef;
bring->exp=ptr2->exp;
ptr3->next =bring;
ptr3=bring;
ptr2=ptr2->next;
ptr3->next =NULL;
}
}
else
{
if (pol3 == NULL)
{
pol3 = (struct term *) malloc(size);
pol3->coef=ptr1->coef+ptr2->coef;
pol3->exp=ptr2->exp;
ptr3=pol3;
ptr2= ptr2->next;
ptr1=ptr1->next;
ptr3->next = NULL;
}
else
{
bring =(struct term *) malloc(size);
bring->coef=ptr1->coef+ptr2->coef;
bring->exp=ptr2->exp;
ptr3->next =bring;
ptr3=bring;
ptr2=ptr2->next;
ptr1=ptr1->next;
ptr3->next =NULL;
}
}
}
} /* while */
/* Append rest of the Polynomial */
if (ptr2->next ==NULL) ptr3->next = ptr1;
280 Data Structures Using C
if (ptr1->next ==NULL) ptr3->next = ptr2;
/* Print the Final Polynomial */
ptr = pol3;
printf (“\n Final Polynomial ...”);
while (ptr != NULL)
{
printf (“%d ^ %d “, ptr->coef, ptr->exp);
if (ptr->exp !=0) printf (“+”);
ptr = ptr->next;
}
}
ExERCISES
6. Write a program that takes two ordered linked lists as input and merges them into single or-
dered linked list.
7. What is a doubly linked list? Write program/algorithm for showing the following operations on
a doubly linked list:
n Create
n Insert
n Delete
8. What are the advantages of doubly linked list over singly linked list?
9. Write the program that inserts a node in a linked list after a given node.
10. Write the program that inserts a node in a linked list before a given node.
11. Write a program that deletes a given node from a linked list.
12. Write a function that deletes an element from a doubly linked list.
13. Implement a stack (LIFO) using singly linked list.
14. Write a program to implement a queue (FIFO) using singly linked list.
15. How do you detect a loop in a singly linked list? Write a ‘C᾽ program for the same.
16. How do you find the middle of a linked list without counting the nodes? Write a ‘C᾽ program
for the same.
17. Write a ‘C᾽ program that takes two lists List1 and List2 and compare them. The program
should return the following:
−1 : if List1 is smaller than List2
0 : if List1 is equal to List2
1 : if List1 is greater than List2
18. Write a function that returns the nth node from the end of a linked list.
19. Write a program to create a new linear linked list by selecting alternate element of a given linear
linked list.
20. Write an algorithm to search an element from a given linear linked list.
21. Explain the merits and demerits of static and dynamic memory allocation techniques.
22. Define a header linked list and explain its utility.
Trees
7
Chapter
• 7.1 Introduction
ChapTeR OUTlINe
7.1 INTRODUCTION
There are situations where the nature of data is specific and cannot be represented by linear or sequen-
tial data structures. For example, as shown in Figure 7.1 hierarchical data such as ‘types of computers’ is
non-linear by nature.
Computers
Digital
Analog
Hybrid
Desk Note
top book
Situations such as roles performed by an entity, options available for performing a task, parent–child
relationships demand non-linear representation. A tourist can go from Mumbai to Goa by any of the
options shown in Figure 7.2.
Trees 283
Travel
Root :A B C D 1
Child nodes of A : B, C, D
Sub-Trees of A : T1, T2, T3 F I J 2
E G H
Root of T1 :B
leaf leaf leaf
Root of T2 :C leaf leaf
Root of T3 :D K L Sub-tree 3
Leaf nodes : E, G, H, I, J, K, L leaf leaf T2
Sub-tree
Before proceeding to discuss more on T3
trees, let us have a look at basic terminology Sub-tree
used in context of trees as given in Section 7.2. T1
Example: Degree of nodes C and D are 1 and 2, respectively. Degree of tree is 3 as there are two nodes A
and B having maximum degree equal to 3.
It may be noted that the node of a tree can have any number of child nodes, i.e., branches. In the
same tree, any other node may not have any child at all. Now the problem is how to represent such a gen-
eral tree where the number of branches varies from zero to any possible number. Or is it possible to put a
restriction on number of children and still manage this important non-linear data structure? The answer
is ‘Yes’. We can restrict the number of child nodes to less than or equal to 2. Such a tree is called binary
tree. In fact, a general tree can be represented as a binary tree and this solves most of the problems. A
detailed discussion on binary trees is given in next section.
B C
D E F G
H I J K L M N O
2 3
4 5 6 7
8 9 10 11 12
Consider the binary tree given in Figure 7.10 which violates the condition of the complete binary
tree as the node number 11 is missing. Therefore, it is not a complete binary tree because there is a miss-
ing intermediate node before the last node, i.e., node number 12.
2 3
4 5 6 7
8 9 10 12 13
Level No of
Nodes
0 1
1 2
2 4
3 8
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
linTree = A B C D E F G H I J K L M N O
Fig. 7.11 Linear representation using linTree (of the tree of Figure 7.8)
In an array of size N, for the given node i, the following relationships are valid in linear
representation:
(1) leftChild (i) = 2*i { When 2*i > N then there is no left child }
(2) rightChild(i) = 2*i + 1 { When 2*i +1 > N then there is no right child }
(3) parent (i) = [i/2] { node 0 has no parent – it is a root node }
The following examples can be verified from the tree shown in Figure 7.8:
n The parent of node at i 5 5 (i.e., E) would be at [5/2] 5 2, i.e., B.
n The right child of node i 5 10 (i.e., J) would be at (2*10) + 1 5 21 which is > 15, indicating that it
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
skewTree = A B C D
Fig. 7.12 Linear representation of the skewed tree of Figure 7.7(d) using skewTree
It may be noted that out of 15 locations, only 4 locations are being used and 11 are unused resulting
in almost 300 per cent wastage of storage space.
The disadvantages of linear representation of binary tree are as follows:
(1) In a skewed or scanty tree, a lot of memory space is wasted.
(2) When binary tree is full then space utilization is complete but insertion and deletion of nodes
becomes a cumbersome process.
(3) In case of a full binary tree, insertion of a new node is not possible because an array is a static
allocation and its size cannot be increased at run time.
Trees 289
It may be noted that the root node is pointed by a pointer called binTree. Both leftChild and
rightChild of leaf nodes are pointing to NULL. Let us now try to create binary tree.
Note: Most of the books create only binary search trees which are very specific trees and easy to build. In
this book a novel algorithm is provided that creates a generalized binary tree with the help of a stack. The
generalized binary tree would be constructed and implemented in ‘C’ so that the reader gets the benefit
of learning the technique for building binary trees.
(3) Repeat steps 1 and 2 for sub-trees till you reach leaf nodes. A missing child in a non-leaf node is
represented as $.
By applying the above steps on binary tree of Figure 7.15, we get the following list:
A ( B, C)
The algorithm that uses the treeList and a stack to create the required binary tree is given below:
Algorithm createBinTree()
{
Step
1. Take an element from treeList;
2. If (element != ‘(‘ && element != ’)’ && element != ‘$’ && element !=
‘#’)
{
Take a new node in ptr;
DATA(ptr) = element;
leftChild (ptr) = NULL;
rightChild (ptr) = NULL;
Push (ptr);
}
3. if (element = ‘)‘ )
{
Pop (LChild);
Pop (RChild);
Pop(Parent);
leftChild(Parent) = LChild;
rightChild(Parent) = RChild;
Push (Parent);
}
4. if (element = ‘(’ ) do nothing;
5. if (element = ‘$’ ) Push (NULL);
6. repeat steps 1 to 5 till element != #;
7. Pop (binTree);
8. stop.
}
Trees 291
A simulation of the above algorithm for the following treeList is given in Figure 7.16.
treeList[] = {‘A’, ‘(‘, ‘B’, ‘(‘, ‘D’,’(‘,’H’, ‘I’,’)’, ‘E’,’(‘, ‘J’,’$’, ‘)
’,’)’,’C’,’(‘,’F’,’(‘,’K’,’L’,’)’,’G’,’)’,’)’,’#’};
I
H H
D D D E
D D
B B B B B B
A A A A A A H I
A H I
$ E
J J J $ B
E E
D D D D
B B B E
A H I A H I A H I A H I J $
F G
C
B K L
B
D E D E
A H I J $ A H I J $
B C
D E F G
H I J $ K L
Fig. 7.16 The creation of binary tree using the algorithm createBinTree()
It may be noted that the algorithm developed by us has created exactly the same tree which is shown
in Figure 7.15.
It may be noted that the travel strategies given above have been named as preorder, inoder, and
postorder according to the placement of operation: ‘V’ –‘process the visited node’. Inorder (L-V-R) travel has
the operation ‘V’ in between the other two operations. In preorder (V-L-R), the operation ‘V’ precedes the
other two operations. Similarly, the postorder traversal has the operation ‘V’ after the other two operations.
7.4.3.1 Inorder Travel (L-V-R) This order of travel, i.e., L-V-R requires that while travelling the binary
tree, the left sub-tree of a node be travelled first before the data of the visited node is processed. There-
after, its right sub-tree is travelled.
Consider the binary tree given in Figure 7.17.
Its root node is pointed by a pointer called Tree.
The travel starts from root node A and its Tree
data is not processed as it has a left child which
needs to be travelled first indicated by the
operation L. Therefore, we move to B. As B has A
L R
no left child, its data is processed indicated by
the operation V. Now the right child of B needs
to be travelled which is NULL, indicated by the B
V
C
L
operation R. Thus, the travel of B is complete
and the travel of left sub-tree of A is also over. V
R D E
As per the strategy, the data of A is processed (V)
and the travel moves to its right sub-tree, i.e., to
node C (R). The process is repeated for this F G
sub-tree with root node as C. The final trace of
the travel is given below:
Fig. 7.17 Partial travel of L-V-R order
B A F D G C E
Trees 293
An algorithm for inorder travel (L-V-R) is given below. It is provided with the pointer called Tree
that points to the root of the binary tree.
Algorithm inorderTravel(Tree)
{
if (Tree == NULL) return
else
{
inorderTravel (leftChild (Tree));
process DATA (Tree);
inorderTravel (rightChild (Tree));
}
}
Example1: Write a program that creates a generalized binary tree and travels it using an inorder strategy.
While travelling, it should print the data of the visited node.
Solution: We would use the algorithms createBinTree() and inorderTravel(Tree) to write the
program.
The required program is given below:
/* This program creates a binary tree and travels it using inorder fashion */
#include <stdio.h>
#include <alloc.h>
#include <conio.h>
struct tnode
{
char ch;
struct tnode *leftChild, *rightChild;
};
struct node
{
struct tnode *item;
struct node *next;
};
It may be noted that a linked stack has been used for the implementation. The program has been
tested for the binary tree given in Figure 7.15 represented using the following tree list:
treeList[] = {‘A’, ‘(‘, ‘B’, ‘(‘, ‘D’,’(‘,’H’, ‘I’,’)’, ‘E’,’(‘, ‘J’,’$’, ‘)
’,’)’,’C’,’(‘,’F’,’(‘,’K’,’L’,’)’,’G’,’)’,’)’,’#’};
The output of the program is given below:
HDIBJEAKFLCG
The above output has been tested to be true.
The program was also tested on the following binary tree (shown as a comment in this program):
char treeList[] = {‘1’, ‘(‘,’2’, ‘(‘, ‘$’, ‘4’, ‘)’,’3’, ‘(‘,’5’,’6’,’)’,’)’,’#’}
7.4.3.2 Preorder Travel (V-L-R) This order of travel, i.e., V-L-R requires that while travelling the
binary tree, the data of the visited node be processed first before the left sub-tree of a node is travelled.
Thereafter, its right sub-tree is travelled.
Consider the binary tree given in Figure 7.18. Tree
Its root node is pointed by a pointer called Tree.
The travel starts from root node A and its
V
data is processed (V). As it has a left child, we L
A
move to B (L) and its data is processed (V). As
B has no left child, the right child of B needs R
to be travelled which is also NULL. Thus, the V
B C
L
travel of B is complete and the processing of
R
node A and travel of left sub-tree of A is also
D E
over. As per the strategy, the travel moves
to its right sub-tree, i.e., to node C (R). The
process is repeated for this sub-tree with root F G
node as C. The final trace of the travel is given
below:
Fig. 7.18 Partial travel of V-L-R order
ABCDFGE
An algorithm for preorder travel (V-L-R) is given below. It is provided with a pointer called Tree that
points to the root of the binary tree.
Algorithm preorderTravel(Tree)
{
if (Tree == NULL) return
else
{ process DATA (Tree);
preorderTravel (leftChild (Tree));
preorderTravel (rightChild (Tree));
}
}
Example2: Write a program that creates a generalized binary tree and travels it using preorder strategy.
While travelling, it should print the data of the visited node.
Solution: We would use the algorithms createBinTree() and preorderTravel(Tree) to
write the program. However, the code will be similar to Example 1. Therefore, only the function
preOrderTravel() is provided.
The function has been tested for the binary tree given in Figure 7.15 represented using the following
tree list:
treeList[] = {‘A’, ‘(‘, ‘B’, ‘(‘, ‘D’,’(‘,’H’, ‘I’,’)’, ‘E’,’(‘, ‘J’,’$’,
‘)’,’)’,’C’,’(‘,’F’,’(‘,’K’,’L’,’)’,’G’,’)’,’)’,’#’};
7.4.3.3 Postorder Travel (L-R-V) This order of travel, i.e., L-R-V requires that while travelling the
binary tree, the data of the visited node be processed only after both the left and right sub-trees of a node
have been travelled.
Consider the binary tree given in Figure 7.19. Tree
Its root node is pointed by a pointer called Tree.
The travel starts from root node A and
its data is not processed as it has a left child A
which needs to be travelled first indicated by L
the operation L. Therefore, we move to B. As R
B has no left child, the right child of B needs B C
to be travelled which is NULL indicated by the L
operation R. Now, data of node B is processed R V D E
indicated by the operation V. Thus, the travel
of B is complete and the travel of left sub-tree
of A is also over. As per the strategy, the travel F G
moves to its right sub-tree, i.e., to node C (R).
The process is repeated for this sub-tree with
root node as C. The final trace of the travel is Fig. 7.19 Partial travel of L-R-V order
given below:
BFGDECA
An algorithm for postorder travel (L-R-V) is given below. It is provided with the pointer called Tree
that points to the root of the binary tree.
Algorithm postOrderTravel(Tree)
{
if (Tree == NULL) return
else
{ postOrderTravel (leftChild (Tree));
298 Data Structures Using C
postOrderTravel (rightChild (Tree));
process DATA (Tree);
}
}
Example3: Write a program that creates a generalized binary tree and travels it using preorder strategy.
While travelling, it should print the data of the visited node.
Solution: We would use the algorithms createBinTree() and postOrderTravel(Tree)
to write the program. However, the code will be similar to Example 1. Therefore, only the function
postOrderTravel() is provided.
The required function is given below:
void postOrderTravel (struct tnode *binTree)
{
if (binTree == NULL) return;
postOrderTravel (binTree->leftChild);
postOrderTravel (binTree->rightChild);
printf (“%c “, binTree->ch);
}
The function has been tested for the binary tree given in Figure 7.15 represented using the following
tree list:
treeList[] = {‘A’, ‘(‘, ‘B’, ‘(‘, ‘D’,’(‘,’H’, ‘I’,’)’, ‘E’,’(‘, ‘J’,’$’,
‘)’,’)’,’C’,’(‘,’F’,’(‘,’K’,’L’,’)’,’G’,’)’,’)’,’#’};
The output of the program is given below:
HIDJEBKLFGCA
The above output has been tested to be true.
The program was also tested on the following binary tree:
char treeList[] = {‘1’, ‘(‘,’2’, ‘(‘, ‘$’, ‘4’, ‘)’,’3’, ‘(‘,’5’,’6’,’)’,’)’,’#’}
operator has two operands. In a binary tree, a single node is also considered as a binary tree. Similarly, a
variable or constant is also the smallest expression.
Consider the preorder arithmetic expression given below:
*+A*BC−DE
The above arithmetic expression can be expressed as an expression tree given in Figure 7.20.
+ –
A * D E
B C
The utility of an expression tree is that the various tree travel algorithms can be used to convert one
form of arithmetic expression to another i.e. prefix to infix or infix to post fix or vice-versa.
For example, the inorder travel of the tree given in Figure 7.20 shall produce the following infix
expression.
A+B*C*D−E
The postorder travel of the tree given in Figure 7.20 shall produce the following postfix expression:
ABC*+DE−*
Similarly, the preorder travel of the tree given in Figure 7.20 shall produce the following prefix
expression:
*+A*BC−DE
From the above discussion, we can appreciate the rationale behind the naming of the travel strate-
gies VLR, LVR, and LRV as preorder, inoder, and postorder, re-
spectively. Tree
The above list would be used to write the required menu Fig. 7.21 An expression tree
driven program. The various binary tree travel functions,
300 Data Structures Using C
developed above, would be used to travel the expression tree as per the user choice entered on the
following menu displayed to the user:
Menu
Inorder 1
Preorder 2
Postorder 3
Quit 4
Enter your choice:
The output of the program for choice 1 (inorder travel) is given below:
The Tree is …: P + Q / A − B
The output of the program for choice 2 (preorder travel) is given below:
The Tree is …: − + P / Q A − B
The output of the program for choice 3 (postorder travel) is given below:
The Tree is …: P Q A / + B −
It is left as an exercise for the readers to verify the results by manually converting the expression to
the required form.
n The key values of the nodes of left sub-tree are always less than the key value of the root
node.
n The key values of the nodes of
108
right sub-tree are always more
than the key value of the root
node. 90 110
n The left and right sub-trees are
Kindly note that the inorder travel of the binary search tree would produce the following node sequence:
Tree is … 83 85 90 104 108 109 110 117 125 128
The above list is sorted in ascending order indicating that a binary search tree is another way to
represent and produce a sorted list. Therefore, a binary search algorithm can be easily applied to a BST
and, hence, the name, i.e., binary search tree.
7.5.2.1 Creation of a Binary Search Tree A binary search tree is created by successively reading
values for the nodes. The first node becomes the root node. The next node’s value is compared with the
value of the root node. If it is less, the node is attached to root’s left sub-tree otherwise to right sub-tree,
and so on. When the sub-tree is NULL, then the node is attached there itself. An algorithm for creation
of binary search tree is given below.
Note: This algorithm gets a pointer called binTree. The creation of BST stops when an invalid data is
inputted. It uses algorithm attach() to attach a node at an appropriate place in BST.
binTree = NULL; /* initially the tree is empty */
Algorithm createBST(binTree)
{
while (1)
{
take a node in ptr
read DATA (ptr);
if (DATA (ptr) is not valid) break;
leftChild (ptr) = NULL;
rightChild (ptr) = NULL;
attach (binTree, ptr);
}
}
return binTree;
Algorithm attach (Tree, node)
{
if (Tree == NULL)
Tree = node;
else
{
if (DATA (node) < DATA(Tree)
attach (Tree->leftChild);
else
attach (Tree-> rightChild);
}
return Tree ;
}
Example 5: Write a program that creates a BST wherein each node of the tree contains non-negative
integer. The process of creation stops as soon as a negative integer is inputted by the user. Travel the tree
using inorder travel to verify the created BST.
Trees 305
Solution: The algorithm createBST() and its associated algorithm attach() would be used to write
the program. The required program is given below:
/* This program creates a BST and verifies it by travelling it using
inorder travel */
#include <stdio.h>
#include <alloc.h>
#include <conio.h>
struct binNode
{
int val;
struct binNode *leftChild, *rightChild;
};
7.5.2.2 Searching in a Binary Search Tree The most desirable and important application of a BST
is that it is used for searching an entry in a set of values. This method is faster as compared to arrays and
linked lists. In an array, the list of items needs to be sorted before a binary search algorithm could be
applied for searching an item into the list. On the contrary, the data automatically becomes sorted during
the creation of a BST. The item to be searched is compared with the root. If it matches with the data of
the root, then success is returned otherwise left or right sub-tree is searched depending upon the item
being less or more than the root.
An algorithm for searching a BST is given below; it searches Val in a BST pointed by a pointer
called ‘Tree’.
Algorithm searchBST(Tree, Val)
{
if (Tree = NULL)
return failure;
if (DATA (Tree) == val)
return success;
else
if (Val < DATA (Tree))
search (Tree->leftChild, Val);
else
search (Tree->rightChild, Val);
}
Example 6: Write a program that searches a given key in a binary search tree through a function called
searchBST(). The function returns 1 or 0 depending upon the search being successful or a failure. The
program prompts appropriate messages like “Search successful” or “Search unsuccessful”.
Solution: The above algorithm searchBST() is used. The required program is given below:
#include <stdio.h>
#include <alloc.h>
#include <conio.h>
struct binNode
{
int val;
struct binNode *leftChild, *rightChild;
};
void main()
{ int result, val;
struct binNode *binTree;
binTree = createBinTree ();
printf (“\n Enter the value to be searched”);
scanf (“%d”, &val);
result = searchBST(binTree, val);
308 Data Structures Using C
if (result == 1)
printf (“\n Search Successful”);
else
printf (“\n Search Un-Successful”);
}
7.5.2.3 Insertion into a Binary Search Tree The insertion in a BST is basically a search operation.
The item to be inserted is searched within the BST. If it is found, then the insertion operation fails
otherwise when a NULL pointer is encountered the item is attached there itself. Consider the BST given
in Figure 7.24.
It may be noted that the number 49 is inserted into the BST shown in Figure 7.24. It is searched in
the BST which comes to a dead end at the node containing 48. Since the right child of 48 is NULL, the
number 49 is attached there itself.
Note: In fact, a system programmer maintains a symbol table as a BST and the symbols are inserted
into the BST in search-insert fashion. The main advantage of this approach is that duplicate entries into
the table are automatically caught. The search engine also stores the information about downloaded
web pages in search-insert fashion so that possible duplicate documents from mirrored sites could be
caught.
An algorithm for insertion of an item into a BST is given below:
Algorithm insertBST (binTree, item)
{
if (binTree == NULL)
{
Take node in ptr;
DATA (ptr) = item;
leftChild(ptr) = NULL;
rightChild(ptr)= NULL;
Trees 309
49
56
45 60
33
50 58 75
48
56
45 60
33
50 58 75
48
49
binTree = ptr;
return success;
}
else
if (DATA (binTree) == item)
return failure;
else
{ Take node in ptr;
DATA (ptr) = item;
leftChild (ptr) = NULL;
rightChild(ptr)= NULL;
if (DATA (binTree) > item)
insertBST (binTree->leftChild, ptr);
else
insertBST (binTree->rightChild, ptr);
}
}
310 Data Structures Using C
Example 7: Write a program that inserts a given value in a binary search tree through a function called
insertBST(). The function returns 1 or 0 depending upon the insertion being success or a failure. The
program prompts appropriate message like “Insertion successful” or “Duplicate value”. Verify the inser-
tion by travelling the tree in inorder and displaying its contents containing the inserted node.
Solution: The above algorithm insertBST() is used. The required program is given below:
/* This program inserts a value in a BST */
#include <stdio.h>
#include <alloc.h>
#include <conio.h>
struct binNode
{
int val;
struct binNode *leftChild, *rightChild;
};
The output: “Duplicate value” was obtained indicating that the value is already present in the BST.
Similarly, for the following input:
Nodes of BST: 65 78 12 34 89 77 22
Value to be inserted: 44
The output: Insertion successful, the Tree is .. 12 22 34 44 65 77 78 89, indicating that the insertion
at proper place in BST has taken place.
7.5.2.4 Deletion of a Node from a Binary Search Tree Deletion of a node from a BST is not as
simple as insertion of a node. The reason is that the node to be deleted can be at any of the following
positions in the BST:
(1) The node is a leaf node.
(2) The node has only one child.
(3) The node is an internal node, i.e., having both the children.
Case 1: This case can be very easily handled by setting the pointer to the node from its parent equal
to NULL and freeing the node as shown in Figure 7.25.
56 56
45 60 45 60
33 33
50 75 50 75
58 58
48 48
Free
ptr 49 NULL
ptr 49
The leaf node containing 49 has been deleted. Kindly note that its parent’s pointer has been set to
NULL and the node itself has been freed.
Case 2: This case can also be handled very easily by setting the pointer to the node from its parent
equal to its child node. The node may be freed later on as shown in Figure 7.26.
The node, containing 48, having only one child (i.e., 49) has been deleted. Kindly note that its
parent’s pointer (i.e., 50) has been set to its child (i.e., 49) and the node itself has been freed.
Case 3: From discussions on BST, it is evident that when a BST is travelled by inorder, the dis-
played contents of the visited nodes are always in increasing order, i.e., the successor is always greater
then the predecessor. Another point to be noted is that inorder successor of an internal node (having
both children) will never have a left child and it is always present in the right sub-tree of the internal
node.
Trees 313
56 56
45 60 45 60
33 33
50 75 50 75
58 58
ptr 48 49
Free
49
ptr 48
Consider the BST given in Figure 7.26. The inorder successor of internal node 45 is 48, and that of
56 is 58. Similarly, in Figure 7.22, the inorder successors of internal nodes 90, 108, and 110 are 104, 109,
and 117, respectively and each one of the successor node has no left child. It may be further noted that
the successor node is always present in the right sub-tree of the internal node.
The successor of a node can be reached by first moving to its right child and then keep going to left
till a NULL is found as shown in Figure 7.27.
It may be further noted that if an internal node ptr is to be deleted then the contents of inorder
successor of ptr should replace the contents of ptr and the successor be deleted by the methods sug-
gested for Case 1 or Case 2, discussed above . The mechanism is shown in Figure 7.28.
An algorithm to find the inorder successor of a node is given below. It uses a pointer called succPtr
that travels from ptr to its successor and returns it.
Algorithm findSucc(ptr)
{
succPtr = rightChild (ptr);
while (leftChild (succPtr != NULL))
succPtr = leftChild (succptr);
return succPtr;
}
B C
Succ (ptr)
F G
56 56
Internal node
ptr 45 ptr 48
60 60
33 48 33
50 75 50 75
58 copy 58
48
Succ (ptr)
49
49
Succ (ptr)
56
48 60
33
50 75
58
Free
49
Succ (ptr)
The algorithm that deletes a node from a BST is given below. It takes care of all the three cases given
above.
Algorithm del NodeBST(Tree, Val)
{
ptr = search (Tree, Val); /* Finds the node to be deleted */
parent = search (Tree, Val); /* Finds the parent of the node */
if (ptr == NULL) report error and return;
Example 8: Write a program that deletes a given node form binary search tree. The program prompts
appropriate message like “Deletion successful”. Verify the deletion for all the three cases by travelling the
tree in inorder and displaying its contents.
Solution: The above algorithms del NodeBST() and findSucc() are used. The required program is
given below:
/* This program deletes a node from a BST */
#include <stdio.h>
#include <alloc.h>
#include <conio.h>
struct binNode
{
int val;
struct binNode *leftChild, *rightChild;
};
316 Data Structures Using C
struct binNode * createBinTree();
struct binNode *attach(struct binNode *tree, struct binNode *node);
struct binNode *searchBST (struct binNode *binTree, int val, int *flag);
struct binNode *findParent (struct binNode *binTree, struct binNode *ptr);
struct binNode *delNodeBST (struct binNode *binTree, int val, int *flag);
struct binNode *findSucc (struct binNode *ptr);
void inOrderTravel (struct binNode *Tree);
void main()
{ int result, val;
struct binNode *binTree;
binTree = createBinTree ();
printf (“\n Enter the value to be deleted”);
scanf (“%d”, &val);
binTree = delNodeBST(binTree, val, &result);
if (result == 1)
{printf (“\n Deletion successful, The Tree is..”);
inOrderTravel (binTree);
}
else
printf (“\n Node not present in Tree”);
}
struct binNode * delNodeBST (struct binNode *binTree, int val, int *flag)
{ int size, nval;
struct binNode *ptr, *parent,*succPtr;
if (binTree == NULL)
{ *flag =0;
return binTree;
}
ptr = searchBST (binTree, val, flag);
if (*flag == 1)
parent = findParent (binTree, ptr);
else
return binTree;
if (ptr->leftChild == NULL && ptr->rightChild == NULL) /* Case 1*/
{
if (parent->leftChild == ptr)
parent->leftChild = NULL;
Trees 317
else
if (parent->rightChild == ptr)
parent->rightChild = NULL;
free (ptr);
}
A complete binary tree is called a heap tree (say max heap) if it has the following properties:
(1) Each node in the heap has data value more than or equal to its left and right child nodes.
(2) The left and right child nodes themselves are heaps.
Thus, every successor of a node has data value less than or equal to the data value of the node. The
height of a heap with N nodes = |log 2 N|.
7.5.3.1 Representation of a Heap Tree Since a heap tree is a complete binary tree, it can be comfort-
ably and efficiently stored in an array. For example, the heap of Figure 7.30 can be represented in an array
called ‘heap’ as shown in Figure 7.31. The zeroth position contains total number of elements present in
the heap. For example, the zeroth location of heap contains 10 indicating that there are 10 elements in
the heap shown in Figure 7.31.
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
heap = 10 70 59 68 36 56 60 62 22 31 45
It may be noted that this representation is simpler than the linked representation as there are
no links. Moreover from any node of the tree, we can move to its parent or to its children, i.e., in
both forward and backward directions. As discussed in Section 7.4.1, the following relationships
hold good:
In an array of size N, for the given node i, the following relationships are valid in linear
representation:
(1) leftChild (i) = 2*i {When 2*i > N then there is no left child}
(2) rightChild(i) = 2*i + 1 {When 2*i +1 > N then there is no right child}
(3) parent (i) = | i/2 | {node 0 has no parent – it is a root node}
The operations that are defined on a heap tree are:
(1) Insertion of a node into a heap tree
(2) Deletion of a node from a heap tree
A discussion on these operations is given in the subsequent sections.
7.5.3.2 Insertion of a Node into a Heap Tree The insertion operation is an important operation on
a heap tree. The node to be inserted is placed on the last available location in the heap, i.e., in the array.
Its data is compared with its parent and if it is more than its parent, then the data of both the parent and
the child are exchanged. Now the parent becomes a child and it is compared with its own parent and
so on; the process stops only when the data part of child is found less than its parent or we reach to the
root of the heap. Let us try to insert ‘69’ in the heap given in Figure 7.31. The operation is illustrated in
Figure 7.32.
An algorithm that inserts an element called item into a heap represented in an array of size N called
maxHeap is given below. The number of elements currently present in the heap is stored in the zeroth
location of maxHeap.
Trees 321
70 70
59 59 Exchange
68 68
36 56 60 62 36 69 60 62
Exchange
22 31 45 69 22 31 45 56
70
69 68
36 59 60 62
22 31 45 56
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
heap = 11 70 69 68 36 59 60 62 22 31 45 56
Algorithm insertHeap(item)
{
if (heap [0] == 0)
{ heap[0] = 1;
heap[1] = item;
}
else
{
last = heap[0] +1;
if (last >size)
prompt message “Heap Full”;
else
{
heap[last] = item;
heap[0] = heap [0] +1;
322 Data Structures Using C
I = last;
Flag = 1;
while (Flag)
{
J = abs (I/2); /* find parent’s index */
if (J >= 1)
{
if (heap[J] < heap [I])
{
temp = heap[J];
heap [J]= heap [I];
heap [I] = temp; /* Parent becomes child */
I =J;
}
else
Flag = 0;
}
else
Flag=0;
}
}
}
}
}
Note: The insertion operation is important because the heap itself is created by iteratively inserting the
various elements starting from an empty heap.
Example 9: Write a program that constructs a heap by iteratively inserting elements input by a user into
an empty heap.
Solution: The algorithm insertHeap() would be used. The required program is given below:
/* This program creates a heap by iteratively inserting elements into a
heap Tree */
#include <stdio.h>
#include <conio.h>
void insertHeap(int heap[], int item, int size);
void dispHeap (int heap[]);
void main()
{
int item, i, size;
int maxHeap [20];
printf (“\n Enter the size(<20) of heap”);
scanf (“%d”, &size);
printf (“\n Enter the elements of heap one by one”);
for (i=1; i<=size; i++)
Trees 323
{
printf (“\n Element:”);
scanf (“%d”, &item);
insertHeap(maxHeap, item, size);
}
printf (“\n The heap is..”);
dispHeap(maxHeap);
}
7.5.3.3 Deletion of a Node from a Heap Tree The deletion operation on heap tree is relevant when
its root is deleted. Deletion of any other element has insignificant applications. The root is deleted by the
following steps:
Step
(1) Save the root in temp.
(2) Bring the right most leaf of the heap into root.
(3) Move root down the heap till the heap is ordered.
(4) Return temp containing the deleted node.
The operation, given in step2 above is also
70
called reheap operation. Consider the heap given
in Figure 7.33.
Let us delete its root (i.e., 70) by bringing the 69 68
rightmost leaf (i.e., 56) into the root as shown in
Figure 7.34. The root has been saved in temp. 36 59 60 62
An algorithm that deletes the maximum
element (i.e., root) from a heap represented in 22 31 45 56
an array of size N called maxHeap is given below.
The number of elements currently present
Fig. 7.33 A heap tree
in the heap is stored in the zeroth location of
maxHeap.
Algorithm delMaxHeap(heap)
{
if (heap[0] == 1) return heap[1]; /* There is only one element in the heap */
last = heap[0];
tempVal = heap[1]; /* Save the item to be deleted and returned */
heap[1]= heap[last]; /* Bring the last leaf node to root */
Trees 325
70
70 56 temp
69 68 69 68
36 59 60 62 36 59 60 62
22 31 45 56 22 31 45
70
69
temp
56 68
36 59 60 62
22 31 45
69
return
temp
59 68
36 56 60 62
22 31 45
70
Example 10: Write a program that deletes and displays the root of a heap. The remaining elements of the
heap are put to reheap operation.
Solution: The algorithm delMaxHeap() would be used. The required program is given below:
/* This program deletes the maximum element from a heap Tree*/
#include <stdio.h>
#include <conio.h>
int delMaxHeap(int heap[]);
void insertHeap(int heap[],int item, int size);
void dispHeap(int heap[]);
void main()
{
int item, i, size;
int maxHeap [20];
printf (“\n Enter the size(<20) of heap”);
scanf (“%d”, &size);
printf (“\n Enter the elements of heap one by one”);
for (i=1; i<=size; i++)
{
printf (“\n Element :”);
scanf (“%d”, &item);
insertHeap(maxHeap, item, size);
}
printf (“\n The heap is..”);
dispHeap(maxHeap);
item = delMaxHeap(maxHeap);
printf (“\n The deleted item is : %d”, item);
printf (“\n The heap after deletion is..”);
Trees 327
dispHeap(maxHeap);
}
7.5.3.4 Heap Sort From discussion on heap trees, it is evident that the root of a maxHeap contains
the largest value. When the root of the max heap is deleted, we get the largest element of the heap.
The remaining elements are put to reheap operation and we get a new heap. If the two operations
delete and reheap operation is repeated till the heap becomes empty and the deleted elements stored
in the order of their removal, then we get a sorted list of elements in descending order as shown in
Figure 7.35.
reHeap
70 69
69 68 59 68
delete
36 59 62 36 56 60 62
60
22 31 45 56 22 31 45
70 70 69
68 62
59 59 60
62
36 56 45 36 56 31 45
60
22 31 22
70 69 68 62
70 69 68
Example 11: Write a program that sorts a given list of numbers by heap sort.
Solution: We would modify the program of Example 10 wherein the element deleted from heap would
be stored in a separate array called sortList. The deletion would be carried out iteratively till the heap
becomes empty. Finally, sortList would be displayed. The required program is given below:
/* This program sorts a given list of numbers using a heap tree*/
#include <stdio.h>
#include <conio.h>
int delMaxHeap(int heap[]);
void insertHeap(int heap[],int item, int size);
void dispHeap (int heap[]);
void main()
330 Data Structures Using C
{
int item, i, size, last;
int maxHeap [20];
int sortList [20];
printf (“\n Enter the size(<20) of heap”);
scanf (“%d”, &size);
printf (“\n Enter the elements of heap one by one”);
for (i=1; i<=size; i++)
{
printf (“\n Element :”);
scanf (“%d”, &item);
insertHeap(maxHeap, item, size);
}
printf (“\n The heap is..”);
dispHeap(maxHeap);
getch();
last = maxHeap[0];
for (i =0; i < last; i++)
{
item = delMaxHeap(maxHeap);
sortList [i] =item;
}
printf (“\n The sorted list...”);
for (i = 0; i < last; i++)
printf (“%d “, sortList[i]);
}
exchange
70 69 exchange
69 reHeap 59 68
68
36 59 62 36 56 60 62
60
22 31 45 56 22 31 45 70
68 exchange 62 exchange
reHeap
59 reHeap 59 60
62
36 56 60 45 36 56 31 45
22 31 22
69 70 68 69 70
I=1;
Flag = 1;
while (Flag)
{
J = I*2;
if (J <= K)
{ /* find child which is greater than parent*/
if (heap[J] > heap[J+1])
pos = J;
else
pos = J+1;
if (heap [pos] > heap[I])
{
temp = heap[pos]; /* exchange parent and child */
heap [pos]= heap [I];
heap [I] = temp;
I =pos;
}
else
Flag = 0;
}
334 Data Structures Using C
else
Flag=0;
}
}
while (last >1);
}
}
It may be noted that the element at the root is being exchanged with the last leaf node and the rest
of the heap is being put to reheap operation. This is iteratively being done in the do-while loop till there
is only one element left in the heap.
Example 12: Write a program that in-place sorts a given list of numbers by heap sort.
Solution: We would employ the algorithm IpHeapSort() wherein the element from the root is
exchanged with the last leaf node and rest of the heap is put to reheap operation. The exchange and
reheap is carried out iteratively till the heap contains only 1 element.
The required program is given below:
/* This program in place sorts a given list of numbers using heap tree*/
#include <stdio.h>
#include <conio.h>
void sortMaxHeap(int heap[]);
void insertHeap(int heap[],int item, int size);
void dispHeap (int heap[]);
void main()
{
int item, i, size, last;
int maxHeap [20];
printf (“\n Enter the size(<20) of heap”);
scanf (“%d”, &size);
printf (“\n Enter the elements of heap one by one”);
for (i=1; i<=size; i++)
{
printf (“\n Element :”);
scanf (“%d”, &item);
insertHeap(maxHeap, item, size);
}
printf (“\n The heap is..”);
dispHeap(maxHeap);
getch();
sortMaxHeap(maxHeap);
printf (“\n The sorted list...”);
for (i = 1; i <= size; i++)
printf (“%d “, maxHeap[i]);
}
void insertHeap(int heap[],int item, int size)
{ int last, I, J, temp, Flag;
Trees 335
if (heap [0] == 0)
{ heap[0] = 1;
heap[1] = item;
}
else
{
last = heap[0] +1;
if (last >size)
{
printf (“\n Heap Full”);
}
else
{
heap[last] = item;
heap[0]++;
I = last;
Flag = 1;
while (Flag)
{
J = (int) (I/2);
if (J >= 1)
{ /* find parent */
if (heap[J] < heap [I])
{
temp = heap[J];
heap [J]= heap [I];
heap [I] = temp;
I =J;
}
else
Flag = 0;
}
else
Flag=0;
}
}
}
}
7.5.3.5 Merging of Two Heaps Merging of heap trees is an interesting application wherein two heap
trees are merged to produce a tree which is also a heap. The merge operation is carried out by the fol-
lowing steps:
Trees 337
67 22
31 34 12 17
8 10 5 4 9 6 15
67
31 34
17 12 6 22
8 15 10 22 5 4
Example 13: Write a program that merges two heap trees to produce a third tree which is also a heap.
Solution: The required program is given below:
/* This program merges two heap trees */
#include <stdio.h>
#include <conio.h>
int delMaxHeap(int heap[]);
void insertHeap(int heap[], int item, int size);
void dispHeap (int heap[]);
void main()
{
int item,i,size1, size2;
int maxHeap1 [20],maxHeap2[20];
printf (“\n Enter the size(<20) of heap1”);
scanf (“%d”, &size1);
printf (“\n Enter the elements of heap one by one”);
for (i=1; i<=size1; i++)
338 Data Structures Using C
{
printf (“\n Element :”);
scanf (“%d”, &item);
insertHeap(maxHeap1, item, size1);
}
printf (“\n The heap1 is..”);
dispHeap(maxHeap1);
printf (“\n Enter the size(<20) of heap2”);
scanf (“%d”, &size2);
printf (“\n Enter the elements of heap one by one”);
for (i=1; i<=size2; i++)
{
printf (“\n Element :”);
scanf (“%d”, &item);
insertHeap(maxHeap2, item, size2);
}
printf (“\n The heap2 is..”);
dispHeap(maxHeap2);
for (i = 1; i <= size2; i++)
{
item = delMaxHeap(maxHeap2);
size1=size1+1;
insertHeap (maxHeap1, item,size1);
}
printf (“\n The heaps after merge is..”);
dispHeap(maxHeap1);
}
The above program was tested on the heap trees given in Figure 7.37. The input provided is given
below:
Heap1: 67 31 34 8 10 5
Heap2: 22 12 17 4 9 615
n If a node is having an empty left child (i.e., a NULL pointer), then replace it by a special pointer
that points to its predecessor node in the tree.
n If a node is having an empty right child (i.e., a NULL pointer), then replace it by a special pointer
Consider the binary tree given in Figure 7.38 wherein a binary tree has been assigned the threads.
It may be noted that the normal pointers have been shown with thick lines and the threads with dotted
lines.
A
B C
D E F G
H I
B C
D E F G
H I
Fig. 7.38 The vacant NULL pointers have been replaced by threads
The resultant binary tree of Figure 7.38 is called a threaded binary tree (TBT). It may be noted
that the characteristic of this special tree is that a left thread of a node points to its inorder predecessor
and the right thread to its inorder successor. For example, the left thread of ‘E’ is pointing to ‘B’, its pre-
decessor and the right thread of ‘E’ is pointing to ‘A’, its successor.
A threaded binary tree is a binary tree in which a node uses its empty left child pointer to
point to its inorder predecessor and the empty right child pointer to its inorder successor.
Now, the problem is that how a program
would differentiate between a normal pointer
and a thread. A simple solution is to keep LTag DATA RTag
variables called LTag and RTag, having value
1 or 0 indicating a normal pointer or a thread,
respectively. Accordingly, a node of a threaded
Lchild Rchild
binary tree would have the structure as given in
Figure 7.39.
Let us now use the structure of Figure 7.39 to Fig. 7.39 A node of a threaded binary tree
represent the threaded binary tree of Figure 7.38.
The arrangement is shown in Figure 7.40.
342 Data Structures Using C
1 A 1
1 B 1 1 C 1
1 D 1 0 E 0 0 G 0
0 F 0
0 H 0 0 I 0
Fig. 7.40 The threaded binary tree with node structure having the Tag bits
A left thread pointing to NULL indicates that there is no more inorder predecessor. Similarly, a NULL
right thread indicates that there is no more inorder successor. For example, the left thread of ‘H’ is point-
ing to NULL and the right thread of ‘G’ is pointing to NULL. This anomalous situation can be handled by
keeping a head node having a left child and a right thread as shown in Figure 7.41. This dummy node
does not contain any value. In fact, it is an empty threaded binary tree because both its left child and
right thread are pointing to itself.
Head
1 1
Now the dangling threads of a binary tree can be made to point to the head node and the left child
of the head node to the root of the binary threaded tree as shown in Figure 7.42.
Head
1 1
1 A 1
1 B 1 1 C 1
1 D 1 0 E 0 0 G 0
0 F 0
0 H 0 0 I 0
Now with the arrangement shown in Figure 7.42, it can be appreciated that an inorder successor of
a node pointed by ptr can be very easily found by the following set of rules:
n If RTag of ptr is 0, then the rightChild is pointing to its successor.
n If RTag of ptr is 1 (i.e., thread), then the left most node of its right sub-tree is its successor.
n If LTag of ptr is 1 (i.e., thread), then the right most node of its left sub-tree is its predecessor.
As the threaded binary tree has been designed to honour inorder travel, the insertion and deletion
have to be carried out in such a way that after the operation the tree remains the inorder threaded binary
tree. A brief discussion on both the operations is given in the subsequent sections.
7.5.4.1 Insertion into a Threaded Binary Tree A node ‘X’, having Data part equal to ‘T’, can be
inserted into an inorder threaded binary tree after node ‘Y’ in either of the following four ways:
Case 1: When X is inserted as left child of Y and Y has an empty left child.
Case 2: When X is inserted as right child of Y and Y has an empty right child.
Case 3: When X is inserted as left child of Y and Y has a non-empty left child
Case 4: When X is inserted as right child of Y and Y has a non-empty right child.
CASE 1:
When Y (Say Data part equal to ‘F’) has an empty left child, then it must be a thread. The insertion can be
made by making the left thread of X (Data part equal to ‘T’) to point where the left thread of Y is currently
pointing to. Thereafter, X becomes the left child of Y for which LTag of Y is reset to 1. The right child of
X is set as thread pointing to the node Y, indicating that Y is the successor of X. The insertion operation is
shown in Figure 7.43.
1 A 1
Y 1 C 1
0 F 0 0 G 0
X
0 T 0
1 A 1
1 C 1
1 F 0 0 G 0
0 T 0
It may be noted that the node containing ‘T’ has become the left child of node containing ‘F’. After the
insertion, the tree is still a threaded binary tree. The predecessor of ‘T’ is ‘A’ and the successor of ‘T’ is ‘F’.
An algorithm for insertion of node as left thread is given below:
Algorithm insertAtLThread(X,Y)
{
if (LTag(Y) == 0)
{
LTag (X) =0;
leftChild (X) = leftChild (Y);
LTAg (Y) =1;
leftChild (Y) = X;
RTag (X) = 0;
rightChild (X) = Y;
}
}
CASE 2:
When Y (Data part equal to ‘F’) has an empty right child, then it must be a thread. The insertion can be made
by making the right thread of X (Data part equal to ‘T’) to point where the right thread of Y is currently point-
ing to. Thereafter, X becomes the right child of Y for which RTag of Y is reset to 1. The left child of X is set as
a thread pointing to the node Y, indicating that Y is the predecessor of X. The insertion operation is given in
Figure 7.44.
1 A 1
Y 1 C 1
0 F 0 0 G 0
0 T 0
1 A 1
1 C 1
0 F 1 0 G 0
0 T 0
It may be noted that the node containing ‘T’ has become the right child of node containing ‘F’. After the
insertion, the tree is still a threaded binary tree. The predecessor of ‘T’ is ‘F’ and the successor of ‘T’ is ‘C’.
An algorithm for insertion of node as right thread is given below:
Algorithm inserAtRThread(X,Y)
{
if (RTag(Y) == 0)
{
RTag (X) =0;
rightChild (X) = rightChild (Y);
RTAg (Y) =1;
rightChild (Y) = X;
LTag (X) = 0;
leftChild (X) = Y;
}
}
CASE 3:
When Y (Data part equal to ‘C’) has non-empty left child, the insertion can be made by making the left
child of X (Data part equal to ‘T’) point where the left child of Y is currently pointing to. Thereafter, X
becomes the left child of Y for which Lchild of Y is pointed to X. The right child of X is set as thread
pointing to the node Y, indicating that Y is the successor of X.
Now the most important step is to find the inorder predecessor of Y (pointed by ptr) and make it
inorder predecessor of X by making ptr’s right thread to point to X. The insertion operation is shown
in Figure 7.45.
It may be noted that the node containing ‘T’ has become the left child of node containing ‘C’. After the
insertion, the tree is still a threaded binary tree. The predecessor of ‘T’ is ‘F’ and the successor of ‘T’ is ‘C’.
An algorithm for insertion of node as left child is given below:
Algorithm insertAtLchild(X,Y)
{
RTag (X) =0; /* Connect the right thread of X to Y */
rightChild (X) = Y
LTAG (X) = 1; /* Connect the leftChild of X to the node pointed by
left child o Y */
leftChild (X) = leftChild (Y)
/* Point the left child of Y to X */
leftChild (Y) = X;
ptr = inOrderPred (Y) /* Find the inorder predecessor of Y */
rightChild (ptr) =X; /* Connect the right thread of (ptr) predecessor
to X */
}
CASE 4:
When Y (Data part equal to ‘C’) has non-empty right child, the insertion can be made by making the
right child of X (Data part equal to ‘T’) point where the right child of Y is currently pointing to. There-
after, X becomes the right child of Y by pointing the right child of Y to X. The left thread of X is set as a
thread pointing to the node Y, indicating that Y is the predecessor of X.
Trees 347
1 A 1
Y
1 C 1
X
0 F 0 0 G 0
0 T 0
1 A 1
Y
X 1 C 1
1 T 0 0 G 0
0 F 0
ptr
Fig. 7.45 Insertion of a node at non empty left child of a given node
Now, the most important step is to find the inorder successor of Y (pointed by ptr) and make
it inorder successor of X by making ptr’s left thread point to X. The insertion operation is shown in
Figure 7.46.
It may be noted that the node containing ‘T’ has become the right child of node containing ‘C’. After the
insertion, the tree is still a threaded binary tree. The predecessor of ‘T’ is ‘C’ and the successor of ‘T’ is ‘G’.
An algorithm for insertion of node as right child is given below:
Algorithm insertAtRchild(X,Y)
{
LTag (X) = 0; /* Connect the left thread of X to Y */
leftChild (X) = Y
RTag (X) = 1; /* Connect the leftChild of X to the node pointed by
left child o Y */
rightChild (X) = rightChild (Y)
/* Point the left child of Y to X */
rightChild (Y) = X;
ptr = inOrderSucc (Y) /* Find the inorder successor of Y */
leftChild (ptr) =X; /* Connect the left thread of ptr (successor) to X */
}
348 Data Structures Using C
1 A 1
Y
1 C 1
X
0 F 0 0 G 0
0 T 0
1 A 1
Y
1 C 1 X
0 T 1
0 F 0
0 G 0
ptr
A brief discussion on deletion of a node from a threaded binary tree is given in Section 7.5.4.2.
7.5.4.2 Deletion from a Threaded Binary Tree A node ‘X’, having Data part equal to ‘T’, can be
deleted from an inorder threaded binary tree in any of the following five ways:
Case 1: When X is a left leaf node.
Case 2: When X is a right leaf node
Case 3: When X is only having a right sub-tree.
Case 4: When X is only having a left sub-tree.
Case 5: When X is having both sub-trees.
Before an attempt is made to delete a node from any tree, the parent of the node must be reached
first so that the dangling pointers produced after deletion can be properly handled. Therefore, the fol-
lowing algorithm is given that finds the parent ‘Y’ of a node ‘X’.
Y = Tree;
The above algorithm is recursive. A non-recursive algorithm that uses the threads to reach to the
parent of a node X is given below. It uses the strategy wherein it reaches the successor of the rightmost
node. If the left child of the successor points to X, then the successor is parent. Otherwise, move to the
left child and then keep going to right till a node is found whose right child is equal to X.
Algorithm findParent (X)
{
ptr = X;
while (RTag(ptr) ==1) /* Find the node which has a right thread */
ptr = rightChild (ptr);
succ = rightChild (ptr); /* Move to the successor node */
if (succ −> leftChild == X)
Y = succ; /* Parent found, point it by Y*/
else
{ suc = leftChild (succ);
while (rightChild (succ) !=X)
succ = rightChild (succ);
Y= succ;
}
return Y;
}
Now the algorithm findParent() can be conveniently used to delete a node from a threaded
binary tree. The various cases of deletion are discussed below:
CASE 1:
When X is a left leaf node, then its parent Y is found (see Figure 7.47). The left child of Y is made to point
where X’s left thread is currently pointing to. The LTag of Y is set to 0, indicating that it has become a
thread now.
An algorithm for deletion of a node X which is a left leaf node is given below:
Algorithm delLeftLeaf (X,Y)
{
Y = findParent (X);
if (leftChild (Y) == X)
{
leftChild (Y) = leftChild (X); /* Make left child of parent as thread and let
it point where X’s left thread is pointing */
LTag (Y) = 0;
}
return X;
}
It may be noted from Figure 7.47 that node X has been deleted and the left child of its parent Y has
been suitably handled.
350 Data Structures Using C
1 A 1
Y
1 B 1
1 C 0 0 D 0
0 F 0
1 A 1
1 B 1
Y
0 C 0 0 D 0
0 F 0
Deleted
node
X
This case is similar to CASE 1 and, therefore, illustration for this case is not provided.
CASE 3:
When X is having a right sub-tree, its parent Y is found. A pointer called ‘grandson’ is make to point
to the right child of X (see fig 7.48). The right child of Y is pointed to grandson. The successor of X is
found. The left thread of successor is pointed to Y, indicating that the successor of X has now become the
successor of parent as shown in Figure 7.48.
1 A 1
Y
1 B 1
X
0 C 0 0 D 1
0 E 0
grandSon
1 A 1
Y
1 B 1
Y
0 C 0 0 E 0
X grandSon
0 D 0
Deleted node
It may be noted that in Figure 7.48, the grandson is also the successor of X.
An algorithm for deletion of X when X is having a right sub-tree is given below:
Algorithm delNonLeafR (X, Y)
{
Y = findParent (X);
if (RTag (X) == 1)
{
grandson = rightChild (X);
352 Data Structures Using C
rightChild (Y) = grandson;
succ = inOrderSucc(X);
leftChild (succ) = Y;
}
return X
}
CASE 4:
When X is having only left sub-tree, its parent Y is found. A pointer called ‘grandson’ is made to point to the
left child of X. The left child of Y is pointed to grandson. The predecessor of X is found. The right thread of
predecessor is pointed to Y, indicating that the predecessor of X has now become the predecessor of parent.
An algorithm for deletion of X when X is having a left sub-tree is given below:
Algorithm delNonLeafL (X, Y)
{
Y = findParent (X);
if (LTag (X) == 1)
{
grandson = leftChild (X);
leftChild (Y) = grandson;
pred = inOrderPred(X);
rightChild (pred) = Y;
}
return X
}
This case is similar to CASE 3 and, therefore, illustration for this case is not provided.
CASE 5:
When X is having both sub-trees, then X is not deleted but its successor is found. The data of successor is
copied in X and the successor is deleted using either CASE 1 or CASE 2, whichever is applicable.
An algorithm for deletion of X when X is having both sub-trees is given below:
Algorithm delBoth (X)
{
succ = findSucc (X);
DATA (X) = DATA (succ);
Delete succ using Case1 or Case2
}
Note: The applications of threaded binary trees are limited in the sense that the above discussions per-
tain to inorder threaded binary trees only. For postorder, the above algorithms need to be rewritten. The
implementation is rather complex.
A search engine collects information from various Web sites with the help of downloaders called
crawlers. The information, if large, needs to be compressed before sending it over network so as to save
bandwidth traffic. The huge data collected by the search engine needs to be encoded prior to storing it
in its local repository.
Similarly, Fax machines must also compress and encoded data before throwing it on the telephone
line.
All these applications require efficient encoding techniques. In this section, Huffman coding
technique is discussed with a view to introduce the reader to the area of data compression, a potential
area of research nowadays.
The following terms of binary trees A
are important for discussions related to
the topic of this section.
2-Tree or Extended Binary Tree: It is
B C
binary tree with the property that each
node has either 0 children or 2 children.
The tree given in Figure 7.49 is an Ex- D E F G
tended Binary Tree (EBT)
Internal Node: The node with two chil- H I J K L M
dren is called an internal node. For ex-
ample, in Figure 7.49, the nodes B, C, D,
E , and F are internal nodes. Fig. 7.49 An extended binary tree
External Node: A node with no children
is called an external node. This is also popularly known as a leaf node. For example, the nodes G, H, I, J,
K, L, and M of Figure 7.49 are external or leaf nodes.
Path Length: The number of edges traversed to reach a node is called the path length of the node. For
example the path length of Node D is 2 and that of Node K is 3. Of course, the path length of root node
A is 0.
As per convention, it may be noted that internal and external nodes have been represented by circle
and rectangle, respectively. This representation of a binary tree is also called as Extended Binary Tree
(EBT). Thus, in EBT, an internal node is represented as a circle and the external node as a rectangle.
External Path Length: Let LEXT be the external path length of a binary tree. It is defined as sum
of path lengths of all external nodes. The external path length of the tree given in Figure 7.49 is given
below:
LEXT = LH + LI + LJ + LK + LL + LM + LG = 3 +3 +3 + 3 + 3 +3 + 2 =20
Where Li : is the path length of external node i
This can be formally defined as:
N
L EXT = ∑ L i
i =1
Consider the external weights of the tree given in Figure 7.50. The external nodes with their weights are:
G H I J K L M
3 5 7 2 6 1 4
The sorted list is given below:
L J G M H K I
1 2 3 4 5 6 7
The trace of above given steps is provided in Figure 7.51:
LJGMHKl GMHKl MH Kl K l
12 3 4 5 6 7 3* 3 4 5 6 7 4 5 6* 6 7 6*6 7 9*
3* (1 + 2) 6* (3* + 3) 9* (4 + 5) (6* + 6) 12*
L J 3* G M H 6* K
w=1 w=2 w=3 w=4 w=5 w=6
L J 3* G
w=1 w=2 w=3
L J
w=1 w=2
12*16* l
7 9*12*
(28*) (16* + 12*) 28*
(7 + 9) 16*
I 9*
16* 12*
w=7
9* M H
I 6* K
w=4 w=5
w=7 w=6
M H 3* G
w=4 w=5 w=3
L J
w=1 w=2
Lw = 28 + 16 + 9 + 12 + 6 + 3 = 74 9* 6* K
I
It is left as an exercise to the reader to
w=7 w=6
create another weighted binary tree from M H
3* G
the list of weights and verify that the Huff-
w=4 w=5
man tree obtained above has the minimum w=3
weighted path length. L J
It may be noted that when duplicate w=1 w=2
values are available in the list of weights then
depending upon the order of their positions Fig. 7.52 The final Huffman Tree
356 Data Structures Using C
Using the above codes, we can compress the string ‘madam’ into a much shorter code of 10 bits as
shown below:
1001000110
Thus, we have saved a significant amount of storage space i.e. 10 bits as compared to 40 or 80 bits.
But the coding scheme used by us is still a fixed length coding scheme.
Huffman provided an excellent variable length coding scheme which effectively compresses the
data by assigning short codes to most frequently occurring characters. The less frequent characters get
longer codes. Thus, the technique saves the storage space in terms of 25–90 per cent of the original size
Trees 357
of the text. For a given piece of text, the following steps are used for developing the codes for various
alphabets contained in the text:
1. Count the frequency of occurrence of each alphabet in the text.
2. Based on the frequency, assign weights to the alphabets.
3. Create a Huffman tree using the weights assigned to the alphabets. Thus, the alphabets become
the external nodes of the Huffman tree.
4. Starting from root, label the left edge as 0 and right as 1.
5. Traverse the sequence of edges from root to an external node i.e. an alphabet. The assembly of
labels of the edges from root to the alphabet becomes the code for the alphabet.
The Huffman codes applied to Huffman
28*
tree of Figure 7.52 are given in Figure 7.54
0 1
The codes assigned to various leaf nodes
are listed in the table given below:
16* 12*
1
Character Weight Code 0 0
1
I 7 00 9* 6*
I 0 1 K
0 1
K 6 11 w=6
w=7
H 5 011 M H
3* G
M 4 010 w=4 w=5 0 1
w=3
G 3 101 L J
J 2 1001 w=1 w=2
L 1 1000
Fig. 7.54 The assignment Huffman codes
It may be noted that the heaviest nodes: I and K have got shorter codes than the lighter nodes.
Example 14: Develop Huffman code for this piece of text ‘Nandini’
Solution: The trace of steps followed for assigning the Huffman codes to the various characters of the
given text are given below:
1. The frequency count of each character appearing in the above text is given in the following
table:
Character Frequency
N 1
a 1
d 1
n 2
i 2
4. From the weights develop the Huffman Tree. The trace of Huffman algorithm is given
Figure 7. 55.
Na d n i d n i n i
1 1 1 2 2 1 2* 2 2 2 2 3* 3* 4* 7* 7*
2* 4*
3*
N a 2* n i 3* 4*
d
1 1 1 2 2
2* n
N a d i
1 1 1 2 2
N a
1 1
Now, the code assigned to the text ‘Nandini’ is: 0100111000111011 = 16 bits
In ASCII coding scheme, we would have used 8*7 = 56 bits. Thus, we have saved 40 bits (56 – 16)
of storage space in this small piece of text comprising of only 7 characters.
Example 15: Develop Huffman code for this piece of text given below:
“when you are on the left you are on the right. when you are on the right, you are on the wrong”
Also compute the amount storage space in terms of bits saved by using Huffman coding.
Solution: The text contains 93 characters. We will represent the space with the character: Б. The trace
of steps followed for assigning the Huffman codes to the various characters of the given text are given
below:
Step 1: The frequency count of each character appearing in the above text is given in the following
table:
Trees 359
93*
93*
0 1
37* 56*
0 1 0 1
19* 25*
Б 0 1 0 1
31*
18 10*
0 14* 0
o 1 e 0 1 1
0
9 11 15*
6* n r 0
a 0 1 1 16*
7 7 1
4 8*
W g t 0
h
1
7 8*
3 3 4* 0 8
1
1
0 4*
0 1 y u
2* i
0 1 4 4
2 . .
I 2 2
f
1 1
The codes assigned to various leaf nodes are listed in the table given below:
As expected, the most frequently occurring character called space (Б) has got the shortest code i.e.
‘00’. Similarly, the character ‘e’ has been assigned a shorter code i.e. 100. Obviously, the least frequently
occurring characters like l & f have been assigned the longest codes 1101000 and 1101001 respectively.
Let us now compute the total space occupied by the text using Huffman coding. The computed bits
per character are given in table below:
Character Frequency Code No of Bits Character Frequency Code No of Bits
Б (space) 18 00 36(18*2) l 1 1101000 7
O 2 010 27 f 1 1101000 7
A 4 0110 16 i 2 110101 12
W 3 01110 15 . 2 110110 12
G 3 01111 15 , 2 110111 12
E 11 100 33 y 4 11100 20
N 7 1010 28 u 4 11101 20
R 7 1011 28 h 8 1111 32
T 7 1100 28
in the dictionary, and if not found it is inserted in the dictionary and also written on the output.
n If the word is found in the dictionary then its position from the dictionary is written on the
Example: Input Text: “When you are on the left—you are on the right and when you are on the right—
you are on the wrong’’.
The dynamic dictionary obtained from above input text is given below:
Output text: When you are on the left 2 3 4 5 right and 1 2 3 4 5 7 2 3 4 5 wrong.
Example 16. Write a program that reads text into an array of strings and compresses the data using
dynamic dictionary coding technique.
Solution: An array of strings to store 50 words of 20 characters each would be used to represent the text.
The required program is given below:
/* This program implements Dynamic Dictionary Decoding */
# include <stdio.h>
main()
{
char text[50][20];
char Dictionary[30][20];
int i,j, flag, wordCount, DictCount;
i=-1;
printf (“\n Enter the text terminated by a ### : \n”) ;
do
{
i++;
scanf (“%s”, text[i]);
}
while (strcmp (text [i], “###”));
wordCount = --i;
strcpy (Dictionary[0], text[0]);
DictCount =0; /* The Dictionary gets the first word */
printf (“\n The Text is : %s “,text[0]); /* print the first word */
for (i = 1;i <= wordCount; i++)
{
flag= 0;
362 Data Structures Using C
for ( j = 0; j <= DictCount; j++)
{
if (!strcmp (text[i], Dictionary[j]))
{
printf (“%d “, j+1);
flag = 1 ;
break;
}
} /* End of j Loop */
if (flag == 0)
{
DictCount ++;
strcpy (Dictionary[DictCount], text[i]);
printf (“%s “, text[i]);
}
} /* End i Loop */
}
exeRCISeS
1. What is the need of tree data structure? Explain with the help of examples. List out the areas in
which this data structure can be applied extensively.
2. What is a tree? Discuss why definition of tree is recursive. Why it is said to be non-linear?
3. Define the following terms:
(i) Root (iii) Leaf nodes
(ii) Empty tree (iv) Sub-tree
4. Discuss the following terms with suitable examples:
(i) Parent (vi) Path
(ii) Child (vii) Depth
(iii) Sibling (viii) Height
(iv) Internal node (ix) Degree
(v) Edge
5. Explain binary tree with the help of examples. Discuss the properties of binary tree that need to
be considered.
Trees 363
6. Discuss the concept of full binary tree and complete binary tree. Differentiate between the two
types with the help of examples.
7. Discuss various methods of representation of a binary tree along with their advantages and
disadvantages.
8. Present an algorithm for creation of a binary tree. Consider a suitable binary tree to be created
and discuss how the algorithm can be applied to create the required tree.
9. What do you mean by tree traversal? What kinds of operations are possible on a node of a
binary tree? Give an algorithm for inorder traversal of a binary tree. Taking an example, discuss
how the binary tree can be traversed using inorder traversal.
10. What do you mean by preorder traversal of a tree? Discuss an algorithm for preorder traversal
of a binary tree along with an example.
11. Give an algorithm for postorder traversal of a binary tree. Taking an example, discuss how the
binary tree can be traversed using postorder traversal.
12. What are the various types of binary trees?
13. What is an expression tree? Consider the following preorder arithmetic expressions:
(i) 1 * 2 A B C D (iv) / * A 1 B C D
(ii) 1 A 1 B C (v) 1 * A B / C D
(iii) / 2 A B C (vi) * A 1 B / C D
Draw expression tree for the above arithmetic expressions. Give the inorder, preorder and pos-
torder traversal of the expression trees obtained.
14. What is a binary search tree? What are the conditions applied on a binary tree to obtain a binary
search tree? Present an algorithm to construct a binary search tree and discuss it using a suitable
example.
15. Discuss various operations that can be applied on a BST. Present an algorithm that searches a
key in a BST. Take a suitable binary tree and search for a particular key in the tree by applying
the discussed algorithm.
16. Discuss how the insert operation can be performed in a binary tree with the help of an algo-
rithm. Consider a binary tree and insert a particular value in the binary tree.
17. Discuss how the delete operation can be performed in a binary tree. Also, present an algorithm
for the same. Consider a binary tree and delete a particular value in the binary tree.
18. Discuss the process of deletion of a node in a binary tree for the following cases by taking suit-
able examples:
(i) The node is a leaf node.
(ii) The node has only one child.
(iii) The node is an internal node, i.e., has both the children.
19. What are heap trees? Discuss representation of a heap tree. Discuss a minheap tree and a maxheap
tree with the help of examples. What are the operations that can be performed on a heap tree?
20. Discuss insertion and deletion of a node into/from a heap tree using examples. Present an algo-
rithm for both the operations.
364 Data Structures Using C
100
19 36
17 3 25 1
2 7
• 8.1 Introduction
ChapTeR OUTlINe
8.1 INTRODUCTION
I want to meet Amitabh Bachchan. Do I know somebody who knows him? Or is there a path of connected
people that can lead me to this great man? If yes, then how to represent the people and the path? The
problem becomes challenging if the people involved are spread across various cities. For instance, my
friend Mukesh from Delhi knows Surender at Indore who in turn knows Wadegoanker residing in Pune.
Wadegoanker is a family friend of the great Sachin at Mumbai. The friendship of Sachin and Amitabh
is known world over. There is an alternative given by Dr. Asok De. He says that his friend Basu from
Kolkata is the first cousin of Aishwarya Rai, the daughter in law of Amitabh.
The various cities involved in this set of related people are connected by roads as shown Figure 8.1.
Delhi
Kota
Indore
Kolkata
Bhopal
Mumbai
Nagpur
Pune
Now the problem is how to represent the data given in Figure 8.1. Arrays are linear by nature and
every element in an array must have a unique successor and predecessor except the first and last elements.
The arrangement shown in Figure 8.1 is not linear and, therefore, cannot be represented by an array.
Though a tree is a non-linear data structure but it has a specific property that a node cannot have more
than one parent. The arrangement of Figure 8.1 is not following any hierarchy and, therefore, there is no
parent–child relationship.
A close look at the arrangement suggests that it is neither linear nor hierarchical but a network of
roads connecting the participating cities. This type of structure is known as a graph. In fact, a graph is
a collection of nodes and edges. Nodes are connected by edges. Each node contains data. For example,
in Figure 8.1, the cities have been shown as nodes and the roads as edges between them. Delhi, Bhopal,
Pune, Kolkata, etc. are nodes and the roads connecting them are edges. There are numerous examples
where the graph can be used as a data structure. Some of the popular applications are as follows:
n Model of www: The model of world wide web (www) can be represented by a collection of graphs
(directed) wherein nodes denote the documents, papers, articles, etc. and the edges represent the
outgoing hyperlinks between them.
n Railway system: The cities and towns of a country are connected through railway lines. Similarly,
n Resource allocation graph: In order to detect and avoid deadlocks, the operating system main-
tains a resource allocation graph for processes that are active in the system.
n Electric circuits: The components are represented as nodes and the wires/connections as edges.
Graph: A graph has two sets: V and E, i.e., G = <V, E>, where V is the set of vertices or nodes
and E is the set of edges or arcs. Each element of set E is a pair of elements taken from set V. Consider
Figure 8.1 and identify the following:
V 5 {Delhi, Bhopal, Calcutta, Indore….}
E 5 {(Delhi, Kota), (Bhopal, Nagpur), (Pune, Mumbai)…}
Before proceeding to discuss more on graphs, let us have a look at the basic terminology of graphs
discussed in the subsequent section.
A G A
F D
D
H
B B
E
E
C
C
(a) (b)
A
D
B
B D
E
C
E
C F
(c) (d)
Adjacent node: A node Y is called adjacent to X if there exists an edge from X to Y. Y is called successor
of X and X predecessor of Y. The set of nodes adjacent to X are called neighbours of X.
Complete graph: A complete graph is a graph G = <V, E> with the property that each node of the graph
is adjacent to all other nodes of the graph, as shown in Figure 8.3(b).
Acyclic graph: A graph without a cycle is called an acyclic graph as shown in Figure 8.3(c). In fact, a tree
is an excellent example of an acyclic graph.
Self loop: If the starting and ending nodes
A
of an edge are same, then the edge is called
a self loop, i.e., edge (E, E) is a self loop as A
D
shown in Figure 8.4(a).
Parallel edges: If there are multiple edges B D
E B
between the same pair of nodes in a graph,
then the edges are called parallel edges. For
example, there are two parallel edges between C E
nodes B and C of Figure 8.4(b).
C
Multigraphs: A graph containing self loop
(a) (b)
or parallel edges or both is called a multi-
graph. The graphs shown in Figure 8.4 are
multigraphs. Fig. 8.4 Multigraphs
Simple graph: A simple graph is a graph which
is free from self loops and parallel edges.
Degree of a node or vertex: The number of edges connected to a node is called its degree. The maximum
degree of a node in a graph is called the degree of the graph. For example, the degree of node A in Figure 8.3(a)
is 2 and that of Delhi in Figure 8.1 is 4. The degree of graph in Figure 8.1 is 6.
However, in case of directed graphs, there are two degrees of a node—indegree and outdegree. The
indgree is defined as the number of edges incident upon the node and outdegree as the number of edges
radiating out of the node.
Consider Figure 8.3(d), the indegree and outdegree of C is 0 and 4, respectively. Similarly, the inde-
gree and outdegree of D is 3 and 1, respectively.
Pendent node: A pendent node in a graph is a node whose indegree is 1 and outdegree is 0. The node F
in Figure 8.3(d) is a pendent node. In fact, in a tree all leaf nodes are necessarily pendent nodes.
n Linked representation
n Set representation
A A
A B C D E A B C D E
A 0 0 0 1 0 A 0 1 0 1 0
D B 1 0 1 0 0 D B 1 0 1 0 0
B B
C 0 0 0 1 0 C 0 1 0 1 1
D 0 0 0 0 1 D 1 0 1 0 1
E E
E 0 0 1 0 0 E 0 0 1 1 0
C C
(a) (b)
A
A
A B C D E A B C D E F
11 A 0 10 0 11 0 A 0 1 0 1 0 0
10 D
B 10 0 12 0 0 B B 1 0 1 0 0 0
D
B C 0 12 0 12 9 C 0 1 0 0 0 1
10 D 11 0 12 0 10 D 1 0 0 0 1 0
12 E
12 E 0 0 9 10 0 E 0 0 0 1 0 1
E
C F 0 0 1 0 1 0
C 9 F
(c) (d)
Property 1: It may further be noted that the adjacency matrix is symmetric for an undirected graph.
This can be easily proved that an edge (vi, vj) of an undirected graph gets an entry into an adjacency matrix A
at two locations: Aij and Aji. Thus in the matrix A for all i, j, Aij 5 Aji, which is the property of a symmetric
matrix. Therefore, the following relation also holds good:
A 5 AT
where AT is the transpose of adjacency matrix A.
From Figure 8.6(a), it can be confirmed that in case of directed graphs, the adjacency matrix is not
symmetric and therefore the following relation holds true.
A ≠ AT
370 Data Structures Using C
D
B
A B C D E A B C D E A B C D E
A 0 0 0 1 0 A 0 1 0 0 0 A 1 0 1 0 0
B 1 0 1 0 0 B 0 0 0 0 0 B 0 2 0 0 1
A= AT = A AT =
C 0 0 0 1 0 C 0 1 0 0 1 C 1 0 1 0 0
D 0 0 0 0 1 D 1 0 1 0 0 D 0 0 0 1 0
E 0 0 1 0 0 E 0 0 0 1 0 E 0 1 0 0 1
It may be noted that the diagonal elements of A × AT are {1, 2, 1, 1, 1}, which are in fact the outde-
grees of vertices A, B, C, D and E, respectively.
Similarly, for a directed graph G with adjacency matrix A, the diagonal of matrix AT × A gives the in
degrees of the corresponding vertices. Consider the graph and its adjacency matrix A given in Figure 8.8.
It may be noted that the diagonal elements of AT × A are {1, 0, 2, 2, 1}, which are in fact the in degrees
of vertices A, B, C, D and E, respectively.
A
D
B
A B C D E A B C D E A B C D E
A 0 1 0 0 0 A 0 0 0 1 0 A 1 0 1 0 0
B 0 0 0 0 0 B 1 0 1 0 0 B 0 0 0 0 0
AT = A= AT A =
C 0 1 0 0 1 C 0 0 0 1 0 C 1 0 2 0 0
D 1 0 1 0 0 D 0 0 0 0 1 D 0 0 0 2 0
E 0 0 0 1 0 E 0 0 1 0 0 E 0 0 0 0 1
Property 3: For a directed graph G with adjacency matrix A, the element Aij of matrix AK (kth power of
A) gives the number of paths of length k from vertex vi to vj. This means that an element Aij of A2 will denote
the number of paths of length 2 from vertex vi to vj. Consider Figure 8.9. A2 gives the path length of 2 from a
vertex vi to vj. For example, there is one path of length 2 from C to A (i.e., C-B-A). Similarly, A3 gives the path
length of 3 from a vertex vi to vj. For example, there is one path of length 3 from E to D (i.e., E-C-B-D).
The major drawback of adjacency matrix is that it requires N × N space whereas most of the entries
in the matrix are 0, i.e., the matrix is sparse.
D
B
C
F
A B C D E F A B C D E F A B C D E F
A 0 0 0 1 0 0 A 0 0 0 0 1 0 A 0 0 1 0 0 0
A= B 1 0 0 1 0 0 A2 = B 0 0 0 1 1 0 A3 = B 0 0 1 0 1 0
C 0 1 0 1 0 1 C 1 0 0 1 1 0 C 0 0 0 1 0 0
D 0 0 0 0 1 0 D 0 0 1 0 0 0 D 0 1 0 0 0 1
E 0 0 1 0 0 0 E 0 1 0 1 0 1 E 1 0 0 1 1 1
F 0 0 0 0 0 0 F 0 0 0 0 0 0 F 0 0 0 0 0 0
A
A D NULL
B A C NULL
D
B
C D E NULL
E D E NULL
C E NULL
D
B
E
C
F
A D B NULL
B A C F G NULL
C B D F G NULL
D A C E F NULL
E D F NULL
F B C D E NULL
G B C NULL
A A D NULL
D B A C NULL
B
E C D E NULL
C
D E NULL
E NULL
NULL
DATA adjNode
nextHead
Note: If in a graph G, there are N vertices and M edges, the adjacency matrix has space complexity O(N2)
and that of adjacency list is O(M).
V = {A, B, C, D, E}
B D
E = {(A, D), (B, A), (B, C), (C, D), (D,E), (E, C)}
E
C
(a)
D
B V = {A, B, C, D, E, F, }
E E = {(A, D), (A, B) (B, A), (B, C), (B, F), (B, G), (C, B), (C, D), (C, F), (C, G),
C (D, A), (D, C), (D,E), (D, F), (E, D), (E, F), (F,B), (F, C), ( F, D), (F, E),
F
(G, C), (G, B)}
G
(b)
In case of weighted graphs, the weight is also included with the edge information making it a 3-tuple,
i.e., (w, vi, vj) where w is the weight of edge between vertices vi and vj. The set representation for a
weighted graph is given in Figure 8.15.
A
10 11
D V = {A, B, C, D, E, }
B
10 E = {(11, A, D), (10, A, B), (12, B, C), (12, C, D), (9, C, E), (10, D, E)}
12 12
E
C 9
This set representation is most efficient as far as storage is concerned but is not suitable for opera-
tions that can be defined on graphs. A discussion on operations on graphs is given in the section 8.4.
(2) Deletion: Similarly, a node or an edge or both can be deleted from an existing graph.
(3) Traversal: A graph may be traversed for many purposes—to search a path, to search a goal, to
establish a shortest path between two given nodes, etc.
A B C D E
D
B A 0 1 0 1 0
B 1 0 1 0 0
E C 0 1 0 1 1
D 1 0 1 0 1
C E 0 0 1 1 0
A
A B C D E F
A 0 1 0 1 0 0
D
B B 1 0 1 0 0 1
C 0 1 0 1 1 0
D 1 0 1 0 1 1
E
F E 0 0 1 1 0 0
C F 0 1 0 1 0 0
It may be noted that a vertex called F has been added to the graph. It has two edges (F, B) and (F, D).
The adjacency matrix has been accordingly modified to have a new row and a new column for the in-
coming vertex. Appropriate entries for the edges have been made in the adjacency matrix, i.e., the cells
(F, B), (F, D), (B, F) and (D, F) have been set equal to 1.
An algorithm for insertion of a vertex is given below:
/* This algorithm uses a two-dimensional matrix adjMat[][] to store the
adjacency matrix. A new vertex called verTex is added to the matrix by add-
ing an additional empty row and an additional empty column. */
Algorithm addVertex ()
{
lastRow = lastRow + 1;
Graphs 375
lastCol = lastCol + 1;
adjMat[lastRow][0] = verTex;
adjMat[0][lastCol] = verTex;
Set all elements of last row = ‘0’;
Set all elements of last Col = ‘0’;
}
Algorithm addEdge()
{
Find row corresponding to v1, i.e., rowV1;
Find col corresponding to v2, i.e., colV2;
adjMat[rowV1][ColV2] = ‘1’; /* make symmetric entries */
adjMat[colV2][rowV1] = ‘1’;
}
Example 1: Write a program that implements an adjacency matrix wherein it adds a vertex and its
associated edges. The final adjacency matrix is displayed.
Solution: Two functions called addVertex() and addEdge() would be used for insertion of a new
vertex and its associated edges, respectively. Another function called dispAdMat() would display the
adjacency matrix.
The required program is given below:
/* This program implements an adjacency matrix */
#include <stdio.h>
#include <conio.h>
void main()
{ /* Adjacency matrix of Figure 8.17 */
char adjMat[7][7] = { ‘-’, ‘A’,’B’,’C’,’D’,’E’,’ ‘,
‘A’, ‘0’, ‘1’, ‘0’, ‘1’, ‘0’,’ ‘,
‘B’, ‘1’, ‘0’, ‘1’,’0’, ‘0’,’ ‘,
‘C’, ‘0’, ‘1’, ‘0’,’1’, ‘1’,’ ‘,
‘D’, ‘1’, ‘0’, ‘1’,’0’, ‘1’,’ ‘,
‘E’, ‘0’, ‘0’, ‘1’,’1’, ‘0’,’ ‘,
‘ ‘, ‘ ‘, ‘ ‘, ‘ ‘, ‘ ‘, ‘ ‘, ‘ ‘};
int numVertex = 5;
char newVertex, v1,v2;
char choice;
dispAdMat (adjMat, numVertex);
376 Data Structures Using C
printf (“\n Enter the vertex to be added”);
fflush (stdin);
newVertex = getchar();
numVertex++;
addVertex (adjMat, numVertex, newVertex);
do
{
fflush(stdin);
printf (“\n Enter Edge: v1 - v2”);
scanf (“%c %c”, &v1, &v2);
addEdge (adjMat, v1,v2,numVertex);
dispAdMat (adjMat, numVertex);
fflush (stdin);
printf (“\n do you want to add another edge Y/N”);
choice = getchar();
}
while ((choice != ‘N’) && (choice != ‘n’));
}
void addVertex (char adjMat[7][7], int numV, char verTex)
{ int i;
adjMat[numV][0] = verTex;
adjMat[0][numV] = verTex;
for (i = 1; i<= numV; i++)
{
adjMat[numV][i] = ‘0’;
adjMat[i][numV] = ‘0’;
}
}
void addEdge (char adjMat[7][7], char v1, char v2, int numV)
{ int i, j, k;
i = 0;
for (j = 1; j <= numV; j++)
{
if (adjMat[i][j] == v1)
{
for (k = 0; k <= numV; k++)
{
if (adjMat [k][0] == v2)
{
adjMat[k][j] = ‘1’;
adjMat[j][k] = ‘1’; break; /* making symmetric entries */
}
}
}
}
}
Graphs 377
void dispAdMat(char adjMat[7][7], int numV)
{
int i,j;
printf (“\nThe adj Mat is--\n”);
for (i=0; i<= numV; i++)
{
for (j=0; j<=numV; j++)
{
printf (“%c “, adjMat[i][j]);
}
printf (“\n”);
}
printf (“\n\n Enter any key to continue”);
getch();
}
The above program has been tested for data given in Figure 8.16. The screenshots of the result are
given in Figure 8.17.
Fig. 8.17 The adjacency matrix before and after insertion of vertex F
Note: In case of a directed graph, only one entry of the directed edge in the adjacency matrix needs to be
made. An algorithm for insertion of a vertex in a directed graph is given below:
378 Data Structures Using C
/* This algorithm uses a two-dimensional matrix adjMat[][] to store the
adjacency matrix. A new vertex called verTex is added to the matrix by
adding an additional empty row and an additional empty column. */
Algorithm addVertex ()
{
lastRow = lastRow + 1;
lastCol = lastCol + 1;
adjMat[lastRow][0] = verTex;
adjMat[0][lastCol] = verTex;
Set all elements of last row = ‘0’;
Set all elements of last Col = ‘0’;
}
Algorithm addEdge ()
{
Find row corresponding to v1, i.e., rowV1;
Find col corresponding to v2, i.e., colV2;
adjMat[rowV1][ColV2] = ‘1’;
}
Accordingly, the function addEdge() needs to modified so that it makes the adjacency matrix
assymmetric. The modified function is given below:
/* The modified function for addition of edges in directed graphs */
void addEdge (char adjMat[7][7], char v1, char v2, int numV)
{ int i,j, k;
j=0;
for (i=1; i<=numV; i++)
{
if (adjMat[i][j] == v1)
{
for (k=1; k<= numV; k++)
{
if (adjMat [0][k] ==v2)
{
adjMat[i][k] = ‘1’; break;
}
}
}
}
}
Graphs 379
The deletion of an edge from a graph requires different treatment for undirected and directed
graphs. Both the cases of deletion of edges are given below:
An algorithm for deletion of an edge from an undirected graph is given below:
/* This algorithm uses a two-dimensional matrix adjMat[][] to store the
adjacency matrix. A new edge (v1, v2) is deleted from the matrix by
deleting its entry (i.e., ‘0’) from the row and column corresponding to v1
and v2, respectively. */
Example 2: Write a program that deletes a vertex and its associated edges from an adjacency matrix. The
final adjacency matrix is displayed.
Solution: A function called delVertex() would be used for deletion of a vertex and its associated
edges from an adjacency matrix. Another function called dispAdMat() would display the adjacency
matrix.
380 Data Structures Using C
#include <stdio.h>
#include <conio.h>
void main()
{
char adjMat[7][7] = { ‘-’, ‘A’,’B’,’C’,’D’,’E’,’ ‘,
‘A’, ‘0’, ‘1’, ‘0’, ‘1’, ‘0’,’ ‘,
‘B’, ‘1’, ‘0’, ‘1’,’0’, ‘0’,’ ‘,
‘C’, ‘0’, ‘1’, ‘0’,’1’, ‘1’,’ ‘,
‘D’, ‘1’, ‘0’, ‘1’,’0’, ‘1’,’ ‘,
‘E’, ‘0’, ‘0’, ‘1’,’1’, ‘0’,’ ‘,
‘ ‘, ‘ ‘, ‘ ‘, ‘ ‘, ‘ ‘, ‘ ‘, ‘ ‘};
int numVertex = 5;
char Vertex;
dispAdMat (adjMat, numVertex);
printf (“\n Enter the vertex to be deleted”);
fflush (stdin);
Vertex=getchar();
delVertex (adjMat, numVertex, Vertex);
dispAdMat (adjMat, numVertex);
}
void delVertex (char adjMat[7][7], int numV, char verTex)
{ int i, j, k;
j = 0;
for (i = 1; i <= numV; i++)
{
if (adjMat[i][j] == verTex)
{
adjMat[i][0] =’-’;
for (k = 1; k <= numV; k++)
{
adjMat[i][k] = ‘0’;
}
}
}
i = 0;
for (j = 1; j <= numV; j++)
{
if (adjMat[i][j] == verTex)
{
adjMat[0][j] = ’-’;
for (k = 1; k <= numV; k++)
Graphs 381
{
adjMat[k][j] = ‘0’;
}
}
}
}
void dispAdMat(char adjMat[7][7], int numV)
{
int i, j;
printf (“\nThe adj Mat is--\n”);
for (i = 0; i <= numV; i++)
{
for (j = 0; j <= numV; j++)
{
printf (“%c “, adjMat[i][j]);
}
printf (“\n”);
}
printf (“\n\n Enter any key to continue”);
getch();
}
The screenshots of the output of the program are shown in Figure 8.18.
Input adjacency
matrix
Adjacency matrix
after deletion of
vertex B
Fig. 8.18 The adjacency matrix before and after deletion of vertex B
382 Data Structures Using C
Example 3: Write a program that deletes an edge from an adjacency matrix of an undirected graph. The
final adjacency matrix is displayed.
Solution: A function called delEdge() would be used for deletion of an edge from an adjacency ma-
trix. Another function called dispAdMat() would display the adjacency matrix.
The required program is given below:
/* This program deletes an edge of an undirected graph from an adjacency
matrix */
#include <stdio.h>
#include <conio.h>
void delEdge (char adjMat[7][7], char v1, char v2, int numV);
void dispAdMat(char adjMat[7][7], int numV);
void main()
{
char adjMat[7][7] = { ‘-’, ‘A’,’B’,’C’,’D’,’E’,’ ‘,
‘A’, ‘0’, ‘1’, ‘0’, ‘1’, ‘0’,’ ‘,
‘B’, ‘1’, ‘0’, ‘1’,’0’, ‘0’,’ ‘,
‘C’, ‘0’, ‘1’, ‘0’,’1’, ‘1’,’ ‘,
‘D’, ‘1’, ‘0’, ‘1’,’0’, ‘1’,’ ‘,
‘E’, ‘0’, ‘0’, ‘1’,’1’, ‘0’,’ ‘,
‘ ‘, ‘ ‘, ‘ ‘, ‘ ‘, ‘ ‘, ‘ ‘, ‘ ‘};
int numVertex = 5;
char v1, v2;
dispAdMat (adjMat, numVertex);
printf (“\n Enter the edge to be deleted”);
fflush (stdin);
printf (“\n Enter Edge : v1 - v2”);
scanf (“%c %c”, &v1, &v2);
delEdge (adjMat, v1,v2,numVertex);
dispAdMat (adjMat, numVertex);
}
void delEdge (char adjMat[7][7], char v1, char v2, int numV)
{ int i, j, k;
i = 0;
for (j = 1; j <= numV; j++)
{
if (adjMat[i][j] == v1)
{
for (k = 0; k <= numV; k++)
{
if (adjMat [k][0] == v2)
{
adjMat[k][j] = ‘0’;
adjMat[j][k] = ‘0’; break; /* making symmetric entries */
Graphs 383
}
}
}
}
}
The screenshots of the output of the program are given in Figure 8.19.
Input adjacency
matrix
Adjacency matrix
after deletion of
edge (D, A)
Fig. 8.19 The adjacency matrix before and after deletion of edge (D, A)
384 Data Structures Using C
The function delEdge () needs to modified for directed graphs so that it does not make the adja-
cency matrix as symmetric. The modified function is given below:
/* The modified function for deletion of edges of directed graphs */
void delEdge (char adjMat[7][7], char v1, char v2, int numV)
{ int i, j, k;
j = 0;
for (i = 1; i <= numV; i++)
{
if (adjMat[i][j] == v1)
{
for (k = 1; k <= numV; k++)
{
if (adjMat [i][k] == v2)
{
adjMat[i][k] = ‘0’;break;
}
}
}
}
}
8.4.3.1 Depth First Search (DFS) In this method, the travel starts from a vertex then carries on to its
successors, i.e., follows its outgoing edges. Within successors, it travels their successor and so on. Thus,
this travel is same as inorder travel of a tree and, therefore, the search goes deeper into the search space
till no successor is found. Once the search space of a vertex is exhausted, the travel for next vertex starts.
This carries on till all the vertices have been visited. The only problem is that the travel may end up in
a cycle. For example, in a graph, there is a possibility that the successor of a successor may be the vertex
itself and the situation may end up in an endless travel, i.e., a cycle.
In order to avoid cycles, we may maintain two queues: notVisited and Visited. The vertices that
have already been visited would be placed on Visited. When the successors of a vertex are generated, they
are placed on notVisited only when they are not already present on both Visited and notVisited
queues thereby reducing the possibility of a cycle.
Consider the graph shown in Figure 8.20. The trace of DFS travel of the given graph is given in
Figures 8.20 and 8.21, through entries into the two queues Visited and notVisited.
Graphs 385
D A notVisited
B
E
C
F Visited
D BD notVisited
B
E
C
F A Visited
D CFGD notVisited
B
E
C
F AB Visited
D FGD notVisited
B
E
C
F ABC Visited
The final visited queue contains the nodes visited during DFS travel, i.e., A B C F E G D.
D EGD notVisited
B
E
C
F ABCF Visited
D GD notVisited
B
E
F ABCFE Visited
D D notVisited
B
E
F ABCFEG Visited
D notVisited
B
E
F ABCFEGD Visited
Example 4: Write a program that travels a given graph using DFS strategy.
Solution: We would use the following functions:
(1) void addNotVisited (char Q[], int *first, char vertex); :
This function adds a vertex on notVisited queue in front of the queue.
(2) void addVisited (char Q[], int *last, char vertex);
This function adds a vertex on Visited queue.
(3) char removeNode (char Q[], int *first);
This function removes a vertex from notVisited queue.
(4) int findPos (char adjMat[8][8], char vertex);
This function finds the position of a vertex in adjacency matrix, i.e.,
adjMat.
(5) void dispAdMat(char adjMat[8][8], int numV);
This function displays the contents of adjacency matrix, i.e., adjMat.
(6) int ifPresent (char Q[], int last, char vertex);
This function checks whether a vertex is present on a queue or not.
(7) void dispVisited (char Q[], int last);
This function displays the contents of Visited queue.
We would use the adjacency matrix of graph shown in Figures 8.20 and 8.21. The required program
is given below:
/* This program travels a graph using DFS strategy */
#include <stdio.h>
#include <conio.h>
{
int i,j;
printf (“\nThe adj Mat is--\n”);
for (i=0; i<= numV; i++)
{
for (j=0; j<=numV; j++)
{
printf (“%c “, adjMat[i][j]);
}
printf (“\n”);
}
printf (“\n\n Enter any key to continue “);
getch();
}
}
void dispVisited (char Q[], int last)
{
int i;
printf (“\n The visited nodes are ...”);
for (i =0; i <= last; i++)
printf (“ %c “, Q[i]);
}
The output of the program is shown in Figure 8.22. The final list of visited nodes matches with the
results obtained in Figures 8.20 and 8.21, i.e., the visited nodes are A B C F G D E.
8.4.3.2 Breadth First Search (BFS) In this method, the travel starts from a vertex then carries on to
its adjacent vertices, i.e., follows the next vertex on the same level. Within a level, it travels all its siblings
and then moves to the next level. Once the search space of a level is exhausted, the travel for the next
level starts. This carries on till all the vertices have been visited. The only problem is that the travel may
end up in a cycle. For example, in a graph, there is a possibility that the adjacent of an adjacent may be
the vertex itself and the situation may end up in an endless travel, i.e., a cycle.
In order to avoid cycles, we may maintain two queues—notVisited and Visited. The vertices that
have already been visited would be placed on Visited. When the successor of a vertex are generated, they
are placed on notVisited only when they are not already present on both Visited and notVisited
queues, thereby reducing the possibility of a cycle.
Consider the graph given in Figure 8.23. The trace of BFS travel of the given graph is given in
Figures 8.23 and 8.24, through entries into the two queues—Visited and notVisited.
The Visited queue contains the nodes visited during BFS travel, i.e., A B D C F G E.
The algorithm for BFS travel is given below:
Algorithm BFS (firstVertex)
{
add firstVertex on notVisited;
place NULL on Visited; /* initially Visited is empty */
while (notVisited ! = NULL)
{
remove vertex from notVisited;
generate adjacents of vertex;
for each adjacent of vertex do
{
if (adjacent is not present on Visited AND adjacent is not present
on notVisited) Then add adjacent at Rear on notVisited;
Graphs 391
}
D A notVisited
B
E
C
F Visited
D BD notVisited
B
E
C
F A Visited
D DCFG notVisited
B
E
C
F AB Visited
D CFGE notVisited
B
E
C
F ABD Visited
Example 5: Write a program that travels a given graph using breadth first search strategy.
Solution: We would use the following functions:
(1) void addNotVisited (char Q[], int *first, char vertex); :
This function adds a vertex on notVisited queue at Rear of the queue.
392 Data Structures Using C
D FGE notVisited
B
E
C
F ABDC Visited
D GE notVisited
B
E
C
F ABDCF Visited
D E notVisited
B
E
C
F ABDCFG Visited
D notVisited
B
E
C
F ABDCFGE Visited
The adjacency matrix of graph shown in Figures 8.20 and 8.21 is used. The program of Example 4
has been suitably modified which is given below:
#include <stdio.h>
#include <conio.h>
Note: Implementation of operations on graphs using adjacency matrix is simple as compared to adjacency
list. In fact, adjacency list would require the extra overhead of maintaining pointers. Moreover, the linked lists
connected to vertices are independent and it would be difficult to establish cross-relationship between the
vertices contained on different lists, which is otherwise possible by following a column in adjacency matrix.
For example, the graph shown in Figure 8.26 is a connected undirected graph.
We know that a tree is a special graph which does not contain a cycle. Thus, we can remove some
edges from the graph of Figure 8.26 such that it is still connected but has no cycles. The various sub-
graphs or trees so obtained are shown in Figure 8.27.
A B A B A B
C C C
E D E D E D
Fig. 8.26 A connected undirected graph Fig. 8.27 The acyclic sub-graphs or trees
8.4.4.1 Minimum Cost Spanning Trees From previous section, we know that multiple spanning
trees can be drawn from a connected undirected graph. If it is weighted graph then the cost of each span-
ning tree can be computed by adding the weights of all the edges of the spanning tree. If the weights of
the edges are unique then there will be one spanning tree whose cost is minimum of all. The spanning
tree with minimum cost is called as minimum spanning tree or Minimum-cost Spanning Tree (MST).
Graphs 397
Having introduced the spanning trees, lets us now see where they can be applied. An excellent situa-
tion that comes to mind is a housing colony wherein the welfare association is interested to lay down the
electric cables, telephone lines and water pipelines. In order to optimize, it would be advisable to con-
nect all the houses with minimum number of connecting lines. A critical look at the situation indicates
that a spanning tree would be an ideal tool to model the connectivity layout. In case the houses are not
symmetrical or are not uniformly distanced then an MST can be used to model the connectivity layout.
Computer network cabling layout is another example where an MST can be usefully applied.
There are many algorithms available for finding an MST. However, following two algorithms are
very popular.
n Kruskal’s algorithm
n Prim’s algorithm
12
3. Add edge BG D
A
A
10 A
10
F 10
B F B
14
G 14 F B
25 14
G
16 16
G 16
E
C
E C
22 12 C
22 12
D
D 12
D
6. Add edge EG. It will create cycle 5. Add edge DG. It will create cycle 4. Add edge BC
Discard EG. Add edge EF Discard DG. Add edge ED
Example 6: Find the minimum spanning tree for the graph given in Figure 8.31 using Kruskal’s
algorithm.
Graphs 399
A
6
5
B
3
7
F
1
2
5
G
C
4
7
8
6
9
E
D
Solution: The sorted list of edges in increasing order of their weight is given below:
Edge Weight
F–G 1
B–C 2
A–G 3
E–F 4
A–F 5
C–G 5
A–B 6
C–D 6
B–G 7
E–G 7
D–G 8
D–E 9
Now add the edges starting from shortest edge in sequence from F–G to D–E. The step by step cre-
ation of the spanning tree is given in Figure 8.32 (Steps 1–6)
Prim’s algorithm
In this algorithm the spanning tree is grown in stages. A vertex is chosen at random and included as
the first vertex of the tree. Thereafter a vertex from remaining vertices is so chosen that it has a smallest
edge to a vertex present on the tree. The selected vertex is added to the tree. This process of addition of
vertices is repeated till all vertices are included in the tree. Thus at given moment, there are two set of
vertices: T—a set of vertices already included in the tree, and E—another set of remaining vertices. The
algorithm for this procedure is given below:
Input: T: An empty spanning tree
E: Empty set of edges
N: Number of nodes in a connected undirected graph
400 Data Structures Using C
A
F 3
1 F B F B
1
1
2 2
G
G
G C
C
1. Add edge F-G 2. Add edge B-C
A A A
F 3 B F 3 B F 3 B
1 1 1
2 2 2
G 5 C G 5 C 4 G C
6
E E E
D
5. Add edge A-F. It will create cycle
Discard it. Add edge C-G 4. Add edge E-F
6. Add edge A-B. It will create
cycle Discard it. Add edge C-D
Algorithm Prim()
{
Step 1. Take random node v add it to T. Add adjacent nodes of v to E
2. while ( number of nodes in T < N)
{ 2.1 if E is empty then report no spanning tree is possible.
Stop.
2.3 for a node x of T choose node y such that the edge (x,y) has
lowest cost
2.4 if node y is in T then discard the edge (x,y) and repeat step 2
2.5 add the node y and the edge (x,y) to T.
2.6 delete (x,y) from E.
2.7 add the adjacent nodes of y to E.
}
3. return T
}
It may be noted that the implementation of this algorithm would require the adjacency matrix rep-
resentation of a graph whose MST is to be generated.
Let us consider the graph of Figure 8.31. The adjacency matrix of the graph is given in Figure 8.33.
Let us now construct the MST from the adjacency matrix given in Figure 8.33. Let us pick A as the
starting vertex. The nearest neighbour to A is found by searching the smallest entry (other than 0) in its
row. Since 3 is the smallest entry, G becomes its nearest neighbour. Add G to the sub graph as edge AG
Graphs 401
A B C D E F G A A A A
A 0 6 - - - 5 3
B 6 0 2 - - - 7 F F
C - 2 0 6 - - 5
G G G
D - - 6 0 9 - 8
E - - - 9 0 4 7 Step 1 Ste p 2 Step 3 Ste p 4
F 5 - - - 4 0 1
E
G 3 5 7 8 7 1 0
A A
A
B B
F
F F
C G C
G C G
E E
E
D
Step 7 Step 6 Step 5
Fig. 8.33 The adjacency matrix and the different steps of obtaining MST
(see Figure 8.33). The nearest neighbour to sub graph A–G is F as it has the smallest entry i.e. 1. Since by
adding F, no cycle is formed then it is added to the sub-graph. Now closest to A–G–F is the vertex E. It is
added to the subgraph as it does form a cycle. Similarly edges G–C, C–B, and C–D are added to obtain
the complete MST. The step by step creation of MST is shown in Figure 8.33.
0 otherwise
1 if a path from vertex vi to vj exists
Path [i][j] =
The path matrix is also called the transitive closure of A i.e. if path [i][k] and Path [k][j] exist then
path [1][j] also exists.
We have already introduced this concept in Section 8.3.1 vide Property 3 wherein it was stated that
“For a given adjacency matrix A, the element Aij of matrix Ak (kth power of A) gives the number of paths of
length k from vertex vi to vj”. Therefore, the transitive closure of an adjacency 1
matrix A is given by the following expression:
Path = A + A1 + A2 + … + An 3
2
The above method is compute intensive as it requires a series of matrix
multiplication operations to obtain a path matrix.
4
Example 7: Compute the transitive closure of the graph given in Figure 8.34.
Solution: The step wise computation of the transitive closure of the graph is
given in Figure 8.35. Fig. 8.34
0 1 0 0 1 2 3 4
1 0 1 0 1
A = 0 1 0 0
0 0 0 1
0 0 0 0 2 1 0 1 0
3 0 0 0 1
0 1 0 0 0 1 0 0 1 0 1 0
4 0 0 0 0
1 0 1 0 1 0 1 0 0 1 0 1
A2 = × =
0 0 0 1 0 0 0 1 0 0 0 0 Adjacency matrix
0 0 0 0 0 0 0 0 0 0 0 0
1 0 1 0 0 1 0 0 0 1 0 1
0 1 0 1 1 0 1 0 1 0 1 0
A3 = × =
0 0 0 0 0 0 0 1 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0
0 1 0 1 0 1 0 0 1 0 1 0
1 0 1 0 1 0 1 0 0 1 0 1
A4 = × =
0 0 0 0 0 0 0 1 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0
Path = A + A2 + A3 + A4
0 1 0 0 1 0 1 0 0 1 0 1 1 0 1 0
1 0 1 0 0 1 0 1 1 0 1 0 0 1 0 1
Path = + + +
0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1 1 1 1
1 1 1 1
Path =
0 0 0 1
0 0 0 0
Warshall gave a simpler method to obtain a path or reachability matrix of a directed graph (dia-
graph) with the help of only AND or OR Boolean operators. For example, it determines whether a path
form vertex vi to vj through vk exists or not by the following formula
Path [i][j] = Path [i][j] V ([i][k] Path [k][j])
Thus, Warshall’s algorithm also computes the transitive closure of an adjacency matrix of a dia-
graph.
The algorithm is given below:
Input: Adjacency matrix A[][] of order n*n
Path Matrix Path[][] of order n*n, initially empty
Algorithm Warshall()
{ /* copy A to Path */
for (i = 1; i < = n; i++)
{
for ( j = 1; j < = n; j++)
{
Path [i][j] = A[i][j];
}
}
/* Find path from vi to vj through vk */
for (k = 1; k <= n; k++)
{ for (i = 1; i < = n; i++)
{
for (j = 1; j < = n; j++)
{ Path [i][j] = Path[i][j] || (Path [i][k] && Path [k][j]);
}
}
}
return Path;
}
Example 8: Apply Warshall’s on the graph given in Figure 8.34 to obtain its path matrix
Solution: The step by step of the trace of Warshall’s algorithm is given in Figure 8.36.
K=1 K=2 K=3 K=4
0 1 0 0 0 1 0 0 1 1 1 0 1 1 1 1 1 1 1 1
A= 1 0 1 0 1 1 1 0 1 1 1 0 1 1 1 1 1 1 1 1
0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 1
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1 1 1 1
Path = 1 1 1 1
0 0 0 1
0 0 0 0
Fig. 8.36
404 Data Structures Using C
8.4.5.2 Floyd’s Algorithm The Floyd’s algorithm is almost same as Warshall’s algorithm. The
Warshal’s algorithm establishes as to whether a path between vertices vi and vj exists or not? The Floyd’s
algorithm finds a shortest path between every pair of vertices of a weighted graph. It uses the following
data structures:
(1) A cost matrix, Cost [][] as defined below:
Cost if there is an edge between vi and vj
Cost[i][j] = a
0
if there is no edge between vi and vj
if i = j
(2) A distance matrix Dist[][] wherein an entry D[i][j] is the distance of the shortest path from
vertex vi to vj.
In fact, for given vertices vi and vj, it finds out a path from vi to vj through vk. If vi–vk–vj path is
shorter than the direct path from vi to vj then vi–vk–vj is selected. It uses the following formula (similar
to Warshall):
Dist [i][j] = min ( Dist[i][j], ( Dist[i][k] + Dist [k][j])
The algorithm is given below:
Algorithm Floyd ()
{
for (i = 1; i <= n; i++)
for (j = 1; j <= n; j++)
Dist[i][j] = Cost[i][j]; /* copy cost matrix to distance matrix */
for ( k = 1; k <= n; k++)
{ for (i = 1; i <= n; i++)
{ for (j = 1; j <= n; j++)
Dist[i][j] = min ( Dist[i][j] , ( Dist[i][k] + Dist [k][j];
}
} V1
} 2
2
Example 9: For the weighted graph given in Figure 8.37, find all pairs 4
shortest paths using Floyd’s algorithm. 1
V5 V2
Solution: The cost matrix and the step by step of the trace of Floyd’s
algorithm is given in Figure 8.38.
7 1
3 4
8.4.5.3 Dijkstra’s Algorithm This algorithm finds the shortest paths 2
from a certain vertex to all vertices in the network. It is also known
as single source shortest path problem. For a given weighted graph G =
3
(V, E), one of the vertices v0 is designated as a source. The source vertex V4 V3
v0 is placed on a set S with its shortest path taken as zero. Where the
set S contains those vertices whose shortest paths are known. Now it- Fig. 8.37
eratively, a vertex (say vi) from remaining vertices is added to the set S
such that its path is shortest from the source. In fact this shortest path from v0 to vi passes only through
the vertices present in set S. The process stops as soon as all the vertices are added to the set S.
Graphs 405
1 2 3 4 5 K =1 K =2
1 0 2 a a a 0 2 a a a 0 2 3 a 3
Cost 2 4 0 1 a 1 4 0 1 a 1 4 0 1 a 1
3 a 7 0 3 2 a 7 0 3 2 11 7 0 3 2
4 a a a 0 4 a a a 0 4 a a a 0 4
5 2 a a 3 0 2 4 a 3 0 2 4 5 3 0
K =5 K =4 K =3
0 2 3 6 3 0 2 3 6 3 0 2 3 6 3
3 0 1 4 1 4 0 1 4 1 4 0 1 4 1
4 6 0 3 2 11 7 0 3 2 11 7 0 3 2
6 8 9 0 4 a a a 0 4 a a a 0 4
2 4 5 3 0 2 4 5 3 0 2 4 5 3 0
Fig. 8.38
1 2 3 4 5 Visited 1 0 0 0 0 Visited 1 1 0 0 0
1 0 1 6 a a
2 1 0 2 3 5
Cost Dist 0 1 6 a a Dist 0 1 3 4 6
3 6 2 0 4 2
4 a 3 4 0 2
S 1 S 1 2
5 a 5 2 2 0
S 1 2 3 4 5 S 1 2 3 4 S 1 2 3
A
6
5
B
3
F 7
1 2
G 5 C
4
7
8
6
9
E
D
Fig. 8.41
Example 10: For the graph given in Figure 8.41 , find the shortest path from vertex A to all the remaining
verices.
Solution: Let us number the vertices A–G as 1–7. The corresponding cost matrix and the stepwise con-
struction of shortest cost of all vertices with respect to vertex 1 (i.e. A) is given in Figure 8.42.
1 2 3 4 5 6 7 Visited 1 0 0 0 0 0 0 Visited 1 0 0 0 0 0 1
1 0 6 a a a 5 3
2 6 0 2 a a a 7
Cost 3 a 2 0 6 a a 5 Dist 0 6 a a a 5 3 Dist 0 6 8 11 10 4 3
4 a a 6 0 9 a 8
5 a a a 9 0 4 7
6 5 a a a 4 0 1 S 1 S 1 7
7 3 7 5 8 7 1 0
S 1 7 6 2 S 1 7 6 2 S 1 7 6
S 1 7 6 2 3 S 1 7 6 2 3 5 S 1 7 6 2 3 5 4
Site S3
Site S1
Site S4
Site S2
It may be noted that a page on site S1 has a link to page on site S4. The pages on S4 refer to
pages on S2 and S3, and so on. This hyperlink structure is the basis of the web structure similar
to the web created by a spider and, hence, the name world wide web.
(2) Resource allocation graph: In order to detect and avoid deadlocks, the operating system
maintains a resource allocation graph for processes that are active in the system. In this graph,
the processes are represented as rectangles and resources as circles. Number of small circles
within a resource node indicates the number of instances of that resource available to the
system. An edge from a process P to a resource R, denoted by (P, R), is called a request edge.
An edge from a resource R to a process P, denoted by (R, P), is called an allocation edge as
shown in Figure 8.44.
It may be noted from the resource allocation graph of Figure 8.27 that the process P1 has been
allocated one instance of resource R1 and is requesting for an instance of resource R2. Similarly,
P2 is holding an instance of both R1 and R2. P3 is holding an instance of R2 and is asking for an
instance of R1. The edge (R1, P1) is an allocation edge whereas the edge (P1, R2) is a request edge.
An operating system maintains the resource allocation graph of this kind and monitors it
from time to time for detection and avoidance of deadlock situations in the system.
(3) Colouring of maps: Colouring of maps is an interesting problem wherein it is desired that a
map has to be coloured in such a fashion that no two adjacent countries or regions have the
Graphs 409
R2
P1
P2
R1
P3
same colour. The constraint is to use minimum number of colours. A map (see Figure 8.45) can
be represented as a graph wherein a node represents a region and an edge between two regions
denote that the two regions are adjacent.
(4) Scene graphs: The contents of a visual scene are also managed by using graph data structure.
Virtual reality modelling language (VRML) supports scene graph programming model. This
model is used by MPEG-4 to represent multimedia scene composition.
A A
C
C
B
B
D D
E
F G
F
H
H
A scene graph comprises nodes and edges wherein a node represents objects of the scene and the
edges define the relationship between the objects. The root node called parent is the entry point. Every
other node in the scene graph has a parent leading to an acyclic graph or a tree like structure as shown
in Figure 8.46.
Computer
For more reading on scene graphs, the reader may refer to the paper “Understanding Scene Graphs”
by Aeron E. Walsh, chairman of Mantis development at Boston College.
Besides above, the other popular applications of graphs cited in books are:
n Shortest path problem
n Spanning trees
eXeRCISeS
5. Draw the directed graph that corresponds to the following adjacency matrix:
V0 V1 V2 V3
V0 1 0 1 0
V1 1 0 0 0
V2 0 0 0 1
V3 1 0 1 0
8
1 6
5
6
4 1 6
2 5
1 1
3 7
3 4
1
Fig. 8.47
10. Write a function that inserts an edge into an undirected graph represented using an adjacency
matrix.
11. Write a function that inserts an edge into a directed graph represented using an adjacency matrix.
12. Write a function that deletes an edge from a directed graph represented using an adjacency
matrix.
13. Write a function that deletes an edge from an undirected graph represented using an adjacency
matrix.
14. Find the minimum spanning tree for the graph given in Figure 8.39.
15. Write the Warshall’s algorithm.
16. Write the Dijkstra’s algorithm.
Files
9
Chapter
9.1.1 Data
The word data is a plural of datum, which means fact. Thus, data is a collection of facts and figures. It
can be represented by symbols. For example, in a business house, data can be the name of an employee,
his salary, number of hours worked, etc. Similarly in an educational institute, the data can be the marks
of a student, roll numbers, percentage, etc.
9.1.2 Information
It is the data arranged in an ordered and useful form. Thus, the data can be processed so that it becomes
meaningful and understandable (see Figure 9.1).
Files 413
For example, in Figure 9.2(a), the collection of characters and digits is meaningless by itself because
it could refer to total marks and names of students, the house numbers and names of house owners, etc.
Once the data is arranged in an order as shown in Figure 9.2(b), anybody can come to know that the data
refers to names of persons and their corresponding ages. Therefore, this meaningful data can be called
information.
Name Age
20 RAM 18
SAGUN 10
SHAM 40 SHAM 18
RAFIQ SINGH 60 JOHN 18
JOHN 18 RAM 20
RAFIQ 40
SAGUN 10
SINGH 60
001 AJAY 50
002 RAM 67
File
most important activities in an organization. Depending upon its purpose, a file can be called by a spe-
cific name or type. The various types of files are tabulated in Table 9.1.
n The files are handled and maintained by specific programs written in a high level language such
The manual filing system is suitable for small organizations where fast processing and storage of
data is not required. In the EDP filing system, large amount of data can be managed efficiently. The data
storage and retrieval becomes very fast. It may be noted that in the manual filing system, the files are
generally arranged in a meaningful sequence. The records are located manually by the person in charge
of the files. However when the files are stored on the computer, the files have to be kept in such a way
that the records can be located, processed, and selected easily by the computer program. The handling
of files depends on both input and output requirements, and the amount of data to be stored. How best
can the files be arranged for ease of access? This necessity has led to the development of a number of file
organization techniques as listed below:
(i) Sequential file organization
(ii) Direct access file organization
(iii) Indexed sequential file organization
9.4 FIleS IN C
All the programs that we have written so far have extensively used input and output statements. The
scanf statement is used to input data from keyboard and printf statement to output or display
data on visual display unit (VDU) screen. Thus, whenever some data is required in a program, it
tries to read from the keyboard. The keyboard is an extremely slow device. For small amount of
data, this method of input works well. But what happens when huge data is needed by a program?
For example, a program which generates the merit list of joint entrance examination (JEE) may
require data in the tune of 10,000 to 20,000 records. It is not possible to sit down and type such a
large amount of data in one go. Another interesting situation is: what happens when the data or
results produced by one program are required subsequently by another program? On the next day,
the results may even be required by the program that produced them. Therefore, we need to have a
mechanism such as files by virtue of which the program can read data from or write data on mag-
netic storage medium.
When the data of a file is stored in the form of readable and printable characters, then the file is
known as a text file. On the other hand, if a file contains non-readable characters in binary code then
the file is called a binary file. For instance, the program files created by an editor are stored as text files
whereas the executable files generated by a compiler are stored as binary files. An introduction to files
supported by `C´ is given in the following sections.
The standard source and sink are keyboard and monitor screen, respectively. These unformatted
streams are initialized whenever the header file <stdio.h> is included in a program.
In fact, C supports a very large number of I/O functions capable of performing a wide variety of
tasks. The I/O system of C is categorized into following three categories:
(1) The stream I/O system
(2) The console I/O system
(3) The low-level I/O system
Some of the important stream I/O functions are listed in Table 9.2.
n Table 9.2 Some important stream I/O functions in C
Function purpose
fclose close a stream
feof test for end of file
flush flush a stream
fgctc read a character from a stream
fgetchar read a character from a stream
fgets read a string from a stream
fopen open a stream
fprintf send formatted data to a stream
fputc send a character to a stream
fputs send a string to a stream
fread read a block of data from a stream
fscanf read formatted data from a stream
fseek reposition a stream pointer
fwrite write a block of data to a stream
fputs send a string to a stream
getw read an integer from a stream
putw send an integer to a stream
Some of the important console I/O functions are listed in Table 9.3.
n Table 9.3 Some important console I/O functions in C
Function purpose
cgets read a string from the console
clrscr clear text window
cprintf write formatted data to the console
getch read a character from the console
getche read a character from the console and echo it
gotoxy move the cursor to a specified location
putch write a character to the console
418 Data Structures Using C
stream
The above statement requests the system to open a file called “myfile” in read mode and assigns
its pointer to ptr.
A file can be closed by a function called fclose() which takes one argument of type FILE.
Thus, the file (“Myfile”) can be closed by the following statement:
fclose (ptr);
if (!ptr)
{
printf (“\n The file cannot be opened”);
exit(1);
}
fputs(“When an apple fell, Newton was disturbed\n”, ptr);
fputs(“but when he found that all apples fell, \n”, ptr);
fputs(“it was gravitation and he was satisfied”, ptr);
fclose(ptr);
}
The above program is simple wherein the first statement of function main() declares a pointer to
a FILE and in the second statement a file called “message.dat” is opened in write mode. The next
compound statement checks whether the operation was successful or not. Rest of the statements write
the text, line by line in the file. The last statement closes the file.
A string from a file can be read by a function called fgets(). This function requires three argu-
ments as shown below:
fgets (<string>, n, <file-pointer>)
where <string> is the character string, i.e., an array of char in which a group of characters from
the file would be read.
n is the number of characters to be read from the file. The function reads a string of characters until the
first new line (“\n”) character is encountered or until the number of characters read is equal to n 2 1.
<file-pointer> is the pointer to the file from which the text is to be read.
The end of a file can be detected by a function called feof(<file.pointer>). This function can be
used in a situation where a program needs to read whole of the file. For example, the following program
uses feof() and fgets() functions to read the “message.dat” file till the end of the file is encoun-
tered. The strings read from the file are displayed on the screen.
#include<stdio.h>
main()
{
char text[80];
FILE*ptr;
ptr = fopen(“message.dat”, “r”);
if(!ptr)
{
printf(“\n the file cannot be opened”);
exit(1);
}
while(!feof(ptr))
{
fgets(text, 80, ptr);
printf(“\ %s”, text);
Files 421
}
fclose(ptr);
}
Similarly, the functions fgetc() and fputc() functions can be used to read from or write a char-
acter to a file. The function fputc() takes two arguments, i.e., a character variable, and the file pointer
whereas the function fgetc() takes only one argument, i.e., the file pointer.
For example, ch=fgetc(ptr) reads a character in ch from a file pointed by the pointer called ptr.
fputc(ch, ptr) writes a character stored in ch to a file pointed by the pointer called ptr.
Example 1: Write a program that copies the file called “message.dat” to another file called “new.dat”.
Solution: The above program can be written by using the following steps.
Steps
(1) Open “message.dat” for reading.
(2) Open “new.dat” for writing.
(3) Read a character from “message.dat”. If the character is “eof”, then go to step 5 else step 4.
(4) Write the character in “new.dat”. Go to step 3.
(5) Close “message.dat” and “new.dat”.
The required program is given below:
#include <stdio.h>
#include <conio.h>
main()
{
FILE *ptr1, *ptr2;
char ch;
ptr1 = fopen(“message.dat”, “r”);
if (!ptr1)
{
printf (“\n The file %s cannot be opened”, “message.dat”);
exit(1);
}
The efficacy of the above program can be verified by opening the “new.dat” either in notepad or
in the Turbo-C editor.
It may be noted that till now we have used programs which are rigid in nature, i.e., the programs
work on particular files. For instance, the above program works on “message.dat” and “new.dat”
files and will not work with other files. This situation can be easily handled by taking the name of the file
from the user at run time. Consider the program segment given below:
.
.
.
char filename [20];
FILE *ptr;
printf(“\n Enter the name of the file to open for reading)”;
scanf(“%s”, filename);
ptr = fopen(filename, “r”);
.
.
.
The above program asks the user for the name of the file to be opened giving freedom to the user to
opt for any of the available files or to provide altogether a new name for a file.
Example 2: Write a program that asks from the user for the file to be opened and then displays the con-
tents of the file.
Solution: The required program is given below:
#include <stdio.h>
#include <conio.h>
main()
{
char filename[20], ch;
FILE *ptr;
printf(“\n Enter the name of the file to be opened for reading:”);
scanf(“%s”, filename);
ptr = fopen(filename, “r”);
if (!ptr)
{
printf (“\n The file %s cannot be opened”, filename);
exit(1);
}
clrscr();
while ((ch=fgetc(ptr)) != EOF)
printf (“%c”, ch);
fclose (ptr);
}
Sample Input: Enter the name of the file to be opened for reading: new.dat
Output: When an apple fell, Newton was disturbed;
but when he found that all apples fell;
it was gravitation and he was satisfied.
Files 423
It may be noted that in the above program, instead of feof(), the keyword EOF has been used to
detect the end of file. This is another method of doing the same thing. In fact, it is especially useful where
a file has to be read character by character as was the case in the above program.
Example 3: Write a program that reads a file of numbers and sorts the numbers in ascending order. The
sorted numbers are stored in a new file.
Solution: Assume that the input file (say “datafile.dat” contains the numbers which need to
be sorted. The getw() function would be used to read the numbers from the file into an array of integers
called List. The sorted list of numbers would be finally stored in an output file (say “sorted file.dat”).
The required program is given below:
/* This program reads a file of numbers, sorts them and
writes the sorted list into a new file */
The numbers are written in the file as long as system does not encounter −9999, a number used to
identify the end of the list.
Similarly, the following program can be used to see the contents of a file (say “sortedfile.dat”)
#include <stdio.h>
main()
{
FILE *ptr;
int val, i;
ptr = fopen(“sortedfile.dat”, “r”);
printf (“\n”);
while (!feof (ptr))
{
val =getw(ptr);
if (!feof(ptr)) printf (“%d “, val); /*Do not print EOF*/
}
fclose(ptr);
}
It may be further noted that both the programs take care that the EOF (a number) does not get
included into the list of numbers, being manipulated by them.
Example 5: Use the file created in Example 4. Read the list of numbers contained in the file using
fscanf() function. Print the largest number in the list.
Solution: The numbers from the file would be read one by one and compared for getting the largest of
them. After EOF, the largest would be displayed.
The required program is given below:
#include stdio.h
main()
{
FILE *ptr;
int val, i, large;
char infile[20];
printf(“\n Enter the name of the file:”);
scanf(“%s”, infile);
ptr = fopen(infile, “r”);
printf(“\n The list is ...”);
large = −9999;
while (1)
{
fscanf(ptr,”%d”, &val);
if (feof(ptr)) break;
printf (“%d “, val);
if (large < val) large = val;
}
printf(“\n The largest is: %d”, large);
fclose(ptr);
}
Sample output:
Enter the name of the file: Mydata.dat
The list is 12 34 56 78 32 41 21 13
The largest is: 78.
where <address of object> is the address or pointer to the object from which the data is to be read/
written on the file and vice versa.
<size of object> is computed by the function sizeof() and included in the argument list
<number of items> is the number of data items to be read or written. Generally, it is set as 1.
<file pointer> is the pointer to the file from which data is to be read or written.
Let us now try to write the following structure into a file (say test.dat):
struct stud {
char name [20];
int roll;
};
Example 6: Write a program that creates a file called “Marks.dat” and writes the records of all the
students studying in a class.
428 Data Structures Using C
Example 7: A file called “Marks.dat” contains the records of students having the following structure:
struct student
{
char name [20];
int roll;
int marks;
};
Write a program that reads marks and creates the following two files:
(1) “Pass.dat” should contain the roll numbers of students scoring more than or equal to pass-
marks stored in a variable called pass_marks.
(2) “FAIL.dat” should contain the roll numbers of students scoring below the pass_marks.
Solution: The required program is given below:
#include <stdio.h>
struct student {
char name[20];
int roll;
int marks;
};
main()
{ FILE * ptr, *p, *f;
struct student studob;
Files 429
int pass_marks;
int t = sizeof (struct student);
ptr = fopen (“marks.dat”, “r”);
p = fopen(“pass.dat”, “w”);
f = fopen (“fail.dat”, “w”);
printf (“\n Enter the passing parks :”);
scanf (“%d”, &pass_marks);
while (!feof(ptr))
{
fread(&studob,t,1,ptr);
if (feof(ptr)) break;
if( studob.marks >= pass_marks)
fprintf (p, “%d “, studob.roll);
else
fprintf (f,”%d “, studob.roll);
}
fclose (ptr);
fclose (p);
fclose(f);
}
Example 8: Write a program that opens the “pass.dat” and “fail.dat” files created in the program in
Example 7 and displays two separate lists of roll numbers who have passed and failed in the examination.
Solution: The required program is given below:
#include <stdio.h>
main()
{ int roll;
FILE *p, *f;
p = fopen(“pass.dat”, “r”);
if (! p)
{
printf (“\n The file %s cannot be opened”, “pass.dat”);
exit(1);
}
f = fopen (“fail.dat”, “r”);
if (! f)
{
printf (“\n The file %s cannot be opened”, “fail.dat”);
exit(1);
}
printf (“\n The list of pass students :”);
while (! feof(p))
{
fscanf(p,”%d”, &roll);
if (!feof(p)) printf (“%d “, roll);
roll = getch();
430 Data Structures Using C
}
printf (“\n The list of fail students :”);
while (! feof(f))
{
fscanf(f,”%d”, &roll);
if (!feof(f)) printf (“%d “, roll);
}
fclose (p);
fclose(f);
}
}
Key field
Start Start
Read a record
Read a record
N
Y Is it the Is the EOF
desired encountered?
record?
Y
Display Information N
Input a record
N
Is the EOF
encountered? Write the record
into the file
Any more Y
Print “Record not found”
record to be
added ?
Stop Stop
Fig. 9.10 Reading sequential file Fig. 9.11 Appending a sequential file
Example 9: Write a program that creates a sequential file of records of students who are players of differ-
ent games. The structure of student record is given below:
struct student
{
char name [20]
int roll;
char class [5];
char game [15];
};
Solution: The required program that creates the file (say “Play_Master.dat”) is given below:
/* This program creates a sequential file */
#include <stdio.h>
struct student {
char name[20];
int roll;
char class[5];
char game[15];
};
main()
{
FILE *ptr;
char myfile[15];
int t = sizeof(struct student);
struct student studrec;
printf(“\n Enter the name of the file:”);
scanf(“%s”, myfile);
ptr = fopen (myfile, “w”);
printf(“\n Enter the records of the students one by one”);
studrec.roll = 0;
while (studrec.roll != −9999)
{
printf(“\nName:”); fflush(stdin); gets(studrec.name);
printf(“\nRoll:”); scanf(“%d”, &studrec.roll);
printf(“\nClass:”); fflush(stdin); gets(studrec.class);
printf(“\nGame:”); fflush(stdin); gets(studrec.game);
if (studrec.roll != −9999)
fwrite(&studrec, t, 1, ptr);
}
fclose (ptr);
}
Example 10: Write a program that uses “Play_Master.dat” created in Example 9. For a given game,
it displays the data of all the students who play that game.
Solution: The required program is as follows:
434 Data Structures Using C
/* This program searches a sequential file */
#include <stdio.h>
struct student {
char name[20];
int roll;
char class[5];
char game[15];
};
main()
{
FILE *ptr;
char game[15], myfile[15];
int t = sizeof(struct student);
struct student studrec;
printf(“\n Enter the name of the file to be opened”);
scanf(“%s”, myfile);
ptr = fopen (myfile, “r”);
printf(“\n Enter the name of the game:”);
fflush(stdin);
gets(game);
fread(&studrec,t,1,ptr);
while (!feof(ptr))
{
if (! strcmp(game, studrec.game))
{ printf(“\nName =%s”, studrec.name);
printf(“\nRoll =%d”, studrec.roll);
printf(“\nclass =%s”, studrec.class);
printf(“\nGame =%s”, studrec.game);
}
fread(&studrec,t,1,ptr);
}
fclose(ptr);
}
Example 11: Write a program that opens the “Play_Master.dat” file and appends records in the file
till the roll 5 29999 is encountered.
Solution: The required program is given below:
/* This program appends records in a sequential file */
#include <stdio.h>
struct student {
char name[20];
int roll;
char class[5];
char game[15];
};
Files 435
main()
{
FILE *ptr;
char myfile[15];
int t = sizeof(struct student);
struct student studrec;
printf(“\n Enter the name of the file:”);
scanf(“%s”, myfile);
ptr = fopen (myfile, “r+”);
printf(“\n Enter the records to be appended “);
while (!feof (ptr))
fread(&studrec,t,1,ptr);
studrec.roll=0;
while (studrec.roll !=−9999)
{
printf(“\nName:”); fflush(stdin); gets(studrec.name);
printf(“\nRoll:”); scanf(“%d”, &studrec.roll);
printf(“\nClass:”); fflush(stdin); gets(studrec.class);
printf(“\nGame:”); fflush(stdin); gets(studrec.game);
if (studrec.roll != −9999)
fwrite(&studrec, t, 1, ptr);
}
fclose (ptr);
}
It may be noted that in the above program, the file was opened in “r+” mode. In fact, the same
job can be done by opening the file in “a”, i.e., append mode. The program that uses this mode is given
below:
/* This program appends records in a sequential file */
#include <stdio.h>
struct student {
char name[20];
int roll;
char class[5];
char game[15];
};
main()
{
FILE *ptr;
char myfile[15];
int t = sizeof(struct student);
struct student studrec;
printf(“\n Enter the name of the file:”);
scanf(“%s”, myfile);
ptr = fopen (myfile, “a”);
printf(“\n Enter the records to be appended“);
436 Data Structures Using C
studrec.roll = 0;
while (studrec.roll != −9999)
{
printf(“\nName:”); fflush(stdin); gets(studrec.name);
printf(“\nRoll:”); scanf(“%d”, &studrec.roll);
printf(“\nClass:”); fflush(stdin); gets(studrec.class);
printf(“\nGame:”); fflush(stdin); gets(studrec.game);
if (studrec.roll != −9999)
fwrite(&studrec, t, 1, ptr);
}
fclose (ptr);
}
Example 12: Modify the program written in Example 11 so that it displays the contents of a file before
and after the records are appended to the file.
Solution: A function called show() would be used to display the contents of the file. The required pro-
gram is given below:
/* This program displays the contents before and after it appends records
in a sequential file */
#include <stdio.h>
void show(FILE *p, char x[15]);
struct student {
char name[20];
int roll;
char class[5];
char game[15];
};
main()
{
FILE *ptr;
char myfile[15];
int t = sizeof(struct student);
struct student studrec;
printf(“\n Enter the name of the file:”);
scanf(“%s”, myfile);
printf(“\n The Records before append”);
show(ptr, myfile);
ptr = fopen (myfile, “a”);
printf(“\n Enter the records to be appended “);
studrec.roll=0;
while (studrec.roll !=−9999)
{
printf(“\nName:”); fflush(stdin); gets(studrec.name);
printf(“\nRoll:”); scanf(“%d”, &studrec.roll);
printf(“\nClass:”); fflush(stdin); gets(studrec.class);
Files 437
printf(“\nGame:”); fflush(stdin); gets(studrec.game);
if (studrec.roll != −9999)
fwrite(&studrec, t, 1, ptr);
}
fclose (ptr);
printf(“\n The Records after append”);
show(ptr, myfile);
}
void show(FILE *p, char myfile[15])
{
struct student studrec;
p=fopen(myfile, “r”);
fread(&studrec, sizeof(studrec),1,p);
while (!feof(p))
{
printf (“\nName:”); fflush(stdout); puts(studrec.name);
printf(“\nRoll:”); printf(“%d”, studrec.roll);
printf(“\nClass:”); fflush(stdout); puts(studrec.class);
printf(“\nGame:”); fflush(stdout); puts(studrec.game);
fread(&studrec, sizeof(studrec),1,p);
}
fclose(p);
}
Example 13: A file called old master of a bank contains the records of saving fund A/C of its customers
in ascending order of account numbers. The format of the record is given below:
Name
Acnt_No
Balance_Amt
A file called transaction file contains the details about transactions done in A/C on a particular day.
It also contains the records in the ascending order of A/C numbers. Each record of that file has the fol-
lowing format:
Acnt_No
Trans_type
Trans_amt
The field Trans-amt contains the amount withdrawn or deposited to the account depending upon
the value of Trans-type being 0 or 1, respectively. Write a program that reads the two files and does the
transaction processing giving a third file called new master.
Solution: In this program, the following menu would be displayed so that the possible actions can be
conveniently carried out by the user:
Menu
Create old master file 0
Create transaction file 1
Generate new master file 2
Show contents of a file 3
Quit 4
beginning of the file because the start position is equal to 0 (zero). The offset (i.e., 4) would move the file
pointer to the fourth record.
Similarly, the fseek(ptr, 9, 2) suggests that from the end of the file move back by 9 records.
Thus, it is a relative file organization wherein the records are accessed directly in relation with the
starting point, specified within the file. The offset is usually computed by the following formula:
offset = record number × size of a record
Example 14: Write a program that creates a file of room data of a hotel having following structure. The
rooms are numbered starting from 1.
struct room
{
int room_no;
char room type;
char status;
};
where room type may take value S/D, i.e., single bed or double bed. Similarly, status may take value A/N,
i.e., available or not available. Use fseek() function to display the information about a particular room
of the hotel.
Solution: The required program is given below:
/* This program uses fseek() function to search
records in a file */
#include <stdio.h>
struct room {
int room_no;
char room_type;
char room_status;
};
main()
{ FILE *ptr;
char filename[20];
int t = sizeof(struct room);
struct room rob;
int no_of_rooms, i, choice;
printf (“\n Enter the name of the file”);
fflush(stdin); gets(filename);
printf(“\n Enter the number of rooms”);
scanf(“%d”,&no_of_rooms);
/* create the file */
ptr = fopen(filename, “w”);
printf (“\n Enter the room data one by one”);
for (i =1; i<=no_of_rooms; i++)
{ rob.room_no=i;
printf (“\n Room No. %d:”, i);
printf(“\n Room_Type (S/D):”);
fflush(stdin); scanf (“%c”, &rob.room_type);
rob.room_type=toupper(rob.room_type);
444 Data Structures Using C
printf(“\n Room_Status (A/N):”);
fflush(stdin); scanf (“%c”, &rob.room_status);
rob.room_status = toupper (rob.room_status);
fwrite (&rob,t,1,ptr);
}
fclose(ptr);
ptr = fopen(filename, “r”);
/*search a record */
do
{
printf(“\n Menu”);
printf(“\n”);
printf(“\n Display info 1”);
printf(“\n Quit 2”);
printf(“\n Enter your choice”);
scanf(“%d”, &choice);
if (choice == 1)
{
printf(“\n Enter the room no.”);
scanf(“%d”, &i);
fseek(ptr, (i − 1) * t, 0);
fread(&rob, t, 1, ptr);
clrscr();
printf(“\n Room No.= %d”, rob.room_no);
printf(“\n Room Type = %c”, rob.room_type);
printf(“\n Room Status= %c”, rob.room_status);
}
else
break;
}
while (1);
fclose (ptr);
}
n Unless the record number for a record within the file is known, one cannot gain access to the
n File processing and updation activities can be performed without the creation of new master
file.
n Speed of record processing in case of large files is very fast as compared to sequential files.
n Concurrent processing of several files is possible. For example, when a sale is recorded in a com-
pany, the account receivable can be posted on one file and simultaneously the inventory with-
drawal can also be written on another file.
110 4110
Block 2
240 4130
4130 215 216 218 240
417 4127
518 4140
Block 3
Block 4
Over flow
block
main memory. Once the block is brought inside the main memory, the record is searched sequentially
in the block.
It may be noted from Figure 9.14 that each entry in the index contains the highest key value for any
record in the block and the starting address of the block. Now if we want to search a record with key
value 376, then the index is searched first. The first entry (110) of the index is compared with the search
key (376). Since 110 is the highest key value in its corresponding block and it is less than the search key
(376), the comparison with next entry is done. The comparison continues till it reaches to third entry
and it is found that the key of the entry (417) is larger than the search key (376). Thus it is established
that the required record should be on its corresponding block. The block’s starting address is 4127. Now
the access is made to disk address 4127 and the block is brought into the main memory. The search key
is compared with key of the first record of the block (370). Then it moves on to the second record and
the process is repeated till it reaches to the third record (key = 376). The keys match establishing the fact
that the required record has been found. Now, it can perform the desired operation on the record such
as display the contents, print certain values, etc.
110 4110
Block 2
240 4130
4130 215 216 217 218 <> 240
417 4127
518 4140
Block 3
Block 4
417
Over flow
block
The addition of record (371) will push the record (417) out of the Block 3. The reason being that
the block is already full, and to accommodate the incoming record (371) the record from the extreme
end has to leave the block. The overflowed record is then stored in a separate block known as overflow
block. The deletion of a record from the file is logical deletion, i.e., the record is not physically or actu-
ally deleted from the file but it is marked with special characters to indicate that the record is no longer
accessible. For example, if the records with key value 002 and 219 are to be deleted then these records are
searched in the file. If they are found then the records are marked with special characters (<> in our case)
(see Figure 9.15 ). The presence of ‘< >’ in a record indicates that the record has been deleted.
The modifications in a record involves the following steps
(1) Search the record
(2) Read the record
(3) Change the record
(4) Rewrite the changed record in its original location.
The ISAM files are very flexible in the sense that they can be processed either sequentially or ran-
domly, depending on the requirements of the user.
Index track
1
Data tracks
N-1
N
Over flow track
Spindle
n Index area: The first track in each cylinder is reserved for an index. This index describes the stor-
age of records on the various tracks of the cylinder.
n Prime area (Data tracks): The various records of the file are written on this area on a cylinder al-
ways starting from the 2nd track. The records are recorded sequentially along the various tracks.
The highest key on the track and the track address are entered into the index.
n Overflow area: The area is created to accommodate the overflowed records in the file. The over-
flow area is generally the last track on the same cylinder. This area is unoccupied at the time of
creation of file.
n Extra storage space is required for the index especially when multilevel indexes are main-
tained.
n The overflow areas sometimes overflow to other cylinders and thus cause much read/write head
movement making the access very slow. Therefore, the file has to be reorganized and indexes rear-
ranged accordingly. This is a time consuming process.
Cylinder 50
Block n
518 5009
Cylinder index 5009 518
Record Cylinder
Key No. Overflow track
518 50
1024 51
2530 52
Cylinder 81
6015 81 Block 1
Track index
Block n
6015 8109 8109 6015
Overflow track
A sequential file can be used if the activity ratio is high, i.e., more than 60–70 per cent. On the other
hand, if the activity ratio is very low, then the direct file can be chosen. The indexed sequential files are
less efficient than the direct files for low activity. A comparison between the various files organizations
is shown in Figure 9.18.
450 Data Structures Using C
processed
Sequential Indexed
Sequential
Low
2. Response time: The response time can be defined as the amount of time elapsed between a demand
for processing and its completion. The response time of a sequential file is always very long because it
involves both the waiting time i.e. until a batch of transaction is run and the processing time. The direct
and indexed sequential files have a faster response time.
The direct files allow quick retrieval of records. Generally faster response time is comparatively
more expansive. Since the direct and indexed files are more up to date, the added cost per transaction
is justified.
3. Volatility: It is the measure of percentage of records added to or deleted from a file during its use over
a period of time. Since each file organization requires different methods to add or delete records, the
volatility becomes a very important factor as far as file processing and its organization is concerned.
In a direct file, the process for adding and deleting records is simple and time saving compared to
a sequential file. The indexed sequential files are also very volatile. The direct or indexed files have a
problem of reorganization and overflow overheads.
4. Size: The amount of processing required for a file is directly proportional to its size. The size of a file
is determined in terms of total number of records in the file. The size of a record is measured in terms of
total number of characters in the record.
Small files do not require any painstaking design. On the other hand large ones require careful
design. A small sequential file may be stored on a direct access device to avoid waiting time, i.e., mount-
ing and dismounting of tapes is avoided. Large sequential files are to be stored on a tape.* As far as direct
files are concerned, the size has no relevance
5. Integration: Generally, a business organization maintains integrated file system that allows online
updation of interdependent files. Random and indexed organization allow all the files affected by a par-
ticular transaction to be updated online. Sequential files are not suitable for such system of files.
6. Security: A back up is taken during every run of a sequential file. Therefore a sequential file stored
is more secured. In fact, sequential files processing provide automatic data backup. The data stored in
direct and indexed files are comparatively not secured. Special techniques such as frequent mirroring of
disks, are used to insure file integrity. On the other hand the magnetic tapes are more prone to physical
damage.
* Nevertheless, the size of available hard disks is so large( more than 500GB) that tapes have become obsolete and redundant.
Files 451
Sample Input:
#define ce Computer Engineering
#define ymca YMCA Institute of Engineering
#define aks A. K. Sharma
#define fbd Faridabad
#define MDU Maharishi Dayanand University, Rohtak
MEND
ymca is an Institute situated in fbd. It is a Government of Haryana institute. It is affiliated to MDU. The
institute is rated as the best engineering college in the state of Haryana. The ce department is the largest
department of the institute. aks works in ce department as a Professor.
Sample Output:
YMCA Institute of Engineering is an Institute situated in Faridabad. It is a Government of Haryana
institute. It is affiliated to Maharishi Dayanand University, Rohtak. The institute is rated as the best
engineering college in the state of Haryana. The Computer Engineering department is the largest
department of the institute. A.K. Sharma works in Computer Engineering department as a Professor.
Note: The program does not perform syntax checking, i.e., if the terms “define” and “MEND” are mis-
spelled, the program will go hey wire.
Problem 2: Statements in BASIC language are written in such a manner that each statement starts with
an integer number as shown below:
10 REM This is a sample statement
20 input A, B
30 goto 150
:
:
:
100 If (x y) goto 20
Write a program that reads a file of a BASIC program. For a given offset, it adds the offset to all the state-
ment numbers. The arguments of the goto statements should also be modified to keep the program
consistent. In fact, this type of activity is done by a loader, i.e., when it loads a program into the main
memory for execution, it adds the offset of the address of the main memory to all address modifiable
parts of the program such as to goto or jump statements and the like.
Files 455
Solution: The get_token() and writeout() functions of previous program would be used in this pro-
gram to read a token and to write the tokens to a given file. The following functions of stdlib.h would
be used to convert a string to integer and vice versa.
(1) <int> atoi (string) : This function takes a string of characters and converts it
into an equivalent integer.
(2) void itoa (int, string, <radix>) : This function converts an integer to an equivalent string.
Radix is the radix of the integer such as 2, 10 depending
upon the number being binary or decimal.
The required program is given below:
/* This program implements a relative loader */
#include <stdio.h>
#include <stdlib.h>
#include <ctype.h>
/* function to read a word from input text file */
void get_token (FILE *infile, char token[20], char *break_point , int
*flag);
char inputfile[15], outputfile[15];
/* function to write a word to output text file */
void writeout(FILE *outfile,char token[20],char ch);
main()
{
FILE *infile, *outfile;
char token[20];
char last_token[20]=””;
char break_point;
int flag, j, temp;
int offset;
char ch, ch1;
printf (“\n Enter the name of input file:”);
fflush(stdin); gets(inputfile);
printf (“\n Enter the name of output file:”);
fflush(stdin); gets(outputfile);
printf (“\n Enter the value of offset (int):”);
scanf(“%d”, &offset);
infile = fopen(inputfile, “r”);
outfile = fopen(outputfile, “w”);
flag = 0;
clrscr();
ch1 = ‘\n’;
/* flag == 3 indicates EOF encountered */
while (!feof(infile) && flag != 3)
{ flag=0;
456 Data Structures Using C
token[0] =’\0’;
ch= ‘’;
get_token(infile,token, &ch, &flag);
/* Check if it is a number appearing at the
beginning of a line or as argument of
goto statement */
if ((isdigit (token [0]) && ch1 ==’\n’ ) || (isdigit (token [0]) &&
(!strcmp(last_token, “goto”))) )
{ temp = atoi(token);
temp = temp + offset;
itoa(temp,token,10);
}
writeout(outfile,token,ch);
ch1 = ch;
strcpy(last_token ,token);
}
fclose(infile);
fclose(outfile);
} /* end of main */
Sample input:
10 Rem this is a loader
20 Input A, B
30 If (A 20) goto 20
40 C 5 A 1 5 * 20;
50 if (C 50) goto 100
60 C 5 C 1 B
70 go to 50
100 stop
Value of offset 5 50
Sample output:
60 Rem this is a loader
70 Input A, B
80 If (A 20) goto 70
90 C 5 A 1 5 * 20;
100 if (C 50) goto 150
110 C 5 C 1 B
120 go to 50
150 stop
eXeRCISeS
10. Write a program which reads a text file called document and generates a table of frequency count
of alphabets appearing in the text. The output should be displayed in the following form:
Frequency count
Alphabet Count
A
B
C
.
.
.
Z
11. A student record is defined as given in Example 5. Write a program which reads a file of such
records (say class-file) and prints the result in the following format:
S. No. Condition No. of Students
1 Marks < 50% xx
2 Marks > 50% and < 60% xx
3 Marks >= 60% and < 75% xx
4 Marks >= 75% and < 90% xx
5 Marks >= 90 xx
(Assume Max marks in each subject = 100).
12. What is EOF? Explain its utility.
13. Modify Example 11 such that it generates a list of students who secure more than 80 per cent
marks.
14. Write an interactive menu driven `C´ program to create a text file and then display the file. Cre-
ate another text file by converting each line of the newly created text file into a lowercase string.
Display the newly created file. In case number of lines exceeds 22, file should be displayed one
screen at a time.
15. Write an interactive menu driven `C´ program to create a text file and then display the file.
Create another text file by reversing each line of the newly created text file. Display the newly
created file. In case number of lines exceeds 22, file should be displayed one screen at a time.
16. Explain relative files in detail.
17. Explain indexed sequential organization.
18. Write an explanatory note on multilevel indexed files.
19. Define file activity ratio, response time, and volatility.
20. How records are edited or deleted from an indexed file?
Advanced Data Structures
10
Chapter
• 10.2 Sets
• 10.3 Skip Lists
• 10.4 B-Trees
• 10.5 Search by Hashing
Panther
Puma
(a) Tiger (b) (c)
Fig. 10.1 BSTs produced from same data provided in different combinations
460 Data Structures Using C
It may be noted that the BST of Figure 10.1 (a) is a skewed binary tree and the search within such a
data structure would amount to a linear search. The BST of Figure 10.1 (b) is heavily lopsided towards
left whereas the BST of Figure 10.1 (c) is somewhat balanced and may lead to a comparatively faster
search. If the data is better planned, then definitely an almost balanced BST can be obtained. But the
planning can only be done when the data is already with us and is suitably arranged before creating the
BST out of it. However, in many applications such as ‘symbol processing’ in compilers, the data is
dynamic and unpredictable and, therefore, creation of a balanced BST is a remote possibility. Hence, the
need to develop a technique that maintains a balance between the heights of left and right subtrees of
a binary tree has always been felt; the major aim being that whatever may be the order of insertion of
nodes, the balance between the subtrees is maintained.
A height balanced binary tree was developed by Adelson-Velenskii and Landis, Russian researchers,
in 1962 which is popularly known as an ‘AVL Tree’. The AVL Tree assumes the following basic
subdefinitions:
(1) The height of a tree is defined as the length of the longest path from the root node of the tree to
one of its leaf nodes.
(2) The balance factor (BF) is defined as given below:
BF 5 height of left subtree (HL) 2 height of right subtree (HR)
With the above subdefinitions, the AVL Tree is defined as a balanced binary search tree if all its
nodes have balance factor (BF) equal to 21, 0, or 1.
In simple words we can say that in AVL Tree, the height of two subtrees of a node differs by at most
one. Consider the trees given in Figure 10.2.
1 65 –2 65
–1 30 82 1 0 30 82 –1
0 25 1 46 74 0 74 0
86 1
37 0
84 0
(a) (b)
Fig. 10.2 The binary search trees with BF labelled at each node
It may be noted that in Figure 10.2 (a), all nodes have BF within the range, i.e. 21, 0, or 1. Therefore,
it is an AVL Tree. However, the root of graph in Figure 10.2 (b) has BF equal to −2, which is a violation
of the rule and hence the tree is not AVL.
Note: A complete binary search tree is always height balanced but a height balanced tree may or may not
be a complete binary tree.
The height of a binary tree can be computed by the following simple steps:
(1) If a child is NULL then its height is 0.
(2) The height of a tree is 5 1 1 max (height of left subtree, height of right subtree)
Advanced Data Structures 461
Based on the above steps, the following algorithm has been developed to compute the height of a
general binary tree:
Algorithm compHeight (Tree)
{
if (Tree == NULL)
{height = 0; return height}
hl = compHeight (leftChild (Tree));
hr = compHeight (rightChild (Tree));
if (hl >= hr) height = hl + 1;
else
height = hr + 1;
return height;
}
–1 30 82 1
0 25 1 46 74 0
37 0
2
Insert 65
50 1
65 Pivot
–2 30 P 82 1
–1 30 82 1
0 25 –1 46 Q 74 0
0 25 0 46 74 0
0 37 1 57 R
0 37 57 0
50 0
The imbalance can be removed by carefully rearranging the nodes of the tree. For instance, the zone of
imbalance in the tree is the left subtree with root node 30 (say Node P). Within this subtree, the imbalance
is in right subtree with root node 46 (say Node Q). The closest node to the inserted node is 57. Call this
node as R. A point worth noting is that P, Q, and R are numbers and can be rearranged as per binary search
rule, i.e., the smallest becomes the left child, the largest as the right child and the middle one becomes the
root. This rearrangement can eliminate the imbalance in the section as shown in Figure 10.5.
30 P
46 Q
46 Q
P 30 57 R
57 R
It may be noted that imbalanced section of the tree has been rotated to left to balance the subtree. The
rotation of a subtree in an AVL Tree is called an AVL rotation. Depending upon, in which side of the
pivot a node is inserted, there are four possible types of AVL rotations as discussed in following sections.
10.1.2.1 Insertion of a Node in the Left Subtree of the Left Child of the Pivot In this case, when
a node in an AVL Tree is inserted in the left subtree of the left child (say LL) of the pivot then the imbal-
ance occurs as shown in Figure 10.6 (a).
It may be noted that in this case, after the rotation, the pivot P has become the right child, Q has become
the root. QR, the right child of Q has become the left child of P. The QL, the left child of Q remains intact.
Advanced Data Structures 463
Tree Tree
Pivot
P Q
ptr
P
Q
nptr R
R
PR
QL QL
QR Rotate QR PR
Imbalance
(a) (b)
Insert 2 65
12 1 65
1 30 2 30 82 0
82 0
0 25 0 46 1 25 0 46 72 0 93 0
72 0 93 0
0 17 28 0 1 17 28 0
0 12
(a) AVL tree (b) Imbalanced tree after insertion
Tree 2 65
1 65
Tree
Ptr 2 30 82 0
0 25 82 0
1 25 0 46 72 0 93 0 1 17 0 30 72 0 93 0
1 17 28 0 0 28
0 12 0 46
nPtr
0 12
(c) Imbalanced tree before LL (d) AVL tree (balanced) after rotation
Figure10.7 (c) shows the application of pointers on the nodes of the tree as per the AVL rotation.
Figure 10.7 (d) shows the final AVL Tree (height balanced) after the AVL rotation, i.e., after the rear-
rangement of nodes.
10.1.2.2 Insertion of a Node in the Right Subtree of the Right Child of the Pivot In this case,
when a node in an AVL Tree is inserted in the right subtree of the right child (say RR) of the pivot then
the imbalance occurs as shown in Figure 10.8 (a).
PL
QR QR
QL rotate QL
PL
Imbalance
(a) (b)
It may be noted that in this case, after the rotation, the pivot P has become the left child, Q has be-
come the root. QL, the left child of Q (i.e, QL) has become the right child of P. The QR, the right child of
Q remains intact.
The algorithm for this rotation is given below:
/* The Tree points to the Pivot */
Consider the AVL Tree given in Figure 10.9 (a). The tree has become imbalanced after the inser-
tion of 100 (see Figure 10.9 (b)).
Figure 10.9 (c) shows the application of pointers on the nodes of the tree as per the AVL rotation.
Figure 10.9 (d) shows the final AVL Tree (height balanced) after the AVL rotation, i.e., after the
rearrangement of nodes.
10.1.2.3 Insertion of a Node in the Right Subtree of the Left Child of the Pivot In this case,
when a node in an AVL Tree is inserted in the right subtree of the left child (say LR) of the pivot then
the imbalance occurs as shown in Figure 10.10 (a).
Advanced Data Structures 465
Insert
100 65 –1 65 –2
82 –1 0 30 82 –2
0 30
72 0 93 0 0 25 0 46 0 72 0 93 –1
0 25 0 46
0 0 89 98 0 0 89 98 –1
100 0
(a) AVL tree (b) Imbalanced tree after insertion of 100 (RR)
65 –2 Tree 65 –1 Tree
82 –2 Ptr 0 30 93 0
0 30
Rotate
0 25 0 46 0 72 0 93 –1 0 25 0 46 0 82 0 98 –1
0 89 98 –1 72 0 89 0
100 0
nPtr
100 0
(c) Imbalanced tree before RR roatation (d) AVL tree after rotation (RR)
Tree
Pivot Tree
P
R
Q 2
Q P
1 R
PR
QL Rotate PR
QL RL RR
RL RR
Imbalance
(a) (b)
The trace of the above algorithm is shown in Figure 10.11 wherein the AVL Tree has become imbal-
anced because of insertion of node 50 in the right subtree of left child of pivot [see Figures 10.11 (a) and (b)].
Insert 2 65
50
1 65
–1 30 82 1
0 30 82 1
1 25 –1 46 74 0
1 25 0 46 74 0
0 20 0 37 1 57
0 20 0 37 57 0
50 0
(a) AVL tree (b) Imbalance due to insertion
2 65 0
46
1 46 82 1 30 0 65
1 30 74 0 1 25 0 37
57 1 57 1 82
1 25 0 37 50 0 0 50
0 20 0 74
0 20
(c) The tree after first AVL rotation (d) The tree after second AVL rotation
After the double rotation, the tree has again become height balanced as shown in Figures 10.11 (c) and (d).
10.1.2.4 Insertion of a Node in the Left Subtree of the Right Child of the Pivot In this case,
when a node in an AVL Tree is inserted in the left subtree of the right child (say LR) of the pivot then
the imbalance occurs as shown in Figure 10.12 (a).
This case also requires two rotations: rotation 1 and rotation 2.
Rotation 1: In this rotation, the right child of R becomes the left child of Q and R becomes the right child
of P. Q becomes the right child of R. The left child of R remains intact.
Rotation 2: In this rotation, the left child of R becomes the right child of P. R becomes the root.
P becomes the left child of R.
The final balance tree after the double rotation is shown in Figure 10.12 (b).
Following is the algorithm for this double rotation:
Advanced Data Structures 467
/* The Tree points to the Pivot */
Algorithm rotate RL (Tree)
{
ptr = rightChild (Tree); /* point ptr to Q */
rotateLL (ptr); /* perform the first rotation */
ptr = rightChild (Tree); /* point ptr to R */
rotateRR (ptr);
}
Tree Tree
Pivot
P R
Q
2 P Q
R
PL
1
rotate RR QR
PL RL
RL RR QR
Imbalance
(a) (b)
The trace of the above algorithm is shown in Fig. 10.13 wherein the AVL tree has become imbal-
anced because of insertion of node 92 the left subtree of right child of pivot (See Fig. 10.13(a), (b)). After
the double rotation, the tree has again become height balanced as shown in Fig. 10.13(c), (d).
Insert
92 65 –1 65 –2
82 –1 0 30 82 –2
0 30
72 0 93 0 0 25 0 46 0 72 0 93 1
0 25 0 46
0 0 89 98 0 –1 89 98 0
R
92 0
(a) AVL tree (b) Imbalanced tree after insertion of 100
65 –2 65 –1
0 30 82 –2 89 0
0 30
0 25 0 46 0 72 0 89 –2
0 25 46 0 82 1 93 0
93 0
72 0 92 0 98 0
92 0 98 0
(c) The tree after first AVL rotation (d) The tree after second AVL rotaion
10.2 SeTS
A set is an unordered collection of homogeneous elements. Each element occurs at most once with the
following associated properties:
n All elements belong to the Universe. The Universe is defined as “all potential elements of set”.
Note: The LAN is a set of nodes of a local area network of an educational institute spread over various
departments like computer engineering, mechanical engineering, electrical engineering, etc.
The basic terminology associated with sets is given below:
n Null set: If a set does not contain any element, then the set is called empty or null set. It is
represented by Φ.
n Subset: If all elements of a set S1 are contained in set S2, then S1 is called a subset of S2. It is
denoted by S1 ⊆ S2.
Example: The set of big_Cat 5 {Lion, Tiger, Panther, Cheetah, Jaguar} is a subset of the set of
Cat_Family.
n Union of sets: The union of two sets S1 and S2 is obtained by collecting all members of either set
without duplicate entries. This is represented by S1 S2.
Example: If S1 5 {1, 3, 4}, S2 5 {4, 5, 6, 7, 8},
then S1 S2 5 {1, 3, 4, 5, 6, 7, 8}.
n Intersection of sets: The intersection of two sets S1 and S2 is obtained by collecting common
elements of S1 and S2. This is represented by S1 ∩ S2.
Example: If S1 5 {1, 3, 4}, S2 5 {4, 5, 6, 7, 8},
then S1 ∩ S2 5 {4}.
n Disjoint sets: If two sets S1 and S2 do not have any common elements, then the sets are called
disjoint sets, i.e., S1 ∩ S2 5 Φ.
Example: S1 5 {1, 3}, S2 5 {4, 5, 6, 7, 8} are disjoint sets.
n Cardinality: Cardinality of a set is defined as the number of unique elements present in the set.
The cardinality of a set S is represented as |S|.
Example: If S1 5 {1, 3, 4}, S2 5 {4, 5, 6, 7, 8},
then |S1| 5 3 and |S2| 5 5.
n Equality of sets: If S1 is subset of S2 and S2 is subset of S1, then S1 and S2 are equal sets. This
means that all elements of S1 are present in S2 and all elements of S2 are present in S1.The equal-
ity is represented as S1 ⊆ S2.
Example: If S1 5 {1, 3, 4}, S2 5 {3, 1, 4},
then S1 ≡ S2.
Advanced Data Structures 469
S1 S1
S2 S2
S2
S1
S2
S1
n Difference of sets: Given two sets S1 and S2, the difference S1 2 S2 is defined as the set having
all the elements which are in S1 but not in S2.
Example: If S1 5 {1, 3, 4}, S2 5 {3, 2, 5},
then S1 2 S2 5 {1, 4}.
The graphic representation of various operations related to sets is shown in Figure 10.14.
n Partition: It is a collection of disjoint sets belonging to same Universe. Thus, the union of all the
n Tree representation
List and hash table representations are more popular methods as compared to other methods.
A brief discussion on list and hash table representations is given in following sections.
10.2.1.1 List Representation A set can be very comfortably represented using a linear linked list. For
example, the set of fruits can be represented as shown in Figure 10.15.
Fruit
NULL
mango orange banana grapes peach
Since the set is an unordered collection of elements, linked list is the most suitable representation
for the sets because elements can be dynamically added or deleted from a set.
10.2.1.2 Hash Table Representation Hash table representation of a set is also a version of linked
representation. A hash table consists of storage locations called buckets. The size of a bucket is arbitrary,
i.e., it can hold any number of elements as it stores them into a linked list.
Consider the following set:
Tokens 5 {126, 235, 100, 317, 68, 129, 39, 423, 561, 222, 986}
Let us store the above set in a hash table of five buckets with hash function as given below:
H(K) = K mod 5
Where, K is an element;
H(K) gives the bucket number in which the element K is to be stored.
The arrangement is shown in Figure 10.16.
Tokens = {126, 235, 100, 317, 68, 129, 39, 423, 561, 222, 986}
Bucket
3 68 423 NULL
It may be noted that each bucket is having a linked list consisting of nodes containing a group of
elements from the set.
10.2.2.1 Union of Sets As per the definition, given two sets S1 and S2, the union of two sets is ob-
tained by collecting all the members of either set without duplicate entries. Let us assume that the sets
have been represented using lists as shown in Figure 10.17. The union set S3 of S1 and S2 has been ob-
tained by the following steps:
(1) Copy the list S1 to S3.
(2) For each element of S2, check if it is member of S1 or not and if it is not present in S1 then
attach it at the end of S3.
S1
S2
Yellow Orange Red White Magenta NULL
Union
S3
/* This function is used to copy a set to another empty set, i.e., Set1
to Set3 */
10.2.2.2 Intersection of Sets As per the definition, given two sets S1 and S2, the intersection of two sets
is obtained by collecting all the common members of both the sets. Let’s assume that the sets have been rep-
resented using lists as shown in Figure 10.18. The intersection set S3 of S1 and S2 has been obtained by the
following steps:
(1) Initialize S3 to Null;
(2) last = S3;
(3) For each element of S2, check if it is member of S1 or not and if it is present in S1 then attach it
at the end of S3.
S1
S2
Yellow Orange Red White Magenta NULL
Intersection
S3
Red NULL
S2
Yellow Orange Red White Magenta NULL
difference
S3
10.2.2.4 Equality of Sets As per the definition, given two sets S1 and S2, if S1 is subset of S2 and S2 is
subset of S1 then S1 and S2 are equal sets. Let us assume that the sets have been represented using lists.
The equality S1 ≡ S2 has been obtained by the following steps:
(1) For each element of S1, check if it is a member of S2 or not and if it is not present in S2 then
report failure.
(2) For each element of S2, check if it is a member of S1 or not and if it is not present in S1 then
report failure.
(3) If no failure is reported in step1 and step 2, then infer that S1 ≡ S2.
The algorithm for equality of two sets S1 and S2 is given in the following:
/* This algorithm uses the function ifMember() which tests the membership
of an element in a given set */
476 Data Structures Using C
n P6, P12, P4
n P5
Fig. 10.21 The web pages visited by a user
From the above representation, it can be found
as to whether two elements belong to the same set or not. For example, P6 and P12 belong to the same set. A
click on page P11 would produce a list of links to P4, P7 and P9, grouped as “Similar” pages to P11.
(2) Colouring black and white photographs: This is an interesting application in which an old
black and white photograph is converted into a coloured one. The method is given below:
(i) The black and white photograph is digitized, i.e., it is divided into pixels of varying shades
of grey, i.e., ranging from white to black.
(ii) Pixels belonging to the same part or component are called equivalent pixels. For example,
pixels belonging to eyeballs are equivalent. Similarly, pixels belonging to the shirt of a
person are equivalent.
(iii) The equivalence is decided by comparing the grey levels of adjacent pixels.
(iv) In fact, 4 pixels are taken at a time to find the equivalent pixels among the adjacent pixels.
This activity is also called 4-pixel square scan.
(v) The scanning starts from left to right and top to bottom.
(vi) Each pixel is assigned a colour.
(vii) The equivalent pixels are given the same colour.
(viii) If two portions with different coloured pixels meet each other, then it is considered as part
of the same component and then both the portions are coloured with the same colour
chosen from either of the colours as shown in Figure 10. 22.
Adjacent pixels
Meeting point
It may be noted that the equivalence relation between the pixels is obtained through set operations
where an equivalence relation R over a set S can be seen as a partitioning of S into disjoint sets.
(3) Spelling checker: A spelling checker for a document editor is another interesting application
of sets, especially the hash table representation. A normal dictionary is maintained using hash
table representation of sets. A simple-most dictionary (Dict.) will have 26 buckets because
there are 26 alphabets in English language. Similarly, the words of a document (Doc) would
also be represented in the same fashion as shown in Figure 10.23. A difference operation Doc
– Dict would produce all those words which are in document but not in dictionary, i.e., the
misspelled words.
NULL NULL
NULL NULL
NULL NULL
Dict Doc
Doc - Dict
Bucket
NULL
NULL
Misspelled Words
NULL
Now, the question is how to make efficient search operation on sorted linked lists? In 1990, Bill
Pugh proposed an enhancement on linked lists and the new data structure was termed as skip list.
A skip list is basically a sorted linked list in ascending order. Now, extra links are added to selected
nodes of the linked list so that some nodes can be skipped while searching within the linked list so that
the overall search time is reduced. Consider the normal linked list shown in Figure 10.24. Let us call this
list as level 0.
List
Level 0
22 55 88 13 34 53 55
It may be noted that the time complexity of various operations such as search, insertion, deletion, etc.
on a sorted linked list is O(N).
Let us now add extra links in such a manner that every alternative node is skipped. The arrangement
is shown in Figure 10.25. The chain of extra links is called level 1. The head node is said to have height
equal to 1.
Head
Level 0
2 5 8 13 34 53 55
Now, we carry on to add another level of links, i.e., level 2 wherein every alternate node of level 1 is
skipped as shown in Figure 10.26. The height of head node has become 2.
The levels of links are added till we reach a stage where the link from head node has reached almost
to the middle element of the linked list as has happened in Figure 10.26.
Head
Level 2
Level 0
2 5 8 13 34 53 55
It may be noted that in level 1, links point to every second element in the list. In level 2, the links point to
every fourth element in the list. Thus, it can be deduced that a link in the ith level will point to 2*ith element
in the list. A point worth noting is that the last node of the skip list is pointed by links of all levels.
Now the search within the skip list shall take O(log n) time as it will follow binary search pattern.
For example, the search for ‘53’ will follow the route shown in Figure 10.27.
Head
Level 2
Level 0
2 5 8 13 34 53 55
Fig. 10.27 The path followed for searching ‘53’ in the list
The algorithm that searches a value VAL in the skip list is given below:
Algorithm skipListSearch ()
{
curPtr = Head;
For (I = N; I > 1; I−−)
{
while (DATA (curPtr[i] < VAL)
CurPtr[I] = NEXT (CurPtr [I]);
If (DATA (curPtr[I])) == VAL) ; return success;
}
return failure;
}
It may be noted that insertion and deletion operations in the list will disturb the balance of the skip
list and many levels of pointer adjustment would be required to redistribute the levels of pointers and
heights of head and intermediate nodes.
10.4 B-TreeS
A binary search tree (BST) is an extremely useful data structure as far as search operations on static
data items is concerned. The reason is that static data can be so arranged that the BST generated out
of the data is almost height balanced. However, for very large data stored in files, this data structure
becomes unsuitable in terms of number of disk accesses required to read the data from the disk.
The remedy is that each node of BST should store more than one record to accommodate the complete
data of the file. Thus, the structure of BST needs to be modified. In fact, there is a need to extend the
concept of BST so that huge amount of data for search operations could be easily handled.
We can look upon a BST as a two-way search tree in which each node (called root) has two subtrees—
left subtree and right subtree, with following properties:
(1) The key values of the nodes of the left subtree are always less than the key value of the root node.
(2) The key values of the nodes of the right subtree are always more than the key value of the root node.
Advanced Data Structures 481
45 76
12 25 55 72 85 98
Consider the B-tree given in Figure10.30. It is of order 5 because all the internal nodes have at
least |5/2| = 3 children and two keys. In fact, the maximum number of children and keys are 5 and 4,
respectively. As per rule number 3, each leaf node must contain at least |5/2| − 1 = 2 keys.
70
10 25 79 85 98
4 7 12 23 34 47 69 72 75 82 84 87 92 110 125
It may be noted that the node of a B-tree, by its nature, can accommodate more than one key. We
may design the size of a node to be as large as a block on the disk (say a sector). As a block is read at a
time from the disk, compared to a normal BST, less number of disk accesses would be required to read/
write data from the disk.
The following operations can be carried out on a B-tree:
(1) Searching a B-tree
(2) Inserting a new key in B-tree
(3) Deleting a key from a B-tree
Search (82)
70
79 85 98
10 25
4 7 12 23 34 47 69 72 75 82 84 87 92 110 125
The trace of the above algorithm on the list of numbers is given in Figures 10.32 and 10.33.
484 Data Structures Using C
Insert (3) 3
Insert (14) 3 14
Insert (7) 3 7 14
Insert (1) 1 3 7 14
Insert (8) 7
1 3 8 14
1 3 5 8 11 14 17
Insert (13) 7 13
1 3 5 8 11 14 17
1 3 5 6 8 11 12 14 17 20 23
Insert (26) 7 13 20
1 3 5 6 8 11 12 14 17 23 26
13
4 7 17 20
1 3 5 6 8 11 12 14 16 18 19 23 24 25 26
13
4 7 17 20
1 3 5 6 11 12 14 16 18 19 23 24 25 26
Delete (20) 13
4 7 17 20
1 3 5 6 11 12 14 16 18 19 23 24 25 26
13
4 7 17 23
1 3 5 6 11 12 14 16 18 19 24 25 26
Note: From the above discussion, it is clear that the insertion, and especially deletion, is a complicated
operation in B-trees. Therefore, the size of leaf nodes is kept as large as 256 or more so that insertion and
deletion do not demand rearrangement of nodes.
Delete (19) 13
4 7 17 23
1 3 5 6 11 12 14 16 18 19 24 25 26
13
4 7 17 24
1 3 5 6 11 12 14 16 18 23 25 26
13
7 17 24
1 3 4 6 11 12 14 16 18 23 25 26
7 13 17 24
1 3 4 6 11 12 14 16 18 23 25 26
n Binary search
Averagely, the linear search requires O(n) comparisons to search an element in a list. On the other
hand, the binary search takes O(log n) comparisons to search an element in the list but it requires the
list should already be sorted.
Some applications such as direct files, require the search to be of order O(1) in the best case i.e. the
time of search should be independent of the location of the element or key in the storage area. In fact,
in this situation a hash function is used to obtain a unique address for a given key or a token. The term
‘hashing’ means key transformation. The address for a key is derived by applying an arithmetic function
called ‘hash’ function on the key itself. The address so generated is used to store the record of the key as
shown in Fig. 10.38.
Key (k)
F(k)
Storage
Address (A)
Hash
function
Storage
device
Record
It may be observed from the examples given above, that there are chances that the records with dif-
ferent key values may hash to the same address. For example, in folding technique, the keys 123529164
and 529164123 will generate the same address, i.e., 816. Such mapping of keys to the same address is
known as a collision and the keys are called as synonyms.
Now, to manage the collisions, the overflowed keys must be stored in some other storage space
called overflow area. The procedure is described below:
n) When a record is to be stored, a suitable hashing function is applied on the key of the record and
an address is generated. The storage area is accessed, and, if it is unused, the record is stored there.
If there is already a record stored, the new record is written in the overflow area.
n When the record is to be retrieved, the same process is repeated. The record is checked at the
generated address. If it is not the desired one, the system looks for the record in the overflow area
and retrieves the record from there.
Thus, synonyms cause loss of time in searching records, as they are not at the expected address.
It is therefore essential to devise hash algorithms which generate minimum number of collisions. The
important features required in a hash algorithm are given in the following section.
afterwards.
n Even distribution: The records of a file should be evenly distributed throughout the allocated
storage space.
n Minimum collisions: It should generate unique addresses for different keys so that number of
0 0 10 20 30 NULL
1
2
3 23 33 23 NULL
27 77 NULL
9 19 9 59 NULL
Assuming that (K mod 10) was taken as the hash function. Therefore, the keys with values 0, 10, 20,
30 etc. will map on the same address i.e. 0 and their associated records are stored in a linked list of nodes
as shown in figure. Similarly 19, 9, and 59 will also map on the same address i.e. 9 and their associated
records stored in a linked list of nodes.
It may be noted that when a key K is searched in the above arrangement and if it is not found in the
head node then it is searched in the attached linked list. Since the linked list is a pure sequential data
structure, the search becomes sequential. A poor hash function will generate many collisions leading
to creation of long linked lists. Consequently the search will become very slow. The remedy to above
drawback of chaining is rehashing.
2. Rehashing: When the address generated by a hash function F1 for a key Kj collides with another key
Ki then another hash function F2 is applied on the address to obtain a new address A2. The collided Key
Kj is then stored at the new address A2 as shown in Fig. 10. 40
Storage Storage
Key (kj) address (A1) address (A2)
F1(kj) F2(A1)
Hash Hash
function 1 function 2
Storage
device
Record
It may be noted that if the space referred to by A2 is also occupied, then the process of rehashing is
again repeated. In fact rehashing is repeatedly applied on the intermediate address until a free location is
found where the record could be stored.
exerCiSeS
50
75
40
20 60
85
90
5. For the AVL Tree shown in Fig. 10.41, draw the final AVL Tree that would result from per-
forming the actions indicated from (a) to (e) listed below:
a. Insert the key 10.
b. Insert the key 95.
c. Insert the key 80, and then insert the key 77.
d. Insert the key 80, and then insert the key 83.
e. Insert the key 45.
Note: Always start with the given tree, the questions are not accumulative. Also, show both the
data value and balance factor for each node.
6. Define set data structure. How are they different from arrays? Give examples to show the
applications of the set data structures.
7. Write an algorithm that computes the height of a tree.
8. Write an algorithm that searches a key K in an AVL Tree.
9. What are the various cases of insertion of a key K in an AVL Tree?
10. Define the terms: null set, subset, disjoint set, cardinality of a set, and partition.
11. What are the various methods of representing sets?
12. Write an algorithm that takes two sets S1 and S2 and produces a third set S3 such that
S3 = S1 S2.
13. Write an algorithm that takes two sets S1 and S2 and produces a third set S3 such that
S3 = S1 ∩ S2.
14. Write an algorithm that takes two sets S1 and S2 and produces a third set S3 such that
S3 = S1 − S2.
15. How are web pages managed by a search engine?
16. Give the steps used for colouring a black and white photograph.
17. Write a short note on spell checker as an application of sets.
18. You are given a set of persons P and their friendship relation R, that is, (a, b) Є R if a is a friend
of b. You must find a way to introduce person x to person y through a chain of friends. Model
this problem with a graph and describe a strategy to solve the problem.
19. The subset-sum problem is defined as follows:
Input: a set of numbers A = {a1, a2, . . . , aN} and a number x;
Output: 1 if there is a subset of numbers in A that add up to x.
Write down the algorithm to solve the subset-sum problem.
20. What is a skip list? What is the necessity of adding extra links?
21. Write an algorithm that searches a key K in a skip list.
22. What is an m-way tree.
23. Define a B-tree. What are its properties?
24. Write an algorithm for searching a key K, in a B-tree.
25. Write an algorithm for inserting a key K, in a B-tree.
26. Write an explanatory note on deletion operation in a B-tree.
27. What are the advantages of a B-tree.
ASCII Codes
(Character Sets)
A
appendix
A.1 ASCII (American Standard Code for Information Interchange) Character Set
3
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
0
16
32 b
/ ! " # $ % & ; ( ) * + , – . /
48 0 1 2 3 4 5 6 7 8 9 : ; < = > ?
64 @ A B C D E F G H I J K L M N O
80 P Q R S T U V W X Y Z [ \ ] ^ _
96 ′ a b c d e f g h i j k l m n o
112 p q r s t u v w x y z { | } ~
Codes 00–31 and 127 are nonprintable control characters.
A.2 EBCDIC (Extended Primary Coded Decimal Interchange Code) Character Set
3
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
0
16
32
48
64 b
/ e/ . < ( + |
80 & ! $ * ) ; ¬
96 - / , % _ > ?
112 : # @ ' = "
128 a b c d e f g h i
144 j k l m n o p q r
160 s t u v w x y z
176
192 A B C D E F G H I
208 J K L M N O P Q R
224 S T U V W X Y Z
240 0 1 2 3 4 5 6 7 8 9
Table of Format Specif iers
B
appendix
Format Specified Meaning
c data item is a single character
d data item is a decimal integer
e data item is a floating point value
f data item is a floating point value
g data item is a floating point value
h* data item is a short integer
i data item is a decimal, hexadecimal or octal integer
o data item is a octal integer
s data item is a string followed by a white space
u data item is a is an unsigned decimal integer
x data item is a hexadecimal integer
[…]* data item is a string which may include while space characters
*Only applicable for data input i.e for scanf() function.
Escape Sequences
C
appendix
l f 2* i
1 1 2
l f
1 1
w g . ,
3 3 2 2
4* 4* y u
4 4
2* i . ,
2 2 2
l f
1 1
498 Data Structures Using C
nrt ho e B a nrt ho e B
7 7 7 8* 8* 8 9 10* 11 18 4 6* 7 7 7 8* 8* 8 9 11 18
14* 10*
n r a 6*
7 7 4
w g
3 3
t ho e B ho e B
7 8* 8* 8 9 10* 11 14* 18 8* 8 9 10* 11 14* 15* 18
15* 16*
t 8* 8* h
7 8
4* 4* y u
4 4
2* i . ,
2 2 2
l f
1 1
e B o e B
11 14* 15* 16* 18 19* 9 10* 11 14* 15* 16* 18
25* 19*
e 14* o 10*
11 9
n r a 6*
7 7 4
w g
3 3
Appendix
B B
15* 16* 18 19* 25* 18 19* 25* 31*
31* 37*
t 8* 8* h o 10*
7 8 9
y u a
4* 4* 6*
4 4 4
2* i . , w g
2 2 2 3 3
l f
1 1
56*
25* 31*
n r t 8* 8* h
7 7 7 8
y u
4* 4*
4 4
2* i . ,
2 2 2
l f
1 1
500 Data Structures Using C
37* 56*
93*
37* 56*
a 6* 8* h
n r t 8*
4 7 8
7 7
w g y u
4* 4*
3 3 4 4
2* i . ,
2 2 2
l f
1 1
Index
& operator, 197–198 searching a key in, 482–483 circular queue, 176–181
‘*’ operator, 198 searching in, 307–308 collision handling. See overflow
binary trees, 285–298 management
A binary search tree 303–319 colouring of maps, 408–409
accumulators, 83 complete binary tree, 285 column major order, 131
activity ratio, 448–449 expression tree, 298–303 comma operator, 18
acyclic graph, 368 full binary tree, 285 comments, 10–11
adjacency matrix, 368 heap trees, 319–340 comparison-based algorithm, 109
adjacent node, 368 linked representation of, 289–291 complete binary tree, 285
advanced data structures, 459–492 threaded binary trees, 340–352 complete graph, 368
algorithm, 78–90 traversal of, 291–298 compound statement, 21
analysis of, 85–86 weighted binary trees, 352–360 condensed matrix, 142
Big-Oh notation, 86–90 bitwise shift operator, 19–20 conditional operators, 16–17
design of, 78 breadth first search (BFS), 390–396 conditional statements, 21–25
stepwise refinement, 80–81 break statement, 27 if statement, 21–22
using control structures, 81–85 B-trees. See binary search tree if-else statement, 22
ways to develop, 78–80 (B-trees) nested if statements, 22–23
archival file, 414 chaining, 491–492 switch statement, 23–25
arithmetic operators, 13–14 rehashing, 492 connected graph, 367
unary arithmetic operators, 13–14 bubble sort, 113–116 const correct, 7
array of pointers, 208 analysis of, 113–116 const qualifier, 7
arrays, 34, 93–149 buffered I/O, 31–32 constants, 7
applications of, 138–149 character constant, 7
multi-dimensional arrays, 130–134 C floating point constant, 7
one-dimensional array, 94–130 C, 1–70 integer constant, 7
pointers and, 203–207 characters used in, 2 continue statement, 27
representation in physical memory, comments, 10–11 control structures, 81–85
134–138 console I/O functions in, 417 counters, 83
two-dimensional arrays, 136–137 data types, 2–4 cycle, 367
assignment operator, 18–19 defining a structure, 35–36
AVL trees, 459–467 escape sequence, 11–13 D
inserting a node in, 462–467 flow of control, 20–30 dangling pointers, 202–203
searching of, 461 history of, 1 data, 412
low-level I/O functions in, 418 data element, 7777
B operators and expressions, 13–20 data object, 77
back up file, 414 stream I/O functions in, 417 data structures, 73–78
backlash character constants. See structure of, 8 concept of, 73–78
escape sequence tokens, 4–7 terminology related with, 77–78
BFS. See breadth first search (BFS) C tokens, 4–7 types of, 76–77
Big-Oh notation, 86–90 constants, 7 data tracks, 448
binary recursion, 61–65 identifiers, 4–5 data types, 2–4
binary search tree (B-trees), 303–319, keywords, 5 character data type (char), 3
480–489 variables, 5–6 floating data type (float), 3–4
advantages of, 487–489 call by reference, 49–51, 53–55 integer data type (int), 2
creation of, 304–306 call by value, 47–49, 52–53 user-defined data types, 39–42
deleting a key in, 484–487 character constant, 7 degree, 284–285
deletion of a node from, 312–319 character data type (char), 3 deletion operation, graph, 379–384
inserting a key in, 483–484 child, 284 deletion, 107–109
insertion into, 308–312 circular linked list, 254–258 Dennis, Ritchi, 1
502 Data Structures Using C
depth first search (DFS), 384–390 for loop, 26 index area, 448
depth, 284 formatted file I/O operations, indexed files, 448
deque, 185–195 425–426 advantages of, 448
deterministic loops, 25 full binary tree, 285 disadvantages of, 448
DFS. See depth first search (DFS) functions, 43–56 indexed sequential organization,
difference of sets, 474–475 calling, 45–47 445–448]
Dijkstra’s algorithm, 404–407 parameter passing in, 47–51 addition/deletion of record,
diminishing step sort, 128 passing structures to, 52–56 446–447
direct file organization, 442–445 prototypes, 44–45 multilevel indexed files, 448
direct file, 444–445 returning values from, 52 searching a record, 445–446
advantages of, 444–445 storage devices for, 447–448
directed graph, 366 G infinite loop, 83
division method, 490 getc ( ) function, 33 infix method, 158–159
doubly linked list, 258–263 getchar ( ) function, 32–33 information, 412–413
do-while loop, 25–26 getche ( ) function, 33 inorder traversal, 292–295
dummy nodes, 264–266 gets ( ) function, 34 input/output functions (I/O), 30–34
concept of, 264–266 goto statement, 28–30 buffered I/O, 31–32
dynamic allocation, 210–220 graph single character functions, 32–33
dynamic data structure, 152 applications of, 408–410 string-based functions, 33–34
dynamic dictionary coding, 360–362 array-based representation of, insertion, 105–107
368–371 insertion operation, graph, 374–378
E linked representation of, 371–373 insertion sort, 117–119
edge, 284 operations of, 373–408 analysis of, 117–119
enumerated data types, 40–42 set representation of, 373 integer, 2
equality of sets, 475–476 integer constant, 7
escape sequence, 11–13 H integer data type (int), 2
exit ( ) function, 27–28 hash table representation, 470 integration, 450
expression tree, 298–303 hashing, 489–492 internal node, 284, 353
expressions, 17 searching by, 489–492 internal path length, 353–354
order of evaluation of, 17 hashing algorithms, 491 intersection of sets, 473–474
extended binary tree, 353 requirements of, 491 iterative statements, 25–27
external node, 353 hashing functions, 490–491 break statement, 27
external path length, 353 types of, 490–491 continue statement, 27
heap trees, 319–340 do-while loop, 25–26
F deletion of a node from, 324–328 for loop, 26
file, 413 heap sort, 329–336 while loop, 25
concepts of, 413–415 insertion of a node into, 320–324
in C, 416 merging of two, 336–340 K
opening of, 418–419 representation of, 320 key field, 415
stream and, 416–418 height, 284 key transformation, 489
types of, 414 height branched binary tree, 460 keywords, 5
working with files using stream Huffman algorithm, 352–360 Kruskal’s algorithm, 397–399
I/O, 418–430 Huffman codes, 356–360
file opening modes, 419 L
file organization, 415–416 I linear data structures, 76–77
finite loop, 83 I/O operations linear linked list, 228
floating data type (float), 3–4 formatted file I/O operations, linear recursion, 61
floating point constant, 7 425–426 linked list, 227–280
flow of control, 20–30 reading or writing blocks of data in circular linked list, 254–258
compound statement, 21 files, 426–430 deleting a node from, 250–253
conditional statements, 21–25 unformatted file I/O operations, doubly linked list, 258–263
exit() function, 27–28 419–425 insertion in, 243–250
goto statement, 28–30 I/O stream, 418–430 operations on, 231
nested loops, 28 identifiers, 4–5 searching, 241–243
Floyd’s algorithm, 404 if statement, 21–22 travelling, 236–241
folding method, 490–491 if-else statement, 22 variations of, 253–263
Index 503
linked queues, 270–273 pointer variables, 198–203 S
linked stacks, 267–269 dangling pointers, 202–203 scanf ( ) functions, 9–10
linked storage, 274 pointers, 197–223 read data from keyboard using, 10
list representation, 469 & operator, 197–198 scene graphs, 409–410
long integer, 2 ‘*’ operator, 198 searching, 98–104
loop, 368 arrays and, 203–207 selection, 96–98
pointer variables, 198–203 selection sort, 109–113
M structures and, 208–210 analysis of, 113
master file, 414 polish notation. See prefix expression selective execution. See conditional
matrix arrays, 130 postfix expression, 158 statements
memory bleeding, 212 evaluation of, 164–170 self referential structures, 215–220,
merge sort, 119–124 postorder transversal, 297–298 231–236
analysis of, 119–124 post-test loop, 26 sequential file organization, 430–442
midsquare method, 490 prefix expression, 157 appending a sequential file,
minimum cost spanning trees, preorder transversal, 296–297 431–437
396–401 pre-test loop, 25 creating a sequential file, 430
model of www, 408 Prim’s algorithm, 399–401 reading and searching a sequential
multi-dimensional arrays, primitive data types, 83 file, 431
130–134 printf ( ) functions, 9–10 updating a sequential file, 437–442
multigraphs, 368 display data using, 9–10 sequential storage, 274
multilevel indexed files, 448 priority queue, 181–185 sets, 468–478
putc ( ) function, 33 application of, 476–478
N putch ( ) function, 33 difference of sets, 474–475
nested if statements, 22–23 putchar ( ) functions, 32–33 equality of sets, 475–476
nested loops, 28 puts ( ) function, 34 intersection of sets, 473–474
nested structures, 38–39 operation on, 470–476
non-deterministic loops, 25 Q representation of, 469
non-linear data structures, 76–77 queues, 170–195 union of sets, 471–473
circular queue, 176–181 shell sort, 127–130
O deque, 185–195 analysis of, 127–130
one-dimensional array, 94–130 linked queues, 270–273 short integer, 2
deletion, 107–109 operations, 171–176 shortest path problem, 401–407
insertion, 105–107 priority queue, 181–185 sibling, 284
physical address computation of quick sort, 124–127 simple graph, 368
elements of, 135 analysis of, 124–127 single character functions, 32–33
searching, 98–104 singleton, 119
selection, 96–98 R sinking sort, 116
sorting, 109–113 radix sort, 130 sizeof operator, 18
traversal, 95–96 recursion, 56–68 skip list, 478–480
operators, 13–20 types of, 60–65 sort file, 414
arithmetic operators, 13–14 reference file, 414 sorting, 109–113
assignment operator, 18–19 relational and logical operators, bubble sort, 113–116
bitwise shift operator, 19–20 14–16 insertion sort, 117–119
conditional operators, 16–17 relative file, 444 merge sort, 119–124
relational and logical operators, characteristics of, 444 quick sort, 124–127
14–16 repetitive execution. See iterative selection sort, 109–113
special operators, 18 statements shell sort, 127–130
overflow area, 448 report file, 414 spanning trees, 396–401
overflow management, 491–492 resource allocation graph, 408 sparse graph, 396
chaining, 491–492 response time, 450 sparse matrix addition, 143–147
rehashing, 492 reverse polish notation. See postfix sparse matrix representation,
expression 141–149
P roll field, 415 sparse matrix addition, 143–147
parallel edges, 368 root, 284 sparse matrix transpose, 147–149
parent, 284 row major order, 131 sparse matrix transpose, 147–149
path, 284, 366–367 rvalue, 6 special operators, 18
504 Data Structures Using C
comma operator, 18 T union of sets, 471–473
sizeof operator, 18 threaded binary trees, 340–352 unions, 42
spelling checker, 478 deletion from, 348–352 unsigned integer, 2
stack method, 159 insertion into, 344–347 user-defined data types, 39–42
stacks, 151–170 time complexity, 86 enumerated data types, 40–42
applications of, 155–170 Tower of Hanoi, 65–67
linked stacks, 266–269 transaction file, 414 V
operations, 152–155 transversal of graph, 384–396 variables, 5–6
statement label, 29 traversal, 95–96 vertex, 368
stored program computer, 72 trees, 282–362 volatility, 450
stream, 416–418 terminology related to, 284–285 Von Neumann architecture, 72
stream I/O, 418–430 two-dimensional arrays, 136–137
string-based functions, 33–34 physical address computation of W
structures, 34–39 elements of, 136–137 Warshall’s algorithm, 401–403
array of, 36 web linked pages, 476–477
assignment of, 37–38 U weighted binary trees, 352–360
initializing, 36 unary arithmetic operators, 13–14 Huffman algorithm and, 352–360
nested structures, 38–39 unconditional branching. See goto weighted graph, 366
pointers and, 208–210 statement while loop, 25
switch statement, 23–25 undirected graph, 366 world wide web, 408
unformatted file I/O operations,
419–425