0% found this document useful (0 votes)
228 views122 pages

ADS Notes

Uploaded by

vahini
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
228 views122 pages

ADS Notes

Uploaded by

vahini
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 122

Advanced Data Structures through C++

II B.Tech I SEM CODE: A63K2

Pre-requisites: Basic knowledge in Programming.

Objectives:
1. To introduce and practice advanced algorithms and programming techniques necessary for developing sophisticated
computer application programs
2. To learn new techniques for solving specific problems more efficiently and for analyzing space and time requirements.
3. To write programs in C to solve problems using data structures such as arrays, linked lists, stacks, queues, Trees, graphs,
hash tables, search Trees.

Unit I: The Fundamentals of Object-Oriented Programming: Necessity for OOP, Data Hiding, Data Abstraction, Encapsulation,
Procedural Abstraction, Class and Object, Scope of Class and Scope Resolution Operator, Member Function of a Class,
private, protected, and public Access Specifier.

Unit II Priority Queues: Definition, realizing a Priority Queue using Heaps, Insertion, Deletion, Heap sort, External Sorting.
Dictionaries: Linear List Representation, Skip List Representation, Operations- Insertion, Deletion and Searching.

Unit III: Hashing: Hash Table Representation, Hash Functions, Collision Resolution Separate
Chaining, Open Addressing-Linear Probing, Quadratic Probing, Double Hashing, Rehashing,
Extendible Hashing, Comparison of Hashing and Skip Lists.
Unit IV: Trees: Basic Terminology, Binary Tree, Array and Linked List Representations, Traversals, Threaded Binary Trees.
Search Trees (Part I): Binary Search Trees- Definition (ADT), Implementation, Operations- Searching, Insertion and Deletion,
AVL Trees, Definition, Operations – Insertion and Searching. Search Trees (Part II): B-Trees- Definition, B-Tree of Order m,
Insertion, Deletion and Searching, Red black Trees and Splay Trees.

Unit V: Graphs: Basic Terminology, Representations of Graphs, Graph Search Methods: DFS, BFS. Text Processing Pattern
Matching Algorithms: Brute Force, The Knuth-Morris-Pratt Algorithm, Tries: Standard Tries, Compressed Tries, Suffix Tries.

Course Outcomes: Upon the successful completion of the course, the student will be able:
1. Analyze performance of algorithms with respect to complexities, trace and code recursive functions, understanding of
various searching algorithms.
2. Understand operations on linear data structures such as stacks, queues to solve various computing problems and
Implement these data structures in more than one manner.
3. Classify different linked lists and their operations, Compare different implementations to recognize the advantages and
disadvantages.
4. Understand the linked implementation use in non-linear data structures such as Binary Tree and Binary Search Tree and
Tree traversals such as in, pre, post.
5. Understand graph types and their representations. Know the usage of graph traversal algorithms.
Textbooks:
1.Data Structures, Algorithms and Applications in C++, S.Sahni, University Press (India) Pvt.Ltd, Second edition, Universities
Press Orient Longman Pvt. Ltd.
2. Data Structures and Algorithm Analysis in C++, Mark Allen Weiss, Pearson Education. Ltd., Second Edition.

Why OOP?
Suppose that you want to assemble your own PC, you go to a hardware store and pick up a motherboard,
a processor, some RAMs, a hard disk, a casing, a power supply, and put them together. You turn on the
power, and the PC runs. You need not worry whether the motherboard is a 4-layer or 6-layer board,
whether the hard disk has 4 or 6 plates; 3 inches or 5 inches in diameter, whether the RAM is made in
DS Using C++ Page 1
Japan or Korea, and so on. You simply put the hardware components together and expect the machine to
run. Of course, you have to make sure that you have the correct interfaces, i.e., you pick an IDE hard
disk rather than a SCSI hard disk, if your motherboard supports only IDE; you have to select RAMs with
the correct speed rating, and so on. Nevertheless, it is not difficult to set up a machine from
hardware components.
Similarly, a car is assembled from parts and components, such as chassis, doors, engine, wheels, brake,
and transmission. The components are reusable, e.g., a wheel can be used in many cars (of the same
specifications).
Hardware, such as computers and cars, are assembled from parts, which are reusable components.
How about software? Can you "assemble" a software application by picking a routine here, a routine
there, and expect the program to run? The answer is obviously no! Unlike hardware, it is very difficult to
"assemble" an application from software components. Since the advent of computer 60 years ago, we
have written tons and tons of programs. However, for each new application, we have to re-invent the
wheels and write the program from scratch.
Why re-invent the wheels?
1.1 Traditional Procedural-Oriented languages

Can we do this in traditional procedural-oriented programming language such as C, Fortran, Cobol, or


Pascal?
Traditional procedural-oriented languages (such as C and Pascal) suffer some notable drawbacks in
creating reusable software components:
1. The programs are made up of functions. Functions are often not reusable. It is very difficult to
copy a function from one program and reuse in another program because the the function is
likely to reference the headers, global variables and other functions. In other words, functions are
not well-encapsulated as a self-contained reusable unit.
2. The procedural languages are not suitable of high-level abstraction for solving real life problems.
For example, C programs uses constructs such as if-else, for-loop, array, function, pointer, which
are low-level and hard to abstract real problems such as a Customer Relationship Management
(CRM) system or a computer soccer game. (Imagine using assembly codes, which is a very low
level code, to write a computer soccer game. C is better but no much better.)
In brief, the traditional procedural-languages separate the data structures and algorithms of the software
entities.

DS Using C++ Page 2


In the early 1970s, the US Department of Defense (DoD) commissioned a task force to investigate why its
IT budget always went out of control; but without much to show for. The findings are:
1. 80% of the budget went to the software (while the remaining 20% to the hardware).
2. More than 80% of the software budget went to maintenance (only the remaining 20% for new
software development).
3. Hardware components could be applied to various products, and their integrity normally did not
affect other products. (Hardware can share and reuse! Hardware faults are isolated!)
4. Software procedures were often non-sharable and not reusable. Software faults could affect other
programs running in computers.
The task force proposed to make software behave like hardware OBJECT. Subsequently, DoD replaces
over 450 computer languages, which were then used to build DoD systems, with an object-oriented
language called Ada.
1.2 Object-Oriented Programming Languages

Object-oriented programming (OOP) languages are designed to overcome these problems.


1. The basic unit of OOP is a class, which encapsulates both the static attributes and dynamic
behaviors within a "box", and specifies the public interface for using these boxes. Since the class is
well-encapsulated (compared with the function), it is easier to reuse these classes. In other words,
OOP combines the data structures and algorithms of a software entity inside the same box.
2. OOP languages permit higher level of abstraction for solving real-life problems. The traditional
procedural language (such as C and Pascal) forces you to think in terms of the structure of the
computer (e.g. memory bits and bytes, array, decision, loop) rather than thinking in terms of the
problem you are trying to solve. The OOP languages (such as Java, C++, C#) let you think in the
problem space, and use software objects to represent and abstract entities of the problem space to
solve the problem.

DS Using C++ Page 3


As an example, suppose you wish to write a computer soccer games (which I consider as a complex
application). It is quite difficult to model the game in procedural-oriented languages. But using OOP
languages, you can easily model the program accordingly to the "real things" appear in the soccer
games.
 Player: attributes include name, number, location in the field, and etc; operations include run, jump,
kick-the-ball, and etc.
 Ball:
 Reference:
 Field:
 Audience:
 Weather:
Most importantly, some of these classes (such as Ball and Audience) can be reused in another
application, e.g., computer basketball game, with little or no modification.
1.3 Benefits of OOP
The procedural-oriented languages focus on procedures, with function as the basic unit. You need to first
figure out all the functions and then think about how to represent data.
The object-oriented languages focus on components that the user perceives, with objects as the basic
unit. You figure out all the objects by putting all the data and operations that describe the user's
interaction with the data.
Object-Oriented technology has many benefits:
 Ease in software design as you could think in the problem space rather than the machine's bits and
bytes. You are dealing with high-level concepts and abstractions. Ease in design leads to more
productive software development.
 Ease in software maintenance: object-oriented software are easier to understand, therefore easier to
test, debug, and maintain.
 Reusable software: you don't need to keep re-inventing the wheels and re-write the same functions
for different situations. The fastest and safest way of developing a new application is to reuse existing
codes - fully tested and proven codes.
2. OOP Basics
2.1 Classes & Instances
Class : A class is a definition of objects of the same kind. In other words, a class is a blueprint, template,
or prototype that defines and describes the static attributes and dynamic behaviors common to all
objects of the same kind.
DS Using C++ Page 4
Instance : An instance is a realization of a particular item of a class. In other words, an instance is
an instantiation of a class. All the instances of a class have similar properties, as described in the class
definition. For example, you can define a class called " Student" and create three instances of the class
"Student" for "Peter", "Paul" and "Pauline".
The term "object" usually refers to instance. But it is often used quite loosely, which may refer to a class
or an instance.
2.2 A Class is a 3-Compartment Box encapsulating Data and Functions

A class can be visualized as a three-compartment box, as illustrated:


1. Classname (or identifier): identifies the class.
2. Data Members or Variables (or attributes, states, fields): contains the static attributes of the
class.
3. Member Functions (or methods, behaviors, operations): contains the dynamic operations of the
class.
In other words, a class encapsulates the static attributes (data) and dynamic behaviors (operations that
operate on the data) in a box.
Class Members : The data members and member functions are collectively called class members.
The followings figure shows a few examples of classes:

DS Using C++ Page 5


The following figure shows two instances of the class Student, identified as "paul" and "peter".

Unified Modeling Language (UML) Class and Instance Diagrams: The above class
diagrams are drawn according to the UML notations. A class is represented as a 3-compartment box,
containing name, data members (variables), and member functions, respectively. classname is shown in
bold and centralized. An instance (object) is also represented as a 3-compartment box, with instance
name shown as instanceName:Classname and underlined.
Brief Summary
1. A class is a programmer-defined, abstract, self-contained, reusable software entity that mimics a
real-world thing.
2. A class is a 3-compartment box containing the name, data members (variables) and the member
functions.
3. A class encapsulates the data structures (in data members) and algorithms (member functions).
The values of the data members constitute its state. The member functions constitute its behaviors.
4. An instance is an instantiation (or realization) of a particular item of a class.
2.3 Class Definition
In C++, we use the keyword class to define a class. There are two sections in the class
declaration: private and public, which will be explained later. For examples,

DS Using C++ Page 6


class Circle { // classname
private:
double radius; // Data members (variables)
string color;
public:
double getRadius(); // Member functions
double getArea();
}
class SoccerPlayer { // classname
private:
int number; // Data members (variables)
string name;
int x, y;
public:
void run(); // Member functions
void kickBall();
}
Class Naming Convention: A classname shall be a noun or a noun phrase made up of several
words. All the words shall be initial-capitalized (camel-case). Use a singular noun for classname. Choose a
meaningful and self-descriptive classname. For
examples, SoccerPlayer, HttpProxyServer, FileInputStream, PrintStream and SocketFactory.
2.4 Creating Instances of a Class
To create an instance of a class, you have to:
1. Declare an instance identifier (name) of a particular class.
2. Invoke a constructor to construct the instance (i.e., allocate storage for the instance and initialize
the variables).
For examples, suppose that we have a class called Circle, we can create instances of Circle as follows:
// Construct 3 instances of the class Circle: c1, c2, and c3
Circle c1(1.2, "red"); // radius, color
Circle c2(3.4); // radius, default color
Circle c3; // default radius and color

Alternatively, you can invoke the constructor explicitly using the following syntax:

Circle c1 = Circle(1.2, "red"); // radius, color


Circle c2 = Circle(3.4); // radius, default color
Circle c3 = Circle(); // default radius and color

2.5 Dot (.) Operator


To reference a member of a object (data member or member function), you must:
1. First identify the instance you are interested in, and then
2. Use the dot operator (.) to reference the member, in the form of instanceName.memberName.
For example, suppose that we have a class called Circle, with two data members (radius and color) and
two functions (getRadius() and getArea()). We have created three instances of the class Circle,
namely, c1, c2 and c3. To invoke the function getArea(), you must first identity the instance of interest,
says c2, then use the dot operator, in the form of c2.getArea(), to invoke the getArea() function of
instance c2.
For example,

// Declare and construct instances c1 and c2 of the class Circle


Circle c1(1.2, "blue");

DS Using C++ Page 7


Circle c2(3.4, "green");
// Invoke member function via dot operator
cout << c1.getArea() << endl;
cout << c2.getArea() << endl;
// Reference data members via dot operator
c1.radius = 5.5;
c2.radius = 6.6;
Calling getArea() without identifying the instance is meaningless, as the radius is unknown (there could
be many instances of Circle - each maintaining its own radius).
In general, suppose there is a class called AClass with a data member called aData and a member
function called aFunction(). An instance called anInstance is constructed for AClass. You
use anInstance.aData and anInstance.aFunction().
2.6 Data Members (Variables)
A data member (variable) has a name (or identifier) and a type; and holds a value of that particular type
(as descried in the earlier chapter). A data member can also be an instance of a certain class (to be
discussed later).
Data Member Naming Convention: A data member name shall be a noun or a noun phrase
made up of several words. The first word is in lowercase and the rest of the words are initial-capitalized
(camel-case), e.g., fontSize, roomNumber, xMax, yMin and xTopLeft. Take note that variable name begins
with an lowercase, while classname begins with an uppercase.
2.7 Member Functions
A member function (as described in the earlier chapter):
1. receives parameters from the caller,
2. performs the operations defined in the function body, and
3. returns a piece of result (or void) to the caller.
Member Function Naming Convention: A function name shall be a verb, or a verb phrase
made up of several words. The first word is in lowercase and the rest of the words are initial-capitalized
(camel-case). For example, getRadius(), getParameterValues().
Take note that data member name is a noun (denoting a static attribute), while function name is a verb
(denoting an action). They have the same naming convention. Nevertheless, you can easily distinguish
them from the context. Functions take arguments in parentheses (possibly zero argument with empty
parentheses), but variables do not. In this writing, functions are denoted with a pair of parentheses,
e.g., println(), getArea() for clarity.

DS Using C++ Page 8


2.8 Putting them Together: An OOP Example

A class called Circle is to be defined as illustrated in the class diagram. It contains two data
members: radius (of type double) and color (of type String); and three member
functions: getRadius(), getColor(), and getArea().
Three instances of Circles called c1, c2, and c3 shall then be constructed with their respective data
members, as shown in the instance diagrams.
In this example, we shall keep all the codes in a single source file called CircleAIO.cpp.
CircleAIO.cpp
1/* The Circle class (All source codes in one file) (CircleAIO.cpp) */
2#include <iostream> // using IO functions
3#include <string> // using string
4using namespace std;
5
6class Circle {
7private:
8 double radius; // Data member (Variable)
9 string color; // Data member (Variable)
10
11public:
12 // Constructor with default values for data members
13 Circle(double r = 1.0, string c = "red") {
14 radius = r;
15 color = c;
16 }
17

DS Using C++ Page 9


18 double getRadius() { // Member function (Getter)
19 return radius;
20 }
21
22 string getColor() { // Member function (Getter)
23 return color;
24 }
25
26 double getArea() { // Member function
27 return radius*radius*3.1416;
28 }
29}; // need to end the class declaration with a semi-colon
30
31// Test driver function
32int main() {
33 // Construct a Circle instance
34 Circle c1(1.2, "blue");
35 cout << "Radius=" << c1.getRadius() << " Area=" << c1.getArea()
36 << " Color=" << c1.getColor() << endl;
37
38 // Construct another Circle instance
39 Circle c2(3.4); // default color
40 cout << "Radius=" << c2.getRadius() << " Area=" << c2.getArea()
41 << " Color=" << c2.getColor() << endl;
42
43 // Construct a Circle instance using default no-arg constructor
44 Circle c3; // default radius and color
45 cout << "Radius=" << c3.getRadius() << " Area=" << c3.getArea()
46 << " Color=" << c3.getColor() << endl;
47 return 0;
48}

To compile and run the program (with GNU GCC under Windows):

> g++ -o CircleAIO.exe CircleAIO.cpp


// -o specifies the output file name

> CircleAIO
Radius=1.2 Area=4.5239 Color=blue
Radius=3.4 Area=36.3169 Color=red
Radius=1 Area=3.1416 Color=red

2.9 Constructors
A constructor is a special function that has the function name same as the classname. In the
above Circle class, we define a constructor as follows:
// Constructor has the same name as the class
Circle(double r = 1.0, string c = "red") {
radius = r;
color = c;
}

DS Using C++ Page 10


A constructor is used to construct and initialize all the data members. To create a new instance of a class,
you need to declare the name of the instance and invoke the constructor. For example,

Circle c1(1.2, "blue");


Circle c2(3.4); // default color
Circle c3; // default radius and color
// Take note that there is no empty bracket ()

A constructor function is different from an ordinary function in the following aspects:


 The name of the constructor is the same as the classname.
 Constructor has no return type (or implicitly returns void). Hence, no return statement is allowed
inside the constructor's body.
 Constructor can only be invoked once to initialize the instance constructed. You cannot call the
constructor afterwards in your program.
 Constructors are not inherited (to be explained later).
2.10 Default Arguments for Functions
In C++, you can specify the default value for the trailing arguments of a function (including constructor)
in the function header. For example,

1/* Test function default arguments (TestFnDefault.cpp) */


2#include <iostream>
3using namespace std;
4
5// Function prototype
6int sum(int n1, int n2, int n3 = 0, int n4 = 0, int n5 = 0);
7
8int main() {
9 cout << sum(1, 1, 1, 1, 1) << endl; // 5
10 cout << sum(1, 1, 1, 1) << endl; // 4
11 cout << sum(1, 1, 1) << endl; // 3
12 cout << sum(1, 1) << endl; // 2
13// cout << sum(1) << endl; // error: too few arguments
14}
15
16// Function definition
17// The default values shall be specified in function prototype,
18// not the function implementation
19int sum(int n1, int n2, int n3, int n4, int n5) {
20 return n1 + n2 + n3 + n4 + n5;
21}
2.11 "public" vs. "private" Access Control Modifiers
An access control modifier can be used to control the visibility of a data member or a member function
within a class. We begin with the following two access control modifiers:
1. public: The member (data or function) is accessible and available to all in the system.
2. private: The member (data or function) is accessible and available within this class only.
For example, in the above Circle definition, the data member radius is declared private. As the
result, radius is accessible inside the Circle class, but NOT outside the class. In other words, you cannot
use "c1.radius" to refer to c1's radius in main(). Try inserting the statement "cout << c1.radius;"
in main() and observe the error message:

DS Using C++ Page 11


CircleAIO.cpp:8:11: error: 'double Circle::radius' is private

Try moving radius to the public section, and re-run the statement.
On the other hand, the function getRadius() is declared public in the Circle class. Hence, it can be
invoked in the main().
UML Notation: In UML notation, public members are denoted with a " +", while private members
with a "-" in the class diagram.
2.12 Information Hiding and Encapsulation
A class encapsulates the static attributes and the dynamic behaviors into a "3-compartment box". Once a
class is defined, you can seal up the "box" and put the "box" on the shelve for others to use and reuse.
Anyone can pick up the "box" and use it in their application. This cannot be done in the traditional
procedural-oriented language like C, as the static attributes (or variables) are scattered over the entire
program and header files. You cannot "cut" out a portion of C program, plug into another program and
expect the program to run without extensive changes.
Data member of a class are typically hidden from the outside word, with private access control modifier.
Access to the private data members are provided via public assessor functions,
e.g., getRadius() and getColor().
This follows the principle of information hiding. That is, objects communicate with each others using well-
defined interfaces (public functions). Objects are not allowed to know the implementation details of
others. The implementation details are hidden or encapsulated within the class. Information hiding
facilitates reuse of the class.
Rule of Thumb: Do not make any data member public, unless you have a good reason.
2.13 Getters and Setters
To allow other to read the value of a private data member says xxx, you shall provide a get
function (or getter or accessor function) called getXxx(). A getter need not expose the data in raw
format. It can process the data and limit the view of the data others will see. Getters shall not modify the
data member.
To allow other classes to modify the value of a private data member says xxx, you shall provide a set
function (or setter or mutator function) called setXxx(). A setter could provide data validation (such as
range checking), and transform the raw data into the internal representation.
For example, in our Circle class, the data members radius and color are declared private. That is to
say, they are only available within the Circle class and not visible outside the Circle class -
including main(). You cannot access the private data members radius and color from
the main() directly - via says c1.radius or c1.color. The Circle class provides two public accessor
functions, namely, getRadius() and getColor(). These functions are declared public. The main() can
invoke these public accessor functions to retrieve the radius and color of a Circle object, via
says c1.getRadius() and c1.getColor().
There is no way you can change the radius or color of a Circle object, after it is constructed in main().
You cannot issue statements such as c1.radius = 5.0 to change the radius of instance c1, as radius is
declared as private in the Circle class and is not visible to other including main().
If the designer of the Circle class permits the change the radius and color after a Circle object is
constructed, he has to provide the appropriate setter, e.g.,
// Setter for color
void setColor(string c) {
color = c;
}

// Setter for radius


void setRadius(double r) {
radius = r;
}
With proper implementation of information hiding, the designer of a class has full control of what the user
of the class can and cannot do.

DS Using C++ Page 12


2.14 Keyword "this"
You can use keyword "this" to refer to this instance inside a class definition.
One of the main usage of keyword this is to resolve ambiguity between the names of data member and
function parameter. For example,

class Circle {
private:
double radius; // Member variable called "radius"
......
public:
void setRadius(double radius) { // Function's argument also called "radius"
this->radius = radius;
// "this.radius" refers to this instance's member variable
// "radius" resolved to the function's argument.
}
......
}
In the above codes, there are two identifiers called radius - a data member and the function parameter.
This causes naming conflict. To resolve the naming conflict, you could name the function
parameter r instead of radius. However, radius is more approximate and meaningful in this context. You
can use keyword this to resolve this naming conflict. " this->radius" refers to the data member; while
"radius" resolves to the function parameter.
"this" is actually a pointer to this object. I will explain pointer and the meaning of " ->" operator later.
Alternatively, you could use a prefix (such as m_) or suffix (such as _) to name the data members to avoid
name crashes. For example,

class Circle {
private:
double m_radius; // or radius_
......
public:
void setRadius(double radius) {
m_radius = radius; // or radius_ = radius
}
......
}
C++ Compiler internally names their data members beginning with a leading underscore ( e.g., _xxx)
and local variables with 2 leading underscores (e.g., __xxx). Hence, avoid name beginning with
underscore in your program.
2.15 "const" Member Functions
A const member function, identified by a const keyword at the end of the member function's header,
cannot modifies any data member of this object. For example,
double getRadius() const { // const member function
radius = 0;
// error: assignment of data-member 'Circle::radius' in read-only structure
return radius;
}

2.16 Convention for Getters/Setters and Constructors


The constructor, getter and setter functions for a private data member called xxx of type T in a
class Aaa have the following conventions:

DS Using C++ Page 13


class Aaa {
private:
// A private variable named xxx of type T
T xxx;
public:
// Constructor
Aaa(T x) { xxx = x; }
// OR
Aaa(T xxx) { this->xxx = xxx; }
// OR using member initializer list (to be explained later)
Aaa(T xxx) : xxx(xxx) { }

// A getter for variable xxx of type T receives no argument and return a value of type T
T getXxx() const { return xxx; }

// A setter for variable xxx of type T receives a parameter of type T and return void
void setXxx(T x) { xxx = x; }
// OR
void setXxx(T xxx) { this->xxx = xxx; }
}
For a bool variable xxx, the getter shall be named isXxx(), instead of getXxx(), as follows:

private:
// Private boolean variable
bool xxx;
public:
// Getter
bool isXxx() const { return xxx; }

// Setter
void setXxx(bool x) { xxx = x; }
// OR
void setXxx(bool xxx) { this->xxx = xxx; }

2.17 Default Constructor


A default constructor is a constructor with no parameters, or having default values for all the parameters.
For example, the above Circle's constructor can be served as default constructor with all the parameters
default.
Circle c1; // Declare c1 as an instance of Circle, and invoke the default constructor
Circle c1(); // Error!
// (This declares c1 as a function that takes no parameter and returns a Circle
instance)

If C++, if you did not provide ANY constructor, the compiler automatically provides a default constructor
that does nothing. That is,

ClassName::ClassName() { } // Take no argument and do nothing


Compiler will not provide a default constructor if you define any constructor(s). If all the constructors you
defined require arguments, invoking no-argument default constructor results in error. This is to allow
class designer to make it impossible to create an uninitialized instance, by NOT providing an explicit
default constructor.

DS Using C++ Page 14


2.18 Constructor's Member Initializer List
Instead of initializing the private data members inside the body of the constructor, as follows:

Circle(double r = 1.0, string c = "red") {


radius = r;
color = c;
}
We can use an alternate syntax called member initializer list as follows:
Circle(double r = 1.0, string c = "red") : radius(r), color(c) { }
Member initializer list is placed after the constructor's header, separated by a colon ( :). Each initializer is
in the form of data_member_name(parameter_name). For fundamental type, it is equivalent
to data_member_name = parameter_name. For object, the constructor will be invoked to construct the
object. The constructor's body (empty in this case) will be run after the completion of member initializer
list.
It is recommended to use member initializer list to initialize all the data members, as it is often more
efficient than doing assignment inside the constructor's body.
2.19 *Destructor
A destructor, similar to constructor, is a special function that has the same name as the classname, with a
prefix ~, e.g., ~Circle(). Destructor is called implicitly when an object is destroyed.
If you do not define a destructor, the compiler provides a default, which does nothing.

class MyClass {
public:
// The default destructor that does nothing
~MyClass() { }
......
}
Advanced Notes
 If your class contains data member which is dynamically allocated (via new or new[] operator), you
need to free the storage via delete or delete[].
2.20 *Copy Constructor
A copy constructor constructs a new object by copying an existing object of the same type. In other
words, a copy constructor takes an argument, which is an object of the same class.
If you do not define a copy constructor, the compiler provides a default which copies all the data
members of the given object. For example,

Circle c4(7.8, "blue");


cout << "Radius=" << c4.getRadius() << " Area=" << c4.getArea()
<< " Color=" << c4.getColor() << endl;
// Radius=7.8 Area=191.135 Color=blue

// Construct a new object by copying an existing object


// via the so-called default copy constructor
Circle c5(c4);
cout << "Radius=" << c5.getRadius() << " Area=" << c5.getArea()
<< " Color=" << c5.getColor() << endl;
// Radius=7.8 Area=191.135 Color=blue
DS Using C++ Page 15
The copy constructor is particularly important. When an object is passed into a function by value, the
copy constructor will be used to make a clone copy of the argument.
Advanced Notes
 Pass-by-value for object means calling the copy constructor. To avoid the overhead of creating a
clone copy, it is usually better to pass-by-reference-to- const, which will not have side effect on
modifying the caller's object.
 The copy constructor has the following signature:

 class MyClass {
 private:
 T1 member1;
 T2 member2;
 public:
 // The default copy constructor which constructs an object via memberwise copy
 MyClass(const MyClass & rhs) {
 member1 = rhs.member1;
 member2 = rhs.member2;
 }
 ......

}
 The default copy constructor performs shadow copy. It does not copy the dynamically allocated data
members created via new or new[] operator.
2.21 *Copy Assignment Operator (=)
The compiler also provides a default assignment operator ( =), which can be used to assign one object to
another object of the same class via memberwise copy. For example, using the Circle class defined
earlier,
Circle c6(5.6, "orange"), c7;
cout << "Radius=" << c6.getRadius() << " Area=" << c6.getArea()
<< " Color=" << c6.getColor() << endl;
// Radius=5.6 Area=98.5206 Color=orange
cout << "Radius=" << c7.getRadius() << " Area=" << c7.getArea()
<< " Color=" << c7.getColor() << endl;
// Radius=1 Area=3.1416 Color=red (default constructor)

c7 = c6; // memberwise copy assignment


cout << "Radius=" << c7.getRadius() << " Area=" << c7.getArea()
<< " Color=" << c7.getColor() << endl;
// Radius=5.6 Area=98.5206 Color=orange
Advanced Notes
 You could overload the assignment opeator to override the default.
 The copy constructor, instead of copy assignment operator, is used in declaration:

 Circle c8 = c6; // Invoke the copy constructor, NOT copy assignment operator

// Same as Circle c8(c6)


 The default copy assignment operator performs shadow copy. It does not copy the dynamically
allocated data members created via new or new[] operator.
 The copy assignment operator has the following signature:

DS Using C++ Page 16


 class MyClass {
 private:
 T1 member1;
 T2 member2;
 public:
 // The default copy assignment operator which assigns an object via memberwise copy
 MyClass & operator=(const MyClass & rhs) {
 member1 = rhs.member1;
 member2 = rhs.member2;
 return *this;
 }
 ......

}
 The copy assignment operator differs from the copy constructor in that it must release the
dynamically allocated contents of the target and prevent self assignment. The assignment operator
shall return a reference of this object to allow chaining operation (such as x = y = z).
 The default constructor, default destructor, default copy constructor, default copy assignment
operators are known as special member functions, in which the compiler will automatically generate
a copy if they are used in the program and not explicitly defined.
3. Separating Header and Implementation
For better software engineering, it is recommended that the class declaration and implementation be
kept in 2 separate files: declaration is a header file " .h"; while implementation in a " .cpp". This is known
as separating the public interface (header declaration) and the implementation. Interface is defined by
the designer, implementation can be supplied by others. While the interface is fixed, different vendors
can provide different implementations. Furthermore, only the header files are exposed to the users, the
implementation can be provided in an object file " .o" (or in a library). The source code needs not given to
the users.
I shall illustrate with the following examples.

4. Example: The Circle Class

Instead of putting all the codes in a single file. We shall "separate the interface and implementation" by
placing the codes in 3 files.
1. Circle.h: defines the public interface of the Circle class.
2. Circle.cpp: provides the implementation of the Circle class.
3. TestCircle.cpp: A test driver program for the Circle class.
DS Using C++ Page 17
Circle.h - Header
1/* The Circle class Header (Circle.h) */
2#include <string> // using string
3using namespace std;
4
5// Circle class declaration
6class Circle {
7private: // Accessible by members of this class only
8 // private data members (variables)
9 double radius;
10 string color;
11
12public: // Accessible by ALL
13 // Declare prototype of member functions
14 // Constructor with default values
15 Circle(double radius = 1.0, string color = "red");
16
17 // Public getters & setters for private data members
18 double getRadius() const;
19 void setRadius(double radius);
20 string getColor() const;
21 void setColor(string color);
22
23 // Public member Function
24 double getArea() const;
25};

Program Notes:
 The header file contains declaration statements, that tell the compiler about the names and types,
and function prototypes without the implementation details.
 C++98/03 does NOT allow you to assign an initial value to a data member
(except const static members). Date members are to be initialized via the constructor. For example,
 double radius = 1.0;

// error: ISO C++ forbids in-class initialization of non-const static member 'radius'
C++11 allows in-class initialization of data members.

 You can provide default value to function's arguments in the header. For example,

Circle(double radius = 1.0, string color = "red");

 Header contains function prototype, the parameter names are ignored by the compiler, but good to
serve as documentation. For example, you can leave out the parameter names in the prototype as
follows:

 Circle(double = 1.0, string = "red"); // without identifiers

// Identifiers not needed in prototype but good to serve as documentation

Header files shall contains constants, function prototypes, class/struct declarations.

DS Using C++ Page 18


Circle.cpp - Implementation
1/* The Circle class Implementation (Circle.cpp) */
2#include "Circle.h" // user-defined header in the same directory
3
4// Constructor
5// default values shall only be specified in the declaration,
6// cannot be repeated in definition
7Circle::Circle(double r, string c) {
8 radius = r;
9 color = c;
10}
11
12// Public getter for private data member radius
13double Circle::getRadius() const {
14 return radius;
15}
16
17// Public setter for private data member radius
18void Circle::setRadius(double r) {
19 radius = r;
20}
21
22// Public getter for private data member color
23string Circle::getColor() const {
24 return color;
25}
26
27// Public setter for private data member color
28void Circle::setColor(string c) {
29 color = c;
30}
31
32// A public member function
33double Circle::getArea() const {
34 return radius*radius*3.14159265;
35}

Program Notes:
 The implementation file provides the definition of the functions, which are omitted from
the declaration in the header file.
 #include "Circle.h"
The compiler searches the headers in double quotes (such as "Circle.h") in the current
directory first, then the system's include directories. For header in angle bracket (such
as <iostream>), the compiler does NOT searches the current directory, but only the system's include
directories. Hence, use double quotes for user-defined headers.
 Circle::Circle(double r, string c) {
You need to include the className:: (called class scope resolution operator) in front of all the
members names, so as to inform the compiler this member belong to a particular class.
(Class Scope: Names defined inside a class have so-called class scope. They are visible within the
class only. Hence, you can use the same name in two different classes. To use these names outside
the class, the class scope resolution operator className:: is needed.)
DS Using C++ Page 19
 You CANNOT place the default arguments in the implementation (they shall be placed in the header).
For example,

Circle::Circle(double r = 1.0, string c = "red") { // error!


Compiling the Circle Class
You can compile the Circle.cpp to an object file called Circle.o, via option -c (compile-only) in GNU
GCC:
> g++ -c Circle.cpp
// option –c for compile-only, output is Circle.o
To use the Circle class, the user needs Circle.h and Circle.o. He does not need Circle.cpp. In other
words, you do not need to give away your source codes, but merely the public declarations and the object
codes.
TestCircle.cpp - Test Driver
Let's write a test program to use the Circle class created.
1/* A test driver for the Circle class (TestCircle.cpp) */
2#include <iostream>
3#include "Circle.h" // using Circle class
4using namespace std;
5
6int main() {
7 // Construct an instance of Circle c1
8 Circle c1(1.2, "red");
9 cout << "Radius=" << c1.getRadius() << " Area=" << c1.getArea()
10 << " Color=" << c1.getColor() << endl;
11
12 c1.setRadius(2.1); // Change radius and color of c1
13 c1.setColor("blue");
14 cout << "Radius=" << c1.getRadius() << " Area=" << c1.getArea()
15 << " Color=" << c1.getColor() << endl;
16
17 // Construct another instance using the default constructor
18 Circle c2;
19 cout << "Radius=" << c2.getRadius() << " Area=" << c2.getArea()
20 << " Color=" << c2.getColor() << endl;
21 return 0;
22}
Compiling the Test Program
To compile TestCircle.cpp with the object code Circle.o (and Circle.h):
> g++ -o TestCircle.exe TestCircle.cpp Circle.o
// option -o specifies the output filename
You can also compile TestCircle.cpp with the source code Circle.cpp (and Circle.h)
> g++ -o TestCircle.exe TestCircle.cpp Circle.cpp

DS Using C++ Page 20


5. Example: The Time Class

Let's write a class called Time, which models a specific instance of time with hour, minute and second
values, as shown in the class diagram.
The class Time contains the following members:
 Three private data members: hour (0-23), minute (0-59) and second (0-59), with default values of
0.
 A public constructor Time(), which initializes the data members hour, minute and second with the
values provided by the caller.
 public getters and setters for private data
members: getHour(), getMinute(), getSecond(), setHour(), setMinute(), and setSecond().
 A public member function setTime() to set the values of hour, minute and second given by the
caller.
 A public member function print() to print this Time instance in the format " hh:mm:ss", zero-filled,
e.g., 01:30:04.
 A public member function nextSecond(), which increase this instance by one
second. nextSecond() of 23:59:59 shall be 00:00:00.
Let's write the code for the Time class, with the header and implementation separated in two
files: Time.h and Time.cpp.
Header - Time.h
1/* Header for the Time class (Time.h) */
2#ifndef TIME_H // Include this "block" only if TIME_H is NOT defined
3#define TIME_H // Upon the first inclusion, define TIME_H so that
4 // this header will not get included more than once
5class Time {
6private: // private section
7 // private data members
8 int hour; // 0 - 23
9 int minute; // 0 - 59
10 int second; // 0 - 59
11
12public: // public section
13 // public member function prototypes

DS Using C++ Page 21


14 Time(int h = 0, int m = 0, int s = 0); // Constructor with default values
15 int getHour() const; // public getter for private data member hour
16 void setHour(int h); // public setter for private data member hour
17 int getMinute() const; // public getter for private data member minute
18 void setMinute(int m); // public setter for private data member minute
19 int getSecond() const; // public getter for private data member second
20 void setSecond(int s); // public setter for private data member second
21 void setTime(int h, int m, int s); // set hour, minute and second
22 void print() const; // Print a description of this instance in "hh:mm:ss"
23 void nextSecond(); // Increase this instance by one second
24}; // need to terminate the class declaration with a semicolon
25
26#endif // end of "#ifndef" block
Dissecting Time.h
#ifndef TIME_H
#define TIME_H
......
#endif
To prevent an header file from included more than once into a source file (which could result in
compilation error if an entity is declared twice, e.g., int i), we wrap the header codes within a pair
of preprocessor directives #ifndef (if not define) and #endif. The codes within the if-block will only be
included if the identifier TIME_H has not been defined. This is true for the first inclusion, which also defines
the identifier TIME_H (the first directive in body of the if-block). No subsequent inclusion is possible,
since TIME_H has been defined during the first inclusion. By convention, use the
identifier XXX_H (or XXX_H_INCLUDED) for header Xxx.h.
class Time {
private:
......
public:
......
};
The header Time.h contains the class declaration for the class Time. It is divided into two
sections: private and public. The private members (data or functions) are accessible by members of
this class only, while public members are visible by all (such as the main() function which is outside the
class). The class declaration must be terminated by a semicolon.
private:
int hour;
int minute;
int second;
public:
......
We declare 3 private data members called hour, minute and second. In C++98/C++03, you are NOT allow
to initialize a data member in the class declaration (except const static int data members). For
example, setting hour = 0 causes a compilation error. Instead, the data members are to be initialized in
the constructor (to be shown later). The newer C++11 allows initialization of data members.
Only member function prototypes are listed in the class declaration. A function prototype consists of the
return-type, function name and parameter types.
Time(int h = 0, int m = 0, int s = 0);
declares the so-called constructor. A constructor is a special function that has the same name as the
class. A constructor has no return type, or implicitly return void. No return statement is allowed inside
the constructor's body. A constructor can only be used during the instance declaration to initialize the
data members of the instance. It cannot be invoked thereafter.

DS Using C++ Page 22


In the function prototypes of the header, we can set the default values of the function's parameters for
any function member using " = default-value". In this case, this constructor can be invoked with 0 to 3
arguments, the omitted trailing arguments will be set to their default values, e.g.,
Time t1(1, 2, 3); // no default used
Time t2(1, 2); // s = 0 (default)
Time t3(1); // m = 0, s = 0 (defaults)
Time t4; // h = 0, m = 0, s = 0 (all defaults) - no empty parentheses ()
The identifiers h, m and s are not needed in the function prototype - you only need to specify the
parameters' types. But they serve as proper documentation, and are strongly recommended.
int getHour() const;
void setHour(int h);
int getHour() const;
void setHour(int h);
int getHour() const;
void setHour(int h);
declare the so-called getter and setter for the private data member hour, minute and second. Since the
data members are private and are not accessible outside the class, public getters and setters are often
provided to read and modify the private data members. By convention, a getter receives nothing ( void)
from the caller and returns a value of the type of the data member; a setter receives a value of the type
of the data member and returns void. Setters may validate the input before setting the value of the data
member.
We declare the getter function constant, by placing the keyword const after the function parameter list.
A const member function cannot modify any data member of this object. Getter does not need to modify
any data member.
void setTime(int h, int m, int s);
declares a public member function to set the hour, minute and second of this instance in one call.
void print() const;
declares a public member function to print this instance in the format HH:MM:SS, zero-filled,
e.g., 01:56:09. The function print() returns void.
void nextSecond();
declares a public member function to increase this instance by one second. For
example, 23:59:59 becomes 00:00:00. The function nextSecond() returns void.
Implementation - Time.cpp
1/* Implementation for the Time Class (Time.cpp) */
2#include <iostream>
3#include <iomanip>
4#include "Time.h" // include header of Time class
5using namespace std;
6
7// Constructor with default values. No input validation
8Time::Time(int h, int m, int s) {
9 hour = h;
10 minute = m;
11 second = s;
12}
13
14// public getter for private data member hour
15int Time::getHour() const {
16 return hour;
17}
18
19// public setter for private data member hour. No input validation
DS Using C++ Page 23
20void Time::setHour(int h) {
21 hour = h;
22}
23
24// public getter for private data member minute
25int Time::getMinute() const {
26 return minute;
27}
28
29// public setter for private data member minute. No input validation
30void Time::setMinute(int m) {
31 minute = m;
32}
33
34// public getter for private data member second
35int Time::getSecond() const {
36 return second;
37}
38
39// public setter for private data member second. No input validation
40void Time::setSecond(int s) {
41 second = s;
42}
43
44// Set hour, minute and second. No input validation
45void Time::setTime(int h, int m, int s) {
46 hour = h;
47 minute = m;
48 second = s;
49}
50
51// Print this Time instance in the format of "hh:mm:ss", zero filled
52void Time::print() const {
53 cout << setfill('0'); // zero-filled, need <iomanip>, sticky
54 cout << setw(2) << hour // set width to 2 spaces, need <iomanip>, non-sticky
55 << ":" << setw(2) << minute
56 << ":" << setw(2) << second << endl;
57}
58
59// Increase this instance by one second
60void Time::nextSecond() {
61 ++second;
62 if (second >= 60) {
63 second = 0;
64 ++minute;
65 }
66 if (minute >= 60) {
67 minute = 0;
68 ++hour;

DS Using C++ Page 24


69 }
70 if (hour >= 24) {
71 hour = 0;
72 }
73}
Dissecting Time.cpp
The implementation file Time.cpp contains member's definitions (whereas the header file contains the
declarations), in particular, member functions.
All member's identifiers in the implementation are preceded by the classname and the scope resolution
operator (::), e.g., Time::Time and Time::getHour, so that the compiler can tell that these identifiers
belong to a particular class, in this case, Time.
Time::Time(int h, int m, int s) {
hour = h;
minute = m;
second = s;
}
In the constructor, we initialize the private data members hour, minute and second based on the inputs
provided by the caller. C++ does NOT initialize fundamental-type (e.g., int, double) data members. It
also does NOT issue an error message if you use an data member before it is initialized. Hence, It is
strongly recommended to initialize all the data members in the constructor, so that the constructed
instance is complete, instead of relying on the user to set the values of the data members after
construction.
The default values of the parameters are specified in the class declaration (in the header), NOT in the
function definition. Placing a default value in function definition (e.g., h = 0) causes a compilation error.
Take note that we have not included input validation (e.g., hour shall be between 0 and 23) in the
constructor (and setters). We shall do that in the later example.
int Time::getHour() const {
return hour;
}
the public getter for private data member hour simply returns the value of the data member hour.
void Time::setHour(int h) {
hour = h;
}
the public setter for private data member hour sets the data member hour to the given value h. Again,
there is no input validation for h (shall be between 0 to 23).
The rest of the function definitions are self-explanatory.
"this" Pointer
Instead of naming the function parameters h, m and s, we would like to name the
parameters hour, minute and second, which are semantically more meaningful. However, these names
crashes with the names of private data members. C++ provides a keyword this (which is a pointer to
this instance - to be discussed later) to differentiate between the data members and function
parameters. this->hour, this->minute and this->second refer to the data members; while hour, minute,
and second refer to the function parameters. We can rewrite the constructor and setter as follows:
Time::Time(int hour, int minute, int second) { // Constructor
this->hour = hour;
this->minute = minute;
this->second = second;
}

Time::setHour(int hour) { // Setter for hour


this->hour = hour;

DS Using C++ Page 25


}

Time::getHour() const { // Getter for hour


return this->hour; // this-> is the default, and hence optional
}
Member Initializer List
C++ provide an alternative syntax to initialize data members in the constructor called member initializer
list. For example,
Time::Time(int h, int m, int s) : hour(h), minute(m), second(s) {
// The body runs after the member initializer list
// empty in this case
}
The member initializer list is placed after the function parameter list, separated by a colon, in the form
of dataMemberName(parameters). For fundamental-type data members (e.g., int, double), hour(h) is the
same as hour = h. For object data members (to be discussed later), the copy constructor will be invoked.
The function body will be executed after the member initializer list, which is empty in this case.
The data members in the initializer list are initialized in the order of their declarations in the class
declaration, not the order in the initializer list.
Test Driver - TestTime.cpp
1/* Test Driver for the Time class (TestTime.cpp) */
2#include <iostream>
3#include "Time.h" // include header of Time class
4using namespace std;
5
6int main() {
7 Time t1(23, 59, 59); // Test constructor
8
9 // Test all public member functions
10 t1.print(); // 23:59:59
11 t1.setHour(12);
12 t1.setMinute(30);
13 t1.setSecond(15);
14 t1.print(); // 12:30:15
15 cout << "Hour is " << t1.getHour() << endl;
16 cout << "Minute is " << t1.getMinute() << endl;
17 cout << "Second is " << t1.getSecond() << endl;
18
19 Time t2; // Test constructor with default values for hour, minute and second
20 t2.print(); // 00:00:00
21 t2.setTime(1, 2, 3);
22 t2.print(); // 01:02:03
23
24 Time t3(12); // Use default values for minute and second
25 t3.print(); // 12:00:00
26
27 // Test nextSecond()
28 Time t4(23, 59, 58);
29 t4.print();

DS Using C++ Page 26


30 t4.nextSecond();
31 t4.print();
32 t4.nextSecond();
33 t4.print();
34
35 // No input validation
36 Time t5(25, 61, 99); // values out of range
37 t5.print(); // 25:61:99
38}
Dissecting TestTime.cpp
The test driver tests the constructor (with and without the default values) and all the public member
functions. Clearly, no input validation is carried out, as reflected in instance t5.
Exercise
Add member
functions previousSecond(), nextMinute(), previousMinute(), nextHour(), previousHour() to
the Time class.
Compiling the Program
You can compile all the source file together to get the executable file as follows:

// Using GCC on Windows


// Compile all source files, -o specifies the output
> g++ -o TestTime.exe Time.cpp TestTime.cpp
// Execute the program
> TestTime
Alternatively, you can compile Time.cpp into an object file Time.o, and then the test driver with the object
file. In this way, you only distribute the object file and header file, not the source file.
// Compile Time.cpp into object file Time.o, with -c option
> g++ -c Time.cpp
// Compile test driver with object file
> g++ -o TestTime.exe TestTime.cpp Time.o
// Execute the test driver
> TestTime

6. Example: The Point Class

DS Using C++ Page 27


The Point class, as shown in the class diagram, models 2D points with x and y co-ordinates.
In the class diagram, "-" denotes private member; "+" denotes public member. "= xxx" specifies the
default value of a data member.
The Point class contains the followings:
 Private data members x and y (of type int), with default values of 0.
 A constructor, getters and setters for private data member x and y.
 A function setXY() to set both x and y coordinates of a Point.
 A function getMagnitude() which returns √(x2+y2). You can use the built-in sqrt() function
in <cmath> to compute the square root.
 A function getArgument() which returns tan-1(y/x). You can use the built-in atan2(y, x) function
in <cmath> to compute the gradient in radians.
 A function print() which prints "(x,y)" of this instance.
Point.h - Header
1/* The Point class Header (Point.h) */
2#ifndef POINT_H
3#define POINT_H
4
5// Point class declaration
6class Point {
7private:
8 // private data members (variables)
9 int x;
10 int y;
11
12public:
13 // Declare member function prototypes
14 Point(int x = 0, int y = 0); // Constructor with default values
15 int getX() const;
16 void setX(int x);
17 int getY() const;
18 void setY(int y);
19 void setXY(int x, int y);
20 double getMagnitude() const;
21 double getArgument() const;
22 void print() const;
23};
24
25#endif
Point.cpp - Implementation
1/* The Point class Implementation (Point.cpp) */
2#include "Point.h" // user-defined header in the same directory
3#include <iostream>
4#include <cmath>
5using namespace std;
6
7// Constructor (default values can only be specified in the declaration)
8Point::Point(int x, int y) : x(x), y(y) { } // Use member initializer list
9
10// Public getter for private data member x
11int Point::getX() const {

DS Using C++ Page 28


12 return x;
13}
14
15// Public setter for private data member x
16void Point::setX(int x) {
17 this->x = x;
18}
19
20// Public getter for private data member y
21int Point::getY() const {
22 return y;
23}
24
25// Public setter for private data member y
26void Point::setY(int y) {
27 this->y = y;
28}
29
30// Public member function to set both x and y
31void Point::setXY(int x, int y) {
32 this->x = x;
33 this->y = y;
34}
35
36// Public member function to return the magitude
37double Point::getMagnitude() const {
38 return sqrt(x*x + y*y); // sqrt in <cmath>
39}
40
41// Public member function to return the argument
42double Point::getArgument() const {
43 return atan2(y, x); // atan2 in <cmath>
44}
45
46// Public member function to print description about this point
47void Point::print() const {
48 cout << "(" << x << "," << y << ")" << endl;
49}
TestPoint.cpp - Test Driver
1/* A test driver for the Point class (TestPoint.cpp) */
2#include <iostream>
3#include <iomanip>
4#include "Point.h" // using Point class
5using namespace std;
6
7int main() {
8 // Construct an instance of Point p1
9 Point p1(3, 4);

DS Using C++ Page 29


10 p1.print();
11 cout << "x = " << p1.getX() << endl;
12 cout << "y = " << p1.getY() << endl;
13 cout << fixed << setprecision(2);
14 cout << "mag = " << p1.getMagnitude() << endl;
15 cout << "arg = " << p1.getArgument() << endl;
16 p1.setX(6);
17 p1.setY(8);
18 p1.print();
19 p1.setXY(1, 2);
20 p1.print();
21
22 // Construct an instance of Point using default constructor
23 Point p2;
24 p2.print();
25}

7. Example: The Account Class

A class called Account, which models a bank account, is designed as shown in the class diagram. It
contains:
 Two private data members: accountNumber (int) and balance (double), which maintains the
current account balance.
 Public functions credit() and debit(), which adds or subtracts the given amount from the
balance, respectively. The debit() function shall print "amount withdrawn exceeds the current
balance!" if amount is more than balance.
 A public function print(), which shall print "A/C no: xxx Balance=xxx" (e.g., A/C no: 991234
Balance=$88.88), with balance rounded to two decimal places.
Header file - Account.h
1/* Header for Account class (Account.h) */
2#ifndef ACCOUNT_H
3#define ACCOUNT_H

DS Using C++ Page 30


4
5class Account {
6private:
7 int accountNumber;
8 double balance;
9
10public:
11 Account(int accountNumber, double balance = 0.0);
12 int getAccountNumber() const;
13 double getBalance() const;
14 void setBalance(double balance);
15 void credit(double amount);
16 void debit(double amount);
17 void print() const;
18};
19
20#endif
Implementation file - Account.cpp
1/* Implementation for the Account class (Account.cpp) */
2#include <iostream>
3#include <iomanip>
4#include "Account.h"
5using namespace std;
6
7// Constructor
8Account::Account(int no, double b) : accountNumber(no), balance(b) { }
9
10// Public getter for private data member accountNumber
11int Account::getAccountNumber() const {
12 return accountNumber;
13}
14
15// Public getter for private data member balance
16double Account::getBalance() const {
17 return balance;
18}
19
20// Public setter for private data member balance
21void Account::setBalance(double b) {
22 balance = b;
23}
24
25// Adds the given amount to the balance
26void Account::credit(double amount) {
27 balance += amount;
28}
29
30// Subtract the given amount from the balance

DS Using C++ Page 31


31void Account::debit(double amount) {
32 if (amount <= balance) {
33 balance -= amount;
34 } else {
35 cout << "Amount withdrawn exceeds the current balance!" << endl;
36 }
37}
38
39// Print description for this Account instance
40void Account::print() const {
41 cout << fixed << setprecision(2);
42 cout << "A/C no: " << accountNumber << " Balance=$" << balance << endl;
43}
Test Driver - TestAccount.cpp
1/* Test Driver for Account class (TestAccount.cpp) */
2#include <iostream>
3#include "Account.h"
4
5using namespace std;
6
7int main() {
8 Account a1(8111, 99.99);
9 a1.print(); // A/C no: 8111 Balance=$99.99
10 a1.credit(20);
11 a1.debit(10);
12 a1.print(); // A/C no: 8111 Balance=$109.99
13
14 Account a2(8222); // default balance
15 a2.print(); // A/C no: 8222 Balance=$0.00
16 a2.setBalance(100);
17 a2.credit(20);
18 a2.debit(200); // Amount withdrawn exceeds the current balance!
19 a2.print(); // A/C no: 8222 Balance=$120.00
20 return 0;
21}

DS Using C++ Page 32


8. Example: The Ball class

A Ball class models a moving ball, designed as shown in the class diagram, contains the following
members:
 Four private data members x, y, xSpeed and ySpeed to maintain the position and speed of the ball.
 A constructor, and public getters and setters for the private data members.
 A function setXY(), which sets the position of the ball and setXYSpeed() to set the speed of the
ball.
 A function move(), which increases x and y by xSpeed and ySpeed, respectively.
 A function print(), which prints "Ball @ (x,y) with speed (xSpeed,ySpeed)", to 2 decimal
places.
Header File - Ball.h
1/* Header for the Ball class (Ball.h) */
2#ifndef BALL_H
3#define BALL_H
4
5class Ball {
6private:
7 double x, y; // Position of the ball
8 double xSpeed, ySpeed; // Speed of the ball
9
10public:
11 Ball(double x = 0.0, double y = 0.0, // Constructor with default values
12 double xSpeed = 0.0, double ySpeed = 0.0);
13 double getX() const;

DS Using C++ Page 33


14 void setX(double x);
15 double getY() const;
16 void setY(double y);
17 double getXSpeed() const;
18 void setXSpeed(double xSpeed);
19 double getYSpeed() const;
20 void setYSpeed(double ySpeed);
21 void setXY(double x, double y);
22 void setXYSpeed(double xSpeed, double ySpeed);
23 void move();
24 void print() const;
25};
26
27#endif
Implementation File - Ball.cpp
1/* Implementation for the Ball Class (Ball.cpp) */
2#include <iostream>
3#include <iomanip>
4#include "Ball.h" // include header of Ball class
5using namespace std;
6
7// Constructor with default values. No input validation
8Ball::Ball(double x, double y, double xSpeed, double ySpeed)
9 : x(x), y(y), xSpeed(xSpeed), ySpeed(ySpeed) { } // use member initializer list
10
11// public getters/setters for private data members
12double Ball::getX() const {
13 return x;
14}
15double Ball::getY() const {
16 return y;
17}
18void Ball::setX(double x) {
19 this->x = x;
20}
21void Ball::setY(double y) {
22 this->y = y;
23}
24double Ball::getXSpeed() const {
25 return xSpeed;
26}
27double Ball::getYSpeed() const {
28 return ySpeed;
29}
30void Ball::setXSpeed(double xSpeed) {
31 this->xSpeed = xSpeed;
32}
33void Ball::setYSpeed(double ySpeed) {

DS Using C++ Page 34


34 this->ySpeed = ySpeed;
35}
36
37// Set position (x,y)
38void Ball::setXY(double x, double y) {
39 this->x = x;
40 this->y = y;
41}
42
43// Set speed (xSpeed,ySpeed)
44void Ball::setXYSpeed(double xSpeed, double ySpeed) {
45 this->xSpeed = xSpeed;
46 this->ySpeed = ySpeed;
47}
48
49// Move the ball by increases x and y by xSpeed and ySpeed
50void Ball::move() {
51 x += xSpeed; // increment x by xSpeed
52 y += ySpeed; // increment y by ySpeed
53}
54
55// Print a description about this Ball instance
56void Ball::print() const {
57 cout << fixed << setprecision(2);
58 cout << "Ball @ (" << x << ',' << y << ") with speed ("
59 << xSpeed << ',' << ySpeed << ')' << endl;
60}
Test Driver - TestBall.cpp
1/* Test Driver for the Ball class (TestBall.cpp) */
2#include <iostream>
3#include "Ball.h" // include header of Ball class
4using namespace std;
5
6int main() {
7 Ball ball;
8 ball.print(); // Ball @ (0.00,0.00) with speed (0.00,0.00)
9 ball.setXY(1.1, 2.2);
10 ball.setXYSpeed(3.3, 4.4);
11 ball.print(); // Ball @ (1.10,2.20) with speed (3.30,4.40)
12 ball.setX(5.5);
13 ball.setY(6.6);
14 cout << "x is " << ball.getX() << endl; // x is 5.50
15 cout << "y is " << ball.getY() << endl; // y is 6.60
16 ball.move();
17 ball.print(); // Ball @ (8.80,11.00) with speed (3.30,4.40)
18}

DS Using C++ Page 35


9. Example: The Author and Book Classes (for a Bookstore)
9.1 Let's start with the Author class

Let's begin with a class called Author, designed as shown in the class diagram. It contains:
 Three private data members: name (string), email (string),
and gender (char of 'm', 'f' or 'u' for unknown).
 A constructor to initialize the name, email and gender with the given values. There are no default
values for data members.
 Getters for name, email and gender, and setter for email. There is no setter for name and gender as
we assume that these attributes cannot be changed.
 A print() member function that prints "name (gender) at email", e.g., "Peter Jones (m) at
[email protected]".
Header File - Author.h
1/* Header for the Author class (Author.h) */
2#ifndef AUTHOR_H
3#define AUTHOR_H
4
5#include <string>
6using namespace std;
7
8class Author {
9private:
10 string name;
11 string email;
12 char gender; // 'm', 'f', or 'u' for unknown
13
14public:
15 Author(string name, string email, char gender);
16 string getName() const;
17 string getEmail() const;
18 void setEmail(string email);
19 char getGender() const;
20 void print() const;
21};
22

DS Using C++ Page 36


23#endif
Implementation File - Author.cpp
1/* Implementation for the Author class (Author.cpp) */
2#include <iostream>
3#include "Author.h"
4using namespace std;
5
6// Constructor, with input validation
7Author::Author(string name, string email, char gender) {
8 this->name = name;
9 setEmail(email); // Call setter to check for valid email
10 if (gender == 'm' || gender == 'f') {
11 this->gender = gender;
12 } else {
13 cout << "Invalid gender! Set to 'u' (unknown)." << endl;
14 this->gender = 'u';
15 }
16}
17
18string Author::getName() const {
19 return name;
20}
21
22string Author::getEmail() const {
23 return email;
24}
25
26void Author::setEmail(string email) {
27 // Check for valid email. Assume that a valid email contains
28 // a '@' that is not the first nor last character.
29 size_t atIndex = email.find('@');
30 if (atIndex != string::npos && atIndex != 0 && atIndex != email.length()-1) {
31 this->email = email;
32 } else {
33 cout << "Invalid email! Set to empty string." << endl;
34 this->email = "";
35 }
36}
37
38char Author::getGender() const {
39 return gender;
40}
41
42// print in the format "name (gender) at email"
43void Author::print() const {
44 cout << name << " (" << gender << ") at " << email << endl;
45}

DS Using C++ Page 37


Dissecting the Author.cpp
Author::Author(string name, string email, char gender) {
this->name = name;
setEmail(email);
In this example, we use identifier name in the function's parameter, which crashes with the data member's
identifier name. To differentiate between the two identifiers, we use the keyword this, which is a pointer
to this instance. this->name refers to the data member; while name refers to the function's parameter.
No input validation is done on the parameter name. On the other hand, for email, we invoke
setter setEmail() which performs input validation.
if (gender == 'm' || gender == 'f') {
this->gender = gender;
} else {
cout << "Invalid gender! Set to 'u' (unknown)." << endl;
this->gender = 'u';
}
}
We validate the input for gender ('m', 'f', or 'u' for unknown). We assign 'u' for any other inputs.
void Author::setEmail(string email) {
size_t found = email.find('@');
if (found != string::npos && found != 0 && found != email.length()-1) {
this->email = email;
} else {
cout << "Invalid email! Set to empty string." << endl;
this->email = "";
}
}
To validate email, we assume that there is an '@' which is not the first or last character (there are other
stricter email validation criteria). We use the string class function find() to find the position of the
character '@', which returns a value of type size_t (typically same as unsigned int). The
function find() returns a special constant string::npos (which is typically set to -1) to indicate "not
found"; 0 for the first character and length()-1 for the last character (where string's
function length() returns the length of the string).
TestAuthor.cpp
1/* Test Driver for the Author class (TestAuthor.cpp) */
2#include "Author.h"
3
4int main() {
5 // Declare and construct an instance of Author
6 Author peter("Peter Jones", "[email protected]", 'm');
7 peter.print();
8 // Peter Jones (m) at [email protected]
9 peter.setEmail("[email protected]");
10 peter.print();
11 // Peter Jones (m) at [email protected]
12
13 Author paul("Paul Jones", "@somewhere.com", 'n');
14 // Invalid email! Set to empty string.
15 // Invalid gender! Set to 'u' (unknown).
16 paul.setEmail("paul@");
17 // Invalid email! Set to empty string.
18 paul.print();
19 // Paul Jones (u) at
DS Using C++ Page 38
20}
9.2 A Book is written by an Author - Using an "Object" Data Member

Let's design a Book class. Assume that a book is written by one and only one author. The Book class (as
shown in the class diagram) contains the following members:
 Four private data members: name (string), author (an instance of the class Author that we have
created earlier), price (double), and qtyInStock (int, with default value of 0). The price shall be
positive and the qtyInStock shall be zero or positive.
Take note that data member author is an instance (object) of the class Author, instead of a
fundamental types (such as int, double). In fact, name is an object of the class string too.
 The public getters and setters for the private data members. Take note that getAuthor() returns
an object (an instance of class Author).
 A public member function print(), which prints "'book-name' by author-name (gender) @ email".
 A public member function getAuthorName(), which returns the name of the author of
this Book instance.
The hallow diamond shape in the class diagram denotes aggregation (or has-a) association relationship.
That is, a Book instance has one (and only one) Author instance as its component.
Header File - Book.h
1/* Header for the class Book (Book.h) */
2#ifndef BOOK_H
3#define BOOK_H
4
5#include <string>
6#include "Author.h" // Use the Author class
7using namespace std;
8
9class Book {
10private:
11 string name;
12 Author author; // data member author is an instance of class Author
13 double price;
14 int qtyInStock;
DS Using C++ Page 39
15
16public:
17 Book(string name, Author author, double price, int qtyInStock = 0);
18 // To recieve an instance of class Author as argument
19 string getName() const;
20 Author getAuthor() const; // Returns an instance of the class Author
21 double getPrice() const;
22 void setPrice(double price);
23 int getQtyInStock() const;
24 void setQtyInStock(int qtyInStock);
25 void print() const;
26 string getAuthorName() const;
27};
28
29#endif
#include "Author.h"
We need to include the "Author.h" header, as we use the Author class in this class Book.
private:
Author author;
We declare a private data member author as an instance of class Author, defined earlier.
Implementation File - Book.cpp
1/* Implementation for the class Book (Book.cpp) */
2#include <iostream>
3#include "Book.h"
4using namespace std;
5
6// Constructor, with member initializer list to initialize the
7// component Author instance
8Book::Book(string name, Author author, double price, int qtyInStock)
9 : name(name), author(author) { // Must use member initializer list to construct ob
10 // Call setters to validate price and qtyInStock
11 setPrice(price);
12 setQtyInStock(qtyInStock);
13}
14
15string Book::getName() const {
16 return name;
17}
18
19Author Book::getAuthor() const {
20 return author;
21}
22
23double Book::getPrice() const {
24 return price;
25}
26
27// Validate price, which shall be positive
28void Book::setPrice(double price) {
DS Using C++ Page 40
29 if (price > 0) {
30 this->price = price;
31 } else {
32 cout << "price should be positive! Set to 0" << endl;
33 this->price = 0;
34 }
35}
36
37int Book::getQtyInStock() const {
38 return qtyInStock;
39}
40
41// Validate qtyInStock, which cannot be negative
42void Book::setQtyInStock(int qtyInStock) {
43 if (qtyInStock >= 0) {
44 this->qtyInStock = qtyInStock;
45 } else {
46 cout << "qtyInStock cannot be negative! Set to 0" << endl;
47 this->qtyInStock = 0;
48 }
49}
50
51// print in the format ""Book-name" by author-name (gender) at email"
52void Book::print() const {
53 cout << "'" << name << "' by ";
54 author.print();
55}
56
57// Return the author' name for this Book
58string Book::getAuthorName() const {
59 return author.getName(); // invoke the getName() on instance author
60}
Book::Book(string name, Author author, double price, int qtyInStock)
: name(name), author(author) {
setPrice(price);
setQtyInStock(qtyInStock);
}
In the constructor, the caller is supposed to create an instance of Author, and pass the instance into the
constructor. We use member initializer list to initialize data members name and author. We call setters in
the body, which perform input validation to set the price and qtyInStock. The body is run after the
member initializer list. The author(author) invokes the default copy constructor of the Author class,
which performs memberwise copy for all the data members. Object data member shall be constructed via
the member initializer list, not in the body. Otherwise, the default constructor will be invoked to construct
the object.
void Book::setPrice(double price) {
if (price > 0) {
this->price = price;
} else {
cout << "price should be positive! Set to 0" << endl;
this->price = 0;
}
DS Using C++ Page 41
}
The setter for price validates the given input.
string Book::getAuthorName() const {
return author.getName();
}
Invoke the getName() of the data member author, which returns the author's name of this Book instance.
TestBook.cpp
1/* Test Driver for the Book class (TestBook.cpp) */
2#include <iostream>
3#include "Book.h"
4using namespace std;
5
6int main() {
7 // Declare and construct an instance of Author
8 Author peter("Peter Jones", "[email protected]", 'm');
9 peter.print(); // Peter Jones (m) at [email protected]
10
11 // Declare and construct an instance of Book
12 Book cppDummy("C++ for Dummies", peter, 19.99);
13 cppDummy.setQtyInStock(88);
14 cppDummy.print();
15 // 'C++ for Dummies' by Peter Jones (m) at [email protected]
16
17 cout << cppDummy.getQtyInStock() << endl; // 88
18 cout << cppDummy.getPrice() << endl; // 19.99
19 cout << cppDummy.getAuthor().getName() << endl; // "Peter Jones"
20 cout << cppDummy.getAuthor().getEmail() << endl; // "[email protected]"
21 cout << cppDummy.getAuthorName() << endl; // "Peter Jones"
22
23 Book moreCpp("More C++ for Dummies", peter, -19.99);
24 // price should be positive! Set to 0
25 cout << moreCpp.getPrice() << endl; // 0
26}
The Default Copy Constructor
The initializer author(author) in the constructor invokes the so-called copy constructor. A copy
constructor creates a new instance by copying the given instance of the same class. If you do not provide
a copy constructor in your class, C++ provides a default copy constructor, which construct a new object
via memberwise copy. For example, for Author class, the default copy constructor provided by the
compiler is as follows:
// Default copy constructor of Author class provided by C++
Author::Author(const Author& other)
: name(other.name), email(other.email), gender(other.gender) { } // memberwise copy

9.3 Pass-by-Reference for Objects Function


Parameters Author and string
By default, objects are pass-by-value into functions. That is, a clone copy is created and pass into the
function, instead of the original copy. Pass-by-value for huge objects depicts performance due to the
overhead of creating a clone copy.

DS Using C++ Page 42


Instead, we could pass an object into function by reference, via the reference (&) declaration in the
parameter list. If we do not intend to modify the object inside the function (with side effect to the original
copy), we set it as const.
In the Book class, data members of string and Author are objects. Author class was defined
earlier; string is a class provided in C++ header <string>, belonging to the namespace std. Instead of
including "using namespace std;" in the header (which is a poor practice as this statement will be
included in all the files using this header), we shall use the scope resolution operator and refer to it
as std::string.
Let's modify our Book class to illustrate pass-by-reference (for performance).
Author.h
1/* Header for Author class (Author.h) */
2#ifndef AUTHOR_H
3#define AUTHOR_H
4
5#include <string>
6
7class Author {
8private:
9 std::string name;
10 std::string email;
11 char gender; // 'm', 'f', or 'u' for unknown
12
13public:
14 Author(const std::string & name, const std::string & email, char gender);
15 // & specifies pass by reference, const for non-mutable
16 std::string getName() const;
17 std::string getEmail() const;
18 void setEmail(const std::string & email);
19 char getGender() const;
20 void print() const;
21};
22
23#endif

Program Notes:
 In C++, string is a class in the standard library (in header <string>, belonging to namespace std),
just like Point, Circle classes that we have defined.
 Instead of including "using namespace std;", which is a poor practice as this statement will be
included in all the files using this header, we use the fully-qualified name std::string.
 Instead of passing string objects by value into function, which affects performance as a clone copy
needs to be made. We pass the string objects by reference (indicated by &).
 However, in pass-by-reference, changes inside the function will affect the caller's copy outside the
function.
 If we do not intend to change the object inside the function, we could use keyword const to indicate
immutability. If the object is inadvertently changed inside the function, compiler would issue an error.
Author.cpp
1/* Implementation for the Author class (Author.cpp) */
2#include <iostream>
3#include "Author.h"
4using namespace std;

DS Using C++ Page 43


5
6// Constructor, with input validation
7Author::Author(const string & name, const string & email, char gender) : name(name) {
8 setEmail(email); // Call setter to check for valid email
9 if (gender == 'm' || gender == 'f') {
10 this->gender = gender;
11 } else {
12 cout << "Invalid gender! Set to 'u' (unknown)." << endl;
13 this->gender = 'u';
14 }
15}
16
17string Author::getName() const {
18 return name;
19}
20
21string Author::getEmail() const {
22 return email;
23}
24
25void Author::setEmail(const string & email) {
26 // Check for valid email. Assume that a valid email contains
27 // a '@' that is not the first nor last character.
28 size_t atIndex = email.find('@');
29 if (atIndex != string::npos && atIndex != 0 && atIndex != email.length()-1) {
30 this->email = email;
31 } else {
32 cout << "Invalid email! Set to empty string." << endl;
33 this->email = "";
34 }
35}
36
37char Author::getGender() const {
38 return gender;
39}
40
41// print in the format "name (gender) at email"
42void Author::print() const {
43 cout << name << " (" << gender << ") at " << email << endl;
44}

Program Notes:
 Author::Author(const string & name, const string & email, char gender)
{ ...... }
In the constructor, the string objects are passed by reference. This improves the performance as it
eliminates the need of creating a temporary (clone) object. The constructor then invokes the copy
constructor of the string class to memberwise copy the arguments into its data
members name and email.
We make the parameters const to prevent them from modifying inside the function (with side effect
to the original copies).

DS Using C++ Page 44


Book.h
1/* Header for the class Book (Book.h) */
2#ifndef BOOK_H
3#define BOOK_H
4
5#include <string>
6#include "Author.h" // Use the Author class
7using namespace std;
8
9class Book {
10private:
11 string name;
12 Author author;
13 double price;
14 int qtyInStock;
15
16public:
17 Book(const string & name, const Author & author, double price, int qtyInStock = 0);
18 string getName() const;
19 Author getAuthor() const;
20 double getPrice() const;
21 void setPrice(double price);
22 int getQtyInStock() const;
23 void setQtyInStock(int qtyInStock);
24 void print() const;
25 string getAuthorName() const;
26};
27
28#endif

Program Notes:
 Book(const string & name, const Author & author, double price, int qtyInStock
= 0);
string and Author objects are passed into the constructor via reference. This improves performance
as it eliminates the creation of a temporary clone copy in pass-by-value. The parameters are
marked const as we do not intend to modify them inside the function (with side effect to the original
copies).
 Author getAuthor() const;
The getter returns a copy of the data member author.
Book.cpp
1/* Implementation for the class Book (Book.cpp) */
2#include <iostream>
3#include "Book.h"
4using namespace std;
5
6// Constructor, with member initializer list to initialize the
7// component Author instance
8Book::Book(const string & name, const Author & author, double price, int qtyInStock)
9 : name(name), author(author) { // Init object reference in member initializer list

DS Using C++ Page 45


10 // Call setters to validate price and qtyInStock
11 setPrice(price);
12 setQtyInStock(qtyInStock);
13}
14
15string Book::getName() const {
16 return name;
17}
18
19Author Book::getAuthor() {
20 return author;
21}
22
23double Book::getPrice() const {
24 return price;
25}
26
27// Validate price, which shall be positive
28void Book::setPrice(double price) {
29 if (price > 0) {
30 this->price = price;
31 } else {
32 cout << "price should be positive! Set to 0" << endl;
33 this->price = 0;
34 }
35}
36
37int Book::getQtyInStock() const {
38 return qtyInStock;
39}
40
41// Validate qtyInStock, which cannot be negative
42void Book::setQtyInStock(int qtyInStock) {
43 if (qtyInStock >= 0) {
44 this->qtyInStock = qtyInStock;
45 } else {
46 cout << "qtyInStock cannnot be negative! Set to 0" << endl;
47 this->qtyInStock = 0;
48 }
49}
50
51// print in the format ""Book-name" by author-name (gender) at email"
52void Book::print() const {
53 cout << "'" << name << "' by ";
54 author.print();
55}
56
57// Return the author' name for this Book
58string Book::getAuthorName() const {

DS Using C++ Page 46


59 return author.getName(); // invoke the getName() on instance author
60}
 Book::Book(const string & name, Author & author, double price, int
qtyInStock)
: name(name), author(author) { ...... }
name(name) and author(author) invoke the default copy constructors to construct new instances
of string and Author by memberwise copy the parameters.
 Author Book::getAuthor() { return author; }
A copy of the data member author is returned to the caller.
You should avoid returning a reference of a private data member to the caller (e.g., Author &
Book::getAuthro() { return author; }), as the caller can change the private data member via the
reference, which breaks the concept of "information hiding and encapsulation".
Test Driver - TestBook.cpp
1/* Test Driver for the Book class (TestBook.cpp) */
2#include <iostream>
3#include "Book.h"
4using namespace std;
5
6int main() {
7 // Declare and construct an instance of Author
8 Author peter("Peter Jones", "[email protected]", 'm');
9 peter.print(); // Peter Jones (m) at [email protected]
10
11 // Declare and construct an instance of Book
12 Book cppDummy("C++ for Dummies", peter, 19.99);
13 cppDummy.print();
14 // 'C++ for Dummies' by Peter Jones (m) at [email protected]
15
16 peter.setEmail("[email protected]");
17 peter.print(); // Peter Jones (m) at [email protected]
18 cppDummy.print();
19 // 'C++ for Dummies' by Peter Jones (m) at [email protected]
20
21 cppDummy.getAuthor().setEmail("[email protected]");
22 cppDummy.print();
23 // 'C++ for Dummies' by Peter Jones (m) at [email protected]
24}
In the above test program, an instance of Author called peter is constructed (in Line 8). This instance is
passed by reference into Book's constructor (Line 12) to create the Book's instance cppDummy.
Summary
All the codes in this version of example (using references) is exactly the same as the previous version
(without using references), except that the object function parameters are marked with
"const classname &" (e.g., const string &, const Author &). This eliminates the creation of temporary
clone object as in the pass-by-value, which improves the performance. Take note that the constructor
actually invokes the copy constructor to make a copy for its data member, instead of referencing the
copy provided by the caller.

Unit II
Priority Queues: Definition, realizing a Priority Queue using Heaps, Insertion, Deletion, Heap sort, External Sorting

DS Using C++ Page 47


Priority Queue
DEFINITION:
A priority queue is a collection of zero or more elements. Each element has a priority or value.
Unlike the queues, which are FIFO structures, the order of deleting from a priority queue is determined
by the element priority.
Elements are removed/deleted either in increasing or decreasing order of priority rather than in the order in
which they arrived in the queue.
There are two types of priority queues:
Min priority queue

Max priority queue

Min priority queue: Collection of elements in which the items can be inserted arbitrarily, but only smallest
element can be removed.

Max priority queue: Collection of elements in which insertion of items can be in any order but only largest
element can be removed.
In priority queue, the elements are arranged in any order and out of which only the smallest or
largest element allowed to delete each time.
The implementation of priority queue can be done using arrays or linked list. The data structure heap
is used to implement the priority queue effectively.
APPLICATIONS:
1. The typical example of priority queue is scheduling the jobs in operating system. Typically OS allocates
priority to jobs. The jobs are placed in the queue and position of the job in priority queue determines
their priority. In OS there are 3 jobs- real time jobs, foreground jobs and background jobs. The OS always
schedules the real time jobs first. If there is no real time jobs pending then it schedules foreground jobs.
Lastly if no real time and foreground jobs are pending then OS schedules the background jobs.
2. In network communication, the manage limited bandwidth for transmission the priority queue is used.
3. In simulation modeling to manage the discrete events the priority queue is
used. Various operations that can be performed on priority queue are-
1. Find an element
2. Insert a new element
3. Remove or delete an element
The abstract data type specification for a max priority queue is given below. The specification for a min
priority queue is the same as ordinary queue except while deletion, find and remove the element with
minimum priority

ABSTRACT DATA TYPE(ADT):


Abstract data type maxPriorityQueue
{
Instances
Finite collection of elements, each has a priority
Operations empty():return true iff the queue is

DS Using C++ Page 48


empty size() :return number of elements in the
queue
top() :return element with maximum priority
del() :remove the element with largest priority from the queue
insert(x): insert the element x into the queue
}
HEAPS

Heap is a tree data structure denoted by either a max heap or a min heap.

A max heap is a tree in which value of each node is greater than or equal to value of its children nodes. A
min heap is a tree in which value of each node is less than or equal to value of its children nodes.
18 4

12 4 12 14

11 10 18 20

Max heap Min heap

Insertion of element in the Heap:

Consider a max heap as given below:

Now if we want to insert 7. We cannot insert 7 as left child of 4. This is because the max heap has a property that
value of any node is always greater than the parent nodes. Hence 7 will bubble up 4 will be left child of 7.

DS Using C++ Page 49


Note: When a new node is to be inserted in complete binary tree we start from bottom and from left child
on the current level. The heap is always a complete binary tree.

18

12 7 inserted!

11
10 4

If we want to insert node 25, then as 25 is greatest element it should be the root. Hence 25 will bubble up and
18 will move down.
25 inserted!

12 18

11
10 4

The insertion strategy just outlined makes a single bubbling pass from a leaf toward the root. At each level
we do (1) work, so we should be able to implement the strategy to have complexity O(height) = O(log n).

void Heap::insert(int item)


{
int temp; //temp node starts at leaf and moves up.
temp=++size;
while(temp!=1 && //moving element down
heap[temp/2]<item)
{
H[temp] = H[temp/2]; temp=temp/2;
//finding the parent
}
H[temp]=item;
}

DS Using C++ Page 50


Deletion of element from the heap:

For deletion operation always the maximum element is deleted from heap. In Max heap the maximum
element is always present at root. And if root element is deleted then we need to reheapify the tree.

Consider a Max heap

25

12 18

11
10 4

Delete root element:25, Now we cannot put either 12 or 18 as root node and that should be greater than all its
children elements.

18

12 4

11 10

Now we cannot put 4 at the root as it will not satisfy the heap property. Hence we will bubble up 18 and place 18 at
root, and 4 at position of 18.

If 18 gets deleted then 12 becomes root and 11 becomes parent node of 10.

Thus deletion operation can be performed. The time complexity of deletion operation is O(log n).
1. Remove the maximum element which is present at the root. Then a hole is created at the root.
2. Now reheapify the tree. Start moving from root to children nodes. If any maximum element is found
then place it at root. Ensure that the tree is satisfying the heap property or not.
3. Repeat the step 1 and 2 if any more elements are to be deleted.

void heap::delet(int item)

int item, temp;


if(size==0)
DS Using C++ Page 50
cout<<‖Heap is empty\n‖; else
{

//remove the last elemnt and


reheapify item=H[size--];
//item is placed at root
temp=1; child=2;
while(child<=size)
{

For deletion operation always the maximum element is deleted from heap. In Max heap the maximum
element is always present at root. And if root element is deleted then we need to reheapify the tree.

Consider a Max heap

25

12 18

11
10 4

Delete root element:25, Now we cannot put either 12 or 18 as root node and that should be greater than all its
children elements.

18

12 4

11 10

Now we cannot put 4 at the root as it will not satisfy the heap property. Hence we will bubble up 18 and place 18 at
root, and 4 at position of 18.

If 18 gets deleted then 12 becomes root and 11 becomes parent node of 10.

DS Using C++ Page 51


Thus deletion operation can be performed. The time complexity of deletion operation is O(log n).
4. Remove the maximum element which is present at the root. Then a hole is created at the root.
5. Now reheapify the tree. Start moving from root to children nodes. If any maximum element is found
then place it at root. Ensure that the tree is satisfying the heap property or not.
6. Repeat the step 1 and 2 if any more elements are to be deleted.

void heap::delet(int item)

int item, temp;


if(size==0)
cout<<‖Heap is empty\n‖; else
{
//remove the last elemnt and
reheapify item=H[size--];
//item is placed at root temp=1;
child=2;
while(child<=size)
{
if(child<size && H[child]<H[child+1]) child++;
if(item>=H[child])
break;
H[temp]=H[child];
temp=child;
child=child*2;
}
//pl;ace the largest item at
root H[temp]=item;
}
Applications Of Heap:
1. Heap is used in sorting algorithms. One such algorithm using heap is known as heap sort.

2. In priority queue implementation the heap is used.

HEAP SORT
Heap sort is a method in which a binary tree is used. In this method first the heap is created using binary tree and
then heap is sorted using priority queue.

DS Using C++ Page 52


Eg:

25 57 48 38 10 91 84 33

In the heap sort method we first take all these elements in the array ―A‖

A[0] A[1] A[2] A[3] A[4] A[5] A[6] A[7]


25 57 48 38 10 91 84 33

Now start building the heap structure. In forming the heap the key point is build heap in such a way
that the highest value in the array will always be a root.

Insert 25

DS Using C++ Page 53


The next element is 84, which 91>84>57 the middle element. So 84 will be the parent of 57. For
making the complete binary tree 57 will be attached as right of 84.

DS Using C++ Page 54


Now the heap is formed. Let us sort it. For sorting the heap remember two main things the first thing is that the
binary tree form of the heap should not be distributed at all. For the complete sorting binary tree should be
remained. And the second thing is that we will start sorting the higher elements at the end of array in sorted
manner i.e..
A[7]=91, A[6]=84 and so on..
Step 1:- Exchange A[0] with A[7]

DS Using C++ Page 55


DS Using C++ Page 52
ep 5:-Exchane A[0] with A[2]
// If largest is not
root if (largest != i)
{

swap(&arr[i], &arr[largest]);

// Recursively heapify the affected


sub- tree heapify(arr, n, largest);
}
}

// function to do heap sort


void heapSort(int arr[], int
n)
{ int i;
// Build heap (rearrange array)
for ( i = n / 2 - 1; i >= 0; i--)

heapify(arr, n, i);

// One by one extract an element


from heap for ( i=n-1; i>=0; i--)
{
// Move current root to
end swap(&arr[0], &arr[i]);

// call max heapify on the reduced


heap heapify(arr, i, 0);
}
}
/* A utility function to print array of size n */
void printArray(int arr[], int n)
{
for (int i=0; i<n; ++i) cout
<< arr[i] << " ";
cout << "\n";
}
int main()
{
int n,i;
int list[30];
cout<<"enter no of elements\
n"; cin>>n;

cout<<"enter "<<n<<" numbers ";


for(i=0;i<n;i++)
cin>>list[i];
heapSort(list,
n);

cout << "Sorted array is \n";


printArray(list, n);
return 0;
}
EXTERNAL SORTING

All the algorithms require that the input fit into main memory. There are, some applications
where the input is much too large to fit into memory.
To do so, external sorting algorithms are designed to handle very large inputs.
Internal sorting deals with the ordering of records of a file in the ascending or descending
order when the whole file or list is compact enough to be accommodate in the internal
memory of the computer.
In many applications and problems it is quite common to encounter huge files comprising
millions of records which need to be sorted for their effective use in the application concerned.

The application domains of e-governance, digital library, search engines, on-line telephone
directory and electoral system, to list a few, deal with voluminous files of records.

Majority of the internal sorting techniques are virtually incapable of sorting large files since they require the whole
file in the internal memory of the computer, which is impossible. Hence the need for external sorting methods
which are exclusive strategies to sort huge files.

External sorting is a term for a class of sorting algorithms that can handle massive amounts of data. External
sorting is required when the data being sorted do not fit into the main memory of a computing device (usually
RAM) and instead they must reside in the slower external memory (usually a hard drive). External sorting typically
uses a hybrid sort-merge strategy. In the sorting phase, chunks of data small enough to fit in main memory are read,
sorted, and written out to a temporary file. In the merge phase, the sorted sub-files are combined into a single
larger file.

One example of external sorting is the external merge sort algorithm, which sorts chunks that each fit
in RAM, then merges the sorted chunks together. We first divide the file into runs such that the size of a run is small
enough to fit into main memory. Then sort each run in main memory using merge sort sorting algorithm. Finally
merge the resulting runs together into successively bigger runs, until the file is sorted.

The principle behind external sorting

Due to their large volume, the files are stored in external storage devices such as tapes, disks
or drums.
The external sorting strategies therefore need to take into consideration the kind of medium on
which the files reside, since these influence their work strategy.

A common principal behind most popular external sorting methods is outlined below:

1. Internally sort batches of records from the source file to generate runs. Write out the runs as
and when they are generated on to the external storage devices.
2. Merge the runs generated in the earlier phase, to obtain larger but fewer runs, and write them
out onto the external storage devices.
3. Repeat the run generated and merge, until in the final phase only one run gets generated, on which

Block of
Intermed
Second
iate fies
ary
Main

DICTIONARIES:
Dictionary is a collection of pairs of key and value where every value is associated with the
corresponding key.
Basic operations that can be performed on dictionary are:
1. Insertion of value in the dictionary
2. Deletion of particular value from dictionary
3. Searching of a specific value with the help of key

Linear List Representation

The dictionary can be represented as a linear list. The linear list is a collection of pair and
value. There are two method of representing linear list.
1. Sorted Array- An array data structure is used to implement the dictionary.
2. Sorted Chain- A linked list data structure is used to implement the dictionary

Structure of linear list for dictionary:

class dictionary

private:

int k,data;
struct
node
{
public: int key;
int value;
public: struct node *next;

} *head;

dictionary();
}; void insert_d( );
void delete_d( );
v id display_d( );
o void length();

Insertion of new node in the dictionary:

Consider that initially dictionary is empty then


head = NULL
We will create a new node with some key and value contained in it.
Now as head is NULL, this new node becomes head. Hence the dictionary contains only one
record. this node will be ‗curr‘ and ‗prev‘ as well. The ‗cuur‘ node will always point to current
visiting node and ‗prev‘ will always point to the node previous to ‗curr‘ node. As now there is
only one node in the list mark as ‗curr‘ node as ‗prev‘ node.

New/head/curr/prev

1 10 NULL

Insert a record, key=4 and value=20,

New

4 20 NULL

Compare the key value of ‗curr‘ and ‗New‘ node. If New->key > Curr->key then attach New
node to ‗curr‘ node.

prev/head New curr->next=New

prev=curr
1 10 20 NULL
4
Add a new node <7,80> then

head/prev curr New


1 10 4 20 80 NULL
7

If we insert <3,15> then we have to search for it proper position by comparing key

value. (curr->key < New->key) is false. Hence else part will get executed.

1 10 4 20
7 80 NULL

3 15
The delete operation:

Case 1: Initially assign ‗head‘ node as ‗curr‘ node.Then ask for a key value of the node which is
to be deleted. Then starting from head node key value of each jode is cked and compared with
the desired node‘s key value. We will get node which is to be deleted in variable ‗curr‘. The
node given by variable ‗prev‘ keeps track of previous node of ‗cuu‘ node. For eg, delete node
with key value 4 then

cur

1 10 3 15 4 20 7 80 NULL

se 2:

If the node to be deleted is head


node i.e.. if(curr==head)
Then, simply make ‗head‘ node as next node and delete ‗curr‘

curr hea
d
1 10 4 20 7 80 NULL
3 15

Hence the list becomes

head

3 15 4 20 7 80 NULL

SKIP LIST REPRESENTATION


Skip list is a variant list for the linked list. Skip lists are made up of a
series of nodes connected one after the other. Each node contains a key and value pair as well
as one or more references, or pointers, to nodes further along in the list. The number of
references each node contains is determined randomly. This gives skip lists their probabilistic
nature, and the number of references a node contains is called its node level.
There are two special nodes in the skip list one is head node which is the starting node of the
list and tail node is the last node of the list

1 2 3 4 5 6 7
head tail
node node
The skip list is an efficient implementation of dictionary using sorted chain. This is because
in skip list each node consists of forward references of more than one node at a time.

Eg:

null

Now to search any node from above given sorted chain we have to search the sorted chain
from head node by visiting each node. But this searching time can be reduced if we add one
level in every alternate node. This extra level contains the forward pointer of some
node. That means in sorted chain come nodes can holds pointers to more than one node.

NULL

If we want to search node 40 from above chain there we will require comparatively less
time. This search again can be made efficient if we add few more pointers forward

NULL

references.
skip list

Node structure of skip list:


template <class K, class E>
struct skipnode
{
typedef pair<const K,E> pair_type;
pair_type element;
skipnode<K,E> **next;
skipnode(const pair_type &New_pair, int MAX):element(New_pair)
{
next=new skipnode<K,E>*[MAX];
}
};

The individual node looks like this:

Key value array of pointer

Element *next

Searching:
The desired node is searched with the help of a key value.

template<class K, class E>


skipnode<K,E>* skipLst<K,E>::search(K& Key_val)
{
skipnode<K,E>* Forward_Node = header;
for(int i=level;i>=0;i--)
{
while (Forward_Node->next[i]->element.key < key_val)
Forward_Node = Forward_Node->next[i];

Searching for a key within a skip list begins with starting at header at the overall list level
and moving forward in the list comparing node keys to the key_val. If the node key is less than
the key_val, the search continues moving forward at the same level. If o the other hand, the
node key is equal to or greater than the key_val, the search drops one level and continues
forward. This process continues until the desired key_val has been found if it is present in the
skip list. If it is not, the search will either continue at the end of the list or until the first key
with a value greater than the search key is found.

Insertion:

There are two tasks that should be done before insertion operation:

1. Before insertion of any node the place for this new node in the skip list is searched.
Hence before any insertion to take place the search routine executes. The last[] array in
the search routine is used to keep track of the references to the nodes where the
search, drops down one level.

2. The level for the new node is retrieved by the routine randomelevel()
UNIT III
HASH TABLE REPRESENTATION
Hash table is a data structure used for storing and retrieving data very quickly. Insertion
of data in the hash table is based on the key value. Hence every entry in the hash table
is associated with some key.

Using the hash key the required piece of data can be searched in the hash table by few
or more key comparisons. The searching time is then dependent upon the size of the
hash table.

The effective representation of dictionary can be done using hash table. We can place
the dictionary entries in the hash table using hash function.

HASH FUNCTION
Hash function is a function which is used to put the data in the hash table. Hence one
can use the same hash function to retrieve the data from the hash table. Thus hash
function is used to implement the hash table.

The integer returned by the hash function is called hash key.

For example: Consider that we want place some employee records in the hash table The
record of employee is placed with the help of key: employee ID. The employee ID is a 7 digit
number for placing the record in the hash table. To place the record 7 digit number is
converted into 3 digits by taking only last three digits of the key.
th
If the key is 496700 it can be stored at 0 position. The second key 8421002, the record of
nd
those key is placed at 2 position in the array.
Hence the hash function will be- H(key) = key%1000
Where key%1000 is a hash function and key obtained by hash function is called hash key.
Bucket and Home bucket: The hash function H(key) is used to map several dictionary
entries in the hash table. Each position of the hash table is called bucket.

The function H(key) is home bucket for the dictionary with pair whose value is key.

TYPES OF HASH FUNCTION

There are various types of hash functions that are used to place the record in the hash table-

1. Division Method: The hash function depends upon the remainder


of division. Typically the divisor is table length.
For eg; If the record 54, 72, 89, 37 is placed in the hash table and if the table size is 10 then
h(key) = record % table size 0

1 72
54%10=4 2 54
72%10=2 3

89%10=9 4

37%10=7 5

8
9
2. Mid Square:
In the mid square method, the key is squared and the middle or mid part of the result is used
as the index. If the key is a string, it has to be preprocessed to produce a number. Consider
that if we want to place a record 3111 then
2
3111 = 9678321
for the hash table of size 1000
H(3111) = 783 (the middle 3 digits)

3. Multiplicative hash function:


The given record is multiplied by some constant value. The formula for computing the hash
key is-

H(key) = floor(p *(fractional part of key*A)) where p is integer constant and A is constant real
number.

Donald Knuth suggested to use constant A = 0.61803398987

If key 107 and p=50 then

H(key) = floor(50*(107*0.61803398987))
= floor(3306.4818458045)
= 3306
At 3306 location in the hash table the record 107 will be placed.

4. Digit Folding:
The key is divided into separate parts and using some simple operation these parts
are combined to produce the hash key.
For eg; consider a record 12365412 then it is divided into separate parts as 123 654 12
and these are added together

H(key) = 123+654+12
= 789
The record will be placed at location 789
5. Digit Analysis:
The digit analysis is used in a situation when all the identifiers are known in advance.
We first transform the identifiers into numbers using some radix, r. Then examine the digits of
each identifier. Some digits having most skewed distributions are deleted. This deleting of
digits is continued until the number of remaining digits is small enough to give an address in
the range of the hash table. Then these digits are used to calculate the hash address.
COLLISION

the hash function is a function that returns the key value using which the record can be placed
in the hash table. Thus this function helps us in placing the record in the hash table at
appropriate position and due to this we can retrieve the record directly from that location. This
function need to be designed very carefully and it should not return the same hash key
address for two different records. This is an undesirable situation in hashing.

Definition: The situation in which the hash function returns the same hash key (home bucket)
for more than one record is called collision and two same hash keys returned for different
records is called synonym.
Similarly when there is no room for a new pair in the hash table then such a situation is
called overflow. Sometimes when we handle collision it may lead to overflow conditions.
Collision and overflow show the poor hash functions.
0
For example,
1 131

Consider a hash function. 2 43


44
3
H(key) = recordkey%10 having the hash table size of 10 4 36
57
5 78
19
The record keys to be placed are 6

131, 44, 43, 78, 19, 36, 57 and 77 8


131%10=1 9
44%10=4
43%10=3
78%10=8
19%10=9
36%10=6
57%10=7
77%10=7
Now if we try to place 77 in the hash table then we get the hash key to be 7 and at index 7
already the record key 57 is placed. This situation is called collision. From the index 7 if we
look for next vacant position at subsequent indices 8.9 then we find that there is no room
to place 77 in the hash table. This situation is called overflow.

COLLISION RESOLUTION TECHNIQUES


If collision occurs then it should be handled by applying some techniques. Such a
technique is called collision handling technique.
1. Chaining
2. Open addressing (linear
probing) 3.Quadratic probing
4. Double hashing
5. Double hashing
6.Rehashing
CHAINING

In collision handling method chaining is a concept which introduces an additional field with
data i.e. chain. A separate chain table is maintained for colliding data. When collision occurs
then a linked list(chain) is maintained at the home bucket.

For eg;

Consider the keys to be placed in their home


buckets are 131, 3, 4, 21, 61, 7, 97, 8, 9

then we will apply a hash function as H(key) = key % D

Where D is the size of table. The hash table will be-

Here D = 10

0
131 21 61 NULL
1

3 NULL

131 61
NULL

7 97 NULL

A chain is maintained for colliding elements. for instance 131 has a home bucket (key) 1.
similarly key 21 and 61 demand for home bucket 1. Hence a chain is maintained at index 1.
OPEN ADDRESSING – LINEAR PROBING

This is the easiest method of handling collision. When collision occurs i.e. when two records
demand for the same home bucket in the hash table then collision can be solved by placing the
second record linearly down whenever the empty bucket is found. When use linear probing
(open addressing), the hash table is represented as a one-dimensional array with indices that
range from 0 to the desired table size-1. Before inserting any elements into this table, we must
initialize the table to represent the situation where all slots are empty. This allows us to detect
overflows and collisions when we inset elements into the table. Then using some suitable hash
function the element can be inserted into the hash table.

For example:

Consider that following keys are to be inserted in the hash table

131, 4, 8, 7, 21, 5, 31, 61, 9, 29

Initially, we will put the following keys in the hash table.

We will use Division hash function. That means the keys are placed using the formula

H(key) = key % tablesize


H(key) = key % 10

For instance the element 131 can be placed

at H(key) = 131 % 10
=1

Index 1 will be the home bucket for 131. Continuing in this fashion we will place 4, 8, 7.

Now the next key to be inserted is 21. According to the hash function

H(key)=21%10

H(key) = 1

But the index 1 location is already occupied by 131 i.e. collision occurs. To resolve this collision
we will linearly move down and at the next empty location we will prob the element.
Therefore 21 will be placed at the index 2. If the next element is 5 then we get the home
bucket for 5 as index 5 and this bucket is empty so we will put the element 5 at index 5.

Index 0
Key Key Key

1 NULL NULL NULL

131 131 131


2
NULL 21 21
3

after placing keys 31, 61


The next record key is 9. According to decision hash function it demands for the home bucket
9. Hence we will place 9 at index 9. Now the next final record key 29 and it hashes a key 9. But
home bucket 9 is already occupied. And there is no next empty bucket as the table size is
limited to index 9. The overflow occurs. To handle it we move back to bucket 0 and is the
th
location over there is empty 29 will be placed at 0 index.
Problem with linear probing:
One major problem with linear probing is primary clustering. Primary clustering is a process in which a block
of data is formed in the hash table when collision is resolved.
Key

39
19%10 = 9 cluster is formed
29
18%10 = 8 8

39%10 = 9
29%10 = 9
8%10 = 8

rest of the table is


18

19
empty this cluster problem can be solved by quadratic probing.

QUADRATIC PROBING:

Quadratic probing operates by taking the original hash value and adding successive values of
an arbitrary quadratic polynomial to the starting value. This method uses following formula.

2
H(key) = (Hash(key) + i ) % m)
where m can be table size or any prime number.

for eg; If we have to insert following elements in the hash table with table size 10: 37, 90, 55, 22,

17, 49, 87 0 90
11
1 22

37 % 10 = 7 2
55
90 % 10 = 0 3

55 % 10 = 5 4 37

22 % 10 = 2 5

11 % 10 = 1 6
7
Now if we want to place 17 a collision will occur as 17%10 = 7 and 8
bucket 7 has already an element 37. Hence we will apply 9
quadratic probing to insert this record in the hash
2
table. Hi (key) = (Hash(key) + i ) % m

Consider i = 0 then
2
(17 + 0 ) % 10 = 7
2
(17 + 1 ) % 10 = 8, when i =1

The bucket 8 is empty hence we will place the element at index 8. 0


90
Then comes 49 which will be placed at index 9. 1 11
22
2

49 % 10 = 9 3
55
4
37
5 49
6
7

9
Now to place 87 we will use quadratic probing.
0 90
(87 + 0) % 10 = 7 1 11
22
(87 + 1) % 10 = 8… but already occupied 2
2
(87 + 2 ) % 10 = 1.. already occupied 3 55
2
(87 + 3 ) % 10 = 6 4 87
37
5 49
It is observed that if we want place all the necessary elements 6
in the hash table the size of divisor (m) should be twice as 7
large as total number of elements. 8
9
DOUBLE HASHING
Double hashing is technique in which a second hash function is applied to the key when a
collision occurs. By applying the second hash function we will get the number of positions from
the point of collision to insert.
There are two important rules to be followed for the second
function: it must never evaluate to zero.
must make sure that all cells can be probed.
The formula to be used for double hashing is

H1(key) = key mod tablesize


H2(key) = M – (key mod M) Key

90

where M is a prime number smaller than the size of the table.


22

Consider the following elements to be placed in the hash table of size 10 37, 90, 45,
22, 17, 49, 55
45
Initially insert the elements using the formula for H1(key). Insert 37, 90,
45, 22

H1(37) = 37 % 10 = 7 37

H1(90) = 90 % 10 = 0
49
H1(45) = 45 % 10 = 5
H1(22) = 22 % 10 = 2
H1(49) = 49 % 10 = 9
Now if 17 to be inserted then
Key
90
H1(17) = 17 % 10 = 7
17
H2(key) = M – (key % M) 22

Here M is prime number smaller than the size of the table. Prime number
smaller than table size 10 is 7
45

Hence M = 7 H2(17)

= 7-(17 % 7) 37
ave to take
=7–3=4
49

That means we have to insert the element 17 at 4 places from 37. In short we
h 4 jumps. Therefore the 17 will be placed at index 1.

Now to insert number 55


Key

H1(55) = 55 % 10 =5 Collision 90

17
H2(55) = 7-(55 % 7) 22

=7–6=1

That means we have to take one jump from index 5 to place 55.
45
Finally the hash table will be - 55
Comparison of Quadratic Probing & Double Hashing

The double hashing requires another hash function whose probing efficiency is same
as some another hash function required when handling random collision.
The double hashing is more complex to implement than quadratic probing. The
quadratic probing is fast technique than double hashing.

REHASHING

Rehashing is a technique in which the table is resized, i.e., the size of table is doubled by
creating a new table. It is preferable is the total size of table is a prime number. There are
situations in which the rehashing is required.

When table is completely full


With quadratic probing when the table is filled
half. When insertions fail due to overflow.
In such situations, we have to transfer entries from old table to the new table by re
computing their positions using hash functions.

Consider we have to insert the elements 37, 90, 55, 22, 17, 49, and 87. the table size is 10
and will use hash function.,

H(key) = key mod tablesize

37 % 10 = 7
90 % 10= 0
55 % 10 = 5
22 % 10 = 2
17 % 10 = 7 Collision solved by linear probing
49 % 10 = 9

Now this table is almost full and if we try to insert more elements collisions will occur and
eventually further insertions will fail. Hence we will rehash by doubling the table size. The old
table size is 10 then we should double this size for new table, that becomes 20. But 20 is not a
prime number, we will prefer to make the table size as 23. And new hash function will be
H(key) key mod 23 0
90
1 11
22
37 % 23 = 14 2
90 % 23 = 21 3 55
55 % 23 = 9 4 87
37
22 % 23 = 22 5 49
17 % 23 = 17 6
49 % 23 = 3 7

87 % 23 = 18 8

10

11

12

13

14

15

16

17

18

19

20

21

22

23

Now the hash table is sufficiently large to accommodate new insertions.


Advantages:

1. This technique provides the programmer a flexibility to enlarge the table size if required.
2. Only the space gets doubled with simple hash function which avoids occurrence
of collisions.

EXTENSIBLE HASHING

Extensible hashing is a technique which handles a large amount of data. The data to
be placed in the hash table is by extracting certain number of bits.
Extensible hashing grow and shrink similar to B-trees.
In extensible hashing referring the size of directory the elements are to be placed
in buckets. The levels are indicated in parenthesis.

For eg: Directory

0
1
Levels
0) (1)
001
111
010 data to be
placed in
bucket

The bucket can hold the data of its global depth. If data in bucket is more than
global depth then, split the bucket and double the directory.

Consider we have to insert 1, 4, 5, 7, 8, 10. Assume each page can hold 2 data entries (2 is
the depth).

Step 1: Insert 1, 4
1 = 001

4 = 100

We will examine last bit


of data and insert the data
in bucket.
001

010
Insert 5. The bucket is full. Hence double the directory.
Thus the data is inserted using extensible hashing.

Deletion Operation:

If we wan tot delete 10 then, simply make the bucket of 10 empty.

00 01 10 11

(1) (2) (2)


100 001 111
1000 101

Delete 7.

00 01 10 11
(1) (1)
100 001 Note that the level was increased
1000 101 when we insert 7. Now on deletion
of 7, the level should get decremented.

Delete 8. Remove entry from directory 00.

00 00 10 11

(1) (1)
100 001

101
Applications of hashing:

1. In compilers to keep track of declared variables.


2. For online spelling checking the hashing functions are used.
3. Hashing helps in Game playing programs to store the moves made.
4. For browser program while caching the web pages, hashing is used.
5. Construct a message authentication code (MAC)
6. Digital signature.
7. Time stamping
8. Key updating: key is hashed at specific intervals resulting in new key

COMPARISON OF HASHING & SKIP LISTS

Hashing Skip List

This method is used to carry out dictionary Skip lists are used to implement dictionary
operations using randomized processes. operations using randomized process.

It is based on hash function It does not require hash function

If the sorted data is given then hashing is The sorted data improves the performance
not an effective method to implement of skip list.
dictionary.

The space requirement in hashing is for The forward pointers are required for every
hash table and a forward pointer is required level of skip list.
per node.

Hashing is an efficient method than skip The skip lists are not that mush efficient.
lists.

Skip lists are more versatile than hash Worst case space requirement is larger
table. for skip list than hashing.
Unit IV

TREES

A Tree is a data structure in which each element is attached to one or more elements directly beneath it.

Level 0

1
B
C D

E F G
H I J
2
K L
3

Terminology

The connections between elements are called branches.


A tree has a single root, called root node, which is shown at the top of the tree. i.e. root is always at
the highest level 0.
Each node has exactly one node above it, called parent. Eg: A is the parent of B,C and D.
The nodes just below a node are called its children. ie. child nodes are one level lower than the
parent node.
A node which does not have any child called leaf or terminal node. Eg: E, F, K, L, H, I and M are leaves.
Nodes with at least one child are called non terminal or internal nodes.
The child nodes of same parent are said to be siblings.
A path in a tree is a list of distinct nodes in which successive nodes are connected by branches
in the tree.
The length of a particular path is the number of branches in that path. The degree of a node
of a tree is the number of children of that node.
The maximum number of children a node can have is often referred to as the order of a
tree. The height or depth of a tree is the length of the longest path from root to any leaf.

1. Root: This is the unique node in the tree to which further sub trees are attached. Eg: A
Degree of the node: The total number of sub-trees attached to the node is called the degree of
the node.Eg: For node A degree is 3. For node K degree is 0
3. Leaves: These are the terminal nodes of the tree. The nodes with degree 0 are always the leaf nodes.
Eg: E, F, K, L,H, I, J
4. Internal nodes: The nodes other than the root node and the leaves are called the internal nodes.
Eg: B, C, D, G
5. Parent nodes: The node which is having further sub-trees(branches) is called the parent node
of those sub-trees. Eg: B is the parent node of E and F.
6. Predecessor: While displaying the tree, if some particular node occurs previous to some other
node then that node is called the predecessor of the other node. Eg: E is the predecessor of the
node B.
7. Successor: The node which occurs next to some other node is a successor node. Eg: B is
the successor of E and F.
8. Level of the tree: The root node is always considered at level 0, then its adjacent children are
supposed to be at level 1 and so on. Eg: A is at level 0, B,C,D are at level 1, E,F,G,H,I,J are at level 2, K,L
are at level 3.
9. Height of the tree: The maximum level is the height of the tree. Here height of the tree is 3.
The height if the tree is also called depth of the tree.
10. Degree of tree: The maximum degree of the node is called the degree of the tree.
BINARY TREES

Binary tree is a tree in which each node has at most two children, a left child and a right child. Thus
the order of binary tree is 2.

A binary tree is either empty or


consists of a) a node called the root
b) left and right sub trees are themselves binary trees.

A binary tree is a finite set of nodes which is either empty or consists of a root and two disjoint
trees called left sub-tree and right sub- tree.
In binary tree each node will have one data field and two pointer fields for representing
the sub-branches. The degree of each node in the binary tree will be at the most two.

Types Of Binary Trees:

There are 3 types of binary trees:

1. Left skewed binary tree: If the right sub-tree is missing in every node of a tree we call it as left skewed
tree.
A

C
2. Right skewed binary tree: If the left sub-tree is missing in every node of a tree we call it is right
sub-tree.

3. Complete binary tree:


The tree in which degree of each node is at the most two is called a complete binary tree.
In a complete binary tree there is exactly one node at level 0, two nodes at level 1 and four nodes at
level
l
2 and so on. So we can say that a complete binary tree depth d will contain exactly 2 nodes at
each level l, where l is from 0 to d.

B C

D E F G
Note:
n
1. A binary tree of depth n will have maximum 2 -1 nodes.
2. A complete binary tree of level l will have maximum 2l nodes at each level, where l starts from 0.
3. Any binary tree with n nodes will have at the most n+1 null branches.
4. The total number of edges in a complete binary tree with n terminal nodes are 2(n-1).

Binary Tree Representation

A binary tree can be represented mainly in 2 ways:

a) Sequential Representation
b) Linked Representation

a) Sequential Representation
The simplest way to represent binary trees in memory is the sequential representation that
uses one-dimensional array.
1) The root of binary tree is stored in the 1 st location of array
th
2) If a node is in the j location of array, then its left child is in the location 2J+1 and its right
child in the location 2J+2
d+1
The maximum size that is required for an array to store a tree is 2 -1, where d is the depth of the tree.

Advantages of sequential representation:


The only advantage with this type of representation is that the
direct access to any node can be possible and finding the parent or left children of any particular
node is fast because of the random access.

Disadvantages of sequential representation:


1. The major disadvantage with this type of representation is wastage of memory. For example
in the skewed tree half of the array is unutilized.
2. In this type of representation the maximum depth of the tree has to be fixed. Because we
have decide the array size. If we choose the array size quite larger than the depth of the tree,
then it will be wastage of the memory. And if we coose array size lesser than the depth of the
tree then we will be unable to represent some part of the tree.
3. The insertions and deletion of any node in the tree will be costlier as other nodes has to
be adjusted at appropriate positions so that the meaning of binary tree can be
preserved.
As these drawbacks are there with this sequential type of representation, we will search for more
flexible representation. So instead of array we will make use of linked list to represent the tree.
b) Linked Representation
Linked representation of trees in memory is implemented using pointers. Since each node in a
binary tree can have maximum two children, a node in a linked representation has two pointers for
both left and right child, and one information field. If a node does not have any child, the
corresponding pointer field is made NULL pointer.

In linked list each node will look like this:

Left Child Data Right Child


Advantages of linked representation:
1. This representation is superior to our array representation as there is no wastage
of memory. And so there is no need to have prior knowledge of depth of the
tree. Using dynamic memory concept one can create as much memory(nodes) as
required. By chance if some nodes are unutilized one can delete the nodes by
making the address free.

2. Insertions and deletions which are the most common operations can be done
without moving the nodes.
Disadvantages of linked representation:

1. This representation does not provide direct access to a node and special algorithms
are required.
2. This representation needs additional space in each node for storing the left and right
sub- trees.

TRAVERSING A BINARY TREE

Traversing a tree means that processing it so that each node is visited exactly once. A
binary tree can be
traversed a number of ways.The most common tree traversals are

In-order
Pre-order
and Post-
order
Pre-order 1.Visit the root Root | Left | Right
2.Traverse the left sub tree in pre-order
3.Traverse the right sub tree in pre-order.
In-order 1.Traverse the left sub tree in in-order Left | Root | Right
2. Visit the root
3. Traverse the right sub tree in in-order.
Post-order 1.Traverse the left sub tree in post-order Left | Right | Root
2.Traverse the right sub tree in post-order.
3.Visit the root

B C

D E F G

H I J

The pre-order traversal is: ABDEHCFGIKJ


The in-order traversal is : DBHEAFCKIGJ
The post-order traversal is:DHEBFKIJGCA
Inorder Traversal:
rd
Print 3
A

nd th
Print 2 Print 4
B D

C Print this
s at the last E
t
Print 1
C-B-A-D-E is the inorder traversal i.e. first we go towards the leftmost node. i.e. C so print that node
C. Then go back to the node B and print B. Then root node A then move towards the right sub-
tree print D and finally E. Thus we are following the tracing sequence of Left|Root|Right. This type of
traversal is called inorder traversal. The basic principle is to traverse left sub-tree then root and then
the right sub-tree.

Pseudo Code:

template <class T>


void inorder(bintree<T> *temp)
{
if(temp!=NULL)
{
inorder(temp->left);
cout<<‖temp->data‖;
inorder(temp->right);
}
}

A-B-C-D-E is the preorder traversal of the above fig. We are following Root|Left|Right path
i.e. data at the root node will be printed first then we move on the left sub-tree and go on
printing the data till we reach to the left most node. Print the data at that node and then move
to the right sub- tree. Follow the same principle at each sub-tree and go on printing the data
accordingly.

template <class T>


void preorder(bintree<T> *temp)
{
if(temp!=NULL)
{
cout<<‖temp->data‖; preorder(temp->left);
preorder(temp->right);
}
}
From figure the postorder traversal is C-D-B-E-A. In the postorder traversal we are following the
Left|Right|Root principle i.e. move to the leftmost node, if right sub-tree is there or not if not then
print the leftmost node, if right sub-tree is there move towards the right most node. The key idea
here is that at each sub-tree we are following the Left|Right|Root principle and print the data
accordingly.
Pseudo Code:

template <class T>


void postorder(bintree<T> *temp)
{
if(temp!=NULL)
{
postorder(temp->left);
postorder(temp->right);
cout<<‖temp->data‖;
}
}

BINARY SEARCH TREE


In the simple binary tree the nodes are arranged in any fashion. Depending on user‘s desire
the new nodes can be attached as a left or right child of any desired node. In such a case finding for
any node is a long cut procedure, because in that case we have to search the entire tree. And thus
the searching time complexity will get increased unnecessarily. So to make the searching
algorithm faster in a binary tree we will go for building the binary search tree. The binary search
tree is based on the binary search algorithm. While creating the binary search tree the data is
systematically arranged. That means values at left sub-tree < root node value < right sub-tree
values.
Operations On Binary Search Tree:
The basic operations which can be performed on binary search tree are.
1. Insertion of a node in binary search tree.
2. Deletion of a node from binary search tree.
3. Searching for a particular node in binary search tree.
Insertion of a node in binary search tree.
While inserting any node in binary search tree, look for its appropriate position in the binary search
tree. We start comparing this new node with each node of the tree. If the value of the node which is
to be inserted is greater than the value of the current node we move on to the right sub-branch
otherwise we move on to the left sub-branch. As soon as the appropriate position is found we
attach this new node as left or right child appropriately.

Before Insertion

In the above fig, if we wan to insert 23. Then we will start comparing 23 with value of root node
i.e. 10. As 23 is greater than 10, we will move on right sub-tree. Now we will compare 23 with 20
and move right, compare 23 with 22 and move right. Now compare 23 with 24 but it is less than
24. We will move on left branch of 24. But as there is node as left child of 24, we can attach 23
as left child of 24.
Deletion of a node from binary search tree.
For deletion of any node from binary search tree there are three which are possible.
i. Deletion of leaf node.
ii. Deletion of a node having one child.
iii. Deletion of a node having two children.
Deletion of leaf node.

This is the simplest deletion, in which we set the left or right pointer of parent node as NULL.

10

7 15

Before deletion

5 9 12 18

From the above fig, we want to delete the node having value 5 then we will set left pointer of its
parent node as NULL. That is left pointer of node having value 7 is set to NULL.
Deletion of a node having one child.

To explain this kind of deletion, consider a tree as given below.

If we want to delete the node 15, then we


will simply copy node 18 at place of 16
and then set the node free

Deletion of a node having two children.

Consider a tree as given below.


Let us consider that we want to delete node having value 7. We will then find out the inorder
successor of node 7. We will then find out the inorder successor of node 7. The inorder successor will
be simply copied at location of node 7.
That means copy 8 at the position where value of node is 7. Set left pointer of 9 as NULL. This
completes the deletion procedure.

Searching for a node in binary search tree.


In searching, the node which we want to search is called a key node. The key node will be
compared with each node starting from root node if value of key node is greater than current node
then we search for it on right sub branch otherwise on left sub branch. If we reach to leaf node and
still we do not get the value of key node then we declare ―node is not present in the tree‖.
In the above tree, if we want to search for value 9. Then we will compare 9 with root node 10. As
9 is less than 10 we will search on left sub branch. Now compare 9 with 5, but 9 is greater than 5.
So we will move on right sub tree. Now compare 9 with 8 but 9 is greater than 8 we will move on
right sub branch. As the node we will get holds the value 9. Thus the desired node can be
searched.

AVL TREES

Adelsion Velski and Lendis in 1962 introduced binary tree structure that is balanced
with respect to height of sub trees. The tree can be made balanced and because of this
retrieval
of any node can be done in Ο(log n) times, where n is total number of nodes. From the name of these
scientists the tree is called AVL tree.

Definition:

An empty tree is height balanced if T is a non empty binary tree with TL and TR as
its left and right sub trees. The T is height balanced if and only if
i. TL and TR are height balanced.
ii. hL-hR <= 1 where hL and hR are heights of TL and TR.
The idea of balancing a tree is obtained by calculating the balance factor of a tree.

Definition of Balance Factor:

The balance factor BF(T) of a node in binary tree is defined to be hL-hR where hL and
hR are heights of left and right sub trees of T.
For any node in AVL tree the balance factor i.e. BF(T) is -1, 0 or +1.
Height of AVL Tree:

Theorem: The height of AVL tree with n elements (nodes) is O(log n).

Proof: Let an AVL tree with n nodes in it. Nh be the minimum number of nodes in an AVL tree
of height h.

In worst case, one sub tree may have height h-1 and other sub tree may have height h-2. And both
these sub trees are AVL trees. Since for every node in AVL tree the height of left and right sub trees
differ by at most 1.

Hence

N =N +N
h h-1
+1
h-2

Where Nh denotes the minimum number of nodes in an AVL tree of height

h. N0=0 N1=2

We can also write it as

N > Nh = Nh-1+Nh-2+1

> 2Nh-2

> 4Nh-4
.
.
> 2iNh-2i

If value of h is even, let i = h/2-1

Then equation becomes

N > 2h/2-1N2

= N > 2(h-1)/2x4 (N2 = 4)

= O(log N)

If value of h is odd, let I = (h-1)/2 then equation becomes


N > 2(h-1)/2 N1
N > 2(h-1)/2 x 1
H = O(log N)

This proves that height of AVL tree is always O(log N). Hence search, insertion and deletion can
be carried out in logarithmic time.
Representation of AVL Tree

The AVL tree follows the property of binary search tree. In fact AVL trees are
basically binary search trees with balance factors as -1, 0, or +1.

After insertion of any node in an AVL tree if the balance factor of any node
becomes other than -1, 0, or +1 then it is said that AVL property is violated. Then
we have to restore the destroyed balance condition. The balance factor is denoted
at

right top corner inside the node.

After insertion of a new node if balance condition gets destroyed, then the nodes on that path(new
node insertion point to root) needs to be readjusted. That means only the affected sub tree is to be
rebalanced.

The rebalancing should be such that entire tree should satisfy AVL property.

In above given example-


Insertion of a node.

There are four different cases when rebalancing is required after insertion of new node.

1. An insertion of new node into left sub tree of left child. (LL).
2. An insertion of new node into right sub tree of left child. (LR).
3. An insertion of new node into left sub tree of right child. (RL).
4. An insertion of new node into right sub tree of right child.(RR).

Some modifications done on AVL tree in order to rebalance it is called rotations of AVL tree

There are two types of rotations:

Single rotation Double rotation

Left-Left(LL rotation) Left-Right(LR rotation)

Right-Right(RR rotation) Right-Left(RL rotation)

Insertion Algorithm:
1. Insert a new node as new leaf just as an ordinary binary search tree.
2. Now trace the path from insertion point(new node inserted as leaf) towards root. For each node
‗n‘ encountered, check if heights of left (n) and right (n) differ by at most 1. a)If yes,
move towards parent (n).
b)Otherwise restructure by doing either a single rotation or a double rotation.
Thus once we perform a rotation at node ‗n‘ we do not require to perform any rotation at
any ancestor on ‗n‘.
When node ‗1‘ gets inserted as a left child of node ‗C‘ then AVL property gets destroyed
i.e. node A has balance factor +2.
The LL rotation has to be applied to rebalance the nodes.

2. RR rotation:

When node ‗4‘ gets attached as right child of node ‗C‘ then node ‗A‘ gets unbalanced. The
rotation which needs to be applied is RR rotation as shown in fig.
When node ‗3‘ is attached as a right child of node ‗C‘ then unbalancing occurs because of LR.
Hence LR rotation needs to be applied.

When node ‗2‘ is attached as a left child of node ‗C‘ then node ‗A‘ gets unbalanced as its
balance factor becomes -2. Then RL rotation needs to be applied to rebalance the AVL tree.

Example:

Insert 1, 25, 28, 12 in the following AVL tree.


Insert 1

To insert node ‗1‘ we have to attach it as a left child of ‗2‘. This will unbalance the tree as
follows. We will apply LL rotation to preserve AVL property of it.

Insert 25

We will attach 25 as a right child of 18. No balancing is required as entire tree preserves the AVL
property
Insert 28

The node ‗28‘ is attached as a right child of 25. RR rotation is required to rebalance.
Insert 12

To rebalance the tree we have to apply LR rotation.


Deletion:

Even after deletion of any particular node from AVL tree, the tree has to be restructured in order
to preserve AVL property. And thereby various rotations need to be applied.

Algorithm for deletion:

The deletion algorithm is more complex than insertion algorithm.

1. Search the node which is to be deleted.

2. a) If the node to be deleted is a leaf node then simply make it NULL to remove.
b) If the node to be deleted is not a leaf node i.e. node may have one or two children, then the
node must be swapped with its in order successor. Once the node is swapped, we can remove
this node.
3. Now we have to traverse back up the path towards root, checking the balance factor of
every node along the path. If we encounter unbalancing in some sub tree
then balance that sub tree using appropriate single or double rotations. The deletion
algorithm takes O(log n) time to delete any node.
The tree becomes
Searching:

The searching of a node in an AVL tree is very simple. As AVL tree is basically binary search tree, the
algorithm used for searching a node from binary search tree is the same one is used to search a node
from AVL tree.

The searching of a node from AVL tree takes O(log n) time.

PART II

BTREES
Multi-way trees are tree data structures with more than two branches at a node. The data
structures of m-way search trees, B trees and Tries belong to this category of tree
structures.
AVL search trees are height balanced versions of binary search trees, provide efficient
retrievals and storage operations. The complexity of insert, delete and search operations on
AVL search trees id O(log n).
Applications such as File indexing where the entries in an index may be very large,
maintaining the index as m-way search trees provides a better option than AVL search trees
which are but only balanced binary search trees.
While binary search trees are two-way search trees, m-way search trees are extended binary
search trees and hence provide efficient retrievals.
B trees are height balanced versions of m-way search trees and they do not recommend
representation of keys with varying sizes.
Tries are tree based data structures that support keys with varying sizes.
Definition:

A B tree of order m is an m-way search tree and hence may be empty. If non empty, then the
following properties are satisfied on its extended tree representation:
i. The root node must have at least two child nodes and at most m child nodes.
ii. All internal nodes other than the root node must have at least |m/2 | non empty child nodes and at
most m non empty child nodes.
iii. The number of keys in each internal node is one less than its number of child nodes and these
keys partition the keys of the tree into sub trees.
iv. All external nodes are at the same
level. v.
Example:
F K O B tree of order 4

Level 1

C D G M N P Q W

S T X Y Z
Level
3

Insertion
For example construct a B-tree of order 5 using following numbers. 3, 14, 7, 1, 8, 5, 11, 17, 13, 6, 23, 12,
20, 26, 4, 16, 18, 24, 25, 19
The order 5 means at the most 4 keys are allowed. The internal node should have at least 3 non empty
children and each leaf node must contain at least 2 keys.

Step 1: Insert 3, 14, 7, 1

1 3 7 14

.
Step 2: Insert 8, Since the node is full split the node at medium 1, 3, 7, 8, 14

1 3 8 14

Step 3: Insert 5, 11, 17 which can be easily inserted in a B-tree.

1 3 5
8 11 14 17

Step 4: Now insert 13. But if we insert 13 then the leaf node will have 5 keys which is not allowed.
Hence 8,
11, 13, 14, 17 is split and medium node 13 is moved up.

7 13

1 3 5 8 11 14 17
Step 5: Now insert 6, 23, 12, 20 without any split.

7 13

1 3 5 6 8 11 12 14 17 20 23

Step 6: The 26 is inserted to the right most leaf node. Hence 14, 17, 20, 23, 26 the node is split and 20
will be moved up.

7 13 20

1 3 5 6 8 11 12 14 17 23 26
Step 7: Insertion of node 4 causes left most node to split. The 1, 3, 4, 5, 6 causes key 4 to move up.

Then insert 16, 18, 24, 25.

4 7 13 20

1 3 5 6 8 11 12 14 16 17 18 23 24 25 26

Step 8: Finally insert 19. Then 4, 7, 13, 19, 20 needs to be split. The median 13 will be moved up
to from a root node.
The tree then will be -
13

4 7 17 20

1 3 5 6 8 11 12 14 16 18 19 23 24 25 26

Thus the B tree is constructed. 13

Deletion

Consider a B-tree

4 7 17 20

1 3 5 6 8 11 12 14 16 18 19 23 24 25 26
Delete 8, then it is very simple.

13

4 7 17 20

1 3 5 6 11 12 14 16 18 19 23 24 25 26

Now we will delete 20, the 20 is not in a leaf node so we will find its successor which is 23, Hence
23 will be moved up to replace 20.
13

4 7 17 23

1 3 5 6 11 12 14 16 18 19 24 25 26

Next we will delete 18. Deletion of 18 from the corresponding node causes the node with only
one key, which is not desired (as per rule 4) in B-tree of order 5. The sibling node to immediate
right has an extra key. In such a case we can borrow a key from parent and move spare key of
sibling up.
4 7 24
17

1 3 5 6 11 12 14 16 19 23 25 26

Now delete 5. But deletion of 5 is not easy. The first thing is 5 is from leaf node. Secondly this leaf
node has no extra keys nor siblings to immediate left or right. In such a situation we can combine
this node with one of the siblings. That means remove 5 and combine 6 with the node 1, 3. To make
the tree balanced we have to move parent‘s key down. Hence we will move 4 down as 4 is between
1, 3, and 6. The tree will be-

13

7 17 24

1 3 4 6 11 12 14 16 19 23 25 26

But again internal node of 7 contains only one key which not allowed in B-tree. We then will try to borrow
a key from sibling. But sibling 17, 24 has no spare key. Hence we can do is that, combine 7 with 13 and
17, 24. Hence the B-tree will be
1 3 4 6 12 16 23 26
11 14 19 25

Searching
The search operation on B-tree is similar to a search to a search on binary search tree. Instead of choosing
between a left and right child as in binary tree, B-tree makes an m-way choice. Consider a B-tree as given
below.

13

4 7
17
20

1 3 5 6 8 11 12 14 16 18 19 23 24 25 26

If we want to search 11 then

i. 11 < 13 ; Hence search left node

ii. 11 > 7 ; Hence right most node

iii. 11 > 8 ; move in second block

iv. node 11 is found

The running time of search operation depends upon the height of the tree. It is O(log n).

Height of B-tree

The maximum height of B-tree gives an upper bound on number of disk access. The maximum
number of keys in a B-tree of order 2m and depth h is
2 h-1
1 + 2m + 2m(m+1) + 2m(m+1) + . . .+ 2m(m+1)
h
i-1
The maximum height of B-tree with n keys

log m+1 n = O(log n)


2m

UNIT V

Terminology of Graph
Graphs:-
A graph G is a discrete structure consisting of nodes (called vertices) and lines joining the
nodes (called edges). Two vertices are adjacent to each other if they are joint by an edge. The
edge joining the two vertices is said to be an edge incident with them. We use V (G) and E(G)
to denote the set of vertices and edges of G respectively.

Euler Circuit and Euler Path


An Euler circuit in a graph G is a simple circuit containing every edge of G. An Euler path in G
is a simple path containing every edge of G.
Graph
Representations
Graph data structure is represented using following representations...
1. Adjacency Matrix
2. Incidence Matrix
3. Adj acency List Adjacency Matrix
In this representation, graph can be represented using a matrix of size total number of vertices
by total number of vertices. That means if a graph with 4 vertices can be represented using a
matrix of 4X4 class. In this matrix, rows and columns both represents vertices. This matrix is
filled with either 1 or 0. Here, 1 represents there is an edge from row vertex to column vertex
and 0 represents there is no edge from row vertex to column vertex.

For example, consider the following undirected graph representation...

Directed graph representation...

Incidence Matrix
In this representation, graph can be represented using a matrix of size total
number of vertices by total number of edges. That means if a graph with 4
vertices and 6 edges can be represented using a matrix of 4X6 class. In this matrix,
rows represents vertices and columns represents edges. This matrix is filled with
either 0 or 1 or -1. Here, 0 represents row edge is not connected to column
vertex, 1 represents row edge is connected as outgoing edge to column vertex
and -1 represents row edge is connected as incoming edge to column vertex.

For example, consider the following directed graph representation...


Adjacency List
In this representation, every vertex of graph contains list of its adjacent
vertices.
For example, consider the following directed graph representation
implemented using linked list...

This representation can also be implemented using array as follows..

Graph traversals
Graph traversal means visiting every vertex and edge exactly once in a well-defined order.
While using certain graph algorithms, you must ensure that each vertex of the graph is visited
exactly once. The order in which the vertices are visited are important and may
depend upon the algorithm or question that you are solving.

During a traversal, it is important that you track which vertices have been visited. The
most common way of tracking vertices is to mark them.

Depth First Search (DFS)


The DFS algorithm is a recursive algorithm that uses the idea of backtracking. It
involves exhaustive searches of all the nodes by going ahead, if possible, else by
backtracking.

This recursive nature of DFS can be implemented using stacks. The basic idea is
as follows: Pick a starting node and push all its adjacent nodes into a stack.
Pop a node from stack to select the next node to visit and push all its adjacent nodes into a
stack.
Repeat this process until the stack is empty. However, ensure that the nodes that are
visited are marked. This will prevent you from visiting the same node more than
once. If you do not mark the nodes that are visited and you visit the same node more
than once, you may end up in an infinite loop.

DFS-iterative (G, s): //Where G is graph and s is


source vertex let S be stack
S.push( s ) //Inserting s in stack
mark s as
visited. while
( S is not
empty):
//Pop a vertex from stack to
visit next v = S.top( ) S.pop(
)
//Push all the neighbours of v in
stack that are not visited for all
neighbours w of v in Graph G:
if w is not visited : S.push( w ) mark w
as visited DFS-recursive(G, s):
mark s as visited
for all neighbours w of s in
Graph G: if w is not
visited:
DFS-recursive(G, w)

Breadth First Search (BFS);


There are many ways to traverse graphs. BFS is the most commonly used
approach.BFS is a traversing algorithm where you should start traversing from a
selected node (source or starting node) and traverse the graph layerwise thus
exploring the neighbour nodes (nodes which are directly connected to source node).
You must then move towards the next-level neighbour nodes.As the name BFS
suggests, you are required to traverse the graph breadthwise as follows:
1.First move horizontally and visit all the nodes of the
current layer 2.Move to the next layer.

You might also like