DataStructre N Algorithms 03 PDF
DataStructre N Algorithms 03 PDF
CONTENTS
Objectives
Introduction
1.2.4 C Enum
1.3.1 Arrays
1.3.2 Stack
1.3.3 Queues
1.3.5 Tree
1.3.6 Graph
1.4 Summary
1.5 Keywords
Objectives
Introduction
A data structure is any data representation and its associated operations. Even an integer or
ûoating point number stored on the computer can be viewed as a simple data structure. More
typically, a data structure is meant to be an organization or structuring for a collection of data
items. A sorted list of integers stored in an array is an example of such a structuring. Given
sufûcient space to store a collection of data items, it is always possible to search for speciûed
items within the collection, print or otherwise process the data items in any desired order, or
modify the value of any particular data item. Thus, it is possible to perform all necessary
operations on any data structure. However, using the proper data structure can make the difference
between a program running in a few seconds and one requiring many days.
A data structure is a scheme for organizing data in the memory of a computer. A data structure
is a particular way of storing and organizing data in a computer so that it can be used efficiently.
Different kinds of data structures are suited to different kinds of applications, and some are
highly specialized to specific tasks.
Data structures are used in almost every program or software system. Specific data structures
are essential ingredients of many efficient algorithms, and make possible the management of
huge amounts of data, such as large databases and internet indexing services. Some formal
design methods and programming languages emphasize data structures, rather than algorithms,
as the key organizing factor in software design.
Some of the more commonly used data structures include lists, arrays, stacks, queues, heaps,
trees, and graphs. The way in which the data is organized affects the performance of a program
for different tasks. Data structures are generally based on the ability of a computer to fetch and
store data at any place in its memory, specified by an address — a bit string that can be itself
stored in memory and manipulated by the program.
Notes The record and array data structures are based on computing the addresses of data
items with arithmetic operations; while the linked data structures are based on storing
addresses of data items within the structure itself.
Data may be organized in many different ways: the logical or mathematical model of a particular
organization of data is called data structure. Data model depends on two things. First, it much be
rich enough in structure to mirror the actual relationship of the data in the real world. On other
hand, the structure should be simple to execute the process the data when necessary.
Data are also organized into more complex types of structures. The study of such data structure,
which forms the subject matter of the text, includes the following three steps:
3. Quantitative analysis of the structure, which include determining the amount of memory
needed to store the structure and the time required to process the structure.
Computer programmers decide which data structures to use based on the nature of the data and
the processes that need to be performed on that data.
When selecting a data structure to solve a problem, you should follow these steps.
1. Analyze your problem to determine the basic operations that must be supported.
Example: of basic operations include inserting a data item into the data structure, deleting
a data item from the data structure, and nding a specied data item.
This three-step approach to selecting a data structure operationalizes a data centered view of the Notes
design process. The rst concern is for the data and the operations to be performed on them, the
next concern is the representation for those data, and the nal concern is the implementation of
that representation.
Resource constraints on certain key operations, such as search, inserting data records, and deleting
data records, normally drive the data structure selection process. Many issues relating to the
relative importance of these operations are addressed by the following three questions, which
you should ask yourself whenever you must choose a data structure:
Are all data items inserted into the data structure at the beginning, or are insertions
interspersed with other operations?
Are all data items processed in some well-dened order, or is search for specic data items
allowed?
Typically, interspersing insertions with other operations, allowing deletion, and supporting
search for data items all require more complex representations.
2. ........................ on certain key operations normally drive the data structure selection process.
C uses data types to describe various kinds of data such as integers, floating-point numbers,
characters, etc . In C, an object refers to a memory location where its content represents a value.
If you assign an object a name, that object becomes a variable. A data type determines the
number of bytes to be allocated to the variable and valid operations can be performed on it.
C provides you with various data types that can be classified into the following groups:
Derived types include pointer types, array types, structure types, union types and function
types.
Type void
Did u know? Function type depicts the interface of a function. It specifies types of parameters
and return type of the function.
C data types can be also classified differently into the following groups:
Arithmetic types include basic types and enumerated types
Integer types include signed and unsigned integer types. These are discussed as below.
C provides you with five signed integer types. Each integer type has several synonyms.
Table 1.1 illustrates the first five integer types with their corresponding synonyms:
For each signed integer, C also provides the corresponding unsigned integer type that has the
same memory size as the signed integer type.
C defines exactly minimum storage size of each integer type e.g., short takes at least two byes,
long takes at least 4 bytes. Regardless of the C’s implementation, the size of integer types must
follows the order below:
Table 1.3 gives you the common sizes of the integer types in C:
The value ranges of integer types can be found in the limits.h header file. This header file
contains the macros that define minimum and maximum values of each integer type e.g.,
INT_MIN, INT_MAX for minimum and maximum size of the integer.
C provides various floating-point types that represents non-integer number with a decimal
point at any position.
Example: with integer types, you only can have numbers 1, 2, 10, 200… however with
floating-point type, you can have 1.0, 2.5, 100.25 and so on.
Table 1.4 illustrates the technical attributes of various floating-point types in C. It is important
to notice that this is only the minimal requirement for storage size defined by C.
Table 1.4: Technical attributes of various floating-point types in C
C uses char type to store characters and letters. However, the char type is integer type because
underneath C stores integer numbers instead of characters.
In order to represent characters, the computer has to map each integer with a corresponding
character using a numerical code. The most common numerical code is ASCII, which stands for
American Standard Code for Information Interchange. Table 1.5 illustrates the ASCII code:
1.2.4 C Enum
C provides developers with special types called enumerated types or enum to declare symbolic
names that represent integer constants. Its main purpose is to enhance the readability of the
code. An enumeration consists of a group of symbolic integers.
color is the tag name that you can use later as a type name.
!
Caution The tag name must be unique within its scope. This tag name is also optional so
you can omit it.
Inside the curly brackets {} is a set of named integers. This set is also known as enumeration
constants, enumeration sets, enumerators or just members.
By default, the first enumeration member in the set has value 0. The next member has value of Notes
the first one plus 1. So in this case red = 0, green = 1 and blue = 2. You can explicitly assign an
enumeration member to another value using assignment operator ( =) when you define the
enumeration type.
An enumeration set may have duplicate constant values. For instance, you can assign 0
with two enumeration members.
The name of enumeration member must be unique within its scope. It means its name
should not be the same as other member or variable name.
If you declare a variable with enumeration type, the value of that value must be one of the
values of the enumeration members.
Example: you can declare a variable called favorite_color as a variable of the color type.
enum color favorite_color = green;
4. The value ranges of integer types can be found in the ..................... header file.
5. ..................... types represents non-integer number with a decimal point at any position.
8. C provides developers with special types called ..................... to declare symbolic names
that represent integer constants.
9. By default, the first enumeration member in the set has value .....................
Data structure are classified into two type such as linear or non-linear.
Linear: A data structure is said to be linear if its elements form a sequence. The elements
of linear data structure represents by means of sequential memory locations. The other
way is to have the linear relationship between the elements represented by means of
pointers or links. Ex- Array and Link List.
In this section, we will discuss the brief description of various data structures such as arrays,
stack, queues, etc.
1.3.1 Arrays
The simplest type of data structure is a linear (or one dimensional) array. By a linear array, we
mean a list of a finite number n of similar data elements referenced respectively by a set of n
Notes consecutive numbers, usually 1, 2, 3, …n. If we choose the name A for the array, then the
elements of A are denoted by subscript notation:
Regardless of the notation, the number K in A[K] is called a subscript and A[K] is called a
subscripted variable.
Linear arrays are called one-dimensional arrays because each element in such an array is
referenced by one subscript. A two-dimensional array is a collection of similar data elements
where each element is referenced by two subscripts.
Did u know? Such arrays are called matrices in mathematics, and tables in business
applications.
1.3.2 Stack
A stack is a linear structure in which items may be added or removed only at one end. The stack
is a common data structure for representing things that need to maintained in a particular order.
For instance, when a function calls another function, which in turn calls a third function, it’s
important that the third function return back to the second function rather than the first. One
way to think about this implementation is to think of functions as being stacked on top of each
other; the last one added to the stack is the first one taken off. In this way, the data structure itself
enforces the proper order of calls.
Conceptually, a stack is simple: a data structure that allows adding and removing elements in a
particular order. Every time an element is added, it goes on the top of the stack; the only element
that can be removed is the element that was at the top of the stack. Consequently, a stack is said
to have “first in last out” behavior (or “last in, first out”). The first item added to a stack will be
the last item removed from a stack.
Example: Figure 1.1 pictures three everyday examples of such a structure: a stack of
dishes, a stack of pennies and a stack of folded towels.
Figure 1.1: A stack of dishes, a stack of pennies and a stack of folded towels
Source: http://www.csbdu.in/econtent/DataStructures/Unit1-DS.pdf
Stacks are also called last-in first-out (LIFO) lists. Other names used for stacks are “piles” and Notes
“push-down lists. Stack has many important applications in computer science.
1.3.3 Queues
A queue, also called a first-in-first-out (FIFO) system, is a linear list in which deletions can take
place only at one end of the list, the “front” of the list, and insertions can take place only at the
other end of the list, the “rear” of the list. The features of a Queue are similar to the features of
any queue of customers at a counter, at a bus stop, at railway reservation counter etc. A queue
can be implemented using arrays or linked lists. A queue can be represented as a circular queue.
This representation saves space when compared to the linear queue. Finally, there are special
cases of queues called Dequeues which allow insertion and deletion of elements at both the end.
Source: http://www.csbdu.in/econtent/DataStructures/Unit1-DS.pdf
A linked list, or one-way list, is a linear collection of data elements, called nodes, where the
linear order is given by means of pointers. That is, each node is divided into two parts: the first
part contains the information of the element, and the second part, called the link field or next
pointer field, contains the address of the next node in the list.
Figure 1.3 is a schematic diagram of a linked list with 6 nodes, Each node is pictured with two
parts.
Source: http://www.csbdu.in/econtent/DataStructures/Unit1-DS.pdf
Notes The left part represents the information part of the node, which may contain an entire record of
data items (e.g., NAME, ADDRESS,...). The right part represents the Next pointer field of the
node, and there is an arrow drawn from it to the next node in the list. This follows the usual
practice of drawing an arrow from a field to a node when the address of the node appears in the
given field.
Notes The pointer of the last node contains special value, called the null pointer, which is
any invalid address.
1.3.5 Tree
A tree is an acyclic, connected graph. A tree contains no loops or cycles. The concept of tree is one
of the most fundamental and useful concepts in computer science. Trees have many variations,
implementations and applications. Trees find their use in applications such as compiler
construction, database design, windows, operating system programs, etc. A tree structures is
one in which items of data are related by edges.
Example: A very common example is the ancestor tree as given in Figure 1.4. This tree
shows the ancestors of KUMAR. His parents are RAMESH and RADHA. RAMESH’s parents are
PADMA and SURESH who are also grand parents of KUMAR (on father’s side); RADHA’s parents
are JAYASHRI and RAMAN who are also grand parents of KUMAR (on mother’s side).
Kumar
Radha Ramesh
Source: http://www.csbdu.in/econtent/DataStructures/Unit1-DS.pdf
1.3.6 Graph
All the data structures (Arrays, Lists, Stacks, and Queues) except Graphs are linear data structures.
Graphs are classified in the non-linear category of data structures. A graph G may be defined as
a finite set V of vertices and a set E of edges (pair of connected vertices). The notation used is as
follows:
Graph G = (V,E)
1 3
5 4
Source: http://www.csbdu.in/econtent/DataStructures/Unit1-DS.pdf
From the above graph, we may observe that the set of vertices for the graph is V = {1,2,3,4,5}, and
the set of edges for the graph is E = {(1,2), (1,5), (1,3), (5,4), (4,3), (2,3)}. The elements of E are
always a pair of elements.
!
Caution The relationship between pairs of these elements is not necessarily hierarchical in
nature.
11. A ............................. array is a collection of similar data elements where each element is
referenced by two subscripts.
12. A ............................. is a linear structure in which items may be added or removed only at
one end.
13. A special queue known as ............................. allows insertion and deletion of elements at
both the end.
14. A linked list, or one-way list, is a linear collection of data elements, called .............................,
where the linear order is given by means of pointers.
15. A ............................. structures is one in which items of data are related by edges.
Notes
Figure 11.5: A counting puzzle. How many candies of each different shape are there?
(round, oval, and long). How many candies have a pattern? How many candies are dark
and how many are light?
Source: http://statmath.wu.ac.at/courses/data-analysis/itdtHTML/node80.html
Our task is to record the different shapes (round, oval, or long), the different shades (light or
dark), and whether there is a pattern on the candies that we can see in the image. How can this
information be entered into R?
One way to enter data in R is to use the scan() function. This allows us to type data into R
separated by spaces. Once we have entered all of the data, we enter an empty line to indicate to
R that we have finished. We will use this to enter the different shapes:
> shapeNames <- scan(what=”character”)
1: round oval long
4:
Read 3 items
> shapeNames
[1] “round” “oval” “long”
Another way to enter data is using the c() function. We will use this to enter the possible patterns Notes
and shades:
> patternNames <- c(“pattern”, “plain”)
> patternNames
[1] “pattern” “plain”
> shadeNames <- c(“light”, “dark”)
> shadeNames
[1] “light” “dark”
All we have done so far is enter the possible values of these symbols.
1.4 Summary
A data structure is a particular way of storing and organizing data in a computer so that it
can be used efficiently.
Computer programmers decide which data structures to use based on the nature of the
data and the processes that need to be performed on that data.
By a linear array, we mean a list of a finite number n of similar data elements referenced
respectively by a set of n consecutive numbers, usually 1, 2, 3, …n.
The stack is a common data structure for representing things that need to maintained in a
particular order.
A queue, also called a first-in-first-out (FIFO) system, is a linear list in which deletions can
take place only at one end of the list, the “front” of the list, and insertions can take place
only at the other end of the list, the “rear” of the list.
A linked list, or one-way list, is a linear collection of data elements, called nodes, where
the linear order is given by means of pointers.
A graph G may be defined as a finite set V of vertices and a set E of edges (pair of connected
vertices).
1.5 Keywords
Data structure: A data structure is a scheme for organizing data in the memory of a computer.
Array: Array is a list of a finite number n of similar data elements referenced respectively by a
set of n consecutive numbers, usually 1, 2, 3, …n.
Stack: A stack is a linear structure in which items may be added or removed only at one end.
Queue: A queue, also called a first-in-first-out (FIFO) system, is a linear list in which deletions
can take place only at one end of the list, the “front” of the list, and insertions can take place only
at the other end of the list, the “rear” of the list.
Linked list: A linked list, or one-way list, is a linear collection of data elements, called nodes,
where the linear order is given by means of pointers.
Notes Tree: A tree is an acyclic, connected graph which contains no loops or cycles.
Graph: A graph G may be defined as a finite set V of vertices and a set E of edges (pair of
connected vertices).
4. What are the different floating-point types in C? Also discuss the minimal requirement
for storage size.
10. Graphs are classified in the non-linear category of data structures. Comment.
5. Floating-point 6. Char
9. 0 10. Linear
15. tree
Books Sartaj Sahni, 1976, Fundamentals of data structures, Computer Science Press
Samir Kumar Bandyopadhyay, 2009, Data Structures Using C, Pearson
Education India
Notes
http://www.lix.polytechnique.fr/~liberti/public/computing/prog/c/C/
CONCEPT/data_types.html
http://www.asic-world.com/scripting/data_types_c.html
CONTENTS
Objectives
Introduction
2.4 Summary
2.5 Keywords
Objectives
Introduction
Data are processed by means of certain operations which appearing in the data structure. Data
has situation on depends largely on the frequency with which specific operations are performed.
An essential aspect to data structures is algorithms. Data structures are implemented using
algorithms. In this unit, we will introduce some of the most frequently used operations. Also we
will discuss the concept of algorithm complexity.
We may perform the following operations on any linear structure, whether it is an array or a
linked list.
Traversal: By means of traversal operation, we can process each element in the list.
Search: By means of this operation, we can find the location of the element with a given
value or the record with a given key.
Insertion: Insertion operation is used for adding a new element to the list. Notes
Sorting: This operation is used for arranging the elements in some type of order.
Merging: By means of merging operation, we can combine two lists into a single list.
Notes Depending upon the relative frequency, we may perform these operations with a
particular any one of the Linear structure.
2. ......................... operation is used for arranging the elements in some type of order.
!
Caution An algorithm that requires a gigabyte of main memory is not (currently) useful.
This model clearly has some weaknesses. Obviously, in real life, not all operations take exactly
the same time. In particular, in our model one disk read counts the same as an addition, even
though the addition is typically several orders of magnitude faster. Also, by assuming infinite
memory, we never worry about page faulting, which can be a real problem, especially for
efficient algorithms. This can be a major problem in many applications.
The most important resource to analyze is generally the running time. Several factors affect the
running time of a program. Some, such as the compiler and computer used, are obviously
beyond the scope of any theoretical model, so, although they are important, we cannot deal with
them here. The other main factors are the algorithm used and the input to the algorithm.
An algorithm states explicitly how the data will be manipulated. Some algorithms are more
efficient than others.
Notes
The complexity of an algorithm is a function describing the efficiency of the algorithm in terms
of the amount of data the algorithm must process. Usually there are natural units for the domain
and range of this function. There are two main complexity measures of the efficiency of an
algorithm:
Time complexity is a function describing the amount of time an algorithm takes in terms
of the amount of input to the algorithm. “Time” can mean the number of memory accesses
performed, the number of comparisons between integers, the number of times some
inner loop is executed, or some other natural unit related to the amount of real time the
algorithm will take. We try to keep this idea of time separate from “wall clock” time, since
many factors unrelated to the algorithm itself can affect the real time (like the language
used, type of computing hardware, proficiency of the programmer, optimization in the
compiler, etc.). It turns out that, if we chose the units wisely, all of the other stuff doesn’t
matter and we can get an independent measure of the efficiency of the algorithm.
We often speak of “extra” memory needed, not counting the memory needed to store the
input itself. Again, we use natural (but fixed-length) units to measure this. We can use
bytes, but it’s easier to use, say, number of integers used, number of fixed-sized structures,
etc. In the end, the function we come up with will be independent of the actual number of
bytes needed to represent the unit. Space complexity is sometimes ignored because the
space used is minimal and/or obvious, but sometimes it becomes as important an issue as
time.
There is often a time-space-tradeoff involved in a problem, that is, it cannot be solved with few
computing time and low memory consumption. One then has to make a compromise and to
exchange computing time for memory consumption or vice versa, depending on which algorithm
one chooses and how one parameterizes it.
Example: We might say “this algorithm takes n2 time,” where n is the number of items
in the input. Or we might say “this algorithm takes constant extra space,” because the amount of
extra memory needed doesn’t vary with the number of items processed.
The difference between space complexity and time complexity is that space can be reused. Space
complexity is not affected by determinism or nondeterminism. Amount of computer memory
required during the program execution, as a function of the input size.
When solving a computer science problem there will usually be more than just one solution.
These solutions will often be in the form of different algorithms, and you will generally want to
compare the algorithms to see which one is more efficient. This is where Big O analysis helps –
it gives us some basis for measuring the efficiency of an algorithm
“Big O” refers to a way of rating the efficiency of an algorithm. It is only a rough estimate of the
actual running time of the algorithm, but it will give you an idea of the performance relative to
the size of the input data.
The motivation behind Big-O notation is that we need some way to measure the efficiency of
programs but there are so many differences between computers that we can’t use real time to
measure program efficiency. Therefore we use a more abstract concept to measure algorithm
efficiency.
Definition: Let f(n) and g(n) be functions, where n is a positive integer. We write f(n) = O(g(n)) if
and only if there exists a real number c and positive integer n0 satisfying 0 <= f(n) <= cg(n) for all
n >= n0.
We say, “f of n is big oh of g of n.” We might also say or write f(n) is in O(g(n)), because we can
think of O as a set of functions all with the same property. But we won’t often do that in Data
Structures.
This means that, for example, that functions like n2 + n, 4n2 – n log n + 12, n2/5 - 100n, n log n, 50n,
and so forth are all O(n2).
Every function f(n) bounded above by some constant multiple g(n) for all values of n greater
than a certain value is O(g(n)).
Example:
Show 3n2 + 4n – 2 = O(n2).
3+4–2≤c
n3 = O(n2)
But this is not possible; we can never choose a constant c large enough that n will never exceed
it, since n can grow without bound. Thus, the original assumption, that n3 = O(n2), be wrong so
n3 != O(n2).
Big O gives us a formal way of expressing asymptotic upper bounds, a way of bounding from
above the growth of a function.
Notes Knowing where a function falls within the big-O hierarchy allows us to compare it
quickly with other functions and gives us an idea of which algorithm has the best time
performance.
As we know Big O notation is a convenient way of describing the growth rate of a function and
hence the time complexity of an algorithm.
Let n be the size of the input and f (n), g(n) be positive functions of n.
The time efficiency of almost all of the algorithms can be characterized by only a few growth
rate functions:
This means that the algorithm requires the same fixed number of steps regardless of the size of
the task.
This means that the algorithm requires a number of steps proportional to the size of the task.
Example:
Some more simplistic sorting algorithms, for instance a selection sort of n elements;
Finding duplicates in an unsorted list of n elements (implemented with two nested loops).
Example:
Binary search in a sorted list of n elements;
Insert and Find operations for a binary search tree with n nodes;
Example:
Recursive Fibonacci implementation
Towers of Hanoi
The best time in the above list is obviously constant time, and the worst is exponential time
which, as we have seen, quickly overwhelms even the fastest computers even for relatively
small n.
Notes
Did u know? Polynomial growth (linear, quadratic, cubic, etc.) is considered manageable
as compared to exponential growth.
O(l) < O(log n) < O(n) < O(n log n) < O(n2) < O(n3) < O(an)
The definition of big O is pretty ugly to have to work with all the time, kind of like the “limit”
definition of a derivative in Calculus. Here are some helpful theorems you can use to simplify
big O calculations:
Big O is transitive. That is, if f(n) = O(g(n)) and g(n) is O(h(n)), then f(n) = O(h(n)).
logan = O(logb n) for any a, b > 1. This practically means that we don’t care, asymptotically,
what base we take our logarithms to. (We said asymptotically. In a few cases, it does
matter.)
Big O of a sum of functions is big O of the largest function. How do you know which one
is the largest? The one that all the others are big O of. One consequence of this is, if f(n) =
O(h(n)) and g(n) is O(h(n)), thenf(n) + g(n) = O(h(n)).
Big O only gives you an upper bound on a function, i.e., if we ignore constant factors and let n get
big enough, we know some function will never exceed some other function. But this can give us
too much freedom.
For instance, the time for selection sort is easily O(n3), because n2 is O(n3). But we know that
O(n2) is a more meaningful upper bound. What we need is to be able to describe a lower bound,
a function that always grows more slowly than f(n), and a tight bound, a function that grows at
about the same rate as f(n). Now let us look at a different (and probably easier to understand)
way to approach this.
Big Omega is for lower bounds what big O is for upper bounds:
Definition: Let f(n) and g(n) be functions, where n is a positive integer. We write f(n) = Ω(g(n)) if
and only if g(n) = O(f(n)). We say “f of n is omega of g of n.”
This means g is a lower bound for f; after a certain value of n, and without regard to multiplicative
constants, f will never go below g.
Finally, theta notation combines upper bounds with lower bounds to get tight bounds:
Definition: Let f(n) and g(n) be functions, where n is a positive integer. We write f(n) = Θ(g(n)) if
and only if g(n) = O(f(n)). and g(n) = Ω(f(n)). We say “f of n is theta of g of n.”
The first four properties listed above for big O are also true for Omega and Theta.
Replace O with Ω and “largest” with “smallest” in the fifth property for big O and it
remains true.
nk = O((1+ε) n)) for any positive k and ε. That is, any polynomial is bound from above by
any exponential. So any algorithm that runs in polynomial time is (eventually, for large
enough value of n) preferable to any algorithm that runs in exponential time.
(log n)ε = O(n k) for any positive k and ε. That means a logarithm to any power grows
more slowly than a polynomial (even things like square root, 100th root, etc.)
!
Caution An algorithm that runs in logarithmic time is (eventually) preferable to an
algorithm that runs in polynomial (or indeed exponential, from above) time.
9. Every function f(n) bounded above by some constant multiple g(n) for all values of n
greater than a certain value is ..........................
10. Big O gives us a formal way of expressing asymptotic upper bounds.
11. A function “.........................” means that the algorithm requires the same fixed number of
steps regardless of the size of the task.
12. A function “.........................” means that the algorithm requires a number of steps
proportional to the size of the task.
13. In case of “.........................” function, the number of operations is proportional to the size of
the task squared.
15. Big O is ........................., that is, if f(n) = O(g(n)) and g(n) is O(h(n)), then f(n) = O(h(n)).
Case Study Algorithm Analysis
Here we show how to use the big-Oh notation to analyze three algorithms that solve the
same problem but have different running times. The problem we focus on is one that is
reportedly often used as a job interview question by major software and Internet
companies—the maximum subarray problem. In this problem, we are given an array of
positive and negative integers and asked to nd the subarray whose elements have the
largest sum. That is, given A = [a1, a2, . . . , an], find the indices j and k that maximize the
sum
Contd...
Notes k
s j , k = a j + a j + 1 + ... + ak = ai
i= j
Or, to put it another way, if we use A[j : k]to denote the subarray of A from index j to index
k, then the maximum subarray problem is to nd the subarray A[j : k] that maximizes the
sum of its values. In addition to being considered a good problem for testing the thinking
skills of prospective employees, the maximum subarray problem also has applications in
pattern analysis in digitized images. An example of this problem is shown in Figure a.
In this case, the maximum subarray is A[3 : 6], that is, the maximum sum is s3,6
Our rst algorithm for the maximum subarray problem, which we call MaxsubSlow, is
shown in Algorithm given below. It computes the maximum of every possible subarray
summation, sj,k, of A separately.
Algorithm MaxsubSlow(A):
for j ← 1 to n do
for k ← j to n do
for i ← j to k do
s ← s + A[i]
if s > m then
m←s
return max
It isn’t hard to see that the MaxsubSlow algorithm is correct. This algorithm calculates the
partial sum, sj,k, of every possible subarray, by adding up the values in the subarray from
aj to ak. Moreover, for every such subarray sum, it compares that sum to a running maximum
Contd...
and if the new value is greater than the old, it updates that maximum to the new value. In Notes
the end, this will be maximum subarray sum.
Incidentally, both the calculating of subarray summations and the computing of the
maximum so far are examples of the accumulator design pattern, where we incrementally
accumulate values into a single variable to compute a sum or maximum (or minimum).
This is a pattern that is used in a lot of algorithms, but in this case it is not being used in the
most efcient way possible.
Analyzing the running time of the MaxsubSlow algorithm is easy. In particular, the outer
loop, for index j, will iterate n times, its inner loop, for index k, will iterate at most n times,
and the inner-most loop, for index i, will iterate at most n times. Thus, the running time of
the MaxsubSlow algorithm is O(n3). Unfortunately, in spite of its use of the accumulator
design pattern, giving theMaxsubSlow algorithm as a solution to the maximum subarray
problem would be a bad idea during a job interview. This is a slow algorithm for the
maximum subarray problem.
We can design an improved algorithm for the maximum subarray problem by observing
that we are wasting a lot of time by recomputing all the subarray summations from
scratch in the inner loop of the MaxsubSlow algorithm. There is a much more efcient way
to calculate these summations. The crucial insight is to consider all the prex sums, which
are the sums of the rst t integers in A for t = 1,2, . . . , n. That is, consider each prex sum, St,
which is dened as
t
st = a1 + a2 + ... + at = ai
i=1
If we are given all such prex sums, then we can compute any subarray summation, sj,k, in
constant time using the formula:
s j , k = Sk − S j − 1
where we use the notational convention that S0 = 0. To see this, note that
k j −1
Sk − S j − 1 = ai − ai
i=1 i=1
k
= ai = s j , k ,
i= j
We can incorporate the above observations into an improved algorithm for the maximum
subarray problem, called MaxsubFaster, which we show in Algorithm given below.
Algorithm MaxsubFaster(A):
Output: The subarray summation value such that A[j] + · · · + A[k] is maximized.
for i ← 1 to n do
Si ← Si”1 + A[i]
Contd...
Notes for j ← 1 to n do
for k ← j to n do
m ← Sk “ Sj”1
return max
The correctness of the MaxsubFaster algorithm follows along the same arguments as for
the MaxsubSlow algorithm, but it is much faster. In particular, the outer loop, for index j,
will iterate n times, its inner loop, for index k, will iterate at most n times, and the steps
inside that loop will only take O(1) time in each iteration. Thus, the total running time of
the MaxsubFaster algorithm is O(n2), which improves the running time of the MaxsubSlow
algorithm by a linear factor. True story: a former student of one of the authors gave this
very algorithm during a job interview for a major software company, when asked about
the maximum subarray problem, correctly observing that this algorithm beats the running
time of the naive O(n3)-time algorithm by a linear factor. Sadly, this student did not get a
job offer, however, and one possible reason could have been because there is an even
better solution to the maximum subarray problem, which the student didn’t give.
We can improve the running time for solving the maximum subarray further by applying
the intuition behind the prex summations idea to the computation of the maximum itself.
That is, what if, instead of computing a partial sum, St , for t = 1,2, . . . , n, of the values of the
subarray from a1 to at , we compute a “partial maximum,” Mt , which is the maximum
summation of a subarray of A[1 : t] that ends at index t?
Such a denition is an interesting idea, but it is not quite right, because it doesn’t include the
boundary case where we wouldn’t want any subarray that ends at t, in the event that all
such subarrays sum up to a negative number. So, recalling our notation of letting sj,k
denote the partial sum of the values in A[j : k], let us dene
{
Mt = max 0,max{S j ,t }
j≤t }
Note that if we know all the Mt values, for t = 1,2, . . . , n, then the solution to the maximum
subarray problem would simply be the maximum of all these values. So let us consider
how we could compute these Mt values.
The crucial observation is that, for t ≥ 2, if we have a maximum subarray that ends at t, and
it has a positive sum, then it is either A[t : t] or it is made up of the maximum subarray that
ends at t “ 1 plus A[t]. If this were not the case, then we could make an even bigger subarray
by swapping out the one we chose to end at t “ 1 with the maximum one that ends at t “ 1,
which would contradict the fact that we have the maximum subarray that ends at t. In
addition, if taking the value of maximum subarray that ends at t “ 1 and adding A[t] makes
this sum no longer be positive, then Mt = 0, for there is no subarray that ends at t with a
positive summation. In other words, we can dene M0 = 0 as a boundary condition, and use
the following formula to compute Mt , for t = 1,2, . . . , n:
Mt = max{0, Mt − 1 + A[t ]}
Contd...
Therefore, we can solve the maximum subarray problem using the algorithm, Notes
MaxsubFastest, shown in Algorithm given below.
Algorithm MaxsubFastest(A):
Output: The subarray summation value such that A[j] + · · · + A[k] is maximized.
for t ← 1 to n do
for t ← 1 to n do
m ← max{m, Mt}
return m
The MaxsubFastest algorithm consists of two loops, which each iterate exactly n times and
take O(1) time in each iteration. Thus, the total running time of the MaxsubFastest algorithm
is O(n). Incidentally, in addition to using the accumulator pattern, to calculate the Mt and
m variables based on previous values of these variables, it also can be viewed as a simple
application of the dynamic programming technique. Given all these positive aspects of
this algorithm, even though we can’t guarantee that a prospective employee will get a job
offer by describing the MaxsubFastest algorithm when asked about the maximum subarray
problem, we can at least guarantee that this is the way to nail this question.
Source: http://www.ics.uci.edu/~goodrich/teach/cs161/notes/MaxSubarray.pdf
2.4 Summary
We can use various operations on data structures such as traversing, insertion, deletion,
sorting, merging, etc.
The most important resource to analyze is generally the running time. Several factors
affect the running time of a program.
Time complexity is a function describing the amount of time an algorithm takes in terms
of the amount of input to the algorithm.
The difference between space complexity and time complexity is that space can be reused.
Space complexity is not affected by determinism or nondeterminism.
“Big O” refers to a way of rating the efficiency of an algorithm. It is only a rough estimate
of the actual running time of the algorithm, but it will give you an idea of the performance
relative to the size of the input data.
Insertion Operation: Insertion operation is used for adding a new element to the list.
Time complexity : Time complexity is a function describing the amount of time an algorithm
takes in terms of the amount of input to the algorithm.
Space complexity : Space complexity is a function describing the amount of memory (space) an
algorithm takes in terms of the amount of input to the algorithm.
Constant time: This means that the algorithm requires the same fixed number of steps regardless
of the size of the task.
Linear time: This means that the algorithm requires a number of steps proportional to the size
of the task.
1. Insertion 2. Sorting
3. algorithm 4. Model
Books Sartaj Sahni, 1976, Fundamentals of data structures, Computer Science Press
Samir Kumar Bandyopadhyay, 2009, Data Structures Using C, Pearson
Education India
http://www.xpode.com/ShowArticle.aspx?Articleid=87
http://www.slideshare.net/NavtarSidhuBrar/data-structure-and-its-types-
7577762
CONTENTS
Objectives
Introduction
3. 3 Summary
3.4 Keywords
Objectives
Introduction
Sometimes a problem is too difficult or too complex to solve because it is too big. If the problem
can be broken down into smaller versions of itself, we may be able to find a way to solve one of
these smaller versions and then be able to build up to a solution to the entire problem. This is the
idea behind recursion; recursive algorithms break down a problem into smaller pieces which
you either already know the answer to, or can solve by applying the same algorithm to each
piece, and then combining the results. Recursion turns out to be a wonderful technique for
dealing with many interesting problems. Solutions written recursively are often simple.
Recursive solutions are also often much easier to conceive of and code than their iterative
counterparts. In this unit, we will discuss the concept of recursion and recursive functions.
In computer programming, a recursion is programming that is recursive, and recursive has two
related meanings: