1. What are Data Structures?
CITS2200 Data Structures and Algorithms
Data structures are software
artifacts that allow data to
be stored, organized and accessed.
Topic 1
Introduction to Data Structures
They are more high-level than
computer memory (hardware)
and lower-level than databases
and spreadsheets (which associate meta-data and meaning
to the stored data).
Why study data structures?
Collections, abstract data types (ADTs), and algorithm analysis
More on ADTs
Ultimately data structures
have two core functions: put
stuff in, and take stuff out.
Whats ahead?
?
before
window
null
after
c Tim French
!
CITS2200
Introduction to Data Structures
Slide 1
c Tim French
!
CITS2200
Introduction to Data Structures
Slide 2
2. What will we Study?
Why?
software is complex
more than any other man made system
even more so in todays highly interconnected world
2.1 Collections
. . . as name suggests, hold a bunch of things. . .
software is fragile
smallest logical error can cause entire systems to crash
nearly every nontrivial piece of software involves the use of collections
neither you, nor your software, will work in a vacuum
Seen arrays others include queues, stacks, lists, trees, maps, sets, tables. . .
the world is unpredictable
clients are unpredictable!
Software must be correct, efficient, easy to maintain, and reusable.
E
A
c Tim French
!
CITS2200
Introduction to Data Structures
Slide 3
c Tim French
!
CITS2200
Introduction to Data Structures
Slide 4
Why so many?
2.2 Abstract Data Types
Space efficiency
Allow user to abstract away from implementation detail.
Time efficiency:
Consider the statement: I put my lunch in my bag and went to Uni.
store (add to collection)
What is meant by the term bag in this context?
search (find an object)
Most likely it is a backpack, or satchel, but it could also be a hand bag, shopping
bag, sleeping bag, body bag . . . (but probably not a bean bag).
retrieve (read information)
remove or replace
It doesnt actually matter. To parse the statement above, we simply understand
that a bag is something that we can
clone (make a copy)
1. put things in,
2. carry places, and
3. take things out.
Such a specification is an Abstract Data Type.
c Tim French
!
CITS2200
Introduction to Data Structures
Slide 5
2.3 Algorithm Analysis
c Tim French
!
CITS2200
Introduction to Data Structures
Slide 6
Introduction to Data Structures
Slide 8
Time Efficiency
Time performance of algorithms can vary greatly.
We will consider a number of alternative implementations for each ADT.
Which is best?
Example: Finding a word in the dictionary
Algorithm 1:
Simplicity and Clarity
Look through each word in turn until you find a match.
All things being equal we prefer simplicity, but they rarely are. . .
Algorithm 2:
Space Efficiency
go to half way point
space occupied by data overheads
compare your word with the word found
space required by algorithm (eg recursion)
can it blow out?
if < repeat on earlier half
else > repeat on later half
c Tim French
!
CITS2200
Introduction to Data Structures
Slide 7
c Tim French
!
CITS2200
Performance
2.4 ADTs and Java
Algorithm 1 (exhaustive search) proportional to n/2
Object-oriented programming was originally based around the concept of abstract
data types.
Algorithm 2 (binary search) proportional to log n
Java classes are ideal for implementing ADTs.
number of
Algorithm 1
Algorithm 2
words max. comparisons max. comparisons
10
10
4
100
100
7
1000
1000
10
10000
10000
14
100000
100000
17
1000000
1000000
20
c Tim French
!
CITS2200
Introduction to Data Structures
ADTs require:
Some references (variables) for holding the data
(usually hidden from the user)
Some operations that can be performed on the data
(available to the user)
Slide 9
c Tim French
!
CITS2200
Introduction to Data Structures
Slide 10
CITS2200
Introduction to Data Structures
Slide 12
2.5 Information Hiding
A class in Java has the general structure. . .
class declaration
Variables can be made private
no access by users
variable declarations
.
.
.
// data held
method declarations
.
.
.
// operations on the data
Methods can be made public
used to create and manipulate data structure
This encapsulation is good programming practice
can change
the way the data is stored
the way the methods are implemented
without changing the (external) functionality .
c Tim French
!
CITS2200
Introduction to Data Structures
Slide 11
c Tim French
!
Example: A Matrix Class
public void set (int i, int j, int value) {
matrixArray[i][j]=value;
}
public class Matrix {
private int[][] matrixArray;
public int get (int i, int j) {return matrixArray[i][j];}
public Matrix (int rows, int columns) {
matrixArray = new int[rows][columns];
for (int i=0; i<rows; i++)
for (int j=0; j<columns; j++)
matrixArray[i][j] = 0;
}
public void transpose () {
int rows = matrixArray.length;
int columns = matrixArray[0].length;
int[][] temp = new int[columns][rows];
for (int i=0; i<rows; i++)
for (int j=0; j<columns; j++)
temp[j][i] = matrixArray[i][j];
matrixArray = temp;
}
c Tim French
!
CITS2200
Introduction to Data Structures
Slide 13
Q: What is the time performance of transpose()?
c Tim French
!
CITS2200
Introduction to Data Structures
public class MatrixReloaded {
For a matrix with n rows and m columns, how many (array access) operations are
needed?
private int[][] matrixArray;
private boolean isTransposed;
Can you think of a more efficient implementation? One that doesnt move any
data?
public MatrixReloaded (int rows, int columns) {
matrixArray = new int[rows][columns];
for (int i=0; i<rows; i++)
for (int j=0; j<columns; j++)
matrixArray[i][j] = 0;
isTransposed = false;
}
c Tim French
!
Slide 14
CITS2200
Introduction to Data Structures
Slide 15
c Tim French
!
CITS2200
Introduction to Data Structures
Slide 16
What is the time performance of transpose()?
public void set (int i, int j, int value) {
Does it depend on the size of the array?
}
How do the changes affect the users program?
public int get (int i, int j) {
}
public void transpose () {
}
}
c Tim French
!
CITS2200
Introduction to Data Structures
Slide 17
c Tim French
!
CITS2200
Introduction to Data Structures
Slide 18
c Tim French
!
CITS2200
Introduction to Data Structures
Slide 20
2.6 Advantages of ADTs
modularity independent development, re-use, portability, maintainability, upgrading, etc
delay decisions about final implementation
separate concerns of application and data structure design
information hiding (encapsulation) access by well-defined interface
Also other OO benefits like:
polymorphism same operation can be applied to different types
inheritance subclasses adopt from parent classes
c Tim French
!
CITS2200
Introduction to Data Structures
Slide 19