Introduction to
Database Systems
Motivation
Werner Nutt
1
Databases Are Everywhere
• Database = a large (?) collection of related data
• Classically, a DB models a real-world organisation
(e.g., enterprise, university)
– Entities (e.g., students, courses)
– Relationships (e.g., “Martin is taking IDS in 2010/11”)
• Changes in the organisation = changes in the database
• Examples:
– personnel records
– banking
– airline reservations
2
Scientific Databases (Examples)
• Biology:
e.g., DNA sequences of genes, amino-acid sequences
of proteins, genes expressed in tissues
(up to several Gigabytes)
• Astronomy:
e.g., location and spectra of astronomic objects
(up to several Terabytes)
• Physics:
e.g., sensor measurements in particle physics
experiments
(up to several Petabytes)
3
DB Tendencies
• Sensors record data
Î DBs grow in size
Î DBs become more widespread
Î date may be less reliable, i.e., uncertain
• Multimedia data
Î Requirements for larger storage
Î New query operations
(e.g., find a song by humming the melody,
find pictures with a given face)
• Data on the Web
Î Accessed/changed by many people (Facebook,…)
Î Speed up access, loosen consistency (NoSQL)
4
Operations with Databases
• Design
– Define structure and types of data
• Construction
– Create data structures of DB, populate DB with data
• Manipulation of Data
– Insert, delete, update
– Query: “Which department pays the highest salary?”
– Create reports:
“List monthly salaries of employees, organised
by department, with average salary and total
sum of salaries for each dept”
5
An Ideal DB Implementation
Should Support:
• Structure • Performance
– data types – retrieve and store data
– data behaviour quickly
• Persistence • Data Integrity
– store data on • Sharing
secondary storage – concurrency
• Retrieval • Reliability and resilience
– a declarative query • Large data volumes
language
– a procedural database
programming language
6
Database Management System (DBMS)
• A DBMS is a software package designed
to store and manage databases
• A DBMS provides generic functionality (see
previous slide) that otherwise would have to be
implemented over and over again
Î Reduced application development time
• Several brands, e.g.,
– Oracle Xi/Yg (Oracle), DB2 (IBM), SQL Server, Access
(Microsoft), MySQL, PostgreSQL, HSQLDB, SQLite
(open source)
7
Database Actors
Database Application
Designers Programmers “on the scenes”
Database End Users
Administrator • sophisticated
• casual
(DBA) • ‘parametric’ or
‘canned’ transactions
Database
DBMS developers Operators and Maintenance
Tool Developers
Personnel “behind the scenes”
Database Management System 8
File System: A Physical Interface
Student Year
Student Admin Lists
Data
Course Timetable
Scheduler
Data
Lecturer Money
Payroll
Data Transfer
9
Sharing Data: Replication Î Redundancy
Student
Admin
Lab Student Data
Timetable
Course Data
Tutorials
Lecturer Data
Scheduler
Teaching
Schedule
Payroll
10
Sharing Data and Operations
Student Admin
Student Data
Lab Timetable
Course Tutorials
Data
Scheduler
Teaching
Lecturer
Schedule
Data
Payroll
11
DBMS: A Logical Interface
University Database Lab
Data Timetable
Database
Management Teaching
course
System Schedule
student
lecturer
Tutorials
Data Dictionary
or
System Catalog
University
?QUERIES
Database
Metadata
12
File System Approach
• Uncontrolled redundancy
• Inconsistent data
• Inflexibility
• Limited data sharing
• Poor enforcement of standards
• Low programmer productivity
• Excessive program maintenance
• Excessive data maintenance
13
DBMS Approach
• Controlled redundancy • Services & controls
– consistency of data & – security & privacy
integrity constraints controls
• Integration of data – backup & recovery
– self-contained – enforcement of
– represents semantics standards
of application • Flexibility
• Data and operation – data independence
sharing – data accessibility
– multiple interfaces – reduced program
maintenance
• Ease of application 14
development
However.... Summary:
If an application is • In a file system, data is
• simple physically accessed and
• stringent real-time not integrated
• single user
• In a DBMS, data is
• static,
logically accessed and
files are the option of choice integrated:
– query language
DBMS downside: – data dictionary
• more expensive
• more complex
• general 15