Databases 2
Programming with
Web Technologies
Overview
Today we will be exploring SQL in more detail, exploring further query syntax.
We will also be looking into the relational model and how it maps to and from
the SQL tables. Lastly, we will look at entity-relationship (ER) models and how
they can apply to the relational model
For further query syntax, we will be looking at naming expressions and tables,
filtering, joining and unions
Before we talk about this further syntax, we need to quickly introduce a
sample database to use in examples
2
unidb_students
id fname lname country mentor
1661 James Beam US 1715
1668 Jack Daniels US 1715
1713 James Speight NZ 1715
1715 Henry Wagstaff NZ 1970
1717 Johnnie Walker AU 1715
1824 Tia Maria MX 1970
1828 Dom Perignon FR 1970
1970 Scarlett Ohara SC 1970
3
unidb_staff
staff_no fname lname office
123 David Bainbridge G.1.24
456 Geoff Holmes FG.1.01
666 Annika Hinze G.2.26
707 Te Taka Keegan G.2.07
878 Barney Rubble L.2.34
919 SpongeBob Squarepants L.3.13
4
unidb_courses
dept num descrip coord_no rep_id
comp 219 Databases 666 1713
comp 258 OOP 707 1970
comp 425 HCI 123 1717
ling 219 Linguistics 878 1824
teal 202 Film 919 1668
5
unidb_attend
id dept num semester grade mark
1661 comp 219 A A- 91.4
1661 comp 258 A B- 81.2
1668 comp 219 A B+ 88
1668 comp 258 A B+ 89.2
1717 comp 219 A RP 89.1
1828 ling 219 A A- 96
1970 comp 219 A B+ 49.9
1970 teal 202 B NULL NULL
6
unidb_teach
dept num staff_no
comp 219 666
comp 219 707
comp 258 707
ling 219 878
teal 202 919
7
DML: Select - Revision
Recall from the Databases 1 lecture, information can be retrieved from a
database using a SELECT statement. The simplest form of a SELECT
statement is
SELECT <columns>
FROM <table>
WHERE <expression>;
The result is always a table, regardless of how many rows or columns it
returns
8
DML: Select - Extended Form
The SELECT statement can be supplemented with a few extra clauses
SELECT [DISTINCT] <columns>
FROM <table>
WHERE <expression>
GROUP BY <column_name>
HAVING <search_condition>
ORDER BY <column_name> [<direction>]
LIMIT <count>;
The result is still always a table
9
DML: Select - Extended Form
DISTINCT
When present, the resulting table will have duplicate rows removed
LIMIT
Restricts the resulting table to a specified number of rows
ORDER BY
Order the results based on the natural ordering of one or more columns.
Direction can be ASC or DESC for ascending and descending orders
respectively, and will be ASC by default
10
DML: Select - Extended Form
HAVING and GROUP BY are a bit more complicated. These are used when an
aggregate function is used in the query
What is an aggregate function? They are functions that group multiple rows
together to give a single value with more significant meaning. Examples would
be COUNT(), MIN(), MAX(), and SUM(), which operate over multiple values.
SELECT COUNT(*)
FROM unidb_students;
Will count the number of students in the unidb_students table
11
GROUP BY
Aggregate functions are neat, but they operate over the whole table. How can
they be restricted to a group of rows so that we can answer questions like
"How many students are from each country?"
SELECT country, COUNT(id)
FROM unidb_students
GROUP BY country;
The GROUP BY clause splits our result set into groups according to values in
the specified column, then the aggregate function is applied to each of those
groups
12
HAVING
If you want to filter your results based on the value produced by an aggregate
function you need to do it using the HAVING clause rather than the WHERE
clause
SELECT country, COUNT(id)
FROM unidb_students
WHERE country != 'NZ'
GROUP BY country
HAVING COUNT(id) < 2;
13
DML: WHERE - Filtering Data
<expression> [AND | OR] <expression>
Chain two expressions together with a boolean operator
NOT <expression>
Inverts the result of the expression
<column> BETWEEN xxx AND yyy
Evaluates to true if the value found in <column> exists between the specified
values
<column> IN ('aaa', 'bbb', 'ccc')
Evaluates to true if the value found in <column> is in the list of values
14
Naming Expressions and Tables
Often you will encounter column and table names that are long or confusing,
that you don't want to have to retype. SQL uses the AS keyword to allow us to
alias these names to something shorter or more useful
SELECT aReallyLongColumnName AS a
FROM table
WHERE a > 5;
SELECT [Link]
FROM aBadlyNamedTable AS s;
15
Joins & Unions
Joining Tables
Given that we have a students table and an attends table, what would be a
logical way to join them?
SELECT *
FROM unidb_students AS s, unidb_attend AS a
WHERE [Link] = [Link];
This type of join is known as the equi-join or the natural inner join. It will
return all columns from both tables, with each row being made up of the
unidb_students row, with the row with the matching id column from
unidb_attend beside it
17
Joining Tables
We can use joins to answer questions like "what are the names of the
students who are enrolled in COMP219?"
SELECT [Link], [Link]
FROM
unidb_students AS s,
unidb_attend AS a
WHERE [Link] = [Link]
AND [Link] = 'comp'
AND [Link] = '219';
18
Other Joins
There are many types of joins available in SQL, but we will not be talking about
all of them today. Thinking about the tables and joins like circles and overlaps
in Venn diagrams can be useful for deciding which to use
Documentation is plentiful for SQL
joins. Examples for using each can
easily be found in the usual locations
19
Unions
To combine columns of different tables we can use the JOIN keyword (mostly
when using the non-inner joins). To combine rows of different queries we can
use the UNION keyword
For the Union to be successful the output rows of the two (or more) queries
must be compatible. This means that each resulting table needs the same
number of columns, with each having the same types for the UNION to work
By default duplicates are eliminated. If you want duplicates to be preserved,
you can use UNION ALL
20
Unions
Unions can be used to answer questions like "List the names of all students
and all lecturers"
SELECT fname, lname
FROM unidb_students
UNION
SELECT fname, lname
FROM unidb_lecturers;
21
Data Modelling
Data Modelling
23
Data Model: Relational
24
Data Model: Relational
courses(dept, number, description, coord_no, rep_id)
Primary Key Foreign Keys
CREATE TABLE courses (
dept CHAR(4) NOT NULL
, num CHAR(3) NOT NULL
, descrip VARCHAR(24)
, coord_no INT NOT NULL
, rep_id INT NOT NULL
, PRIMARY KEY (dept, num)
, FOREIGN KEY (coord_no) REFERENCES lecturers (staff_no)
, FOREIGN KEY (rep_id) REFERENCES students (id)
);
25
Good Design
Real-world applications lead to databases with hundreds of tables. The
Relational Model helps with the design of a specific table, but not how to form
relationships with tables
This is something that can be developed through the use of an
Entity-Relationship (ER) Model which can then be turned into Relational
Models. These Relational Models can then be turned into SQL Tables
Good design has strong impact on space usage, performance (updating,
indexing, retrieving, enforcing rules, etc.) and application development
26
Entity-Relationship (ER) Model
An ER model is a conceptual tool for database design. They help map a
real-world organization onto a database schema
ER diagrams/models are made up of 3 simple concepts
Entities
attr attr attr attr
Any independent object or concept
Attributes entity relate entity
Properties of entities (columns)
Relationships attr
Associations among entities
27
ER Diagrams
Step 1: Starting Point - Description
"Students take a course and receive a grade for their performance. There are
no limits on how many courses a student can take, or on how many students
complete a particular course. Each course is given by several lecturers
Each student is supervised by exactly one lecturer. Each lecturer is allowed to
be the supervisor of at most 20 students. Courses have a unique course
number and a course title. Students and lecturers have a name and a unique
IRD number"
28
ER Diagrams
Step 2: Identify Entities
Entities are often the important nouns in the description
"Students take a course and receive a grade for their performance. There are no limits on how many
courses a student can take, or on how many students complete a particular course. Each course is
given by several lecturers. Each student is supervised by exactly one lecturer
Each lecturer is allowed to be the supervisor of at most 20 students. Courses have a unique course
number and a course title. Students and lecturers have a name and a unique IRD number"
29
The Diagram so Far
student lecturer
course
30
ER Diagrams
Step 3: Identify Relationships
Relationships are often verbs in the description
"Students take a course and receive a grade for their performance. There are no limits on how many
courses a student can take, or on how many students complete a particular course. Each course is
given by several lecturers. Each student is supervised by exactly one lecturer
Each lecturer is allowed to be the supervisor of at most 20 students. Courses have a unique course
number and a course title. Students and lecturers have a name and a unique IRD number"
31
The Diagram so Far
super-
student lecturer
vise
take course give
32
ER Diagrams
Step 4: Identify Attributes
Attributes describe entities and relationships
"Students take a course and receive a grade for their performance. There are no limits on how many
courses a student can take, or on how many students complete a particular course. Each course is
given by several lecturers. Each student is supervised by exactly one lecturer
Each lecturer is allowed to be the supervisor of at most 20 students. Courses have a unique course
number and a course title. Students and lecturers have a name and a unique IRD number"
33
The Diagram so Far
irdno name irdno name
super-
student lecturer
vise
title num
take course give
grade
34
ER Diagrams: More concepts
Key attributes
Used to identify entities
key attr
Cardinalities
Identify the number of participants in a relationship entity
Minimum and maximum for each entity
(min1, max1) (min2, max2)
entity1 relation entity2
35
Cardinality in ER Diagrams
Many, perhaps none Many, at least one
(0, *) (1, *)
student attend course
Many, perhaps none Exactly one
(0, *) (1, 1)
lecturer coord course
Up to two Many, at least one
(0, 2) (1, *)
student relation course
36
ER Diagrams: Principle
key attr attribute key attr attribute
(n1, m1) (n2, m2)
entity relation entity
attribute
37
ER Diagrams
Step 5: Identify Keys
Keys are attributes that are unique for each entity or relationship
"Students take a course and receive a grade for their performance. There are no limits on how many
courses a student can take, or on how many students complete a particular course. Each course is
given by several lecturers. Each student is supervised by exactly one lecturer
Each lecturer is allowed to be the supervisor of at most 20 students. Courses have a unique course
number and a course title. Students and lecturers have a name and a unique IRD number"
38
The Diagram so Far
irdno name irdno name
super-
student lecturer
vise
title num
take course give
grade
39
ER Diagrams
Step 6: Find Cardinalities
Look for key phrases indicating numbers or quantities
"Students take a course and receive a grade for their performance. There are no limits on how
many courses a student can take, or on how many students complete a particular course. Each
course is given by several lecturers. Each student is supervised by exactly one lecturer
Each lecturer is allowed to be the supervisor of at most 20 students. Courses have a unique course
number and a course title. Students and lecturers have a name and a unique IRD number"
40
The Completed Diagram
irdno name irdno name
(1, 1) super- (0, 20)
student lecturer
vise
(0, *) (0, *)
title num
take course give
(0, *) (1, *)
grade
41
From ER to Relational (Overview)
1. For each entity define a relation (table), including the attributes and keys
of the entity
ename(attributes, ...)
2. For each m to n relationship define a relation (table), including the
attributes and keys of the relation
rname(attributes, ...) the key is formed by the keys of both entities
3. For each 1 to n relationship between entities A and B, the key of A
becomes an attribute of B
4. For each 1 to 1 relationship between entities A and B, either the key of A
becomes an attribute of B, or the key of B becomes an attribute of A
5. Normalization
42
Normalization
"Database normalization, or simply normalization, is the process of organizing
the columns (attributes) and tables (relations) of a relational database to
reduce data redundancy and improve data integrity." -- Wikipedia
Normalization is described in a number of forms, 1NF, 2NF, 3NF, BCNF, ….
Most of these are beyond us at the moment - look into advanced DB courses
First Normal Form (1NF)
Atomic attributes: make sure you take apart compound attributes, such
as full names, addresses, …
43
From ER to Relational
id name semester dept num staff_no name
(0, 1)
(0, *) (1, 20) (1, 3) (0, *)
students attend courses teach lecturers named
country descript office
mark
(1, 1)
coord- alias nickname
is-rep
(0, *) (1, 1) (1, 1) inate (0, *)
Note: this is a different example to that on previous slides
Step 1: For each entity define a table (inherit keys)
students(id, name, country)
courses(dept, number, description)
lecturers(staff_no, name, office)
nickname(alias)
44
From ER to Relational
id name semester dept num staff_no name
(0, 1)
(0, *) (1, 20) (1, 3) (0, *)
students attend courses teach lecturers named
country descript office
mark
(1, 1)
coord- alias nickname
is-rep
(0, *) (1, 1) (1, 1) inate (0, *)
Step 2: For each m:n relationship define a table (inherit keys)
attend(id, dept, number, semester, mark)
teach(dept, number, staff_no)
45
From ER to Relational
id name semester dept num staff_no name
(0, 1)
(0, *) (1, 20) (1, 3) (0, *)
students attend courses teach lecturers named
country descript office
mark
(1, 1)
coord- alias nickname
is-rep
(0, *) (1, 1) (1, 1) inate (0, *)
Step 3: For each 1:n relationship include key of A as foreign key in B
courses(dept, number, description, staff_no, student_id)
46
From ER to Relational
id name semester dept num staff_no name
(0, 1)
(0, *) (1, 20) (1, 3) (0, *)
students attend courses teach lecturers named
country descript office
mark
(1, 1)
coord- alias nickname
is-rep
(0, *) (1, 1) (1, 1) inate (0, *)
Step 4: For each 1:1 relationship include key of A as foreign key in B, or
include key of B in A
lecturers(staff_no, name, office, alias)
OR
nickname(alias, staff_no)
47
From ER to Relational
students(id, name, country)
courses(dept, number, description, coord_no, rep_id)
lecturers(staff_no, name, office)
attend(id, dept, number, semester, mark)
teach(dept, number, staff_no)
nickname(alias, staff_no)
Notice that the cardinalities are not all modeled (min is missing). Also note
that we have made changes to attribute names - this is fine as long as they are
clear, it helps understanding
48
From ER to Relational
Finally, normalization
students(id, fname, lname, country)
courses(dept, number, description, coord_no, rep_id)
lecturers(staff_no, fname, lname, office)
attend(id, dept, number, semester, mark)
teach(dept, number, staff_no)
nickname(alias, staff_no)
49
Summary
● Defining the structure of a database is crucial
● ER modelling facilitates the proper definition of databases
● ER diagrams include
○ Entities
○ Attributes
○ Key attributes
○ Relationships
○ Cardinalities
50
Summary
Translating ER diagrams to relational model in 5 steps
1. Transform entities
2. Transform m:n relationships
3. Transform 1:n relationships
4. Transform 1:1 relationships
5. Normalize
Renaming is allowed when clear
51