Database and Security
CPSC 4270/5270
Tuesday, Aug. 22, 2017
Faculty
Instructor: Li Yang
CPSC 4270/5270,
[email protected] Office hours: T: 9-noon, R: 9:11am
Course Time: T/R 12:15-1:30pm
Midterm: Wednesday, Sept. 28, 12:15-
1:30pm
Final: Thursday, December 7: 10:30 am-
12:30 pm
Content
Database Basics, SQL
Database Transactions
NoSQL
OS security and administration of Users
Access Control Models: DAC, MAC,
RBAC
SQL injection
Virtual Private Databases
Database Auditing Models
Teaching/Learning
Strategies
Lecture
Hands-on projects
Process-oriented Guided Inquiry
Learning (POGIL)
Term paper and presentation (PPT)
CPSC 4670 4
Textbooks
Wont follow any book, but you may want to
consult them if you need more details to
understand a topic
Sam Afyouni, Database Security and Auditing:
Protecting Data Integrity and Accessibility.
Thomson. ISBN: 0-619-21559-3, 2005.
The Complete Book, GarciaMolina, Ullman,
Widom
Ron Ben-Natan, Implementing Database
Security and Auditing, Elsevier digital press.
ISBN: 1-55558-334-2. 2005.
Database Basics
CPSC 4270/5270
Database
What is a database ?
Give examples of databases
Database
What is a database ?
A collection of files storing related data
Give examples of databases
Accounts database; payroll database;
UTCs students database; Amazons
products database; airline reservation
database
Database Systems
The four components of a database system are:
Users
Database Application
Database Management System (DBMS)
Database
9
Users, Database
A user of a database A database is a self-describing
system will collection of related records
Self-describing
Use a database
application to track The database itself
contains the definition of its
things structure
Use forms to enter, Metadata is data describing
read, delete and the structure of the
query data database data
Produce reports Tables within a relational
database are related to each
other
10
Database Management System (DBMS)
A big C program written by someone else that
allows us to manage efficiently a large
database and allows it to persist over long
periods of time
A database management system (DBMS)
serves as an intermediary between database
applications and the database
The DBMS manages and controls database
activities
The DBMS creates, processes and administers
the databases it controls
11
Database Applications
A database application is a set of one or more
computer programs that serves as an
intermediary between the user and the DBMS
A database application
Create and process forms
Process user queries
Create and process reports
Execute application logic
Control database applications
12
Desktop Database Systems
Desktop database systems typically:
Have one application
Have only a few tables
Are simple in design
Involve only one computer
Support one user at a time
13
Organizational Database Systems
Organizational database
systems typically:
Support several users
simultaneously
Include more than one
application
Involve multiple computers
Are complex in design
Have many tables
Have many databases
14
DBMS Products
Example Desktop DBMS Products
Microsoft Access
Example Organizational DBMS Products
MySQL
Oracles Oracle
Microsofts SQL Server
IBMs DB2
InterSystems Cach
http://www.intersystems.com/cache/
15
Functionality
1. Create/store large datasets
2. Search/query/update
3. Change the structure
4. Concurrent access to many user
5. Recover from crashes
6. Security (not here, but in other apps)
Possible Organizations
Files
Spreadsheets
DBMS
1. Create/store Large Datasets
Yes, but
Files
Spreadsheets Not really
Yes
DBMS
2. Search/Query/Update
Simple query:
In what year was Rain man produced ?
Multi-table query:
Find all movies by Coppola
Complex query:
For each actor, count her/his movies
Updating
Insert a new movie; add an actor to a
movie; etc
2. Search/Query/Update
Simple queries
Files
Multi-table queries
Spreadsheets (maybe)
All
DBMS
Updates: generally OK
3. Change the Structure
Add Address to each Actor
Very hard
Files
Yes
Spreadsheets
Yes
DBMS
4. Concurrent Access
Multiple users access/update the data concurrently
What can go wrong ?
How do we protect against that in OS ?
This is insufficient in databases; why ?
4. Concurrent Access
Multiple users access/update the data concurrently
What can go wrong ?
Lost update; resulting in inconsistent data
How do we protect against that in OS ?
Locks
4. Concurrent Access
Transfer $100 from Find total amount
account A to B: in A and B:
X = Read(Accounts, A);
X.amount = X.amount -
100; X = Read(Accounts, A);
Write(Accounts, A, X); Y = Read(Accounts, B);
S = X.amount + Y.amount
return S
Y = Read(Accounts, B);
Y.amount = Y.amount +
100;
Write(Accounts, B, Y);
What can go wrong ? Do locks help ?
5. Recover from crashes
X = Read(Accounts, A);
X.amount = X.amount - 100;
Write(Accounts, A, X);
Y = Read(Accounts, B); CRASH !
Y.amount = Y.amount + 100;
Write(Accounts, B, Y);
What is the problem ?
Enters a DMBS
Two tier system or client-server
connection
(ODBC, JDBC)
Database server
(someone elses
C program)
Data files Applications
DBMS = Collection of Tables
Directors: Movie_Directors:
id fName lName id mid
15901 Francis Ford Coppola 15901 130128
... ...
Movies:
mid Title Year
130128 The Godfather 1972
...
Still implemented as files,
but behind the scenes can be quite complex
data independence
1. Create/store Large Datasets
Use SQL to create and populate tables:
CREATE TABLE Actors (
Name CHAR(30) INSERT INTO Actors
DateOfBirth CHAR(20) VALUES(Tom Hanks, . . .)
) ...
Size and physical organization is handled by DBMS
We focus on modeling the database
Will study data modeling in this course
2.
Searching/Querying/Updating
Find all movies by Coppola
SELECT title
FROM Movies, Directors, Movie_Directors
WHERE Directors.lname = Coppola and
Movies.mid = Movie_Directors.mid and
Movie_Directors.id = Directors.id
What happens behind the scene ?
We will study SQL in gory details in this course
3. Changing the Structure
Add Address to each Actor
ALTER TABLE Actor
ADD address CHAR(50)
DEFAULT unknown
Lots of cleverness goes on behind the scenes
3&4 Concurrency&Recovery:
Transactions
A transaction = sequence of statements
that either all succeed, or all fail
E.g. Transfer $100 BEGIN TRANSACTION;
UPDATE Accounts
SET amount = amount - 100
WHERE number = 4662
UPDATE Accounts
SET amount = amount + 100
WHERE number = 7199
COMMIT
Transactions
Transactions have the ACID properties:
A = atomicity
C = consistency
I = isolation
D = durability
4. Concurrent Access
Serializable execution of transactions
The I (=isolation) in ACID
The concurrent execution of transactions
results in a system state that would be
obtained if transactions were executed
sequentially.
5. Recovery from crashes
Every transaction either executes
completely, or doesnt execute at all
The A (=atomicity) in ACID
Atomicity requires that each transaction be "all or
nothing": if one part of the transaction fails, then
the entire transaction fails, and the database state
is left unchanged.
Durability: once a transaction has been committed,
it will remain so, even in the event of power loss,
crashes, or errors.
The Relational Model
Relational Databases
Relational databases are designed to address
many of the information complexity issues
A relational database stores information in tables.
Each informational topic is stored in its own
table.
In essence, a relational database will break-up a
list into several parts. One part for each theme
in the list.
A Project List would be divided into a CUSTOMER
Table, a PROJECT Table, and a
PROJECT_MANAGER Table
36
Entity
An entity is something of importance to
a user that needs to be represented in
a database
An entity represents one theme or topic
37
Relation
A relation is a two-dimensional table
that has specific characteristics
The table dimensions, like a matrix,
consist of rows and columns
38
Characteristics of a Relation
Rows contain data about an entity
Columns contain data about attributes of the
entity
Cells of the table hold a single value
All entries in a column are of the same kind
Each column has a unique name
The order of the columns is unimportant
The order of the rows is unimportant
No two rows may be identical
39
A Sample Relation
EmployeeNumber FirstName LastName
100 Mary Abermany
101 Jerry Caldera
104 Alea Copley
107 Murugan Jacksoni
40
A Nonrelation Example
Cells of the table hold multiple values
EmployeeNumber Phone LastName
100 335-6421, Abermany
454-9744
101 215-7789 Caldera
104 610-9850 Copley
107 299-9090 Jacksoni
41
A Nonrelation Example
No two rows may be identical
EmployeeNumber Phone LastName
100 335-6421 Abermany
101 215-7789 Caldera
104 610-9850 Copley
100 335-6421 Abermany
107 299-9090 Jacksoni
42
Terminology
Synonyms
Table Row Column
File or Data file Record Field
Relation Tuple Attribute
43
A Key and Uniqueness of Keys
A key is one (or more) columns of a relation
that is (are) used to identify a row
Unique Key Non-unique Key
Data value is unique Data value may be
for each row. shared among several
Consequently, the rows.
key will uniquely Consequently, the
identify a row. key will identify a set
of rows.
44
A Composite Key
A composite key is a key that contains two or
more attributes
For a key to be unique, often it must become a
composite key
For example,
To identify a family member, you need to know a
FamilyID, a FirstName, and a Suffix (e.g., Jr.)
The composite key is:
(FamilyID, FirstName, Suffix)
One needs to know the value of all three columns to
uniquely identify an individual
45
A Candidate and Primary Key
A candidate key is called candidate because it is
a candidate to become the primary key
A candidate key is a unique key
A primary key is a candidate key chosen to be the
main key for the relation
If you know the value of the primary key, you will
be able to uniquely identify a single row
46
Relationships Between Tables
A table may be related to other tables
For example
An Employee works in a Department
A Manager controls a Project
To preserve relationships, you may need to
create a foreign key
A foreign key is a primary key from one table
placed into another table
The key is called a foreign key in the table that
received the key
47
Foreign Key Example
Project Manager
ProjID MgrID Primary Key
ProjName MgrName
MgrID
Foreign Key
48
Foreign Key Example
Department Employee
DeptID Primary Key EmpID
DeptName DeptID
Foreign Key
Location EmpName
49
Referential Integrity
Referential integrity states that every
value of a foreign key must match a
value of an existing primary key
For example (see previous slide)
If EmpID = 4 in EMPLOYEE has a DeptID
= 7 (a foreign key), a Department with
DeptID = 7 must exist in DEPARTMENT
50
Referential Integrity
Another perspective
The value of the Foreign Key EmployeeID
in EQUIPMENT
must exist in
The values of the Primary Key EmployeeID
in EMPLOYEE
51
A Surrogate Key
A Surrogate Key is a unique, numeric
value that is added to a relation to
serve as the Primary Key
Surrogate Key values have no meaning
to users and are usually hidden in
forms, queries and reports
A Surrogate Key is often used in place
of a composite primary key
52
Surrogate Key Example
If the Family Member Primary Key is FamilyID, FirstName,
Suffix, it would be easier to append and use a surrogate
key of FamMemberID
FamilyID, FirstName and Suffix remain in the relation
Referential Integrity:
Use
(FamMemberID) in School must exist in
(FamMemberID) in FamilyMember
Instead of:
(FamilyID, FirstName, Suffix) in School must exist in
(FamilyID, FirstName, Suffix) in FamilyMember
53
Functional Dependency
A relationship between attributes in
which one attribute (or group of
attributes) determines the value of
another attribute in the same table
Illustration
The price of one cookie can determine the
price of a box of 12 cookies
(CookiePrice, Qty) BoxPrice
54
Determinants
The attribute (or attributes) that we use
as the starting point (the variable on
the left side of the equation) is called a
determinant
(CookiePrice, Qty) BoxPrice
Determinant
55
Candidate/Primary Keys and
Functional Dependency
By definition
A candidate key of a relation will functionally
determine all other attributes in the row
Likewise, by definition
A primary key of a relation will functionally
determine all other attributes in the row
56
Primary Key and Functional
Dependency Example
(EmpLastName,
(EmployeeID)
EmpPhone)
(ProjectID) (ProjectName,
StartDate)
57
Normalization
Normalization is a process of analyzing
a relation to ensure that it is well-
formed
More specifically, if a relation is
normalized (well-formed), rows can be
inserted, deleted, or modified without
creating update anomalies
58
Normalization Principles
Relational design principles for
normalized relations:
To be a well-formed relation, every
determinant must be a candidate key
Any relation that is not well formed should
be broken into two or more well-formed
relations.
59
Normalization Example
(StudentID) (StudentName,
DormName, DormCost)
However, if
(DormName) (DormCost)
Then DormCost should be placed into its own relation,
resulting in the relations:
(StudentID) (StudentName,
DormName)
(DormName) (DormCost)
60
Normalization Example
(AttorneyID, (ClientName,
ClientID) MeetingDate, Duration)
However, if
(ClientID) (ClientName)
Then ClientName should be placed into its own relation,
resulting in the relations:
(AttorneyID, (MeetingDate, Duration)
ClientID)
(ClientID) (ClientName)
61