University of South Florida
Digital Commons @ University of
South Florida
FUNDAMENTALS OF INFORMATION The Modernization of Digital Information
TECHNOLOGY: Textbook – English Technology
1-1-2023
Chapter 14 Database Applications
Shambhavi Roy
Clinton Daniel
University of South Florida, [email protected]
Manish Agrawal
University of South Florida, [email protected]
Follow this and additional works at: https://digitalcommons.usf.edu/dit_tb_eng
Scholar Commons Citation
Roy, Shambhavi; Daniel, Clinton; and Agrawal, Manish, "Chapter 14 Database Applications" (2023).
FUNDAMENTALS OF INFORMATION TECHNOLOGY: Textbook – English. 13.
https://digitalcommons.usf.edu/dit_tb_eng/13
This Book Chapter is brought to you for free and open access by the The Modernization of Digital Information
Technology at Digital Commons @ University of South Florida. It has been accepted for inclusion in
FUNDAMENTALS OF INFORMATION TECHNOLOGY: Textbook – English by an authorized administrator of Digital
Commons @ University of South Florida. For more information, please contact [email protected].
Database Applications
CHAPTER CONTENTS
Overview 302
Evolution of Database Technologies 302
Why Study Databases 304
Relational Databases 305
Introduction to SQL 306
SQL Commands Types 307
Querying a Database Using DML Commands 307
Querying from Multiple Tables 308
Updating Data 309
Deleting Data 310
Advanced Querying Options 311
Sorting 311
Grouping 312
Calculated Fields 313
Creating and Managing Tables Using DDL Commands 314
Inserting Data 315
Importing Data 316
Transactions Using TCL Commands 317
Controlling Database Privileges Using DCL Commands 319
Conclusion 319
Chapter Terms and Definitions 320
Chapter Case: Dionne’s Grade Tracking Database 321
Chapter 14—Database Applications 301
Once the business data have been centralized and integrated, the value of the
database is greater than the sum of the preexisting parts
—Larry Ellison, co-founder of Oracle
Overview
Databases are structured collections of data that can be efficiently searched and manipulated by
computers.222 Databases are the underlying information store powering our modern technology-
driven world. These databases are like digital vaults where you can keep data safely, and authorized
users can read or update the data as needed. While you may not yet know what databases are, they
touch your life daily. Your student records, your score history in online video games, your checkout
history at the library are all stored in databases somewhere. In all likelihood, your personal information,
including your home address, your phone number, and your school grades are stored in multiple
databases.
Database applications typically refer to software programs that help users interact with the data
stored in databases. They fall into categories such as customer relationship management (CRM),
enterprise resource planning (ERP), learning management system (LMS), etc. For example, enterprise
resource planning applications help retailers update their stock information when you checkout
items at the store. Learning management systems help schools tracks the performance of all students
at the school. When you buy something online, inventory management applications update the stock
information on the site and create the shipping label for the items to be delivered to your home. We
have discussed these applications throughout this book.
In this chapter, we focus on databases and introduce the technologies and concepts used to
build databases. While there are several types of databases including object-oriented databases,
hierarchical databases, and relational databases, most modern applications use relational databases.
We, therefore, introduce relational database technology in this chapter.
Evolution of Database Technologies223
Commercial databases trace their origins to the Apollo moon program. The Saturn V rocket used
to land Neil Armstrong on the moon used more than 3 million parts.224 Rockwell International, the
company building the rocket, asked IBM if it could build a computer system to track the status of all
these parts. A team of about 25 engineers from IBM, Rockwell, and Caterpillar, led by IBM engineer
Vern Watts, created IMS (Information Management System), the world’s first commercial database in
1968. Since then, IMS has become the backbone of banking, hotel, and airline applications around
222 This definition is adapted from Oracle’s site at https://www.oracle.com/database/what-is-
database/ (accessed June 2023).
223 This section draws heavily on Sinclair Target’s blog post at https://twobithistory.org/2017/12/29/
codd-relational-model.html. All posts on his blog are relevant to DIT, check them out at https://
twobithistory.org/ (accessed June 2023).
224 An overview of the Saturn V is at https://airandspace.si.edu/exhibitions/space-race/online/
sec300/sec384.htm (accessed June 2023).
302 Chapter 14—Database Applications
the world. Any time you use your credit card or make an airline reservation, chances are that the
transaction involves IMS at some point.225
IMS is considered a hierarchical database. Hierarchical databases are very efficient, but also very
inflexible. Searches in hierarchical databases require traversing the data store.226 They are designed
to respond to specific queries and can be very slow in responding to other queries.227 Relational
databases improve upon this limitation.228
The relational model to store data was introduced by Edgar F. Codd of IBM in 1970.229 Codd used
concepts from set theory to organize data. He used the term “relation” to refer to a table in the
database based on set theory, and this reference to relations gives the technology its name.
Following the publication of his landmark paper in 1970, which introduced the relational model, Codd230
described how data storage could be optimized in the relational model (called normalization).231 He
also demonstrated that a small set of operations could extract any information from the database.232,
233
This language is now called Structured Query Language (SQL).234
225 This is from IBM’s page on IMS at https://www.ibm.com/ibm/history/ibm100/us/en/icons/
ibmims/ (accessed June 2023).
226 One of the leaders of hierarchical databases, Charles Bachman, provides an overview in
his 1973 Turing award lecture, “Programmer as Navigator,” at http://people.csail.mit.edu/
tdanford/6830papers/bachman-programmer-as-navigator.pdf. Wikipedia has a page on
Bachman at https://en.wikipedia.org/wiki/Charles_Bachman (accessed June 2023).
227 An overview of hierarchical databases can be seen at https://dataintegrationinfo.com/
hierarchical-vs-relational-database/ and https://en.wikipedia.org/wiki/Hierarchical_database_
model (accessed June 2023).
228 The Wikipedia page on relational databases provides a nice overview of the technology at
https://en.wikipedia.org/wiki/Relational_database (accessed June 2023).
229 The paper is considered a landmark paper in the commercialization of computing technologies.
You can download it from https://www.seas.upenn.edu/~zives/03f/cis550/codd.pdf (accessed
June 2023).
230 E. F. Codd, Wikipedia https://en.wikipedia.org/wiki/Edgar_F._Codd (accessed June 2023). Codd
has a Florida connection—he retired to Aventura, Florida, where he died in 2003.
231 E. F. Codd, Normalized database structure: A brief tutorial, available at https://dl.acm.org/doi/
pdf/10.1145/1734714.1734716 (accessed June 2023). Proceedings of the 1971 ACM SIGFIDET
(now SIGMOD) Workshop on Data Description, Access and Control, November 1971, Pages 1–17
232 E. F. Codd, A database sublanguage founded on the relational calculus, https://dl.acm.org/doi/
pdf/10.1145/1734714.1734718 (accessed June 2023). Proceedings of the 1971 ACM SIGFIDET
(now SIGMOD) Workshop on Data Description, Access and Control, November 1971, Pgs 35–68
233 E. F. Codd, “Relational Completeness of Data Base Sublanguages,” Computer Science Research
Report, 1972, available at http://www.geology.cz/personal/j/jan.sedlacek/codd2.pdf (accessed
June 2023).
234 For those interested in more detail on Codd’s relational model, his colleague C. J. Date has
published a critique of the model, “Codd’s First Relational Papers: A Critical Analysis,” available at
https://www.dcs.warwick.ac.uk/~hugh/TTM/CJD-on-EFC%27s-First-Two-Papers.pdf (accessed
June 2023).
Chapter 14—Database Applications 303
While the relational model was conceptually sound, the success of IMS prevented commercial
investments from creating a working technology based on the relational model. IBM had a small
team working on a project called System R to build a relational database, and Professor Michael
Stonebraker235 at UC Berkeley obtained funding from NSF as well as Army, Navy, and Air Force to start
a project called INGRES to create a database using the relational model.236, 237 The viability of INGRES
and wide distribution of its open-source code facilitated the commercialization of the relational
model. Today, the relational model is taught as the standard database model in all relevant academic
programs.
A few popular relational databases include:
· Oracle Database, offered by Oracle corporation, is probably the most popular relational
database system in the world.
· MySQL is a widely used open-source relational database management system. MySQL is also
owned by Oracle since its purchase of Sun Microsystems.
· Microsoft SQL Server is provided by Microsoft and common in Microsoft-based ecosystems.
· PostgreSQL is another popular open-source relational database management system that is
known for its advanced features and reliability. It is the successor to the INGRES project.
· IBM DB2 is provided by IBM and is widely used in enterprise applications.
· SQLite is a lightweight relational database management system that is often used in embedded
systems and mobile applications. For example, your browser uses SQLite to manage bookmarks.
Why Study Databases
Compared to the technologies covered in other chapters in this book, database technologies have
a unique feature—they are a core enterprise technology, there is an entire class of highly paid
professionals—database administrators—who maintain databases. We introduce the relational
database technology in this book because of its importance as well as because the basic concepts
of relational databases should be very interesting to high school students. If you find the contents in
235 A summary of Stonebraker’s work can be seen in his award citation for the John von Neumann
medal at https://par.nsf.gov/servlets/purl/10219488 (accessed June 2023).
236 This paragraph draws from the excellent story in “The rise of relational databases,” Chapter
6 in Funding a revolution: Government support for computing research,” available at http://
worrydream.com/refs/National%20Research%20Council%20-%20Funding%20a%20
Revolution.pdf, pgs 159–168 (accessed June 2023).
237 For a report on INGRES, please see Michael Stonebraker, Eugene Wong, Peter Kreps and Gerald
Held, “The Design and Implementation of INGRES Tandem Computers, Inc.,” available at https://
dl.acm.org/doi/pdf/10.1145/320473.320476. It goes into the details of INGRES’ working.
304 Chapter 14—Database Applications
this chapter interesting, you can learn more online at your own pace at sites such as Khan Academy,238
W3Schools,239 Coursera,240 EdX,241 etc.
Relational Databases
Relational databases are databases that organize data as tables linked to each other. Each row in a
table represents one item in the database. A single database can have any number of tables, with
each table representing all stored items of a type.
Each table (or relation) in a relational database represents an entity or a concept. Each table consists
of rows, also known as records or tuples, and columns, also known as attributes or fields. Each row in a
table represents a specific instance of the entity, and each column represents a specific characteristic
or data related to that entity.
In a relational database, tables are related to each other using keys. Following set theory, every row
in the database is unique. Keys uniquely identify data in each row and also allow data from different
tables to be joined together. Let’s go over these ideas with an example of students enrolled in courses.
Figure 231 shows a relational database model storing information about courses taken by students.
There is a table called Students, which stores information about students such as names (other
information could also be added). A separate table called Courses has information about courses
such as the title of the course. Finally, a table unique to the relational database model, called
StudentCourses, identifies all the courses taken by each student.
StudentCourses
Students Courses
FIGURE 231 — Relational databases allow users to query multiple tables of data.
The tables for students and courses are intuitive, but the way the tables are linked to each other to
list all the courses taken by all the students gives the relational database model its power. First note
that the Students and Courses tables have a column labeled Id, which assigns a unique label (called
identifier, or Id for short) to each row. In the StudentCourses table, both students and courses are
238 “Intro to SQL: Querying and Managing Data,” https://www.khanacademy.org/computing/
computer-programming/sql (accessed June 2023).
239 “SQL Tutorial,” https://www.w3schools.com/sql/ (accessed June 2023).
240 “SQL for Data Science,” https://www.coursera.org/learn/sql-for-data-science (accessed June
2023).
241 “Databases: Relational Databases and SQL,” https://www.edx.org/course/databases-5-sql
(accessed June 2023).
Chapter 14—Database Applications 305
represented by the respective unique identifiers (Ids). We can see from the StudentCourses table that
only Maria (studentId 2) has taken History (courseId 5), but Sam, Maria, and Carter have all taken DIT
(course Id 2).
In the relational model, a column (or combination of columns) that is guaranteed to be unique in a
table can serve as the primary key for the table. For example, your license number is a primary key
that uniquely identifies you in the Department of Motor Vehicles (DMV)’s database. Your student Id
is a primary key that uniquely identifies you in your school’s database. Usually, the unique identifiers
serve as the primary key column(s) of tables. For example, in the Students table, the “Id” column is the
primary key of the Students table. Similarly, even in the Courses table, the “Id” column is the primary
key. Typically, the database software generates a unique value for the primary key in a row.
When a primary key is defined on a table, the relational database management system ensures that
no two rows in the table have the same primary key values. In addition, the primary key column(s)
cannot have null (empty) values since a null value cannot uniquely identify a row.
Organizing data as shown in Figure 231 allows for efficient storage and retrieval of data using various
relational operations and Structured Query Language (SQL) commands. For example, if you regularly
buy goods from Amazon, they can maintain a record of all your purchases and returns made in the
recent past by looking you up using your primary key and pulling all the purchase records associated
with your primary key. Each purchase is a database record, and so is every return. Each record of
purchase has many details, including the item name, the name of the seller, the time of transaction, the
amount, and the location of the transaction. Once you have a way to input your data into a database,
you can track, update, delete, and report on that data easily. In Amazon’s database, a collection of all
the purchases and returns of all their customers comprises the transactions table. Can you imagine
how huge the table must be, how many bytes of data it must have?
In the relational database model of Figure 231, you will only need to interact with the Students table if
you want to get the basic information related to students. Similarly, if you just want to get information
about all the courses offered, you will only need to query the Courses table. However, to get the
course information of every student, you will need to use the StudentCourses table, which joins all
the Students and Courses tables by Ids. Retrieving records by Ids is much faster than that by names,
which makes well-designed relational databases extremely efficient.
In relational databases, the process of keeping data spread across multiple tables is called normalization.
Normalization ensures that no single table becomes too unmanageable in size. Not surprisingly, the
relational database model is widely used across industries for its flexibility, scalability, and robustness.
Introduction to SQL
Storing data in a database is just the first step. Once you have your data in a database, you need to
interact with the database management system (the software that manages the database) to see or
change the data. Structured Query Language (SQL) is a programming language used to interact with
relational databases.
SQL commands can be used to create new databases and tables. There are also SQL commands to
insert, update, and delete data in existing tables. SQL can also be used to search for and retrieve
specific data using queries, which are statements that specify the criteria for selecting data. Most
306 Chapter 14—Database Applications
college programs in computing have a course on databases, which goes into all the details of SQL. In
this chapter, we only provide an overview of SQL.242
SQL Commands Types
There are four main types of SQL commands you should know about:
Data Manipulation Language (DML) commands are used to manipulate the data stored in databases.
The most commonly used DML commands are SELECT, INSERT, UPDATE, and DELETE. SELECT is used
to retrieve data from databases; INSERT is used to add new data; UPDATE is used to modify existing
data; and DELETE is used to remove data from databases.
Data Definition Language (DDL) commands are used to define the structure (schema) of databases.
When you are creating a new database, you will need to create tables, specify the relationship between
tables, and add constraints on the data to make sure the integrity of the data is maintained. CREATE,
ALTER, and DROP are the most common DDL commands you will need. CREATE is used to create new
tables and other database objects; ALTER is used to modify the structure of existing tables; and DROP
is used to delete tables and other database objects.
Transaction Control Language (TCL) commands serve as gatekeepers of changes. COMMIT, ROLLBACK,
and SAVEPOINT are the most common TCL commands. If you COMMIT a transaction, you make the
changes permanent; ROLLBACK undoes the changes you made earlier; and SAVEPOINT creates a
bookmark in a transaction to which you can rollback later.
Data Control Language (DCL) commands are used to control who can access your database. GRANT
and REVOKE are the most common commands. GRANT is used to grant permission to users to access
and modify the database; and REVOKE is used to revoke permissions.
Querying a Database Using DML Commands
If you are a database developer, you will use DML commands very frequently to change previously
added data, delete existing records, or insert new data. Luckily, SQL is easy to learn because it has
English-like commands to select, insert, update, and delete. SELECT allows you to fetch data; INSERT
is used to add data; UPDATE is used to change previously added data; and DELETE is used to remove
records.
Querying data—The first operation you will most likely want to do is view data stored in your database.
Here’s an example of a basic SELECT statement:
SELECT *
FROM Students
WHERE Id= 56745;
242 You can practice SQL at several sites online. A popular site for code practice is jdoodle, https://
www.jdoodle.com/execute-sql-online/ (accessed June 2023).
Chapter 14—Database Applications 307
This statement tells the RDBS to fetch all the details (*) from the Students table about the student
whose Id is 56745. If you wanted to fetch just a few specific columns about the student, you could
alter the query to specify the names of columns as:
SELECT StudentName, StudentAge, StudentAddress
FROM Students
WHERE Id= 56745;
The WHERE command is powerful, offering many possibilities, and can be used to select exactly what
you want to fetch.
Suppose you want to get all the students who are either from California or Florida, you could change
the WHERE clause in the following way:
SELECT *
FROM Students
WHERE State = ‘California’ OR State = ‘Florida’;
This statement will only return those students who have the values California or Florida in the column
“State.” Just like the equals (=) and the OR operations, you can also employ > (greater than), < (less
than), BETWEEN, LIKE, and IN.
SELECT *
FROM Students
WHERE State = ‘California’ AND GPA >= 3.5;
This command only returns the rows where the “state” column is equal to “California” and the “GPA”
column is greater than or equal to “3.5.”
You can also use both “AND” and “OR” operators in the same query and even group conditions using
parentheses. For example:
SELECT *
FROM Students
WHERE (State = ‘California’ AND GPA >= 3.5)
OR (State = ‘Florida’ AND GPA >=3.5);
Querying from Multiple Tables
In relational databases, data is stored in many tables to eliminate data duplication. However, you will
frequently need to join tables to get the data you want. In SQL, JOIN is a special command that links
tables based on shared columns.
308 Chapter 14—Database Applications
Let’s consider two tables “customers” and “orders”
(Figure 232) to see how JOIN works:
Schema refers to the way a database is structured.
To relate the orders to the customers who placed
those orders, the Orders table has the Id of the
customer as the customer_id column. This allows
us to link the two tables (Customers and Orders).
For example, to get the orders placed by customers,
we can link the two tables, as shown below, and get
all orders from all customers:
SELECT customers.first_name, customers.
last_name, orders.order_date, orders.total
FROM customers, orders
WHERE customers.id = orders.customer_id;
With this query, you are fetching the first name and
last names of your customers from the Customers FIGURE 232 — Data is structured in tables with
table and the order_date and total from your Orders rows and columns, similar to spreadsheet files.
table. We can join the two tables because the id
column of the Customer table and the customer_id
column of the Orders table contain the same data:
the id of the customers. The relational model gives
us this convenient method to retrieve the data we
need from databases. SQL queries read like plain
English sentences. (Figure 233.)
Updating Data FIGURE 233 — Separate tables can be joined by
using matching columns or “keys.”
Databases need to be updated with the latest
information all the time. For example, at the DMV,
drivers will come in to register new cars, update their license plates, or change their current residential
addresses. With the help of SQL, you can identify the record in the database that needs to be updated
and set the new values.
The UPDATE command allows you to change the values of one or more columns in one or more rows
of a table. The basic syntax of the SQL UPDATE command is as follows:
UPDATE table_name
SET column1 = new_value1, column2 = new_value2, …
WHERE some_condition;
In this syntax:
· table_name is the name of the table you want to update.
· column1, column2, etc. are the names of the columns you want to update.
Chapter 14—Database Applications 309
· new_value1, new_value2, etc. are the new values you want to set for the corresponding
columns.
· some_condition is a condition that specifies the rows to update. Only the rows that satisfy the
condition will be updated.
Here’s an example that demonstrates how to use the UPDATE command to change the email address
of a customer:
UPDATE customers
SET email = ‘
[email protected]’
WHERE customer_id = 123;
This command will update the email address of the customer with ID 123 to “newemail@example.
com.”
You must be very careful when using the UPDATE command and not change data you don’t want to
change. Always doublecheck your WHERE clause to make sure you are correctly specifying the rows
to update. For example, the command below will update the email addresses of ALL the customers
in the database to “
[email protected].” If your database has information about a million users,
the email addresses of all million users will be updated instantly. Unless all your customers have the
same email address, you don’t ever want to run a command like the one below:
Update customers
SET email =’
[email protected]’;
Deleting Data
If a student drops out of a class, you’ll need to remove the student from the Class table. If a customer
requests you to remove them from your monthly mailers, you’ll need to delete the customer from
your Subscriptions table.
The DELETE statement deletes one or more rows from a table based on the selection criteria in the
WHERE clause. The basic syntax of the DELETE statement is as follows:
DELETE FROM table_name
WHERE condition;
In this syntax:
· table_name is the name of the table from which you want to delete data;
· condition specifies which rows to delete and only the rows that satisfy the condition will be
deleted.
Here’s an example that shows how to use the DELETE statement to remove a particular customer
record:
DELETE FROM customers
WHERE customer_id = 123;
310 Chapter 14—Database Applications
This command will delete the customer record with Id 123 from the “customers” table.
It’s important to be careful when using the DELETE statement to avoid removing data you don’t
want to delete. Always make sure that you have a backup of your data before performing deletions.
Additionally, as mentioned earlier for the UPDATE command, double-check your WHERE clause to
ensure that you are deleting only the data you want to remove.
If you forget to include a WHERE clause, as in the example below, you may end up deleting ALL the
rows in the customers table:
Delete FROM customers;
Defensive Updates and Deletes
Before you run a DELETE command, you may want to test the WHERE clause in
a SELECT statement to confirm you are deleting the right set.
For example, to DELETE all customers who live in a particular zip code, run this
SELECT command first.
Select * from CUSTOMERS where zipcode=’95070’;
If the output has all the rows you want to delete, then use the same WHERE
clause in your delete command to delete records.
Delete FROM customers WHERE zipcode=’95070’;
Advanced Querying Options
Databases can quickly get large and may require special reporting, sorting, and grouping to make
sense of the stored data. If you have a database to keep track of exam results for your school, the
“Students” table may have 1000s of rows. If you assume each student takes 50 courses over their high
school career, the “StudentCourses” table will be 50 times as large as the “Students” table. Now if each
course has 100 grading events, you can imagine how large the “Grading” table will be.
When you report on this data, you need to know how to sort, group, and limit the amount of output
to quickly get to the information you need. For example, your teacher may be interested in identifying
the two students who have not submitted a deliverable, or the four or five students who have excelled
on tests. It is not helpful in these cases to simply print out the 5 million rows for all grading events at
the school. It is much more useful to use the capabilities of the database to list just the four or five
rows with the needed information.
Sorting
The ORDER BY command sorts the result set in ascending or descending order based on one or more
columns in the SELECT statement. The basic syntax of the ORDER BY clause is as follows:
SELECT column1, column2, …
FROM table_name
ORDER BY column1 [ASC|DESC], column2 [ASC|DESC];
Chapter 14—Database Applications 311
You can specify the column or columns you want
to sort by, followed by either “ASC” for ascending
order or “DESC” for descending order. If you want,
you could sort by multiple columns, separated by
commas.
Let’s consider a table of employees with the following
data: id, name, age, and salary (Figure 234).
To sort this table by salary in descending order, you
can use the following SQL query:
SELECT * FROM employees FIGURE 234 — The ORDER BY command allows
users to efficiently sort data in a table.
ORDER BY salary DESC;
This will return the following output (Figure 235):
To sort the same table by age and then by salary,
try the following SQL query:
SELECT * FROM employees
ORDER BY age ASC, salary DESC;
This will return the following output (Figure 236).
FIGURE 235 — Users can choose to sort a table by
Grouping column name in ascending or descending order.
This table is sorted by salary in ascending order.
The GROUP BY command helps you get a summary
of data. If you have a database of students, you may
want to group students by their courses to see how
well the students are doing in different courses. Or,
you may want to group students by their classes
(freshman, sophomore, junior, senior) to see the
average grade across different cohorts.
The GROUP BY command in SQL is used to group
rows that have the same values in one or more
columns. It is typically used with aggregate
functions like AVERAGE, MAX, MIN, and COUNT. The
result set will contain only one row for every value FIGURE 236 — ORDER BY commands are not
limited to one condition. A user can combine
of the column in the GROUP BY clause. For example,
conditions to further refine their request.
if you GROUP BY the department field in a table
that has employee_name, salary, and department,
then the result set will have one row for every department.
As another example, if you want to count the number of employees and the average salary of
employees in every department, you can use GROUP BY, COUNT, and AVG commands.
312 Chapter 14—Database Applications
SELECT department, COUNT(*) as num_employees, AVG(salary) as avg_salary
FROM employees
GROUP BY department;
In this example, the query counts the number of
employees in each department (COUNT(*)) and
gives you the average salary of those employees.
In the original table, we did not have the count of
employees or the average salary of employees. So,
in the output, we have two new fields that did not
exist before. You can use the “as” operator to give
these fields meaningful names in the output as FIGURE 237 — A variety of commands, including
GROUP BY, allow users to count sets in data.
shown in Figure 237.
The general syntax for using the GROUP BY command, is as follows:
SELECT column1, column2, …, aggregate_function(column)
FROM table_name
GROUP BY column1, column2, …;
In the above syntax, column1 and column2 are the columns you wish to GROUP BY. The aggregate_
function(column) is the function that you want to apply to the grouped data, such as COUNT, AVG,
SUM, MAX, MIN, etc.
You can use multiple columns in the GROUP BY clause if you want to group data by more than one
column.
As you can see, the GROUP BY command is a powerful tool in SQL that allows you to summarize data
based on specific criteria, making it easy to extract useful information from large datasets.
* in SQL
The symbol “*” as used in the GROUP BY example above indicates “all.” In the
example, it tells the database to count all rows in each group.243
Calculated Fields
There are times when you don’t want to simply present the data in the database but would like to
apply some calculations before printing the output. For example, if you are looking at a database of
students with their scores, you may also like to calculate and print their letter grades (A, B, C). Or, if
you have a database of employees, you may want to calculate their bonuses as a percentage of their
salaries. SQL lets you do these calculations easily without impacting the underlying data.
A calculated field is a column in a query result set that is created by applying an expression or formula
to one or more columns in a table. The expression can be a combination of arithmetic operators,
243 You can check out this Stackoverflow post for some additional information, https://stackoverflow.
com/questions/38662318/what-does-mean-in-sql (accessed June 2023).
Chapter 14—Database Applications 313
functions, and constants. This can be especially helpful when you need to perform complex
calculations on columns in the table or create custom metrics that are not available in the original
dataset. Consider this query:
SELECT first_name, last_name, salary, salary * 0.1 AS bonus
FROM employees;
In this example, the query calculates a 10% bonus
for all employees.
The result set of this query has four columns: first_
name, last_name, salary, and bonus (Figure 238).
The bonus column is a calculated field that contains
the bonus of employees based on their salaries. As
the field does not exist in the original table, you can FIGURE 238 — SQL queries can also preform
calculations and output new fields, which is crucial
give it a meaningful name with the help of the “AS” in data analysis.
operator.
Creating and Managing Tables Using DDL Commands
To create a table in SQL, you can use the CREATE statement.
CREATE TABLE table_name (
column1 datatype,
column2 datatype,
column3 datatype,
…
);
In the above DDL command, the datatype specifies the type of data that can be stored in the column.
Common datatypes are VARCHAR for text, INT for integers, DECIMAL for decimal numbers, and DATE
for dates. An interesting feature of relational databases is that most data in databases is numbers or
strings.
Here’s an example of a CREATE TABLE statement that creates a table Customers with columns for three
items of information (name, email and rent) and an id:
CREATE TABLE boarders (
id INT PRIMARY KEY,
name VARCHAR(50),
email VARCHAR(100),
rent INT
);
314 Chapter 14—Database Applications
If you run this SQL command, you will create a table called boarders with four columns: id, name,
email, and rent. The id column is an integer and is set as the primary key of the table.
There are occasions when you no longer need a table, either because you want to create a new table
or because you have the same data in another table. You can use the DROP command to permanently
delete tables.
DROP TABLE table_name;
For example, to delete the boarders table, run the following DDL command:
DROP TABLE boarders;
Sometimes you may need to change a table to add new columns, change the column name, or change
the data type of a column. You can use the ALTER TABLE command to add, remove, and rename
columns. You can even change the type or the size of data stored in columns. Here are examples for
each of these commands:
ALTER TABLE table_name ADD COLUMN column_name data_type;
__________
ALTER TABLE table_name MODIFY COLUMN column_name new_data_type;
__________
ALTER TABLE table_name
RENAME COLUMN old_column_name TO new_column_name;
__________
ALTER TABLE table_name DROP COLUMN column_name;
The DROP COLUMN command removes an existing column from the table.
Storing Images, Documents, and Other Information in Relational Databases
Relational databases typically store simple data types such as numbers and
strings. However, we often need to store information such as images and
documents. How can you store such information in relational databases?
Most modern relational databases have a datatype called blob (binary large
object) that can be used to store images, documents, videos, and any other
information. However, a better alternative is to store these images, documents,
videos, etc., as files in the file system and just save the name and location of the
file in the database. This keeps the database compact, speeding up searches.
Inserting Data
To add data to tables in your database, you can use the INSERT INTO command.
The INSERT command expects you to know the column names and the type of data the columns can
Chapter 14—Database Applications 315
hold; therefore, you will need to know the structure of the table before you can insert data into it. To
check the structure of a table, use the DESCRIBE command.
DESCRIBE Users;
When you run the above command, you will see the structure of Users, a result set that lists all the
columns in Users, along with their data types and any constraints applied to them.
As you can see in Figure 239, the Users table has
three columns: id, name, and email. The id column
is an integer with a maximum length of 11 digits
and is marked as the primary key. The type of
both the name and email columns is varchar with
a maximum length of 50 characters. They are also
marked as nullable, which means that they can FIGURE 239 — Utilizing the DESCRIBE command is
contain Null values. In a relational database, a Null useful when determining data structure.
value means that the data is unknown or missing.
While this table looks okay in theory, a record that has just an id, but no name or email won’t make
sense. So, ideally, you should change the table to make sure either the name or the email contains a
value—cannot be Null.
Once you know what a table expects, you can use the INSERT INTO statement to begin adding data.
Here’s an example.
INSERT INTO Users (id, name, email) VALUES (1, ‘Cecilia Flores’, ‘
[email protected]’);
INSERT INTO Users (id, name, email) VALUES (2, ‘Ratna Devi’, ‘
[email protected]’);
INSERT INTO Variants
The examples shown are the typical ways the INSERT command is used. We
listed three column names next to the table name (Users (id, name, email)) and
gave the values to be inserted for each of these columns. There is another way
to use the INSERT command by skipping the column names. However, you can
only skip the column names if the data being passed has the same columns in
the same order as the existing columns of the table. For this reason, the usage
shown in the example is more typical.
Importing Data
There are times you already have data that you would like to insert into a table in a file. For example,
you may want to move the data in an Excel file into a database. The general concept to import data
is similar across all relational database systems and uses some common methods including the
following:
· We’ve already used the INSERT INTO command to add data to a table in a database. You can
also use the LOAD DATA INFILE command to insert data from a file into a database table.
Ideally, the file should be a text file with well formatted input data, so you can map the data to
the columns in your target table.
316 Chapter 14—Database Applications
· Many database management systems provide command-line utilities to import data from
different sources. For example, MySQL provides the mysqlimport utility that can be used to
import data from CSV or TSV files.
· Some database management systems provide GUI tools that allow you to import data. For
example, Microsoft SQL Server Management Studio provides the Import Wizard that can be
used to import data from different file formats such as Excel, CSV, and Access.
· You can write a script (software program) in a programming language such as Python or Java
to read data from a file and insert it into a database table using the INSERT INTO command.
In general, here are the steps developers follow to import data into an SQL database:
· Connect to the database using a database client or a programming language.
· Create a table or select an existing table to import data into.
· Prepare the data by cleaning it up and ensuring it is in the correct format.
· Use SQL commands, command-line utilities, or a programming language to import the data
into the table.
· Verify that the imported data is accurate by querying the table.
Transactions Using TCL Commands
As stores of information, databases have stringent requirements for correctness. Therefore, if a
database activity fails for any reason, you typically want to restore the data to its original state. Let’s
demonstrate this by example.
Suppose you manage the database of Expedia.com and need to transfer a few passengers to a new
flight because the original flight is inconvenient. Since the passengers want to be transferred only if
the new flight has seats available, you want to make sure that both the transactions (booking seats on
the new flight and cancelling seats on the original flight) are successful. If you remove the passengers
from their original flight, you must be able to allocate seats for them in the new flight. If your operation
completes the first step but fails the second, you will have some very angry passengers with no flight.
Without the assurance that transactions complete as a set or do not occur at all, databases would not
be very useful.
Banking transactions also need to fail or succeed as a set. If you remove money from one account
but fail to add it to the target account (say if the network connection fails), then you end up in an
inconsistent state as far as the database is concerned. The unfortunate account owners may wonder
what happened to their money, and the bank may receive some frantic calls.
To avoid these problems, databases support the concept of a transaction. You can make sure that
database operations in a transaction are executed as a single unit of work or a single transaction, or
not at all. The Transaction Control Language (TCL) commands—BEGIN TRANSACTION, SAVEPOINT,
ROLLBACK, and COMMIT are useful in ensuring a set of transactions fail or succeed as a set.
The BEGIN TRANSACTION statement starts a transaction and the COMMIT statement commits or
saves the transaction to the database permanently. If any part of the transaction fails, the ROLLBACK
statement is used to undo the changes made by the transaction and to return the database to its
original state. Consider the example:
Chapter 14—Database Applications 317
BEGIN TRANSACTION;
UPDATE accounts SET balance = balance - 250 WHERE account_id =786543;
UPDATE accounts SET balance = balance + 250 WHERE account_id = 90643;
COMMIT;
In this example, the transaction has two UPDATE statements. One subtracts $250 from account_id
786543 and the other adds $250 to a different account. If either of the UPDATE statements fails, a
ROLLBACK statement would undo the changes made by the transaction to return the database to its
original state. Anytime you are unsure of the changes you are about to do; you may want to create a
Savepoint to be able to Rollback the database changes to the Savepoint. Consider the following set
of transactions:
SAVEPOINT Mark1;
UPDATE accounts SET balance = balance - 250 WHERE account_id =786543;
SAVEPOINT Mark2;
UPDATE accounts SET balance = balance + 250 WHERE account_id = 90643;
ROLLBACK TO Mark2;
At the end of the set of above transactions, the database will have just one change (the first update)
since we have rolled back the second change.
Transactions implement a few concepts unique to databases: atomicity, consistency, isolation, and
durability (these are popular in the industry as the ACID properties of transactions).
· Atomicity: a transaction is atomic, meaning that it is executed as a single unit of work. Either
all the operations in the transaction are completed successfully, or none of them are. If any
part of the transaction fails, the entire transaction is rolled back, and the database is left in
its original state. In our example, when you transfer money between accounts, both activities
(removing from one account and depositing in another) complete as a single transaction.
· Consistency: a transaction must keep the database in a valid state both before and after
the transaction. Any constraints or rules that apply to the data must not be violated by the
transaction. In our example, the total amount of money in accounts must be the same before
and after the transfer of money between accounts.
· Isolation: the data accessed by one transaction must be isolated and not affect other
transactions running in parallel. A transfer of money from one account to another must not
impact other banking activity between other users and their accounts.
· Durability: once a transaction is committed or saved to the database, the changes are
permanent and must survive any subsequent failures or crashes. Even if the database crashes
once a transaction has completed, the data must be recoverable.
318 Chapter 14—Database Applications
Controlling Database Privileges Using DCL Commands
Database administrators manage databases using DCL commands. These commands help database
administrators create roles and users, grant privileges to roles, assign roles to users, and alter and
revoke privileges. Granting specific privileges prevents accidental errors. Removing users when they
leave or change organizations prevent unauthorized access.
The CREATE USER command creates a new user account in the database.
Example: CREATE USER jose_robelo IDENTIFIED BY ‘password’;
The ALTER USER command modifies the properties or privileges of an existing user.
Example: ALTER USER jose_robelo PASSWORD ‘new_password’;
THE CREATE ROLE command is used to create a new role. Once you have a role, you can assign
privileges to the role. You may have roles like developer, marketing, sales, and admin, each with a
specific set of privileges. When Jose Robelo joins the organization, granting the right role to him will
make sure he has all the necessary privileges.
CREATE ROLE marketing;
Once you have a role, you can grant privileges to the role; and then assign the roles to users.
GRANT select, update ON customers TO marketing;
GRANT marketing TO jose_robelo;
Later, if Jose Robelo leaves the marketing organization, you can revoke all his privileges with the
following command:
REVOKE marketing FROM jose_robelo;
Conclusion
There is no modern organization or government that doesn’t own and rely on its huge databases of
information about their people, industries, education, banking systems, and hospitals, among others.
Modern databases are typically built using the relational model. A simple English-like language allows
you to work with the data in databases.
Although relational databases are popular and widely used across industries and governments, newer
technologies such as NoSQL databases and graph databases are also useful in certain scenarios. Every
database has its own unique strengths and weaknesses. So, you should learn about them before you
decide to pick one to store your data. Most computing programs in colleges discuss the development
and maintenance of databases in more detail.
Chapter 14—Database Applications 319
Chapter Terms and Definitions
Data Control Language (DCL): SQL commands Key: Uniquely identified data in each row that
that serve as gatekeepers to changes in the data can be used to join different tables together
within the database; examples include COMMIT,
ROLLBACK, and SAVEPOINT Query: SQL written to search for and retrieve
specific data from a database based specific
Data Definition Language (DDL): SQL criteria
commands used to define the structure of
databases; examples include CREATE, ALTER, and Relational Database: A database that organizes
DROP data as tables linked to each other
Data Manipulation Language (DML): SQL Structured Query Language (SQL): A
commands used to manipulate data stored in programming language designed to execute
databases; examples include SELECT, INSERT, a small set of operations that could extract any
UPDATE, and DELETE information from databases
Database: A structured collection of data that Table: An entity within a relational database that
can be efficiently search and manipulated by consists of rows and columns
computers
Transaction Control Language (TCL): SQL
Database Management System: The software commands that serve as gatekeepers of changes
that manages the database to the database. Examples include: COMMIT,
ROLLBACK, and SAVEPOINT.
320 Chapter 14—Database Applications
Chapter Case
Dionne’s Grade Tracking Database
Dionne is planning to attend the University of South Florida (USF) when he graduates
from High School. USF has a Grade Point Average (GPA) requirement for admissions
and Dionne wants to make sure he keeps a close eye on his grades. Now that he
knows a lot more about the power of databases, he wants to design a database that
will help him keep track of his grades. But he’s not sure which database to choose
and what features the database offers. He’s also not sure if the database is free or if
he will have to pay to use it. He doesn’t have a lot of money, so a cheap price or free
is the best. Dionne also isn’t sure if he should pick a database that he will have to
install on his laptop or if he can just login to a website online and use it. There are
so many questions that Dionne needs to answer before he can design his database.
Question 1: Use the Internet to research the World Wide Web for at least one
database management system that Dionne could use to design his
database. If you have trouble finding any database management
systems, review the “Evolution of Database Technologies” section in
this chapter where you will see a list of popular relational database
management systems. List the database management systems you
have found for Dionne and describe the following: What are some
key features of the database management system? How much does
it cost?
Question 2: Describe in detail what Dionne will have to do to use the database
management system. In other words, will he have to install it on his
laptop, login to a website, or use some other method?
Chapter 14—Database Applications 321
322 Chapter 14—Database Applications