02 Database Systems SetOperators MultitableQueries
02 Database Systems SetOperators MultitableQueries
Set operators
Multitable Queries
Seminar 1 recap
▪ Tables: let's say we have employees and we filled a table with their data.
2
Seminar 1 recap
The SELECT and the FROM statement
▪ Look into the whole table (query all columns and all rows): Select * from Employee
3
Seminar 1 recap
The SELECT and the FROM statement
▪ To query only specific columns: Select lastname, email from employee
4
Seminar 1 recap
SELECT and FROM statement - expressions
▪ Perform operation on columns: our task is to create a new full name column and select it along with the email column.
▪ For creating new columns we use Expressions. In this case, we need text concatenation.
▪ To give a name for our new columns, we use Column Aliases
▪ Select Firstname+' '+Lastname as FullName, Email from Employee
▪ If we need multipart column name, we use aliasing with brackets (or apostrophes):
▪ Select Firstname+' '+Lastname as [Full Name] , Email from Employee
5
Seminar 1 recap
SELECT and FROM statement - functions
▪ Some operations cannot be performed with expressions, in this case, we use functions.
▪ Let’s transform our LastName column to uppercase and query it along with the Email column.
EmployeeId LastName FirstName Title BirthDate Email …
▪ We need a function that performs uppercasing for us. Look into the functions handout, choose UPPER
function. It has one parameter, the string to be uppered. Column alias is needed to name the colum.
▪ Select Upper(lastname) as LastName, Email from Employee
6
Seminar 1 recap
SELECT and FROM statement - functions
▪ Let’s see another example: list the Lastname column and calculate the age of the employees in a new
column
EmployeeId LastName FirstName Age Title BirthDate …
▪ First take the system time (it is a function without parameter, see handout) then take the year from it. Then
subtract the year of birth of the employee to get the age.
▪ select Lastname, year(getdate())-year(birthdate) as Age from employee
7
Seminar 1 recap
SELECT and FROM statement - functions
▪ Functions can be combined with expressions if needed.
▪ Create the Full name column in this form: Adam SMITH and query it along with the Email column.
EmployeeId LastName FirstName Full name Title BirthDate Email …
8
Seminar 1 recap
SELECT and FROM statement – Conditional expressions
▪ Sometimes we need new columns based on logical condition. Let’s say that over sixty years the given employee is
pensioner. Add a new column along with Lastname column and indicate that a given employee is pensioner.
EmployeeId LastName FirstName Title IsPensioner BirthDate Email …
▪ We have to calculate the age of the given employee, then examine if it is greater than 60. Based on the result we
write Yes or No.
select Lastname,
Case
when year(getdate())-year(birthdate)>60 then 'Yes'
else 'No'
End as IsPensioner
from employee
9
Seminar 1 recap
SELECT and FROM statement – Conditional expressions
▪ Sometimes we need new columns based on a logical condition. Let’s say that over sixty years the given employee is
pensioner. In a new column indicate whether the given employee is a pensioner. Select only the lastname column and this
new column.
EmployeeId LastName FirstName Title IsPensioner BirthDate Email …
▪ We have to calculate the age of the given employee, then examine if it is greater than 60. Based on the
result we write Yes or No.
10
Seminar 1 Recap
Comparison operators
• <, >, <=, >=, =
• LIKE – if the given pattern matches to the given expression, returns true.
Format: Expression LIKE Pattern Any number of occurrences of any character is denoted by the % sign.
• BETWEEN – if the given numeric value between the given boundaries, returns true.
Format: Value BETWEEN x and y
• List all data of every employee.
• Indicate in a new column if an employee is born in the 1960’s
• Indicate in a new column if an employee is „Manager”.
EmployeeId LastName FirstName Title BirthDate IsManager IsBorn60s …
1 Adams Andrew General Manager 1962-02-18 Yes Yes …
… … … … … …
Select *, iif(Title like '%Manager', 'Yes', 'No'), iif(Year(Birthdate) between 1960 and 1969, 'Yes', 'No')
From Employee
11
Seminar 1 recap
SELECT statement and FROM statement – NULL value
▪ Sometimes not every field has value. The lack of value = NULL value. NULL value cannot be examined with =, we have to use the
IS operator. List the Lastname and the City in a new column, list the city again, and if it is not present, substitute with „no value”.
… … … …
▪ We can examine the value in a logical expression with IS operator, and if the value is null, write „No value”
instead of Null.
Select Firstname, City, IIF(City IS NULL, 'No value') from employee
▪ We get the same result with the ISNULL function, this is the usual form of null value substitution.
Select Firstname, City, ISNULL(City, 'No value') from employee
12
Seminar 1 recap
SELECT statement and FROM statement – NULL value
▪ You can substitute the NULL value with another column or expression also. List the last name and the Fax
number of the employees, and if the latter is not present, substitute with the phone column. Name the
column „New fax”
… … … … … … …
… … … … … … …
13
Seminar 1 recap
WHERE statement
▪ In Select statements, we list columns or expressions, these will be the columns of the resultset.
▪ In the Where statements we list logical conditions separated by logical operators, these determine which
rows are included in the result
▪ Select all Sales Support Agent data.
EmployeeId LastName FirstName Title BirthDate Email …
14
Seminar 1 recap
WHERE statement
▪ Logical operators: AND, OR, NOT. Precedence: NOT, AND, OR. To overload precedence, use parentheses.
▪ Select employees born before 1960 and the Title contains „Sales” or „Manager”
EmployeeId LastName FirstName Title BirthDate Email …
▪ Select * from Employee where Year(Birthdate)<1960 and (Title like '%sales%' or Title like '%manager%')
15
Seminar 1 recap
ORDER BY statement and TOP keyword
▪ Order by clause is used for ordering the output
▪ In this and only in this clause, column aliases can be used.
▪ You can reference an expression by its formula, its alias name, or its index.
▪ You can retrieve the first N records in order using the "top" keyword after the select keyword.
▪ Select the top 5 oldest employee from the employees table!
EmployeeId LastName FirstName Title BirthDate …
… … … … … …
▪ Calculate the age with an expression, use it in order by, apply the DESC keyword, then the TOP keyword
▪ Select TOP 5 * from employee order by YEAR(GETDATE())-YEAR(Birthdate) DESC
16
Quiz
17
Seminar 2
Set operators
Multi-table queries
Set operators
Now we can write queries and we are able to order the resultset. We can consider the resultset of a given
query as a set. The concept of a set is definitely familiar from your previous maths studies. It is a mathematical
concept that describes a collection or group of unique elements, where the order and repetition of elements do
not matter.
We can do set operations with SQL:
• Union resultsets and drop duplicates: UNION operator
• Union resultsets and keep duplicates: UNION ALL
• Find the common elements: INTERSECT operator
• Subtract the specified elements from the previous query: EXCEPT operator
Rules:
• Set operators remove duplicates by default (except the UNION ALL operator)
• The number and the order of the columns must be the same in the queries
• The data types must be the same or at least compatible
• You should name the columns in the first query
• You can order the resultset at the very end: after the last query
19
Set operators:
UNION
Get the employees with age over 60 OR under 40.
• Age over 60:
select * from employee where datediff(year, birthdate, getdate())>60
UNION
select * from employee where datediff(year, birthdate, getdate())<40
• Similar to:
select * from employee
where datediff(year, birthdate, getdate())>60 or datediff(year, birthdate,
getdate())<40
More useful example: Make a list that contains the name for all employees and all customers!
Select firstname, lastname from employee
UNION
Select firstname, lastname from customer
20
Set operators:
EXCEPT
Get the employees who are under 40 years old and whose position contains „agent”.
select * from employee where datediff(year, birthdate, getdate())<40
except
select * from employee where title not like '%agent%'
• Similar to:
select * from employee
where datediff(year, birthdate, getdate())<40 and title not like '%agent%'
More useful example: Make a list that contains the cities where we have employees but no customers!
Select city from employee
EXCEPT
Select city from customer
21
Set operators:
INTERSECT
Get the employees with age over 60 AND hired in 2003.
• Age over 60:
select * from employee where datediff(year, birthdate, getdate())>60
INTERSECT
select * from employee where year(hiredate)=2003
• Similar to:
select * from employee
where datediff(year, birthdate, getdate())>60 and year(hiredate)=2003
More useful example: Make a list that contains the common cities of employees and customers!
Select city from employee
INTERSECT
Select city from customer
22
UNION and UNION ALL
• When unioning them, the resultset contains 57 rows: the duplicates are dropped
• To avoid this, use the UNION ALL operator
• Intersect and Except remove duplicates, and under MSSQL there is no Intersect ALL or Except ALL
23
Other things to consider
• Name the columns in the first query (you can name columns in every query, but only the first query effects)
• Order the resultset of the set operations after the last query (you cannot do it elsewhere)
• Precedence:
• Operations within parentheses
• INTERSECT
• EXCEPT and UNION
• In case of equal precedence: left to right
24
Multi-table queries
Multi-table queries in SQL refer to the use of SQL statements to retrieve data from more than one table in a
relational database. This is a fundamental and powerful aspect of SQL, as it allows you to work with related
data stored in different tables and combine it to obtain meaningful results.
JOIN Operations: To perform multi-table queries, SQL uses JOIN operations. JOINs combine rows from two
or more tables based on a related column between them. The most common types of JOINs include INNER
JOIN, LEFT JOIN (or LEFT OUTER JOIN), RIGHT JOIN (or RIGHT OUTER JOIN), and FULL JOIN (or FULL
OUTER JOIN
Relationships: Multi-table queries are essential for working with relational databases, where data is
distributed across multiple tables linked by relationships. These relationships are typically established through
primary and foreign keys.
25
Join operators
Join operators are essential components of SQL queries in relational databases. They allow you to combine
data from multiple tables based on specified criteria or relationships. Join operations are crucial for retrieving
meaningful and interconnected information from a relational database.
Types of join:
INNER JOIN: Retrieves rows from both tables that satisfy the join condition.
LEFT JOIN (or LEFT OUTER JOIN): Retrieves all rows from the left table and matching rows from the right
table. If there are no matches in the right table, NULL values are included.
RIGHT JOIN (or RIGHT OUTER JOIN): Similar to LEFT JOIN but retrieves all rows from the right table.
FULL JOIN (or FULL OUTER JOIN): Retrieves all rows when there is a match in either the left or right table. If
there is no match, NULL values are included.
SELF JOIN: Joins a table to itself, which is useful for hierarchical data or when comparing rows within the
same table.
26
Cartesian product
From the Cartesian product we get all possible combinations, but it is usually meaningless (since we want
e.g. a list of tracks with genres), since a track normally defines its genre, and we need only that.
AlbumI
TrackId Name MediaTypeId GenreId Composer Milliseconds Bytes UnitPrice GenreId Name
d
28
Inner join
We have 3503 track in the track table. Why only 3478 returned? NULL equals to nothing!
TrackId Name AlbumId MediaTypeId GenreId Composer Milliseconds Bytes UnitPrice GenreId Name
30
Left join
To select the tracks with or without defined genre, you need to use left join.
Left join means in this case: Tracks with genre + tracks without genre
Select *
Tracks with or
From
without defined
track LEFT JOIN genres
genre on track.genreid=genre.genreid
31
Right join
What if we have genres without tracks, and we would like to return it with
no track defined? In our database, there is no track with „Musical” genre.
Select *
From
Track right join Tracks with defined
Genre on track.Genreid=Genre.genreid genres + genres
without tracks
32
Full join
Select *
From
Track Full join
Tracks with or without genre +
Genre on track.Genreid=Genre.genreid genres with or without tracks
3503+1 = 3504
33
The Chinook model
How can we know which columns to use for joining two tables?
The Track table stores track instances. It has a key defined
(TrackId).
The Genreid stores genre instances. It has a key defined
(GenreId).
The TrackId and the GenreId are so-called primary keys. A
primary key is a column or a combination of columns that
uniquely identifies records within a table.
The track table has also a GenreId column, it holds the key
value for the given track’s genre. This column is a so-called
foreign key. A foreign key is a column or a combination of
columns that references the primary key of another table.
These keys play a central role in the relational data model,
preserving data integrity and facilitating relationships
between tables. Primary keys uniquely identify records, while
foreign keys link data and relationships across tables.
Next time, we will learn about the relational data model,
which provides the theoretical framework for it.
34
The Chinook model
How can we know if left or right join is needed?
If a foreign key value column can hold NULL values we say the relationship is optional.
You can check it:
• In the Table Designer
• On the Entity-Relationship diagram
• Or write a simple query:
• Select * from Track where GenreId is NULL
35
The syntactical considerations of multi-table queries
Sometimes we need a table twice (or more). Let’s see the employee table! The ReportsTo column holds the
manager of the given employee, who is also an Employee.
EmployeeId LastName FirstName Title ReportsTo
1Adams Andrew General Manager NULL
2Edwards Nancy Sales Manager 1
Sales Support
3Peacock Jane Agent 2
Sales Support
4Park Margaret Agent 2
Sales Support
5Johnson Steve Agent 2
6Mitchell Michael IT Manager 1
7King Robert IT Staff 6
So if we want a list with Employee data and the immediate supervisor's name, we need the table twice. Once
as the list of employees and once as the list of supervisors. In this case, we must use table aliases.
For this query we use e1 for employee list, e2 for manager list.
Select e1.*, e2.firstname+' '+e2.lastname as [supervisor name]
from employee e1 join employee e2 on e1.ReportsTo=e2.EmployeeId
37
Self join (cont)
Not every employee has supervisor, we can use left join as well to get our general manager in the list.
38
The DISTINCT keyword
If you want to remove duplicates, you can use the DISTINCT keyword in the SELECT statement. While this
operation is not directly related to JOIN operations, they often lead to the need to remove duplicates, so
discuss it here with an example:
• List the unique values from the Title column in the Employee table!
Select DISTINCT Title from Employee
• Note: The DISTINCT keyword applies to all columns in the SELECT clause.
39
More than two tables
In the FROM clause any number of tables can be listed. The syntax to specify multiple table JOINs:
Select *
from table1 t1
join table2 t2 on t1.t2id=t2.id
left join table3 t3 on t1.t3id=t3.id
40
Example queries
• List the Track names with Album names. Name the columns for clarity.
• What type of Join is needed?
• List customers along with their support representative (if present). If there is no support
representative, the field value should be „<not specified>”.
• List the tracks along with album name, genre name and mediatype name. This list should contain
all tracks.
• List the track names sold in January. Remove duplicates!
• Which genres has no tracks? List them!
41
Individual assignment 1
42
Individual assignment 2
Let's create a list to check if our customers' registration and billing addresses match.
a) Query the unique identifier of customers, their last name and first name (in separate columns), country,
postal code, and address.
b) Join the query with the table containing invoice headers and retrieve the postal code, address, and country
listed on the invoice in three new columns.
c) Display only those rows where either the country, postal code, or address information provided by the
customer does not match the corresponding data on the invoice. The postal code column can be left
empty; take this into account in your solution.
d) Attach the table containing employees and include, in a separate column, the concatenated last name and
first name of the provided contact person (column name: Employee name), as well as their email address.
e) If there is no provided contact person, the customer should still appear in the results!
43
Individual assignment 3
Let's create a list of non-list price sales, including album, genre, and invoice item details.
a) Query the sales quantity, unit price, net amount, and gross amount (the gross amount should be calculated
with a 27% VAT) from the invoice items. Name the two calculated columns appropriately.
b) Join the relevant tables and query the Track title, Album title, and Genre name into new columns.
c) If the genre is not filled out, display "<unknown>" in the corresponding column.
d) Include only those rows where the sale was made at the list price, meaning the unit price of the invoice
item does not match the unit price of the track.
e) Sort the list by the artist's name, then by the album title!
44
Individual assignment 4
To prepare statistics, we would like to examine what genre of tracks each invoice item contains and in which
year they were issued.
a) Query the year of invoice issuance from the invoices table.
b) Join the styles table by interposing the invoice item and track tables. Query the net value of the item
(quantity * unit price) from the invoice item table, the track name from the track table, and the genre name
from the genre table. Name each column appropriately.
c) Include tracks in the query where the genre field is not filled out.
d) Exclude tracks where the length measured in minutes is less than 1 minute.
e) Sort the list by year and, within that, by the name of the genre.
45
Thank you for
your attention!