Data Analysis Made Easy | PDF | Table (Database) | Sql
100% found this document useful (1 vote)
2K views

Data Analysis Made Easy

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
2K views

Data Analysis Made Easy

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 243

II | P a g e Ezekiel Aleke

TABLE OF CONTENT
SQL
CHAPTER ONE
Terms in SQL
CHAPTER TWO
Basic Select Statements
CHAPTER THREE
Filtering Rows
CHAPTER FOUR
Sorting Query Results
CHAPTER FIVE
Joins
CHAPTER SIX
Data Aggregation
CHAPTER SEVEN
Combining and Modifying Data
CHAPTER EIGHT
Subqueries
CHAPTER NINE
String Manipulation
CHAPTER TEN
Numeric Manipulation
CHAPTER ELEVEN
Date and Time Manipulation
CHAPTER TWELVE
If and Switch Statement
CHAPTER THIRTEEN
Combining Queries
CHAPTER FOURTEEN
III | P a g e Ezekiel Aleke

Inserting, Updating and Deleting Data


CHAPTER FIFTEEN
Using Views
CHAPTER SIXTEEN
Managing Databases
CHAPTER SEVENTEEN
Normalization
EXCEL
CHAPTER ONE
Excel Terminologies
CHAPTER TWO
Manage Data in Cells and Ranges
CHAPTER THREE
If, Ifs and Switch in Excel
CHAPTER FOUR
Excel Functions
CHAPTER FIVE
Create Charts and Graphics
CHAPTER SIX
Using Sparklines and Data Bars
CHAPTER SEVEN
Conditional Formatting and Data Validation
CHAPTER EIGHT
Create Pivot Tables and Pivotcharts
CHAPTER NINE
Working with Excel Tables
CHAPTER TEN
What-If Analysis
IV | P a g e Ezekiel Aleke

POWER BI
CHAPTER ONE
Power BI Key Terms
CHAPTER TWO
Dax Aggregation Functions
CHAPTER THREE
Date Functions
CHAPTER FOUR
Filter Functions
CHAPTER FIVE
If and Switch Statement
CHAPTER SIX
Power Query
CHAPTER SEVEN
Data Modelling in Power Bi
PYTHON
CHAPTER ONE
Introduction
CHAPTER TWO
Variables
CHAPTER THREE
Data Types
CHAPTER FOUR
List Comprehension
CHAPTER FIVE
Operators in Python
CHAPTER SIX
Control Structures
CHAPTER SEVEN
V|Page Ezekiel Aleke

Transfer Statement
CHAPTER EIGHT
Functions and Classes
CHAPTER NINE
Regular Expression
CHAPTER TEN
Numpy Basics: Arrays and Vectorized Computation
CHAPTER ELEVEN
Data Cleaning and Preparation
CHAPTER TWELVE
Data Wrangling: Joining Dataframe
CHAPTER THIRTEEN
Plotting and Visualization using Pandas
CHAPTER FOURTEEN
Plotting and Visualization using Matplotlib
CHAPTER FIFTEEN
Machine Learning Algorithm (Scikit-Learn)
CHAPTER SEVENTEEN
Web Scrapping with Python

8 WEBSITES TO GET REMOTE JOBS


1|Page Ezekiel Aleke

SQL
2|Page Ezekiel Aleke

CHAPTER ONE

TERMINOLOGIES IN SQL

SQL (Structured Query Language) is a standard programming language for


managing and manipulating relational databases.

Here are some common SQL terms and their meanings:

1. Database: A collection of related data that is organized in a specific way.

2. Table: A set of related data organized in rows and columns.

3. Column: A vertical division of a table that represents a specific type of


data.

4. Row: A horizontal division of a table that represents a single record or


instance of data.

5. Primary Key: A unique identifier for each row in a table.

6. Foreign Key: A column that references the primary key of another table,
used to establish relationships between tables.

7. Index: A data structure used to speed up searches in a table.

8. Query: A request for data from one or more tables that meets specified
criteria.

9. Join: A way to combine data from two or more tables based on a related
column.

10.Aggregate Function: A function that operates on a set of values and returns


a single value, such as SUM, COUNT, and AVG.

11.Subquery: A query nested within another query, used to retrieve data based
on specific criteria.
3|Page Ezekiel Aleke

12.View: A virtual table that does not exist physically but is created based on
the result of a query.

13. Alias is a temporary name assigned to a table or column in a query to make


it easier to read or to differentiate it from other tables or columns in the query.

The syntax for assigning an alias to a table or column in a query is as follows:

• For a table: SELECT * FROM table_name AS alias_name;

• For a column: SELECT column_name AS alias_name FROM


table_name;

Here, "AS" is optional and can be omitted. The alias name can be any
valid identifier that follows the rules for naming tables or columns in
SQL, such as not starting with a number or containing spaces.
4|Page Ezekiel Aleke

CHAPTER TWO

BASIC SELECT STATEMENTS

This chapter covers the basics of querying a database with the SELECT
statement. It includes examples of filtering data with WHERE clauses, ordering
data with ORDER BY, and limiting results with LIMIT.

Let's consider the following two tables as examples:

Table1: Employees

EmployeeID FirstName LastName DepartmentID

1 John Smith 1

2 Sarah Johnson 2

3 Alex Brown NULL

4 David Lee 3
5|Page Ezekiel Aleke

Table2: Departments

DepartmentID DepartmentName

1 Sales

2 Marketing

3 Finance

4 IT

SELECT CLAUSE

The SELECT clause in SQL is used to select specific columns from a table. Here's
the basic syntax of a SELECT statement:

Example 1: Let's say we want to get a list of all employees' first and last names.
We can use the SELECT clause to select the "FirstName" and "LastName"
columns:
6|Page Ezekiel Aleke

This will return the following result set:

FirstName LastName

John Smith

Sarah Johnson

Alex Brown

David Lee

Example 2: Let's say we want to get a list of all departments' names. We can use
the SELECT clause to select the "DepartmentName" column:

This will return the following result set:

DepartmentName

Sales

Marketing

Finance

IT
7|Page Ezekiel Aleke

WHERE CLAUSE

The WHERE clause in SQL is used to filter rows based on a certain condition.
Here's the basic syntax of a SELECT statement with a WHERE clause:

Example 1: Let's say we want to get a list of all employees who belong to the
"Sales" department. We can use the WHERE clause to filter the rows based on
the "DepartmentID" column:

This will return the following result set:

FirstName LastName

John Smith

Example 2: Let's say we want to get a list of all departments except for "IT". We
can use the WHERE clause to filter the rows based on the "DepartmentName"
column:
8|Page Ezekiel Aleke

This will return the following result set:

DepartmentName

Sales

Marketing

Finance

LIMIT CLAUSE

The LIMIT clause in SQL is used to limit the number of rows returned in a query
result set. It is particularly useful when dealing with large tables or when only a
few records are needed.

Here's the basic syntax of the LIMIT clause:

Let's consider the following table as an example:

Table1: Employees

EmployeeID FirstName LastName DepartmentID

1 John Smith 1

2 Sarah Johnson 2

3 Alex Brown NULL


9|Page Ezekiel Aleke

EmployeeID FirstName LastName DepartmentID

4 David Lee 3

Example 1: Let's say we want to get the first two employees from the
"Employees" table. We can use the LIMIT clause to limit the result set to the first
two rows:

This will return the following result set:

EmployeeID FirstName LastName DepartmentID

1 John Smith 1

2 Sarah Johnson 2

As you can see, only the first two rows are returned in the result set.

Example 2: Let's say we want to get the first employee from each department.
We can use a subquery to select the first employee for each department, and then
use the LIMIT clause to limit the result set to one row for each department:
10 | P a g e Ezekiel Aleke

This will return the following result set:

FirstName LastName DepartmentName

John Smith Sales

Sarah Johnson Marketing

David Lee Finance

As you can see, the result set includes only the first employee for each
department, and the LIMIT clause ensures that only one row is returned for each
department. Note that the subquery is used to select the first employee for each
department, and the main query is used to join the employee and department
information and limit the result set.
11 | P a g e Ezekiel Aleke

ORDER BY

The ORDER BY clause is used in SQL to sort the results of a query in ascending
or descending order based on one or more columns. Here's the basic syntax of an
ORDER BY clause:

Let's consider the following two tables as examples:

Table1: Employees

EmployeeID FirstName LastName DepartmentID

1 John Smith 1

2 Sarah Johnson 2

3 Alex Brown NULL

4 David Lee 3

Table2: Departments

DepartmentID DepartmentName

1 Sales

2 Marketing
12 | P a g e Ezekiel Aleke

DepartmentID DepartmentName

3 Finance

4 IT

Example 1: Let's say we want to get a list of all employees sorted by their first
name in ascending order. We can use the ORDER BY clause on the FirstName
column:

This will return the following result set:

EmployeeID FirstName LastName DepartmentID

3 Alex Brown NULL

4 David Lee 3

1 John Smith 1

2 Sarah Johnson 2

As you can see, the result set is sorted by the FirstName column in ascending
order.
13 | P a g e Ezekiel Aleke

Example 2: Let's say we want to get a list of all departments sorted by their
department name in descending order. We can use the ORDER BY clause on the
DepartmentName column:

This will return the following result set:

DepartmentID DepartmentName

IT IT

Finance Finance

Marketing Marketing

Sales Sales

As you can see, the result set is sorted by the DepartmentName column in
descending order.

In summary, SELECT is used to retrieve data from one or more tables, WHERE
is used to filter the rows based on certain conditions, LIMIT is used to limit the
number of rows returned, and ORDER BY is used to sort the rows in ascending
or descending order based on one or more columns.

These clauses are the backbone of the SELECT statement in SQL and are used
extensively in querying databases.
14 | P a g e Ezekiel Aleke

CHAPTER THREE

FILTERING ROWS

This chapter goes into more detail on filtering data with the WHERE clause. It
covers topics like using multiple conditions with AND and OR, filtering on
NULL values, and using the LIKE operator to search for patterns in text.

AND CLAUSE

The AND clause is used to filter rows based on multiple conditions. The result
will contain only those rows that satisfy all the conditions in the WHERE clause.
Here's the syntax:

Example 1: Suppose we want to find all employees who are in the Sales
department and whose last name is "Smith". We can use the AND clause to
combine the conditions as follows:

This will return the following result set:

EmployeeID FirstName LastName DepartmentID

1 John Smith 1
15 | P a g e Ezekiel Aleke

Example 2: Suppose we want to find all employees who are in the Marketing
department and whose first name is "Sarah". We can use the AND clause to
combine the conditions as follows:

This will return the following result set:

EmployeeID FirstName LastName DepartmentID

2 Sarah Johnson 2

OR CLAUSE

The OR clause is used to filter rows based on multiple conditions. The result will
contain any rows that satisfy at least one of the conditions in the WHERE clause.

Here's the syntax:

Example 1: Suppose we want to find all employees who are in the Sales
department or whose last name is "Brown". We can use the OR clause to combine
the conditions as follows:
16 | P a g e Ezekiel Aleke

This will return the following result set:

EmployeeID FirstName LastName DepartmentID

1 John Smith 1

3 Alex Brown NULL

Example 2: Suppose we want to find all employees who are in the Marketing
department or whose first name is "David". We can use the OR clause to combine
the conditions as follows:

This will return the following result set:

EmployeeID FirstName LastName DepartmentID

2 Sarah Johnson 2

4 David Lee 3

NULL CLAUSE

The NULL clause is used to filter rows that have NULL values in a specified
column.
17 | P a g e Ezekiel Aleke

Here's the syntax:

Example 1: Suppose we want to find all employees who do not have a department
assigned to them. We can use the NULL clause as follows:

This will return the following result set:

EmployeeID FirstName LastName DepartmentID

3 Alex Brown NULL

LIKE CLAUSE

The LIKE clause is used in SQL to search for patterns in columns. It is typically
used with the wildcard characters '%' and ''. The '%' character matches any string
of zero or more characters, while the '' character matches any single character.

Let's consider the following example using the Employees table from the previous
question:

Example 2: Get the list of employees whose last name starts with the letter "S"
18 | P a g e Ezekiel Aleke

This will return the following result set:

EmployeeID FirstName LastName DepartmentID

1 John Smith 1

2 Sarah Johnson 2

As you can see, the result set includes only the employees whose last name starts
with the letter "S".

Example using both NULL and LIKE clause: Let's say we want to get a list of
employees whose last name starts with the letter "B" and who do not have a
department assigned. We can combine the LIKE and NULL clauses:

This will return the following result set:

EmployeeID FirstName LastName DepartmentID

3 Alex Brown NULL

As you can see, the result set includes only the employee "Alex Brown" who has
no department assigned and whose last name starts with the letter "B".
19 | P a g e Ezekiel Aleke

CHAPTER FOUR

SORTING QUERY RESULTS

This chapter covers the ORDER BY clause in more detail. It includes examples
of sorting data by multiple columns, sorting by computed columns, and using the
ASC and DESC keywords to control the sort order.

ORDER BY ASC

The ORDER BY ASC clause is used to sort the result set in ascending order based
on one or more columns. Here's the basic syntax of the ORDER BY ASC clause:

Example 1: Let's say we have a table called "employees" with columns


"employee_id", "first_name", "last_name", and "salary". To retrieve all the rows
in the table sorted in ascending order by salary, we can use the following query:

This will return all the columns for all the rows in the "employees" table sorted
in ascending order by salary.

Example 2: Let's say we want to retrieve all the rows in the "employees" table
sorted in ascending order by last name and then by first name. We can use the
following query:
20 | P a g e Ezekiel Aleke

This will return all the columns for all the rows in the "employees" table sorted
in ascending order by last name and then by first name.

ORDER BY DESC

The ORDER BY DESC clause is used to sort the result set in descending order
based on one or more columns. Here's the basic syntax of the ORDER BY DESC
clause:

Example 1: Let's say we have a table called "employees" with columns


"employee_id", "first_name", "last_name", and "salary". To retrieve all the rows
in the table sorted in descending order by salary, we can use the following query:

This will return all the columns for all the rows in the "employees" table sorted
in descending order by salary.

Example 2: Let's say we want to retrieve all the rows in the "employees" table
sorted in descending order by last name and then by first name. We can use the
following query:

This will return all the columns for all the rows in the "employees" table sorted
in descending order by last name and then by first name.
21 | P a g e Ezekiel Aleke

CHAPTER FIVE

JOINS

This chapter covers the different types of joins available in SQL, including
INNER JOIN, LEFT JOIN, and FULL OUTER JOIN. It includes examples of
joining tables on multiple columns and using aliases to simplify the query.

INNER JOIN

An INNER JOIN in SQL is used to return only the rows from both tables being
joined where there is a match on a particular condition.

Here's the basic syntax of an INNER JOIN:

Let's consider the following two tables as examples:

Table1: Employees

EmployeeID FirstName LastName DepartmentID

1 John Smith 1

2 Sarah Johnson 2

3 Alex Brown NULL

4 David Lee 3
22 | P a g e Ezekiel Aleke

Table2: Departments

DepartmentID DepartmentName

1 Sales

2 Marketing

3 Finance

4 IT

Example 1: Let's say we want to get a list of all employees and their department
names (if any). We can use an INNER JOIN on the DepartmentID column:

This will return the following result set:

FirstName LastName DepartmentName

John Smith Sales

Sarah Johnson Marketing

David Lee Finance


23 | P a g e Ezekiel Aleke

As you can see, the result set includes only those employees who have a matching
department in the "Departments" table.

Example 2: Let's say we want to get a list of all departments and their employees.
We can use an INNER JOIN on the DepartmentID column:

This will return the following result set:

FirstName LastName DepartmentName

John Smith Sales

Sarah Johnson Marketing

David Lee Finance

As you can see, the result set includes only those departments that have
employees assigned to them in the "Employees" table, and only those employees
who have a matching department in the "Departments" table. The department
"IT" is not included in the result set because there are no employees assigned to
it in the "Employees" table.

LEFT JOIN

A LEFT JOIN in SQL is used to return all rows from the left table and matching
rows from the right table. If there are no matches in the right table, the result will
contain NULL values.
24 | P a g e Ezekiel Aleke

Here's the basic syntax of a LEFT JOIN:

Let's use the same two tables as examples:

Table1: Employees

EmployeeID FirstName LastName DepartmentID

1 John Smith 1

2 Sarah Johnson 2

3 Alex Brown NULL

4 David Lee 3

Table2: Departments

DepartmentID DepartmentName

1 Sales

2 Marketing

3 Finance

4 IT
25 | P a g e Ezekiel Aleke

Example 1: Let's say we want to get a list of all employees and their department
names (if any). We can use a LEFT JOIN on the DepartmentID column:

This will return the following result set:

FirstName LastName DepartmentName

John Smith Sales

Sarah Johnson Marketing

Alex Brown NULL

David Lee Finance

As you can see, the result set includes all employees from the "Employees" table,
as well as the department names for those employees who have a matching
department in the "Departments" table. For employees with no department, the
DepartmentName column is NULL.

Example 2: Let's say we want to get a list of all departments and the employees
who work in each department. We can use a LEFT JOIN on the DepartmentID
column:
26 | P a g e Ezekiel Aleke

This will return the following result set:

DepartmentName FirstName LastName

Sales John Smith

Marketing Sarah Johnson

Finance David Lee

IT NULL NULL

As you can see, the result set includes all departments from the "Departments"
table, as well as the employees who work in each department. For the "IT"
department, there are no employees, so the FirstName and LastName columns are
NULL.

RIGHT JOIN

A RIGHT JOIN in SQL is used to return all rows from the right table and the
matching rows from the left table based on the join condition. If there is no match,
it returns NULL values for the columns from the left table.
27 | P a g e Ezekiel Aleke

Here's the basic syntax of a RIGHT JOIN:

Using the same tables from the previous example, here are two examples of
RIGHT JOIN:

Example 1: Let's say we want to get a list of all employees and their department
names (if any), including those employees who do not have a department. We can
use a RIGHT JOIN on the DepartmentID column:

This will return the following result set:

FirstName LastName DepartmentName

John Smith Sales

Sarah Johnson Marketing

David Lee Finance

NULL NULL IT

NULL NULL HR
28 | P a g e Ezekiel Aleke

As you can see, the result set includes all departments from the "Departments"
table, as well as the employees who work in each department. For the "IT" and
"HR" departments, there are no matching employees in the "Employees" table,
so the FirstName and LastName columns are NULL.

Example 2: Let's say we want to get a list of all departments and their employees
(if any), including those departments with no employees. We can use a RIGHT
JOIN on the DepartmentID column:

This will return the following result set:

FirstName LastName DepartmentName

John Smith Sales

Sarah Johnson Marketing

David Lee Finance

NULL NULL IT

NULL NULL HR
29 | P a g e Ezekiel Aleke

As you can see, the result set includes all employees from the "Employees" table,
as well as the department names for those employees who have a matching
department in the "Departments" table. For the "IT" and "HR" departments, there
are no matching employees, so the FirstName and LastName columns are NULL.

FULL OUTER JOIN

A FULL OUTER JOIN in SQL is used to return all rows from both tables being
joined. When there is no match between the two tables on a particular condition,
it returns NULL values for columns from the other table.

Here's the basic syntax of a FULL OUTER JOIN:

Let's consider the following two tables as examples:

Table1: Employees

EmployeeID FirstName LastName DepartmentID

1 John Smith 1

2 Sarah Johnson 2

3 Alex Brown NULL

4 David Lee 3
30 | P a g e Ezekiel Aleke

Table2: Departments

DepartmentID DepartmentName

1 Sales

2 Marketing

3 Finance

4 IT

Example 1: Let's say we want to get a list of all employees and their department
names (if any). We can use a FULL OUTER JOIN on the DepartmentID column:
31 | P a g e Ezekiel Aleke

This will return the following result set:

FirstName LastName DepartmentName

John Smith Sales

Sarah Johnson Marketing

Alex Brown NULL

David Lee Finance

NULL NULL IT

As you can see, the result set includes all employees from the "Employees" table,
as well as the department names for those employees who have a matching
department in the "Departments" table. For employees with no department, the
DepartmentName column is NULL.

Example 2: Let's say we want to get a list of all departments and the employees
who work in each department. We can use a FULL OUTER JOIN on the
DepartmentID column:
32 | P a g e Ezekiel Aleke

This will return the following result set:

FirstName LastName DepartmentName

John Smith Sales

Sarah Johnson Marketing

NULL NULL IT

David Lee Finance

NULL NULL HR

As you can see, the result set includes all departments from the "Departments"
table, as well as the employees who work in each department. For departments
with no employees, the FirstName and LastName columns are NULL.
33 | P a g e Ezekiel Aleke

CHAPTER SIX

DATA AGGREGATION

This chapter covers the use of aggregate functions like SUM, AVG, and COUNT
to summarize data in a query. It includes examples of grouping data with the
GROUP BY clause and filtering grouped data with the HAVING clause.

SUM, COUNT, and AVG are aggregate functions in SQL used to perform
calculations on a set of values in a column. Here are the explanations of each:

• SUM: It returns the total sum of values in a column.

• COUNT: It returns the number of rows in a column.

• AVG: It returns the average value of a column.

Let's consider the following tables as examples for illustrating the usage of these
functions:

Table 1: Orders

OrderID CustomerID OrderDate TotalAmount

1 100 2022-01-01 250.00

2 200 2022-02-01 150.00

3 100 2022-03-01 100.00

4 300 2022-04-01 300.00

5 200 2022-05-01 200.00


34 | P a g e Ezekiel Aleke

Table 2: Customers

CustomerID CustomerName City

100 John Smith New York

200 Sarah Johnson London

300 Alex Brown Paris

Example 1:

Using SUM and COUNT Suppose we want to calculate the total amount of orders
and the number of orders placed. We can use the SUM and COUNT functions in
SQL as follows:

This will return the following result set:

Total NumOrders

1000.00 5

As you can see, the result set includes the total amount of all orders and the
number of orders placed.

Example 2: Using AVG Suppose we want to calculate the average total amount
of orders placed. We can use the AVG function in SQL as follows:
35 | P a g e Ezekiel Aleke

This will return the following result set:

AvgTotal

200.00

As you can see, the result set includes the average total amount of all orders
placed.

Example 3: Using SUM and GROUP BY Suppose we want to calculate the total
amount of orders placed by each customer. We can use the SUM and GROUP
BY functions in SQL as follows:
36 | P a g e Ezekiel Aleke

This will return the following result set:

CustomerName TotalAmount

John Smith 350.00

Sarah Johnson 350.00

Alex Brown 300.00

As you can see, the result set includes the total amount of orders placed by each
customer.

Example 4: Using COUNT and WHERE Suppose we want to calculate the


number of customers who placed orders in New York. We can use the COUNT
and WHERE functions in SQL as follows:

This will return the following result set:

NumCustomers

1
37 | P a g e Ezekiel Aleke

GROUP BY CLAUSE

The GROUP BY clause is used to group the rows returned by a SELECT


statement based on one or more columns. It is often used with aggregate functions
like SUM, AVG, COUNT, etc. to calculate summary statistics for each group.
Here's an example:

Suppose we want to calculate the total revenue generated by each country. We


can use the GROUP BY clause to group the rows by country, and then use the
SUM function to calculate the total revenue for each group:

This query will return the total revenue generated by each country in the
"customer" table.

HAVING

The HAVING clause is used to filter the groups returned by a GROUP BY clause
based on a condition. It is similar to the WHERE clause, but it operates on groups
rather than individual rows. Here's an example:

Suppose we want to find the countries where the average order value is greater
than $50. We can use the GROUP BY clause to group the rows by country, and
then use the AVG function to calculate the average order value for each group.
Finally, we can use the HAVING clause to filter the groups based on the
average order value:
38 | P a g e Ezekiel Aleke

This query will return the countries where the average order value is greater than
$50. Note that we have used the alias "avg_order_value" to refer to the output of
the AVG function in the HAVING clause.
39 | P a g e Ezekiel Aleke

CHAPTER SEVEN

COMBINING AND MODIFYING DATA

This chapter covers techniques for combining data from multiple tables and
modifying the results with computed columns. It includes examples of using
UNION and UNION ALL to combine data, as well as using functions like
CONCAT and SUBSTRING to modify column values.

UNION and UNION ALL

UNION and UNION ALL are used to combine the results of two or more
SELECT statements into a single result set.

The difference between the two is that UNION removes duplicates while UNION
ALL retains duplicates.

Let's consider the following two tables as examples:

Table1: Employees

EmployeeID FirstName LastName DepartmentID

1 John Smith 1

2 Sarah Johnson 2

3 Alex Brown NULL

4 David Lee 3
40 | P a g e Ezekiel Aleke

Table2: MoreEmployees

EmployeeID FirstName LastName DepartmentID

5 Mary White 2

6 John Doe 1

7 Alex Smith NULL

8 Jane Lee 3

Example 1:

Let's say we want to get a list of all employees from both tables, without any
duplicates.

We can use UNION as follows:

This will return the following result set:

EmployeeID FirstName LastName DepartmentID

1 John Smith 1

2 Sarah Johnson 2
41 | P a g e Ezekiel Aleke

EmployeeID FirstName LastName DepartmentID

3 Alex Brown NULL

4 David Lee 3

5 Mary White 2

6 John Doe 1

7 Alex Smith NULL

8 Jane Lee 3

As you can see, the result set includes all employees from both tables, without
any duplicates.

Example 2:

Let's say we want to get a list of all employees from both tables, including
duplicates.

We can use UNION ALL as follows:


42 | P a g e Ezekiel Aleke

This will return the following result set:

EmployeeID FirstName LastName DepartmentID

1 John Smith 1

2 Sarah Johnson 2

3 Alex Brown NULL

4 David Lee 3

5 Mary White 2

6 John Doe 1

7 Alex Smith NULL

8 Jane Lee 3

As you can see, the result set includes all employees from both tables, including
duplicates.

CONCAT

The CONCAT function in SQL is used to concatenate two or more strings.

Here's the basic syntax of the CONCAT function:

Example 1: Let's say we want to create a list of full names by concatenating the
FirstName and LastName columns in the "Employees" table.
43 | P a g e Ezekiel Aleke

We can use the CONCAT function as follows:

This will return the following result set:

FullName

John Smith

Sarah Johnson

Alex Brown

David Lee

As you can see, the result set includes the full name of each employee by
concatenating the FirstName and LastName columns

Example 2: Let's say we want to concatenate three columns - FirstName,


MiddleName and LastName - in the "Employees" table. If MiddleName is NULL,
we want to skip it and concatenate only the FirstName and LastName columns.

We can use the CONCAT function along with the IFNULL function as follows:
44 | P a g e Ezekiel Aleke

This will return the following result set:

FullName

John Smith

Sarah Johnson

Alex Brown

David Lee

As you can see, the result set includes the full name of each employee by
concatenating the FirstName, MiddleName (if it's not NULL) and LastName
columns.

SUBSTRING

The SUBSTRING function in SQL is used to extract a substring from a string.

Here's the basic syntax of the SUBSTRING function:

Example 1: Let's say we want to extract the first three characters of the
FirstName column in the "Employees" table.

We can use the SUBSTRING function as follows:


45 | P a g e Ezekiel Aleke

This will return the following result set:

Initials

Joh

Sar

Ale

Dav

As you can see, the result set includes the first three characters of the FirstName
column for each employee.

Example 2: Let's say we want to extract the last two characters of the LastName
column in the "Employees" table. We can use the SUBSTRING function along
with the LEN function as follows:

This will return the following result set:

LastTwoChars

th

on

wn
46 | P a g e Ezekiel Aleke

LastTwoChars

ee

As you can see, the result set includes the last two characters of the LastName
column for each employee. We used the LEN function to get the length of the
LastName column and subtracted 1 from it to get the position of the second last
character.

Then we specified 2 as the length argument to extract the last two characters.
47 | P a g e Ezekiel Aleke

CHAPTER EIGHT

SUBQUERIES

This chapter covers the use of subqueries to perform complex queries and data
manipulation. It includes examples of using subqueries in WHERE clauses and
SELECT statements.

A subquery, also known as a nested query or inner query, is a query within a


query in SQL. It is used to retrieve data that will be used as a condition for the
main query.

Here's an explanation of subqueries in SQL with two examples:

Example 1: Using a subquery to filter data

Suppose we have two tables: "orders" and "customers". The "orders" table
contains information about orders, such as the order ID, customer ID, and order
date. The "customers" table contains information about customers, such as the
customer ID, name, and address.

Let's say we want to retrieve all the orders made by customers who live in
California. We can use a subquery to retrieve the customer IDs of customers who
live in California, and then use those IDs to filter the orders table.

Here's an example query:


48 | P a g e Ezekiel Aleke

In this query, the subquery is (SELECT customer_id FROM customers WHERE


state = 'CA'), which retrieves the customer IDs of customers who live in
California. The main query then uses the IN keyword to filter the orders table
based on the customer IDs retrieved by the subquery.

Example 2: Using a subquery to perform calculations

Suppose we have a table called "employees" that contains information about


employees, such as their employee ID, name, and salary. Let's say we want to
retrieve all the employees who earn a salary that is higher than the average salary
of all employees. We can use a subquery to calculate the average salary, and then
use that value to filter the employees table.

Here's an example query:

In this query, the subquery is (SELECT AVG(salary) FROM employees), which


calculates the average salary of all employees. The main query then uses the >
operator to filter the employees table based on the average salary retrieved by the
subquery.

In summary, subqueries are a powerful tool in SQL that can be used to filter data,
perform calculations, and more. By nesting queries within other queries, we can
create complex queries that retrieve the exact data we need.
49 | P a g e Ezekiel Aleke

CHAPTER NINE

STRING MANIPULATION

This chapter covers techniques for manipulating text data in SQL. It includes
examples of using functions like UPPER, LOWER, and TRIM to modify text
values, as well as using regular expressions to search for patterns in text.

The UPPER, LOWER, and TRIM functions in SQL are used to manipulate text
data.

Here's a brief explanation of each function along with two examples for each with
the tables given above.

UPPER FUNCTION

The UPPER function converts all characters in a string to uppercase.

This query will return the first names of all employees in uppercase.

UppercaseName

JOHN

SARAH

ALEX

DAVID
50 | P a g e Ezekiel Aleke

LOWER FUNCTION

The LOWER function converts all characters in a string to lowercase.

This query will return the department names in lowercase.

LowercaseName

sales

marketing

finance

it

TRIM FUNCTION

The TRIM function removes any leading or trailing spaces from a string.
51 | P a g e Ezekiel Aleke

This query will return the last names of all employees without any leading or
trailing spaces.

TrimmedName

Smith

Johnson

Brown

Lee

UPPER AND LOWER FUNCTIONS COMBINE

This query will return the first names of all employees in uppercase, but first
converts them to lowercase.

Name

JOHN

SARAH

ALEX

DAVID
52 | P a g e Ezekiel Aleke

TRIM FUNCTION WITH WHERE CLAUSE

This query will return the first names and trimmed last names of all employees
whose last name contains the letter "o".

FirstName TrimmedName

John Smith

David Lee

In summary, UPPER, LOWER, and TRIM functions are used to manipulate text
data in SQL. The UPPER function converts all characters in a string to uppercase,
the LOWER function converts all characters in a string to lowercase, and the
TRIM function removes any leading or trailing spaces from a string.

These functions can be used individually or combined with other SQL clauses
such as WHERE to filter data.
53 | P a g e Ezekiel Aleke

CHAPTER TEN

NUMERIC MANIPULATION

This chapter covers techniques for manipulating numeric data in SQL. It includes
examples of using functions like ABS, ROUND, and MOD to modify numeric
values, as well as using mathematical operators to perform calculations.

ABS

The ABS function is used to return the absolute (positive) value of a number.

The syntax of the ABS function is as follows:

Example 1: Let's say we have a table called "numbers" with a column "value"
containing both positive and negative numbers.

To retrieve the absolute value of each number in the "value" column, we can use
the following query:

This will return a result set with the absolute value of each number in the "value"
column.

Example 2: Let's say we have a table called "transactions" with a column


"amount" containing both positive and negative amounts.

To retrieve the total sum of all transactions in the "amount" column regardless of
whether they are positive or negative, we can use the following query:
54 | P a g e Ezekiel Aleke

This will return the total sum of all transactions in the "amount" column as a
positive number.

ROUND

The ROUND function is used to round a number to a specified number of decimal


places.

The syntax of the ROUND function is as follows:

Example 1: Let's say we have a table called "sales" with a column "amount"
containing sales amounts with decimals. To retrieve the sales amounts rounded
to two decimal places, we can use the following query:

This will return a result set with the sales amounts rounded to two decimal places.

Example 2: Let's say we have a table called "temperature" with a column


"celsius" containing temperatures in Celsius.

To retrieve the temperatures converted to Fahrenheit and rounded to the nearest


whole number, we can use the following query:

This will return a result set with the temperatures in Fahrenheit rounded to the
nearest whole number.
55 | P a g e Ezekiel Aleke

MOD

The MOD function is used to return the remainder of a division operation.

The syntax of the MOD function is as follows:

Example: Let's say we have a table called "numbers" with columns "dividend"
and "divisor". To retrieve the remainder of the division operation "dividend
divided by divisor" for each row in the table, we can use the following query:

This will return a result set with the remainder of the division operation "dividend
divided by divisor" for each row in the table.
56 | P a g e Ezekiel Aleke

CHAPTER ELEVEN

DATE AND TIME MANIPULATION

This chapter covers techniques for working with date and time data in SQL. It
includes examples of using functions like DATEADD and DATEDIFF to
perform calculations on dates and times, as well as using the DATEPART
function to extract specific components of a date or time value.

DATEADD

The DATEADD function in SQL adds a specified time interval to a given date
and returns a new date.

The basic syntax of the DATEADD function is:

Where:

• interval: the time interval you want to add, such as year, month, day, hour,
minute, etc.

• number: the number of intervals you want to add.

• date: the starting date to which you want to add the interval.

For example, let's say we have a table called "orders" with columns "order_id",
"order_date", and "amount". We want to add 30 days to the "order_date" column
and return the new date.

We can use the following query:


57 | P a g e Ezekiel Aleke

This will return a new column called "new_date" with the date values of
"order_date" column added 30 days.

Another example, let's say we want to add 3 months to the "order_date" column
and return the new date.

We can use the following query:

This will return a new column called "new_date" with the date values of
"order_date" column added 3 months.

DATEDIFF

The DATEDIFF function in SQL calculates the difference between two dates in
a specified time interval.

The basic syntax of the DATEDIFF function is:

Where:

• interval: the time interval you want to calculate the difference, such as year,
month, day, hour, minute, etc.

• start_date: the starting date.

• end_date: the ending date.

Example 1: Let's say we have a table called "orders" with columns "order_id",
"order_date", and "delivery_date". We want to calculate the difference between
the "order_date" and "delivery_date" columns in days.
58 | P a g e Ezekiel Aleke

We can use the following query:

This will return a new column called "days_to_delivery" with the number of days
it took for each order to be delivered.

Example 2: Let's say we want to calculate the difference between the


"order_date" and "delivery_date" columns in months.

We can use the following query:

This will return a new column called "months_to_delivery" with the number of
months it took for each order to be delivered.

DATEPART

The DATEPART function in SQL returns a specific part of a date, such as year,
month, day, etc. The basic syntax of the DATEPART function is:

Where:

• interval: the part of the date you want to return, such as year, month, day,
etc.

• date: the date you want to extract the part from.

For example, let's say we have a table called "orders" with columns "order_id",
"order_date", and "amount". We want to extract the year from the "order_date"
column.
59 | P a g e Ezekiel Aleke

We can use the following query:

This will return a new column called "order_year" with the year value extracted
from the "order_date".
60 | P a g e Ezekiel Aleke

CHAPTER TWELVE

IF AND SWITCH STATEMENT

IF and SWITCH are conditional expressions in SQL that allow you to execute
different logic based on certain conditions.

Here are the explanations and examples of each:

IF STATEMENT

The IF statement allows you to execute a block of code if a certain condition is


met, and another block of code if the condition is not met.

The syntax for the IF statement is as follows:

Example 1: Suppose we have a table called "employees" with the columns "id",
"name", "age" and "salary". We want to update the salary of an employee with
id=1000 based on their age. If the age is less than 30, we want to increase their
salary by 10%, otherwise, we want to increase it by 5%.
61 | P a g e Ezekiel Aleke

Example 2: Suppose we have a table called "orders" with the columns "id",
"customer_name", "order_date" and "total_amount". We want to display a
message "You have a discount of 10%" if the total amount of an order is greater
than $1000.IF QuantityInStock > 0

SWITCH STATEMENT

The SWITCH statement allows you to test a series of conditions and execute
different code blocks based on which condition is true.

The syntax for the SWITCH statement is as follows:

Example 1: You can use the CASE statement to classify customers based on their
total purchase amount:
62 | P a g e Ezekiel Aleke

This query returns the customer name, total purchase amount, and customer type
(Platinum, Gold, or Regular) based on their purchase amount.

Example 2: Suppose we have a table called employees with columns


employee_id, first_name, last_name, and salary. We want to categorize
employees based on their salary range.

We can use the CASE statement to achieve this.

In this example, we are categorizing employees into three salary ranges: Low,
Medium, and High. If an employee's salary is less than 50,000, they will be
categorized as Low. If their salary is between 50,000 and 79,999, they will be
categorized as Medium. If their salary is 80,000 or more, they will be categorized
as High.
63 | P a g e Ezekiel Aleke

The result of this query will be a table with the columns employee_id, first_name,
last_name, salary, and Salary Range, where each row represents an employee and
their corresponding salary range.
64 | P a g e Ezekiel Aleke

CHAPTER THIRTEEN

COMBINING QUERIES

This chapter covers how to combine multiple SQL queries into a single query. It
includes examples of how to use the UNION, INTERSECT, and EXCEPT
operators to combine data from multiple tables, as well as how to use subqueries
to combine data from related tables.

UNION

The UNION operator combines the result sets of two or more SELECT
statements into a single result set. The SELECT statements must have the same
number of columns and compatible data types.

Here's the basic syntax of the UNION operator:

Example 1: Let's say we have two tables: "students1" and "students2". Both
tables have the same columns: "name", "age", and "gender". To retrieve a single
list of all the students from both tables, we can use the following query:

This will return a single list of all the students from both tables, with duplicate
entries removed.
65 | P a g e Ezekiel Aleke

Example 1: Let's say we have two tables: "books1" and "books2". Both tables
have the same columns: "title", "author", and "year". To retrieve a single list of
all the books from both tables, we can use the following query:

This will return a single list of all the books from both tables, with duplicate
entries removed.

INTERSECT

The INTERSECT operator returns only the common rows between the result sets
of two or more SELECT statements. The SELECT statements must have the same
number of columns and compatible data types.

Here's the basic syntax of the INTERSECT operator:

Example 1: Let's say we have two tables: "students1" and "students2". Both
tables have the same columns: "name", "age", and "gender". To retrieve a list of
all the students who are in both tables, we can use the following query:
66 | P a g e Ezekiel Aleke

This will return a list of all the students who are in both tables.

Example 2: Let's say we have two tables: "books1" and "books2". Both tables
have the same columns: "title", "author", and "year". To retrieve a list of all the
books that are in both tables, we can use the following query:

This will return a list of all the books that are in both tables.

EXCEPT

EXCEPT is a set operator in SQL that returns the distinct rows that are present in
the result set of the first query but not in the result set of the second query. In
other words, EXCEPT performs a subtraction operation between two sets of
results.

Here's the basic syntax for using EXCEPT:


67 | P a g e Ezekiel Aleke

Example 1: Let's say we have two tables called "employees" and "managers".
The "employees" table contains columns "employee_id", "first_name",
"last_name", and "department", while the "managers" table contains columns
"manager_id", "first_name", "last_name", and "department". To retrieve all the
employees who are not also managers, we can use the following query:

This will return all the rows in the "employees" table that are not present in the
"managers" table.

Example 2: Let's say we have two tables called "table1" and "table2". The
"table1" table contains columns "column1", "column2", and "column3", while the
"table2" table contains columns "column1", "column2", and "column3".

To retrieve all the distinct rows in "table1" that are not present in "table2", we can
use the following query:

This will return all the distinct rows in "table1" that are not present in "table2".
68 | P a g e Ezekiel Aleke

CHAPTER FOURTEEN

INSERTING, UPDATING AND DELETING DATA

This chapter covers how to insert, update, and delete data in a database using
SQL. It includes examples of how to use the INSERT statement to add new data
to a table, how to use the UPDATE statement to modify existing data, and how
to use the DELETE statement to remove data from a table.

INSERTING DATA

The INSERT INTO statement is used to insert new rows into a table.

Here's the basic syntax of the INSERT INTO statement:

Example 1: Let's say we have a table called "employees" with columns


"employee_id", "first_name", "last_name", and "salary". To insert a new row into
the table with values "1001", "John", "Doe", and "50000", we can use the
following query:

This will insert a new row into the "employees" table with the specified values.

Example 2: Let's say we want to insert multiple rows into the "employees" table
at once.
69 | P a g e Ezekiel Aleke

We can use the following query:

This will insert three new rows into the "employees" table with the specified
values.

UPDATING DATA

The UPDATE statement is used to modify existing rows in a table.

Here's the basic syntax of the UPDATE statement:

Example 1: Let's say we want to update the salary of the employee with
employee_id 1001 to 60000 in the "employees" table.

We can use the following query:

This will update the salary of the employee with employee_id 1001 to 60000 in
the "employees" table.

Example 2: Let's say we want to update the first name of all the employees whose
last name is "Smith" to "Samantha".
70 | P a g e Ezekiel Aleke

We can use the following query:

This will update the first name of all the employees whose last name is "Smith"
to "Samantha" in the "employees" table.

DELETING DATA

The DELETE statement is used to delete existing rows from a table.

Here's the basic syntax of the DELETE statement:

Example 1: Let's say we want to delete the row for the employee with
employee_id 1001 from the "employees" table.

We can use the following query:

This will delete the row for the employee with employee_id 1001 from the
"employees" table.

Example 2: Let's say we want to delete all the rows from the "employees" table
where the salary is less than 50000.

We can use the following query:


71 | P a g e Ezekiel Aleke

This will delete all the rows from the "employees" table where the salary is less
than 50000.
72 | P a g e Ezekiel Aleke

CHAPTER FIFTEEN

USING VIEWS

This chapter covers how to create and use views in SQL. It includes examples of
how to create views using the CREATE VIEW statement, how to use views to
simplify data retrieval, and how to use views to enforce security policies.

In SQL, a view is a virtual table that is based on the result of a SELECT statement.
It is a saved SQL query that can be used to simplify complex queries and make
them more readable.

The USING VIEWS command is used to create, modify, or delete views in a


database. Here's an explanation of USING VIEWS in SQL with examples:

Creating a view using SELECT statement: A view can be created by using a


SELECT statement with the CREATE VIEW statement. Once a view is created,
it can be used like a table in subsequent SQL queries.

Here's an example:

In this example, we create a view called "customer_info" that includes the


columns "customer_id", "first_name", "last_name", "email", and "phone" from
the "customers" table. Now we can use this view to simplify queries that involve
these columns, like this:
73 | P a g e Ezekiel Aleke

In this query, we use the "customer_info" view to retrieve all rows where the
email address ends with "gmail.com".

ALTER VIEW

A view can be modified using the ALTER VIEW statement. This statement
allows you to modify the SELECT statement that defines the view.

Here's an example:

In this example, we modify the "customer_info" view to include the "address"


column from the "addresses" table. We use an INNER JOIN to join the
"customers" and "addresses" tables based on the "customer_id" column. Now we
can use this modified view to simplify queries that involve these columns, like
this:

In this query, we use the modified "customer_info" view to retrieve all rows
where the address contains the string "Main Street".

DROP VIEW

A view can be deleted using the DROP VIEW statement. This statement removes
the view from the database.
74 | P a g e Ezekiel Aleke

Here's an example:

In this example, we delete the "customer_info" view from the database. Once the
view is deleted, it can no longer be used in SQL queries.
75 | P a g e Ezekiel Aleke

CHAPTER SIXTEEN

MANAGING DATABASES

This chapter covers how to manage databases in SQL, including creating,


modifying, and dropping databases. It includes examples of how to use the
CREATE DATABASE statement to create a new database, how to use the
ALTER DATABASE statement to modify an existing database, and how to use
the DROP DATABASE statement to delete a database.

CREATE STATEMENT

The CREATE statement is used to create a new database, table, view, or other
database object.

Here's the basic syntax of the CREATE statement:

CREATE DATABASE:

For example, let's say we want to create a new database called "sales".

We can use the following query:

For example, let's say we want to create a new table called "customers" with
columns "customer_id", "customer_name", and "customer_email".
76 | P a g e Ezekiel Aleke

We can use the following query:

ALTER SATEMENT

The ALTER statement is used to modify the structure of an existing database,


table, or other database objects.

Here's the basic syntax of the ALTER statement:

For example, let's say we want to add a new column called "customer_phone" to
the "customers" table.

We can use the following query:

ALTER COLUMN

For example, let's say we want to change the data type of the "customer_email"
column from VARCHAR(50) to VARCHAR(100) in the "customers" table.
77 | P a g e Ezekiel Aleke

We can use the following query:

DROP

The DROP statement is used to delete an existing database, table, view, or other
database object.

Here's the basic syntax of the DROP statement:

DROP DATABASE

For example, let's say we want to delete the "sales" database.

We can use the following query:

DROP TABLE

For example, let's say we want to delete the "customers" table.

We can use the following query:


78 | P a g e Ezekiel Aleke

CHAPTER SEVENTEEN

NORMALIZATION

Normalization is used to organize data in a database to reduce data redundancy


and increase data integrity. This process involves breaking down larger tables
into smaller ones and creating relationships between them based on their
attributes.

There are several levels of normalization, with each level building upon the
previous one to achieve a more efficient and organized database. Here are the
main levels of normalization:

1. First Normal Form (1NF): In 1NF, every column in a table must have
atomic values, meaning that each value should be indivisible. For example,
a phone number should be stored in separate columns for area code, prefix,
and line number, rather than as a single column.

2. Second Normal Form (2NF): In 2NF, a table must be in 1NF and each non-
key column must be fully dependent on the primary key. For example, if
you have a table that contains information about orders and order details,
the order details table should have a separate primary key and only contain
information related to the order.

3. Third Normal Form (3NF): In 3NF, a table must be in 2NF and there should
be no transitive dependencies. A transitive dependency is when a non-key
column is dependent on another non-key column. For example, if you have
a table that contains information about customers and their addresses, the
state column should be in a separate table since it is dependent on the zip
code, which is the primary key.

Let's consider an example to illustrate normalization in SQL. Suppose we have a


table called "employees" with the following columns: employee_id, first_name,
79 | P a g e Ezekiel Aleke

last_name, address, city, state, zip_code, phone_number, department_id, and


salary.

The table is not in 1NF since the address and phone_number columns are not
atomic values. We can fix this by creating separate tables for address and
phone_number:

Now the employees table is in 1NF, but it still has some issues with 2NF and
3NF. The department_id column is not fully dependent on the primary key, so we
can create a separate table for departments:
80 | P a g e Ezekiel Aleke

Finally, the state column is dependent on the zip_code column, so we can create
a separate table for zip codes:
81 | P a g e Ezekiel Aleke

EXCEL
82 | P a g e Ezekiel Aleke

CHAPTER ONE

EXCEL TERMINOLOGY

Here are some terms commonly used in Excel:

1. Workbook: A workbook is a file that contains one or more worksheets.

2. Worksheet: A worksheet is a single page or sheet in an Excel workbook


where data is entered, analysed and manipulated.

3. Cell: A cell is a single rectangular box on a worksheet where data can be


entered or displayed.

4. Range: A range is a group of two or more adjacent cells in a worksheet.

5. Formula: A formula is an equation that performs calculations on values in


one or more cells and returns a result.

6. Function: A function is a predefined formula that performs a specific


calculation. Examples of functions in Excel include SUM, AVERAGE,
MAX, MIN, etc.

7. Chart: A chart is a graphical representation of data from a worksheet. It is


used to visualize and communicate data in a clear and concise manner.

8. Pivot table: A pivot table is a powerful tool in Excel used for summarizing,
analyzing, and presenting large amounts of data in a concise and
meaningful way.

9. Conditional formatting: Conditional formatting is a feature in Excel that


allows you to apply formatting to cells based on certain conditions or
criteria.

10. Data validation: Data validation is a feature in Excel used to restrict or


control the type of data that can be entered into a cell.
83 | P a g e Ezekiel Aleke

CHAPTER TWO

MANAGE DATA IN CELLS AND RANGES

This chapter covers how to manipulate data in cells and ranges, including sorting,
filtering, and calculating with formulas.

Sorting, filtering, and calculating with formulas are some of the most powerful
features in Microsoft Excel.

Here are some examples of how to use these features:

SORTING

Sorting allows you to arrange data in a specific order based on certain criteria. To
sort data in Excel, select the range of cells you want to sort and click the "Sort &
Filter" button on the "Home" tab. You can then select "Sort A to Z" or "Sort Z to
A" to sort the data in ascending or descending order based on the first column.
Alternatively, you can select "Custom Sort" to sort by multiple columns or to sort
by a specific criterion, such as numerical or alphabetical order.

Example 1: Sort a list of names alphabetically.

1. Select the range of cells containing the names you want to sort.

2. Click the "Sort & Filter" button on the "Home" tab.

3. Select "Sort A to Z" to sort the names in alphabetical order.

Example 2: Sort a list of sales figures by month.

1. Select the range of cells containing the sales figures you want to sort.

2. Click the "Sort & Filter" button on the "Home" tab.

3. Select "Custom Sort" and select "Month" as the first sorting column.

4. Select "Sort A to Z" to sort the sales figures by month in ascending order.
84 | P a g e Ezekiel Aleke

FILTERING

Filtering allows you to selectively display only certain data based on specific
criteria. To filter data in Excel, select the range of cells you want to filter and
click the "Filter" button on the "Data" tab. You can then select the criteria you
want to filter by, such as text, numbers, or dates.

Here are two examples:

Example 1: Filter a list of orders by customer name.

1. Select the range of cells containing the orders you want to filter.

2. Click the "Filter" button on the "Data" tab.

3. Click the drop-down arrow in the "Customer Name" column header.

4. Select the customer name you want to filter by.

Example 2: Filter a list of sales figures by date range.

1. Select the range of cells containing the sales figures you want to filter.

2. Click the "Filter" button on the "Data" tab.

3. Click the drop-down arrow in the "Date" column header.

4. Select "Date Filters" and select "Between."

5. Enter the start and end dates of the range you want to filter by.

CALCULATING WITH FORMULAS

Formulas allow you to perform calculations on data in Excel. To create a formula


in Excel, start by typing an equals sign (=) followed by the formula you want to
use. For example, to add two cells together, type "=A1+B1". Here are two
examples:

Example 1: Calculate the average of a list of numbers.


85 | P a g e Ezekiel Aleke

1. Select the cell where you want to display the average.

2. Type "=AVERAGE(A1:A10)" to calculate the average of cells A1 through


A10.

Example 2: Calculate the total sales for a specific product.

1. Select the cell where you want to display the total sales.

2. Type "=SUMIF(B:B,"Product A",C:C)" to calculate the sum of all sales


for Product A in column C.
86 | P a g e Ezekiel Aleke

CHAPTER THREE

IF, IFS AND SWITCH IN EXCEL

In Excel, the IF and SWITCH functions are used to perform conditional


operations.

The IF function allows you to specify a logical test and provide a value to be
returned if the test is true, and another value to be returned if the test is false.

The SWITCH function is similar to the IF function but allows you to specify
multiple conditions and corresponding values to return based on those conditions.

IF FUNCTION

The IF function is a logical function that tests a specified condition and returns
one value if the condition is TRUE, and another value if the condition is FALSE.

Example 1: Suppose we want to calculate a bonus for employees based on their


sales. If an employee's sales are greater than $10,000, they will receive a 5%
bonus, otherwise they will receive no bonus.

We can use the IF function to accomplish this as follows:

In this example, B2 is the cell containing the sales amount for the employee. If
the sales amount is greater than 10,000, the formula will return 5% of the sales
amount, otherwise it will return 0.

Example 2: Suppose we want to categorize students' grades based on their test


scores. If a student's test score is greater than or equal to 90, they will receive an
"A" grade, if it is between 80 and 89 they will receive a "B" grade, and if it is less
than 80 they will receive a "C" grade.
87 | P a g e Ezekiel Aleke

We can use the IF function to accomplish this as follows:

In this example, A2 is the cell containing the test score for the student. If the test
score is greater than or equal to 90, the formula will return "A", otherwise it will
check if it is greater than or equal to 80, and if so it will return "B", otherwise it
will return "C".

IFS
IFS function in Excel is used to check multiple conditions and returns a value
corresponding to the first TRUE condition. It takes a set of conditions and values
as arguments and returns the value of the first condition that is true. If no
conditions are true, it returns an error.

Syntax:

Example 1: Suppose we have a list of scores of students and we want to assign


grades based on the score. If the score is greater than or equal to 90, the grade
will be A, if it is between 80 and 89, the grade will be B, and so on. We can use
the IFS function to assign the grades based on the scores.
88 | P a g e Ezekiel Aleke

In this example, we use the IFS function to check the conditions for each grade
and assign a corresponding value.

Example 2: Suppose we have a list of products and their prices, and we want to
apply a discount based on the price range. If the price is less than $10, there is no
discount, if it is between $10 and $20, the discount will be 5%, and if it is greater
than $20, the discount will be 10%. We can use the IFS function to calculate the
discounted price based on the price range.

In this example, we use the IFS function to check the price range and apply the
corresponding discount to calculate the discounted price.

Note: In Excel 2016 and later versions, we can use the SWITCH function as an
alternative to the IFS function.

SWITCH FUNCTION

The SWITCH function is used to evaluate an expression against a list of cases


and return a corresponding result based on the first match.

Example 1: Suppose we want to calculate a commission for salespeople based


on their sales. If a salesperson's sales are less than $5,000, they will receive a 1%
commission, if it is between $5,000 and $10,000 they will receive a 2%
commission, and if it is greater than $10,000 they will receive a 3% commission.
89 | P a g e Ezekiel Aleke

We can use the SWITCH function to accomplish this as follows:

In this example, B2 is the cell containing the sales amount for the salesperson.
The SWITCH function checks the sales amount against each condition in order
and returns the corresponding commission rate multiplied by the sales amount.

Example 2: Suppose we want to categorize a list of countries based on their


continents.

We can use the SWITCH function to accomplish this as follows:

In this example, A2 is the cell containing the country name. The SWITCH
function checks the country name against each condition in order, and returns the
corresponding continent name. If the country is not recognized, it will return
"Unknown".
90 | P a g e Ezekiel Aleke

CHAPTER FOUR

EXCEL FUNCTIONS

This chapter covers creating and using the major functions you need to master in
Excel, including using built-in functions and creating custom functions.

SUM FUNCTION

The SUM function adds up a range of cells and returns the total. It's useful for
quickly calculating the total of a list of numbers.

Example 1: If you have a list of numbers in cells A1 to A5 and you want to find
the sum of those numbers, you can use the SUM function as follows:

Example 2: You can also use the SUM function to add up multiple ranges of
cells. For example, if you have a list of numbers in cells A1 to A5 and another
list of numbers in cells B1 to B5, you can find the total of both lists by using the
following formula:

AVERAGE FUNCTION

The AVERAGE function calculates the average (arithmetic mean) of a range of


cells. It's useful for finding the average value of a list of numbers.

Example 1: If you have a list of numbers in cells A1 to A5 and you want to find
the average of those numbers, you can use the AVERAGE function as follows:
91 | P a g e Ezekiel Aleke

Example 2: You can also use the AVERAGE function to find the average of
multiple ranges of cells. For example, if you have a list of numbers in cells A1 to
A5 and another list of numbers in cells B1 to B5, you can find the average of both
lists by using the following formula:

COUNT FUNCTION

The COUNT function counts the number of cells in a range that contain numbers.
It's useful for quickly determining the size of a list of numbers.

Example 1: If you have a list of numbers in cells A1 to A5 and you want to find
out how many numbers are in the list, you can use the COUNT function as
follows:

Example 2: You can also use the COUNT function to count the number of cells
in multiple ranges of cells. For example, if you have a list of numbers in cells A1
to A5 and another list of numbers in cells B1 to B5, you can find the total number
of numbers in both lists by using the following formula:

MAX FUNCTION

The MAX function returns the highest value in a range of cells. It's useful for
quickly finding the maximum value in a list of numbers.

Example 1: If you have a list of numbers in cells A1 to A5 and you want to find
the highest number in the list, you can use the MAX function as follows:
92 | P a g e Ezekiel Aleke

Example 2: You can also use the MAX function to find the highest value in
multiple ranges of cells. For example, if you have a list of numbers in cells A1 to
A5 and another list of numbers in cells B1 to B5, you can find the highest value
in both lists by using the following formula:

MIN FUNCTION

The MIN function in Excel is used to find the minimum value in a range of cells.
It is a very useful function when dealing with large amounts of data.

The syntax of the MIN function is:

Here, the range refers to the range of cells from which you want to find the
minimum value.

Example 1: Suppose you have a list of numbers in cells A1 to A10 and you want
to find the minimum value from this range.

You can use the MIN function as follows:

Example 2: Let's say you have a list of sales data in cells B2 to B20 and you
want to find the minimum sales amount.

You can use the MIN function as follows:


93 | P a g e Ezekiel Aleke

CONCATENATE FUNCTION

The CONCATENATE function in Excel is used to join two or more text strings
into a single string.

The syntax of the CONCATENATE function is:

Here, text1 and text2 are the text strings you want to join.

Example 1: Suppose you have two cells, A1 and B1, containing the text "Hello"
and "World" respectively, and you want to join them into a single string.

You can use the CONCATENATE function as follows:

Example 2: Let's say you have a list of names in cells A1 to A10 and you want
to create a full name column by joining the first name and last name.

You can use the CONCATENATE function as follows:

LEFT FUNCTION

The LEFT function in Excel is used to extract a specified number of characters


from the beginning of a text string.

The syntax of the LEFT function is:


94 | P a g e Ezekiel Aleke

Here, text is the text string from which you want to extract characters and
num_chars is the number of characters you want to extract.

Example 1: Suppose you have a cell A1 containing the text "Excel is awesome"
and you want to extract the first 5 characters from this text string.

You can use the LEFT function as follows:

Example 2: Let's say you have a list of email addresses in cells B2 to B20 and
you want to extract the username from these email addresses.

You can use the LEFT function as follows:

RIGHT FUNCTION

The RIGHT function in Excel is used to extract a specified number of characters


from the end of a text string.

The syntax of the RIGHT function is:

Here, text is the text string from which you want to extract characters and
num_chars is the number of characters you want to extract.

Example 1: Suppose you have a cell A1 containing the text "Excel is awesome"
and you want to extract the last 6 characters from this text string.

You can use the RIGHT function as follows:


95 | P a g e Ezekiel Aleke

Example 2: Let's say you have a list of phone numbers in cells C2 to C20 and
you want to extract the last 4 digits from these phone numbers.

You can use the RIGHT function as follows:

XLOOKUP FUNCTION

The XLOOKUP function is a newer version of the VLOOKUP function that


allows you to perform more advanced searches and lookups. It can search for a
value and return a corresponding value in the same row or column.

Here's the syntax for the XLOOKUP function:

Let's break down each of these arguments:

1. lookup_value - This is the value that you want to look up. It can be a cell
reference or a value enclosed in quotation marks.

2. lookup_array - This is the range of cells or array where you want to search
for the lookup value.

3. return_array - This is the range of cells or array from which you want to
return a value that corresponds to the lookup value.

4. [match_mode] - This is an optional argument that specifies the match


mode. There are four match modes you can use:

• 0 or omitted: Exact match. This is the default mode. It will return an error
if there is no exact match for the lookup value.
96 | P a g e Ezekiel Aleke

• -1 or -2: Exact match or next smallest/largest. These modes will return the
closest match to the lookup value if there is no exact match.

• 1: First match. This mode will return the first match found in the lookup
array.

• 2: Last match. This mode will return the last match found in the lookup
array.

5. [search_mode] - This is an optional argument that specifies the search


mode. There are two search modes you can use:

• 1 or omitted: Search from the beginning of the lookup array. This is the
default mode.

• -1: Search from the end of the lookup array.

Example 1: Suppose you have a table with a list of products and their
corresponding prices, and you want to find the price for a specific product.

You can use the XLOOKUP function to search for the product name and return
the price from the same row.

In this example, "Product A" is the value being searched for, A2:A10 is the range
containing the product names, and B2:B10 is the range containing the
corresponding prices. The function will return the price for "Product A" if it is
found in the range.

Example 2: Suppose you have a table with a list of countries and their
corresponding GDPs, and you want to find the GDP for a specific country in a
different currency. You can use the XLOOKUP function to search for the country
97 | P a g e Ezekiel Aleke

name and return the GDP from the same row, then convert the currency using a
separate formula.

In this example, "USA" is the value being searched for, A2:A10 is the range
containing the country names, and B2:B10 is the range containing the
corresponding GDPs. The function will return the GDP for "USA" if it is found
in the range, and then the value will be multiplied by 0.85 to convert it to a
different currency.

VLOOKUP FUNCTION

The VLOOKUP function is an older version of the XLOOKUP function that


allows you to search for a specific value in the first column of a range and return
a corresponding value from a specified column in the same row.

The syntax for VLOOKUP function:

Here's what each of the parameters mean:

1. lookup_value: This is the value that you want to search for in the leftmost
column of the table or range. This parameter can be a value, a cell
reference, or a named range.

2. table_array: This is the table or range of cells that you want to search in.
The lookup_value must be in the first column of this table, and the table
must contain at least the column that you want to return a value from.

3. col_index_num: This is the column number in the table_array that you


want to return a value from. For example, if you want to return a value
from the third column of the table_array, the col_index_num would be 3.
98 | P a g e Ezekiel Aleke

4. [range_lookup]: This is an optional parameter that specifies whether the


VLOOKUP function should perform an approximate match or an exact
match. If you set this parameter to TRUE or leave it blank, the function
will perform an approximate match (also known as a "closest match" or
"range lookup"). If you set this parameter to FALSE, the function will
perform an exact match.

Example 1: Suppose you have a table with a list of employees and their
corresponding salaries, and you want to find the salary for a specific employee.
You can use the VLOOKUP function to search for the employee’s name in the
first column of the range and return the salary from a specified column in the
same row.

In this example, "John Doe" is the value being searched for, A2:B10 is the range
containing the employee names and salaries, 2 is the column index of the salary
column, and FALSE is used to indicate that an exact match is required. The
function will return the salary for "John Doe" if it is found in the range.

Example 2: Suppose you have a table with a list of products and their
corresponding categories, and you want to find the category for a specific
product. You can use the VLOOKUP function to search for the product name in
the first column of the range and return the category from a specified column in
the same row.

In this example, "Product A" is the value being searched for, A2:B10 is the range
containing the product names and categories, 2 is the column index of the
99 | P a g e Ezekiel Aleke

category column, and FALSE is used to indicate that an exact match is required.
The function will return the category for "Product A" if it is found in the range.

AND FUNCTION

The AND function returns TRUE if all the conditions you specify are TRUE, and
returns FALSE if any one of them is FALSE.

The syntax for the AND function is:

where condition1, condition2, etc. are the conditions you want to test. The AND
function returns TRUE if all of the conditions are true, and FALSE if any of the
conditions are false.

Example 1: Suppose you have a data set of students and their exam scores in
three subjects - Math, English, and Science. You want to find out which students
have scored above 80 in all three subjects. You can use the AND function to
create a formula that checks if each score is greater than 80, and returns TRUE if
all conditions are met, otherwise FALSE.

Formula:

Explanation: This formula checks if the score in cell B2 is greater than 80, and if
the score in cell C2 is greater than 80, and if the score in cell D2 is greater than
80. If all three conditions are TRUE, the function returns TRUE, otherwise
FALSE.

Example 2: Suppose you have a data set of employees and their salaries. You
want to find out which employees have a salary greater than $50,000 and work in
the Sales or Marketing department. You can use the AND function to create a
100 | P a g e Ezekiel Aleke

formula that checks if each condition is met, and returns TRUE if both conditions
are TRUE, otherwise FALSE.

Formula:

Explanation: This formula checks if the salary in cell B2 is greater than $50,000,
and if the department in cell C2 is either Sales or Marketing. If both conditions
are TRUE, the function returns TRUE, otherwise FALSE.

OR FUNCTION

The OR function returns TRUE if any one of the conditions you specify is TRUE,
and returns FALSE if all of them are FALSE.

The syntax for the OR function is:

where:

• logical1, [logical2], ... are one or more conditions that are to be tested.
These can be logical expressions, cell references, or values.

Example 1: Suppose you have a data set of employees and their performance
ratings. You want to find out which employees have a rating of either 4 or 5. You
can use the OR function to create a formula that checks if the rating in cell B2 is
either 4 or 5, and returns TRUE if any one of them is TRUE, otherwise FALSE.

Formula:
101 | P a g e Ezekiel Aleke

Explanation: This formula checks if the rating in cell B2 is either 4 or 5. If either


one of them is TRUE, the function returns TRUE, otherwise FALSE.

Example 2: Suppose you have a data set of products and their prices. You want
to find out which products have a price greater than $100 or less than $50. You
can use the OR function to create a formula that checks if the price in cell B2 is
either greater than $100 or less than $50, and returns TRUE if any one of them is
TRUE, otherwise FALSE.

Formula:

Explanation: This formula checks if the price in cell B2 is either greater than
$100 or less than $50. If either one of them is TRUE, the function returns TRUE,
otherwise FALSE.

COUNTIF FUNCTION

The COUNTIF function counts the number of cells in a range that meet a certain
condition.

The syntax for COUNTIF function

where range is the range of cells you want to count, and criteria is the condition
that must be met for a cell to be counted.

Example 1: Count the number of cells that have a value greater than 5 in a range
A1:A10.
102 | P a g e Ezekiel Aleke

Example 2: Count the number of cells that contain the text "apple" in a range
B1:B10.

COUNTIFS FUNCTION

The COUNTIFS function counts the number of cells in multiple ranges that meet
multiple criteria.

The syntax for COUNTIFS function

Example 1: Count the number of cells in a range A1:A10 that have a value
greater than 5 and less than 10.

Example 2: Count the number of cells in a range B1:B10 that contain the text
"apple" and have a value greater than 5.

SUMIF FUNCTION

The SUMIF function adds up the values in a range that meet a certain condition.

Example 1: Add up the values in a range A1:A10 that have a value greater than
5.
103 | P a g e Ezekiel Aleke

Example 2: Add up the values in a range B1:B10 that have the text "apple".

SUMIFS FUNCTION

The SUMIFS function adds up the values in multiple ranges that meet multiple
criteria.

Example 1: Add up the values in a range A1:A10 that have a value greater than
5 and less than 10.

Example 2: Add up the values in a range B1:B10 that have the text "apple" and
have a value greater than 5.

AVERAGEIF FUNCTION

The AVERAGEIF function calculates the average of the values in a range that
meet a certain condition.

Example 1: Calculate the average of the values in a range A1:A10 that have a
value greater than 5.

Example 2: Calculate the average of the values in a range B1:B10 that have the
text "apple".
104 | P a g e Ezekiel Aleke

AVERAGEIFS FUNCTION

The AVERAGEIFS function calculates the average of the values in multiple


ranges that meet multiple criteria.

Example 1: Calculate the average of the values in a range A1:A10 that have a
value greater than 5 and less than 10.

Example 2: Calculate the average of the values in a range B1:B10 that have the
text "apple" and have a value greater than 5.

LOWER FUNCTION

The LOWER function in Excel converts all the uppercase letters in the text string
to lowercase.

The syntax for LOWER function is:

where text is the text string that you want to convert to lowercase.

Example 1: Suppose we have a cell containing the text "EXCEL FORMULAS".


To convert this text to lowercase, we can use the LOWER function as follows:

This will return the output as "excel formulas".


105 | P a g e Ezekiel Aleke

Example 2: Suppose we have a column of names in uppercase and we want to


convert them to lowercase. We can use the LOWER function and drag it down to
apply the function to all cells in the column.

The formula in the first cell would be:

If the names start from cell A2.

UPPER FUNCTION

The UPPER function in Excel converts all the lowercase letters in the text string
to uppercase.

The syntax for the UPPER function is:

where text is the text string that you want to convert to uppercase.

Example 1: Suppose we have a cell containing the text "excel formulas". To


convert this text to uppercase, we can use the UPPER function as follows:

This will return the output as "EXCEL FORMULAS".

Example 2: Suppose we have a column of names in mixed case and we want to


convert them to uppercase. We can use the UPPER function and drag it down to
apply the function to all cells in the column. The formula in the first cell would
be:

If the names start from cell A2.


106 | P a g e Ezekiel Aleke

PROPER FUNCTION

The PROPER function in Excel converts the first letter of each word in the text
string to uppercase and all other letters to lowercase.

The syntax for the PROPER function is:

where text is the text string that you want to convert to proper case.

Example 1: Suppose we have a cell containing the text "eXCEL fORMULAS".


To convert this text to proper case, we can use the PROPER function as follows:

This will return the output as "Excel Formulas".

Example 2: Suppose we have a column of names in lowercase and we want to


convert them to proper case. We can use the PROPER function and drag it down
to apply the function to all cells in the column. The formula in the first cell would
be:

If the names start from cell A2.

MINIFS FUNCTION

• MINIF: Returns the minimum value from a range that meets a given
criterion.

• MINIFS: Returns the minimum value from a range that meets multiple
given criteria.
107 | P a g e Ezekiel Aleke

Example: Suppose you have a list of orders with multiple criteria, such as region,
product, and date. You can use the MINIFS function to find the minimum order
value that meets all the given criteria.

The formula would be:

This formula will return the minimum order value from column D, where column
A contains "West", column B contains "Product B", and column C is greater than
January 1st, 2022.

MAXIFS FUNCTION

• MAXIF: Returns the maximum value from a range that meets a given
criterion.

• MAXIFS: Returns the maximum value from a range that meets multiple
given criteria.

Example: Suppose you have a list of sales data with multiple criteria, such as
region, product, and date. You can use the MAXIFS function to find the
maximum sales value that meets all the given criteria. The formula would be:

This formula will return the maximum sales value from column D, where column
A contains "East", column B contains "Product C", and column C is less than or
equal to December 31st, 2021.

UNIQUE FUNCTION

The UNIQUE function in Excel returns a list of unique values from a range or an
array. It removes all duplicates and returns only the unique values.
108 | P a g e Ezekiel Aleke

Example 1: Suppose we have a list of fruits as shown below in cells A2:A9. To


extract only the unique values, we can use the UNIQUE function as follows:

The function returns a list of unique values: "apple", "banana", "grape", "orange"
and "pear".

Example 2: We can also use the UNIQUE function to extract unique values from
multiple ranges. For example, we can extract unique values from two different
columns A and B by combining them into one array as shown below:

This formula returns a list of unique values from both columns A and B.

TRIM FUNCTION

The TRIM function in Excel removes all leading and trailing spaces from a text
string. It is useful when working with text strings that have unwanted spaces at
the beginning or end of the string.

Example 1: Suppose we have a text string in cell A2 that has unwanted spaces
before and after the string:

" Hello World "

To remove these spaces, we can use the TRIM function as follows:

This formula returns the text string "Hello World" without any leading or trailing
spaces.
109 | P a g e Ezekiel Aleke

Example 2: We can also use the TRIM function to remove unwanted spaces from
multiple cells. For example, suppose we have a list of names in cells A2:A5 that
have leading and trailing spaces. We can remove these spaces from all the names
using the TRIM function as follows:

This formula returns the trimmed names without any leading or trailing spaces.
110 | P a g e Ezekiel Aleke

CHAPTER FIVE

CREATE CHARTS AND GRAPHICS

Create charts and graphics covers how to create and customize charts and
graphics in Excel, including bar charts, line charts, and pie charts.

Creating charts in Excel is a great way to visualize data and make it easier to
understand.

Charts can help you identify trends, compare data, and present information in a
clear and concise manner. In this answer, we will explain the steps to create charts
in Excel and provide examples of different types of charts.

Step 1: Select Data

The first step in creating a chart is to select the data that you want to display. The
data can be in a table or in a range of cells. The data should be organized with
rows representing categories and columns representing values. For example, if
you have data on the sales of different products in different months, the rows
would be the products and the columns would be the months.

Step 2: Insert a Chart

Once you have selected your data, you can insert a chart by selecting the data and
then clicking on the "Insert" tab in the top menu bar. From there, you can choose
the type of chart you want to create. Excel offers a variety of chart types,
including column charts, line charts, pie charts, bar charts, area charts, scatter
charts, and more.

Step 3: Customize the Chart

After inserting the chart, you can customize it to better suit your needs. You can
change the chart type, add titles, labels, and legends, adjust the axis scales, and
111 | P a g e Ezekiel Aleke

apply formatting options to make the chart more visually appealing. You can also
use Excel's chart styles and color schemes to quickly change the look of the chart.

EXAMPLES OF DIFFERENT TYPES OF CHARTS

1. Column Chart: A column chart is a chart that uses vertical bars to


represent the values in a dataset. It is commonly used to compare the values
of different categories. For example, you could use a column chart to
compare the sales of different products.

2. Line Chart: A line chart is a chart that uses lines to connect data points. It
is commonly used to show trends over time. For example, you could use a
line chart to show the trend in the stock prices of a company over the past
year.

3. Pie Chart: A pie chart is a chart that uses a circle to represent the whole
dataset, and slices of the circle to represent the values of each category. It
is commonly used to show the percentage breakdown of a dataset. For
example, you could use a pie chart to show the percentage of sales from
each product.
112 | P a g e Ezekiel Aleke

4. Bar Chart: A bar chart is a chart that uses horizontal bars to represent the
values in a dataset. It is similar to a column chart, but it is useful when the
category labels are long or when there are many categories. For example,
you could use a bar chart to compare the performance of different
departments in a company.

5. Area Chart: An area chart is a chart that uses lines to connect data points,
and the area below the lines is filled with color or shading. It is commonly
used to show the trend in a dataset over time. For example, you could use
an area chart to show the trend in website traffic over the past year.
113 | P a g e Ezekiel Aleke

6. Scatter Chart: A scatter chart is a chart that uses dots to represent data
points. It is commonly used to show the relationship between two
variables. For example, you could use a scatter chart to show the
relationship between a company's advertising spend and its sales.

In conclusion, creating charts in Excel is a simple process that can help you
display your data in a meaningful way. By selecting the right chart type and
customizing it to your needs, you can create charts that effectively communicate
your message to your audience.
114 | P a g e Ezekiel Aleke

CHAPTER SIX

USING SPARKLINES AND DATA BARS


Sparklines and data bars are two visual tools in Excel that can be used to quickly
represent data trends and patterns. Both tools are useful in highlighting trends in
data, but they differ in how they display the data.

SPARKLINES

Sparklines are small, simple charts that are designed to be embedded within a
cell. They provide a quick and easy way to visualize data trends over time,
without taking up much space.

For example, let's say you have a table with sales data for the past 12 months.
You can add a sparkline to each row that displays the trend in sales over the past
12 months. Here's how:

1. Select the cell where you want to insert the sparkline.

2. Go to the "Insert" tab in the ribbon and click "Sparklines".

3. Select "Line" as the type of sparkline and choose the range of data you
want to display (in this case, the sales data for the past 12 months).

4. Click "OK" to insert the sparkline.

Now you have a small chart that shows the trend in sales over the past 12 months.
You can quickly see whether sales have been increasing, decreasing, or staying
steady over time.
115 | P a g e Ezekiel Aleke

DATA BARS

Data bars are a visual representation of the values in a cell or range of cells. They
provide a way to quickly see how the values in a range of cells compare to one
another.

Example 1: Let's say you have a table with sales data for several products. You
can add a data bar to each row that displays the relative sales performance of each
product.

Here's how:

1. Select the range of cells that contains the sales data.

2. Go to the "Home" tab in the ribbon and click "Conditional Formatting".

3. Select "Data Bars" from the dropdown menu, and choose the color and
style you want to use.

4. Click "OK" to apply the data bars to the selected cells.

Now you have a visual representation of the sales data that makes it easy to see
which products are performing better than others. The longer the data bar, the
higher the sales for that product.
116 | P a g e Ezekiel Aleke

CHAPTER SEVEN

CONDITIONAL FORMATTING AND DATA VALIDATION

Conditional formatting and data validation are powerful features of Microsoft


Excel that allow users to customize the appearance and functionality of their
spreadsheets. Here's an explanation of how they work and some examples of how
they can be used.

CONDITIONAL FORMATTING

Conditional formatting allows users to apply formatting to cells based on specific


conditions or criteria. This is a great way to highlight important information, draw
attention to outliers or trends, or simply make data easier to read and interpret. To
use conditional formatting in Excel:

1. Select the cells you want to apply the formatting to.

2. From the Home tab, click on the Conditional Formatting button in the
Styles group.

3. Choose the type of formatting you want to apply, such as Highlight Cells
Rules or Top/Bottom Rules.

4. Set the criteria or conditions for the formatting. For example, you could
apply a red fill color to any cells with a value less than 0, or a green fill
color to any cells with a value greater than 100.

5. Click OK to apply the formatting.

Example 1: Highlighting Negative Numbers Suppose you have a spreadsheet


that contains a list of expenses, and you want to highlight any expenses that are
negative (i.e., money you're owed). You can use conditional formatting to
automatically apply a fill color to any cells with negative values. Here's how:

1. Select the range of cells you want to format.


117 | P a g e Ezekiel Aleke

2. From the Home tab, click on the Conditional Formatting button and select
"Highlight Cells Rules" from the dropdown menu.

3. Choose "Less Than" from the submenu.

4. In the "Value" box, enter "0" (without the quotes).

5. Select the fill color you want to apply to the cells.

6. Click OK to apply the formatting.

Example 2: Heat Map Suppose you have a large dataset containing sales data for
different products and regions. You want to create a visual representation of the
data using a heat map, where the highest sales figures are highlighted in a bright
color and the lowest sales figures are highlighted in a pale color. Here's how:

1. Select the range of cells you want to format.

2. From the Home tab, click on the Conditional Formatting button and select
"Color Scales" from the dropdown menu.

3. Choose the color scale you want to use. For example, you could choose the
"Red-Yellow-Green" color scale, which uses red for low values, yellow for
medium values, and green for high values.

4. Click OK to apply the formatting.

DATA VALIDATION

Data validation is a feature in Microsoft Excel that allows users to set rules or
constraints on data that is entered into a cell or range of cells. This helps to ensure
that data entered into the spreadsheet is accurate, consistent, and error-free.

Here are the steps to apply data validation in Excel:

Step 1: Select the cell or range of cells where you want to apply data validation.

Step 2: Click on the Data tab in the ribbon.


118 | P a g e Ezekiel Aleke

Step 3: Click on the Data Validation button.

Step 4: In the Data Validation dialog box, choose the type of validation you want
to apply (e.g. Whole Number, Decimal, List, etc.).

Step 5: Set the validation criteria based on the type of validation you have chosen.

Step 6: Customize the input message and error alert message as per your
requirement.

Step 7: Click OK to apply the data validation to the selected cell or range of cells.

Here are some examples of data validation:

1. Whole Number: Suppose you want to restrict users from entering decimal
values in a particular cell. To do this, you can apply the Whole Number
validation rule. In the Data Validation dialog box, select Whole Number in
the Allow field, and then set the minimum and maximum values allowed.

2. Decimal: Suppose you want to ensure that users only enter decimal values
in a particular cell. To do this, you can apply the Decimal validation rule.
In the Data Validation dialog box, select Decimal in the Allow field, and
then set the minimum and maximum values allowed.

3. List: Suppose you have a list of items, and you want users to select an item
from the list instead of typing it manually. To do this, you can apply the
List validation rule. In the Data Validation dialog box, select List in the
Allow field, and then enter the list of items separated by a comma.

4. Date: Suppose you want users to enter dates in a specific format. To do


this, you can apply the Date validation rule. In the Data Validation dialog
box, select Date in the Allow field, and then set the start date and end date
for the range of dates allowed.
119 | P a g e Ezekiel Aleke

5. Custom: Suppose you have a specific requirement that is not covered by


any of the above validation rules. To do this, you can apply the Custom
validation rule. In the Data Validation dialog box, select Custom in the
Allow field, and then enter the validation formula or function in the
Formula field. For example, you can create a custom validation rule that
ensures that the sum of two cells is greater than a certain value.

Data validation is a powerful tool that can help to prevent errors and improve the
accuracy of your Excel spreadsheets. By using data validation, you can ensure
that the data entered into your spreadsheet is valid and consistent, which can save
you time and improve the quality of your work.
120 | P a g e Ezekiel Aleke

CHAPTER EIGHT

CREATE PIVOTTABLES AND PIVOTCHARTS

PivotTables and PivotCharts are powerful tools in Excel that allow you to analyze
and summarize large amounts of data in a more meaningful and efficient way.
PivotTables allow you to group, sort, filter, and summarize data, while
PivotCharts visualize that data in a graphical format. In this tutorial, we will
explain how to create PivotTables and PivotCharts in Excel using step-by-step
instructions and examples.

PIVOT TABLE

Let's say we have a dataset with sales information for a company, and we want to
create a PivotTable to analyze this data.

Step 1: Select the data

To create a PivotTable, you need to select the data you want to analyze. In this
example, we have a table with sales data for different products, regions, and
months. We will select this table, including the column headers.

Step 2: Insert the PivotTable

To insert a PivotTable, click on the "Insert" tab in the Excel ribbon, then click on
the "PivotTable" button. This will open the "Create PivotTable" dialog box.

Step 3: Choose the data source

In the "Create PivotTable" dialog box, make sure that the "Select a table or range"
option is selected, and that the correct range of data is selected in the
"Table/Range" field. You can also choose to create the PivotTable in a new
worksheet or in the existing worksheet.

Step 4: Choose the location of the PivotTable


121 | P a g e Ezekiel Aleke

Choose where you want to place the PivotTable by selecting either "New
Worksheet" or "Existing Worksheet." If you choose "New Worksheet," Excel
will create a new worksheet for the PivotTable. If you choose "Existing
Worksheet," you need to select the cell where you want the PivotTable to start.

Step 5: Create the PivotTable

Once you have selected the data source and location for the PivotTable, click on
the "OK" button. This will create a blank PivotTable in the location you specified.

Step 6: Add fields to the PivotTable

To analyze the data, you need to add fields to the PivotTable. In the "PivotTable
Fields" pane, you will see a list of all the columns in the data source. To add a
field to the PivotTable, simply drag it from the "PivotTable Fields" pane to one
of the four areas of the "Fields" section in the PivotTable Field List pane:

• Rows: this area is used for grouping data by rows. For example, you could
group sales data by product.

• Columns: this area is used for grouping data by columns. For example,
you could group sales data by month.

• Values: this area is used for summarizing data. For example, you could
summarize sales data by calculating the total sales amount.

• Filters: this area is used for filtering data. For example, you could filter
sales data by region.

Step 7: Customize the PivotTable

Once you have added the fields to the PivotTable, you can customize it by
applying different settings and options. For example, you can change the layout
of the PivotTable, apply filters, and change the calculation type. You can also use
the PivotTable to create PivotCharts.
122 | P a g e Ezekiel Aleke

PIVOTCHART

To create a PivotChart in Excel, follow these steps:

1. Create a PivotTable: The first step in creating a PivotChart is to create a


PivotTable. This can be done by selecting the data you want to summarize,
going to the Insert tab, and clicking on the PivotTable button. This will
open the Create PivotTable dialog box, where you can choose the range of
data you want to summarize and where you want to place the PivotTable.

2. Create a PivotChart: Once you have created a PivotTable, you can create
a PivotChart by selecting any cell in the PivotTable and going to the Insert
tab. Click on the PivotChart button, and choose the type of chart you want
to create. Excel will automatically create a chart based on the data in your
PivotTable.

3. Customize the PivotChart: After creating the PivotChart, you can


customize it by changing the chart type, adding or removing chart
elements, formatting the chart, and so on. To do this, simply select the chart
and go to the Chart Design and Format tabs, where you can make various
adjustments to the chart's appearance and behavior.

Here's an example of how to create a PivotChart:

Suppose you have a large dataset that contains information about sales
transactions. You want to create a PivotChart that shows the total sales by product
category and by region. Here's how you can do it:

1. Select the dataset: Select the entire dataset, including the headers.

2. Create a PivotTable: Go to the Insert tab and click on the PivotTable


button. In the Create PivotTable dialog box, choose where you want to
place the PivotTable and click OK.
123 | P a g e Ezekiel Aleke

3. Configure the PivotTable: In the PivotTable Fields pane, drag the Product
Category field to the Rows area, and the Region field to the Columns area.
Drag the Sales Amount field to the Values area.

4. Create a PivotChart: Select any cell in the PivotTable and go to the Insert
tab. Click on the PivotChart button, and choose the type of chart you want
to create.

5. Customize the PivotChart: Select the chart and go to the Chart Design and
Format tabs to make any adjustments you want. For example, you can
change the chart type, add a chart title, change the axis labels, and so on.
124 | P a g e Ezekiel Aleke

CHAPTER NINE

WORKING WITH EXCEL TABLES

"Work with Excel tables" covers how to use Excel tables to manage and analyze
data.

Excel tables are a powerful feature in Microsoft Excel that allow you to organize
and manipulate large amounts of data in an easy-to-use format. Excel tables
provide several benefits, including built-in filtering, sorting, and data analysis
tools.

Creating an Excel Table:

To create an Excel table, select the data you want to include in the table and go
to the "Insert" tab on the ribbon. Click on the "Table" button and Excel will
automatically create a table based on the selected data.

Benefits of Excel Tables:

1. Structured Format: Excel tables provide a structured format to store data


that allows for easier analysis and manipulation.

2. Easy Sorting and Filtering: Excel tables make it easy to sort and filter
data based on specific criteria, allowing you to quickly find the information
you need.

3. Built-in Data Analysis: Excel tables come with built-in data analysis tools
such as PivotTables and PivotCharts that help you summarize and analyze
data.

4. Automatic Formatting: Excel tables automatically apply formatting to


data based on predefined styles, making it easier to read and understand.

5. Easy Data Entry: Excel tables make it easy to add new data to your table,
as the table will automatically expand to include new rows.
125 | P a g e Ezekiel Aleke

Example:

Suppose you have a large data set of sales data, and you want to analyze the data
to identify trends and patterns. By creating an Excel table, you can quickly sort
and filter the data to find the information you need.

To create an Excel table, follow these steps:

1. Select the entire data set, including headers.

2. Go to the "Insert" tab on the ribbon and click on the "Table" button.

3. In the "Create Table" dialog box, make sure the range is correct, and check
the box for "My table has headers."

4. Click "OK" to create the table.

Once you have created the table, you can easily sort and filter the data. For
example, to sort the data by the "Sales" column in descending order, click on the
arrow next to the "Sales" header and select "Sort Largest to Smallest."

To filter the data, click on the arrow next to the "Product" header and select the
products you want to include in the filtered data.

You can also use the built-in data analysis tools such as PivotTables and
PivotCharts to summarize and analyze the data.

Overall, Excel tables are a powerful tool that can help you organize, manipulate,
and analyze large amounts of data quickly and efficiently.
126 | P a g e Ezekiel Aleke

CHAPTER TEN

WHAT-IF ANALYSIS

"Perform what-if analysis" covers how to use Excel's "what-if" analysis tools,
including goal seeking and data tables.

What-if analysis in Excel refers to a set of tools and techniques that allow users
to explore different scenarios by changing input values and observing how the
output of a formula or a model changes in response. This is particularly useful
when you need to make decisions based on multiple variables or uncertain
outcomes, such as financial planning, budgeting, or forecasting.

Examples of what-if analysis tools in Excel:

1. Goal Seek: This tool allows you to determine the input value needed to
achieve a desired output. For example, if you want to determine what sales
volume is required to achieve a certain profit margin, you can use Goal
Seek to find the sales volume needed to achieve that goal.

2. Data Tables: This tool allows you to create a table of multiple scenarios
by varying two or more input variables. For example, if you want to
analyze the impact of changing both the price and the quantity of a product,
you can create a data table that shows the resulting revenue for each
combination of price and quantity.

3. Scenario Manager: This tool allows you to create and compare different
sets of input values for a model or a formula. For example, if you want to
analyze the impact of different interest rates on a loan repayment schedule,
you can create different scenarios with different interest rates and compare
the resulting repayment schedules.

4. Solver: This tool allows you to find the optimal solution to a problem by
changing input variables subject to certain constraints. For example, if you
127 | P a g e Ezekiel Aleke

want to determine the optimal production plan that minimizes cost and
meets certain demand constraints, you can use Solver to find the best
combination of production quantities that satisfies the constraints.
128 | P a g e Ezekiel Aleke

CHAPTER TEN

DATA CLEANING

Data cleaning in Excel is the process of removing or correcting inaccurate,


incomplete, or irrelevant data from your dataset.

The following are some examples of data cleaning techniques that can be applied
in Excel:

1. Removing duplicate data: Duplicates are common in data and can cause
inaccuracies in analysis. To remove duplicates, select the data range, go to
the "Data" tab and click on "Remove Duplicates." Choose the columns
where you want to remove duplicates and click OK.

2. Removing Blank Rows: Blank rows in your dataset can be distracting and
take up space. To remove them, you can use the 'Go To Special' feature in
Excel. First, select the range of cells you want to remove blank rows from,
then go to the 'Home' tab, click on 'Find & Select,' and then click on 'Go
To Special.' In the dialog box that appears, select 'Blanks' and click 'OK.'
All blank rows will now be selected, and you can right-click and choose
'Delete.'

3. Correcting spelling errors: Spelling errors can occur when data is entered
manually. To correct them, use the "Find and Replace" feature under the
"Home" tab. Enter the misspelled word in the "Find what" field and the
corrected spelling in the "Replace with" field, then click "Replace All."

4. Converting text to numbers: Sometimes numbers are stored as text in


Excel, which can cause issues when performing calculations. To convert
129 | P a g e Ezekiel Aleke

text to numbers, select the range of cells, right-click and select "Format
Cells," then choose "Number" and click OK.
5. Removing unwanted characters: Sometimes data contains unwanted
characters such as commas, spaces, or special characters. To remove them,
use the "Find and Replace" feature under the "Home" tab. Enter the
unwanted character in the "Find what" field and leave the "Replace with"
field blank, then click "Replace All."
6. Handling Missing Data: Missing data can occur for various reasons and
can affect your analysis. To handle missing data in Excel, you can use the
'IF' function or the 'ISBLANK' function. For example, you can use the 'IF'
function to replace missing data with a default value. First, select the cell
you want to fill, then enter the following formula:
=IF(ISBLANK(A2),"Default Value",A2). This formula will check if the
cell A2 is blank and replace it with "Default Value" if it is.
7. Standardizing data: Data that is entered inconsistently can be difficult to
analyze. To standardize data, use the "LOWER," "UPPER," or "PROPER"
functions to convert text to a consistent case. You can also use the "TRIM"
function to remove extra spaces or the "SUBSTITUTE" function to replace
certain values with others.
8. Correcting Spelling Mistakes: Spelling mistakes can cause
inconsistencies in your data, and it's important to correct them. To do so,
you can use the 'Find and Replace' feature in Excel. First, select the range
of cells you want to correct spelling mistakes in, then go to the 'Home' tab,
click on 'Find & Select,' and then click on 'Replace.' In the dialog box that
appears, enter the incorrect spelling in the 'Find what' box and the correct
spelling in the 'Replace with' box. Then, click 'Replace All' to correct all
instances of the misspelling.
130 | P a g e Ezekiel Aleke

POWER BI
131 | P a g e Ezekiel Aleke

CHAPTER ONE

POWER BI KEY TERMS

• Workspace: A workspace is a shared environment where team members


can collaborate on reports and dashboards.
• Dataset: A dataset is a collection of data that is imported, transformed, and
used in Power BI reports.
• Fact table: A fact table is a table in a data warehouse that stores the
quantitative measures or metrics of a business process. It typically contains
numerical data such as sales revenue, quantities sold, or costs incurred, and
has foreign keys to link to one or more-dimension tables.
• Report: A report is a visual representation of data that is created using the
data in a dataset.
• Dashboard: A dashboard is a collection of visualizations and reports that
provide a high-level view of key performance indicators (KPIs).
• Visualizations: Visualizations are graphical representations of data, such
as charts, tables, and maps.
• Tiles: Tiles are individual components of a dashboard or report that display
a visualization or KPI.
• Filters: Filters are used to limit the data that is displayed in a report or
dashboard.
• Slicers: Slicers are interactive controls that allow users to filter data by
selecting values from a list.
• Measures: Measures are calculations that are performed on data in a
dataset, such as sums, averages, and counts.
• Drill-down: Drill-down is the process of navigating from a high-level view
of data to a more detailed view by clicking on a visual.
132 | P a g e Ezekiel Aleke

• Power Query: Power Query is a data transformation and cleansing tool


that allows users to import data from various sources, apply
transformations, and load the transformed data into Power BI.
• Power Pivot: Power Pivot is a data modeling tool that allows users to
create relationships between tables, define calculated columns and
measures, and perform advanced data analysis.
• Power BI Desktop: Power BI Desktop is a Windows application that
allows users to create and publish Power BI reports and dashboards.
• Power BI Service: Power BI Service is a cloud-based service that allows
users to view and share Power BI reports and dashboards.
• Calculated measure: Calculated measure is typically used for aggregations
and calculations on fact tables. It is defined as a formula that uses one or
more columns in the fact table to calculate a value, such as a sum, count,
average, or any other aggregate function.
• Calculated column: Calculated column is used to add new data to a table.
It is defined as a formula that calculates a value for each row in a table and
adds the results to a new column. Calculated columns are static, meaning
they do not adjust to the current report context. The new column created
by the calculated column formula becomes part of the original table and
can be used as a source for calculations in other measures or columns.
133 | P a g e Ezekiel Aleke

CHAPTER TWO

DAX AGGREGATION FUNCTIONS

SUM()

The SUM function in DAX is used to calculate the sum of a column in a table.

Here's the basic syntax of the SUM function:

Example 1: Let's say we have a table called "sales" with columns "product",
"region", and "sales_amount". We can use the following DAX formula to
calculate the total sales amount:

This will sum up the "sales_amount" column in the "sales" table and give us the
total sales amount.

Example 2: Let's say we have a table called "orders" with columns "order_id",
"customer_id", and "order_total". We can use the following DAX formula to
calculate the total order amount for a specific customer:

This will filter the "orders" table to include only orders for customer ID 1, and
then sum up the "order_total" column to give us the total order amount for that
customer.

SUMX()

The SUMX function in DAX is used to sum up a column after applying an


expression to each row of the column.
134 | P a g e Ezekiel Aleke

Here's the basic syntax of the SUMX function:

Example 1: Let's say we have a table called "sales" with columns "product",
"region", and "sales_amount". We can use the following DAX formula to
calculate the total sales amount for a specific region:

This will filter the "sales" table to include only sales in the "North" region, and
then sum up the "sales_amount" column to give us the total sales amount for that
region.

Example 2: Let's say we have a table called "orders" with columns "order_id",
"customer_id", "order_total", and "order_date". We can use the following DAX
formula to calculate the Total Order Amount for a specific Year:

This will filter the "orders" table to include only orders from the year 2022, and
then sum up the "order_total" column to give us the total order amount for that
year.

COUNT()

The COUNT function is used to count the number of rows in a table that contain
a non-blank value in a specified column.

Here's the basic syntax of the COUNT function:


135 | P a g e Ezekiel Aleke

Example 1: Let's say we have a table called "Sales" with columns "Product",
"Category", and "Revenue". We can use the following query to count the number
of rows in the table that have a non-blank value in the "Product" column:

This will return the total number of rows in the "Sales" table that have a non-
blank value in the "Product" column.

Example 2: Let's say we have a table called "Employees" with columns "Name",
"Department", "Salary". We can use the following query to count the number of
rows in the table that have a non-blank value in the "Department" column:

This will return the total number of rows in the "Employees" table that have a
non-blank value in the "Department" column.

COUNTA()

The COUNTA function is used to count the number of rows in a table that contain
a value (even if it's blank) in a specified column.

Here's the basic syntax of the COUNTA function:

Example 1: Let's say we have a table called "Inventory" with columns "Product",
"Category", and "Quantity". We can use the following query to count the number
of rows in the table that have a value (even if it's blank) in the "Category" column:
136 | P a g e Ezekiel Aleke

This will return the total number of rows in the "Inventory" table that have a value
(even if it's blank) in the "Category" column.

Example 2: Let's say we have a table called "Customers" with columns "Name",
"Address", "Phone". We can use the following query to count the number of rows
in the table that have a value (even if it's blank) in the "Phone" column:

This will return the total number of rows in the "Customers" table that have a
value (even if it's blank) in the "Phone" column.

COUNTAX()

The COUNTAX function is used to count the number of rows in a table that
contain a non-blank value in a calculated column.

Here's the basic syntax of the COUNTAX function:

Example 1: Let's say we have a table called "Sales" with columns "Product",
"Category", and "Revenue". We can use the following query to count the number
of rows in the table that have a non-blank value in a calculated column that sums
the revenue for each category:
137 | P a g e Ezekiel Aleke

This will return the total number of rows in the "Sales" table that have a non-
blank value in the calculated column that sums the revenue for each category.

COUNTBLANK()

The COUNTBLANK function returns the number of blank values in a column.

Here's the basic syntax of the COUNTBLANK function:

Example 1: Let's say we have a table called "sales" with columns "product" and
"price". Some rows in the "price" column have a blank value. We can use the
following query to count the number of blank values in the "price" column:

This will return the number of blank values in the "price" column of the "sales"
table.

Example 2: Let's say we have a table called "students" with columns "name" and
"age". Some rows in the "age" column have a null value. We can use the following
query to count the number of null values in the "age" column:

This will return the number of null values in the "age" column of the "students"
table.

COUNTROWS()

The COUNTROWS function returns the number of rows in a table or table


expression.
138 | P a g e Ezekiel Aleke

Here's the basic syntax of the COUNTROWS function:

Example 1: Let's say we have a table called "sales" with columns "product" and
"price". We can use the following query to count the number of rows in the "sales"
table:

This will return the number of rows in the "sales" table.

Example 2: Let's say we have a calculated table called "filtered_sales" that


contains only rows where the price is greater than 100. We can use the following
query to count the number of rows in the "filtered_sales" table:

This will return the number of rows in the "filtered_sales" table.

DISTINCTCOUNT()

The DISTINCTCOUNT function returns the number of distinct values in a


column.

Here's the basic syntax of the DISTINCTCOUNT function:

Example 1: Let's say we have a table called "sales" with columns "product" and
"price". We can use the following query to count the number of distinct products
in the "sales" table:
139 | P a g e Ezekiel Aleke

This will return the number of distinct products in the "product" column of the
"sales" table.

Example 2: Let's say we have a table called "students" with columns "name" and
"gender". We can use the following query to count the number of distinct genders
in the "students" table:

This will return the number of distinct genders in the "gender" column of the
"students" table.

DISTINCTCOUNTNOBLANK()

The DISTINCTCOUNTNOBLANK function returns the number of distinct non-


blank values in a column.

Here's the basic syntax of the DISTINCTCOUNTNOBLANK function:

Example: Let's say we have a table called "sales" with columns "product" and
"price". Some rows in the "price" column have a blank value. We can use the
following query to count the number of distinct non-blank prices in the "sales"
table:

This will return the number of distinct non-blank prices in the "price" column.
140 | P a g e Ezekiel Aleke

MIN()

The MIN function returns the minimum value in a column.

Here's the basic syntax of the MIN function:

Example 1: Let's say we have a table called "sales" with columns "product" and
"price". We can use the following query to find the minimum price of all
products:

This will return the minimum price from the "price" column in the "sales" table.

Example 2: Let's say we have a table called "students" with columns "name" and
"age". We can use the following query to find the minimum age of all students:

This will return the minimum age from the "age" column in the "students" table.

MINX()

The MINX function returns the minimum value of an expression evaluated over
a table.

Here's the basic syntax of the MINX function:

Example 1: Let's say we have a table called "sales" with columns "product",
"price", and "quantity". We can use the following query to find the minimum total
sales for each product:
141 | P a g e Ezekiel Aleke

This will evaluate the expression "[price] * [quantity]" for each row in the "sales"
table and return the minimum value.

Example 2: Let's say we have a table called "employees" with columns "name",
"age", and "salary". We can use the following query to find the minimum salary
for employees under the age of 30:

This will evaluate the expression "[salary]" for each row in the "employees" table
where the age is less than 30 and return the minimum value.

MINA()

The MINA function returns the minimum value in a column, ignoring any rows
that contain a blank or null value.

Here's the basic syntax of the MINA function:

Example 1: Let's say we have a table called "sales" with columns "product" and
"price". Some rows in the "price" column have a blank value. We can use the
following query to find the minimum price of all products, ignoring any rows
with a blank value:

This will return the minimum price from the "price" column in the "sales" table,
ignoring any rows with a blank value.
142 | P a g e Ezekiel Aleke

Example 2: Let's say we have a table called "students" with columns "name" and
"age". Some rows in the "age" column have a null value. We can use the following
query to find the minimum age of all students, ignoring any rows with a null
value:

This will return the minimum age from the "age" column in the "students" table,
ignoring any rows with a null value.

MAX()

The MAX function returns the maximum value in a column.

Here's the basic syntax of the MAX function:

Example 1: Let's say we have a table called "sales" with columns "product" and
"price". We can use the following query to find the maximum price of all
products:

This will return the maximum price from the "price" column in the "sales" table.

Example 2: Let's say we have a table called "students" with columns "name" and
"age". We can use the following query to find the maximum age of all students:

This will return the maximum age from the "age" column in the "students" table.
143 | P a g e Ezekiel Aleke

MAXX()

The MAXX function returns the maximum value of an expression evaluated over
a table.

Here's the basic syntax of the MAXX function:

Example 1: Let's say we have a table called "sales" with columns "product",
"price", and "quantity". We can use the following query to find the maximum
total sales for each product:

This will evaluate the expression "[price] * [quantity]" for each row in the "sales"
table and return the maximum value.

Example 2: Let's say we have a table called "employees" with columns "name",
"age", and "salary". We can use the following query to find the maximum salary
for employees under the age of 30:

This will evaluate the expression "[salary]" for each row in the "employees" table
where the age is less than 30 and return the maximum value.

MAXA()

The MAXA function returns the maximum value in a column, ignoring any rows
that contain a blank or null value.

Here's the basic syntax of the MAXA function:


144 | P a g e Ezekiel Aleke

Example 1: Let's say we have a table called "sales" with columns "product" and
"price". Some rows in the "price" column have a blank value. We can use the
following query to find the maximum price of all products, ignoring any rows
with a blank value:

This will return the maximum price from the "price" column in the "sales" table,
ignoring any rows with a blank value.

Example 2: Let's say we have a table called "students" with columns "name" and
"age". Some rows in the "age" column have a null value. We can use the following
query to find the maximum age of all students, ignoring any rows with a null
value:

This will return the maximum age from the "age" column in the "students" table,
ignoring any rows with a null value.

AVERAGE()

The AVERAGE function returns the average (arithmetic mean) of a column of


numbers.

Here's the basic syntax of the AVERAGE function:

Example 1: Let's say we have a table called "sales" with columns "product" and
"price". We can use the following query to find the average price of all products:
145 | P a g e Ezekiel Aleke

This will return the average price from the "price" column in the "sales" table.

Example 2: Let's say we have a table called "students" with columns "name" and
"age". We can use the following query to find the average age of all students:

This will return the average age from the "age" column in the "students" table.

AVERAGEX()

The AVERAGEX function returns the average of an expression evaluated over a


table.

Here's the basic syntax of the AVERAGEX function:

Example 1: Let's say we have a table called "sales" with columns "product",
"price", and "quantity". We can use the following query to find the average price
per unit for each product:

This will evaluate the expression "[price] / [quantity]" for each row in the "sales"
table and return the average value.

Example 2: Let's say we have a table called "employees" with columns "name",
"age", and "salary".
146 | P a g e Ezekiel Aleke

We can use the following query to find the average salary for employees under
the age of 30:

This will evaluate the expression "[salary]" for each row in the "employees" table
where the age is less than 30 and return the average value.

AVERAGEA()

The AVERAGEA function returns the average of a column of numbers, including


any rows that contain a blank or null value.

Here's the basic syntax of the AVERAGEA function:

Example 1: Let's say we have a table called "sales" with columns "product" and
"price". Some rows in the "price" column have a blank value. We can use the
following query to find the average price of all products, including any rows with
a blank value:

This will return the average price from the "price" column in the "sales" table,
including any rows with a blank value.
147 | P a g e Ezekiel Aleke

Example 2: Let's say we have a table called "students" with columns "name" and
"age". Some rows in the "age" column have a null value. We can use the following
query to find the average age of all students, including any rows with a null value:

This will return the average age from the "age" column in the "students" table,
including any rows with a null value.
148 | P a g e Ezekiel Aleke

CHAPTER THREE

DATE FUNCTIONS

CALENDAR()

The CALENDAR function returns a table with one column of dates from a
specified start date to a specified end date.

Here's the basic syntax of the CALENDAR function:

Example 1: Llet's say we want to create a calendar table with dates from January
1, 2021 to December 31, 2021. We can use the following query:

This will create a new table called "Calendar" with one column of dates from
January 1, 2021 to December 31, 2021.

Example 2: Let's say we want to create a calendar table with dates from January
1, 2022 to December 31, 2022. We can use the following query:

This will create a new table called "Calendar" with one column of dates from
January 1, 2022 to December 31, 2022.

DATE()

The DATE function returns a date value from year, month, and day values.
149 | P a g e Ezekiel Aleke

Here's the basic syntax of the DATE function:

Example 1: Let's say we want to create a date value for January 1, 2021. We can
use the following query:

This will create a new date value for January 1, 2021.

Example 2: Let's say we want to create a date value for February 14, 2022. We
can use the following query:

This will create a new date value for February 14, 2022.

DATEDIFF()

The DATEDIFF function returns the number of days, months, or years between
two dates.

Here's the basic syntax of the DATEDIFF function:

Example 1: Let's say we want to find the number of days between January 1,
2021 and December 31, 2021. We can use the following query:

This will return the number of days between January 1, 2021 and December 31,
2021.
150 | P a g e Ezekiel Aleke

Example 2: Let's say we want to find the number of years between January 1,
1980 and January 1, 2022. We can use the following query:

This will return the number of years between January 1, 1980 and January 1,
2022.

DATEVALUE()

The DATEVALUE function in DAX converts a date in the form of a text string
to a date value.

Here's the basic syntax of the DATEVALUE function:

Example 1: Let's say we have a table called "sales" with a column called
"order_date" that contains dates in the format "MM/DD/YYYY". We can use the
following query to convert the order dates to date values:

This will convert each date in the "order_date" column to a date value.

Example 2: Let's say we have a table called "employees" with a column called
"hire_date" that contains dates in the format "DD/MM/YYYY". We can use the
following query to convert the hire dates to date values:

This will extract the first 10 characters of each hire date in the "hire_date" column
and convert them to a date value.
151 | P a g e Ezekiel Aleke

DAY()

The DAY function in DAX returns the day of the month from a date value.

Here's the basic syntax of the DAY function:

Example 1: Let's say we have a table called "sales" with a column called
"order_date" that contains date values. We can use the following query to extract
the day of the month from each order date:

This will return the day of the month from each date in the "order_date" column.

Example 2: Let's say we have a table called "employees" with a column called
"hire_date" that contains date values. We can use the following query to extract
the day of the month from each hire date:

This will return the day of the month from each date in the "hire_date" column.

Note that the DAY function returns a number between 1 and 31, depending on
the day of the month.

HOUR()

The HOUR function returns the hour component of a time value.

Here's the basic syntax of the HOUR function:


152 | P a g e Ezekiel Aleke

Example 1: Let's say we have a table called "orders" with a column "order_time"
that contains the date and time of each order. We can use the following query to
extract the hour component of each order time:

This will return the hour component of the "order_time" column in the "orders"
table.

Example 2: Let's say we have a table called "logs" with a column "log_time" that
contains the date and time of each log entry. We can use the following query to
extract the hour component of each log time:

This will return the hour component of the "log_time" column in the "logs" table.

MINUTE()

The MINUTE function returns the minute component of a time value.

Here's the basic syntax of the MINUTE function:

Example 1: Let's say we have a table called "orders" with a column "order_time"
that contains the date and time of each order. We can use the following query to
extract the minute component of each order time:

This will return the minute component of the "order_time" column in the "orders"
table.
153 | P a g e Ezekiel Aleke

Example 2: Let's say we have a table called "logs" with a column "log_time" that
contains the date and time of each log entry. We can use the following query to
extract the minute component of each log time:

This will return the minute component of the "log_time" column in the "logs"
table.

MONTH()

The MONTH function returns the month component of a date value.

Here's the basic syntax of the MONTH function:

Example 1: Let's say we have a table called "orders" with a column "order_date"
that contains the date of each order. We can use the following query to extract the
month component of each order date:

This will return the month component of the "order_date" column in the "orders"
table.

Example 2: Let's say we have a table called "invoices" with a column


"invoice_date" that contains the date of each invoice. We can use the following
query to extract the month component of each invoice date:

This will return the month component of the "invoice_date" column in the
"invoices" table.
154 | P a g e Ezekiel Aleke

NOW()

The NOW function returns the current date and time.

Here's the basic syntax of the NOW function:

Example 1: Let's say we want to calculate the number of seconds since the start
of the day. We can use the following query:

This will return the number of seconds since the start of the day.

Example 2: Let's say we want to create a timestamp for a log entry. We can use
the following query:

This will return a timestamp in the format "YYYY-MM-DD HH:MM:SS".

TIME()

The TIME function returns a time value based on the hour, minute, and second
provided.

Here's the basic syntax of the TIME function:

Example 1: Let's say we want to create a calculated column that returns the time
8:30 AM for all rows in a table. We can use the following formula:
155 | P a g e Ezekiel Aleke

This will return the time value 8:30 AM.

Example 2: Let's say we have a table called "events" with columns "name" and
"start_time". We can use the following formula to create a calculated column that
extracts the time component from the "start_time" column:

This will return the time component of the "start_time" column for each row in
the "events" table.

TODAY()

The TODAY function returns the current date.

Here's the basic syntax of the TODAY function:

Example 1: Let's say we want to create a calculated column that calculates the
number of days between the current date and a date column in a table. We can
use the following formula:

This will return the number of days between the "date_column" and the current
date.

Example 2: Let's say we have a table called "tasks" with columns "name" and
"due_date". We can use the following formula to create a calculated column that
indicates whether a task is overdue or not:
156 | P a g e Ezekiel Aleke

This will return "Overdue" if the "due_date" column is before the current date,
and "Not Overdue" otherwise.

WEEKDAY()

The WEEKDAY function returns the day of the week as a number (1-7) based on
a date provided.

Here's the basic syntax of the WEEKDAY function:

Example 1: Let's say we want to create a calculated column that indicates


whether a date column in a table is a weekday or a weekend day. We can use the
following formula:

This will return "Weekday" if the day of the week is Monday to Friday
(WEEKDAY number 1-5), and "Weekend" otherwise.

Example 2: Let's say we have a table called "sales" with columns "date" and
"sales_amount". We can use the following formula to create a calculated column
that calculates the average sales amount for each weekday:

This will group the "sales" table by weekday and calculate the average sales
amount for each weekday.
157 | P a g e Ezekiel Aleke

YEAR()

The YEAR function returns the year value based on a date provided.

Here's the basic syntax of the YEAR function:

Example 1: Let's say we want to create a calculated column that returns the year
component of a date column in a table. We can use the following formula:

This will return the year component of the "date_column" for each row in the
table.

Example 2: Let's say we have a table called "sales" with columns "date" and
"sales_amount".
158 | P a g e Ezekiel Aleke

CHAPTER FOUR

FILTER FUNCTIONS

The first argument is the table or expression to be filtered, and the second
argument is the condition or criteria to be applied.

Example 1: Suppose we have a Sales table with columns Product, Region, and
SalesAmount. We want to calculate the total sales amount for the region "West".

In this example, the FILTER function is used to filter the rows in the Sales table
based on the condition Sales[Region] = "West". The resulting filtered table is then
used in the SUM function to calculate the total sales amount for the region
"West".

Example 2: Suppose we have a Customer table with columns CustomerName,


Age, and Gender. We want to calculate the average age of female customers.

In this example, the FILTER function is used to filter the rows in the Customer
table based on the condition Customer[Gender] = "Female". The resulting filtered
159 | P a g e Ezekiel Aleke

table is then used in the AVERAGE function to calculate the average age of
female customers.

The FILTER function can also be used with other functions such as MAX, MIN,
COUNT, etc. to perform more complex calculations.

Example 3: Suppose we have a Product table with columns ProductName,


Category, and Price. We want to calculate the highest price for each category.

In this example, the FILTER function is used to filter the rows in the Product
table based on the condition Product[Category] = EARLIER(Product[Category]).
The EARLIER function is used to refer to the current category in the outer
MAXX function. The resulting filtered table is then used in the inner MAX
function to calculate the highest price for each category. The MAXX function is
used to iterate through each distinct category and return the maximum price.

Example 4: Suppose we have a Sales table with columns Product, Region,


Month, and SalesAmount. We want to calculate the total sales amount for each
product in the region "West" and month "January".
160 | P a g e Ezekiel Aleke

In this example, the FILTER function is used to filter the rows in the Sales table
based on the conditions Sales[Region] ="West" and MONTH(Sales[Month]) = 1.

The resulting filtered table is then used in the SUM function to calculate the total
sales amount for each product in the region "West" and month "January". The
MONTH function is used to extract the month from the Sales[Month] column.
The && operator is used to specify multiple conditions in the FILTER function.

CALCULATE()

The CALCULATE function in DAX is used to evaluate an expression in a


modified filter context. It takes an expression and a set of filter arguments that
define the filter context to apply before evaluating the expression.

The function is used to modify or override the current filter context for a given
expression.

Example 1: Calculate Total Sales for only California State

In the above example, we are calculating the total sales for only California state
by using the CALCULATE function. The expression we want to evaluate is
SUM(Sales[Sales Amount]) and the filter argument we want to apply is
Sales[State] = "California".
161 | P a g e Ezekiel Aleke

Example 2: Calculate Sales for a specific period

In the above example, we are calculating the total sales for a specific period using
the CALCULATE function. The expression we want to evaluate is
SUM(Sales[Sales Amount]) and the filter argument we want to apply is
FILTER(Sales, Sales[Date] >= DATE(2022,1,1) && Sales[Date] <=
DATE(2022,3,31)).

ALL()

The ALL function in DAX is used to remove one or more filters from the current
context. It takes one or more columns or tables as arguments, and returns a table
that contains all the rows from the original table, without any filters applied to
them.

Example 1: Remove filter from a single column

In the above example, we are calculating the total sales without any filter applied
to the "State" column using the ALL function.
162 | P a g e Ezekiel Aleke

Example 2: Remove filter from multiple columns

In the above example, we are calculating the total sales without any filter applied
to the "State" and "Product" columns using the ALL function.

ALLEXCEPT()

The ALLEXCEPT function in DAX is used to remove all filters from the current
context except for the ones specified in the function. It takes one or more columns
or tables as arguments, and returns a table that contains all the rows from the
original table, except those that have been filtered out by the specified columns
or tables.

Example 1: Remove filter from all columns except for "State"

In the above example, we are calculating the total sales without any filter applied
to all columns except for the "State" column using the ALLEXCEPT function.

Example 2: Remove filter from all columns except for "State" and "Product"
163 | P a g e Ezekiel Aleke

In the above example, we are calculating the total sales without any filter applied
to all columns except for the "State" and "Product" columns using the
ALLEXCEPT function.

ALLSELECTED()

The ALLSELECTED function in DAX returns a table that includes all the
possible combinations of values from columns and rows in a visual. The function
is useful when you want to retrieve the values that the user selected in a visual,
along with the values that were not selected.

Here are two examples that demonstrate how to use the ALLSELECTED function
in DAX:

Example 1: Suppose you have a Sales table that contains data about sales made
by your company. You want to create a measure that shows the total sales for the
selected category and subcategory, along with the sales for all the other categories
and subcategories. You can use the following DAX expression to accomplish
this:

In this example, the ALLSELECTED function is used to remove any filters that
the user applied to the Category and Subcategory columns in the Sales table. This
ensures that the measure calculates the total sales for all the categories and
subcategories, regardless of whether they were selected by the user.

Example 2: Suppose you have a PivotTable that shows the total sales for each
product category and subcategory. You want to add a calculated field to the
PivotTable that shows the percentage of total sales for each category and
164 | P a g e Ezekiel Aleke

subcategory, based on the user's selections. You can use the following DAX
expression to accomplish this:

In this example, the ALLSELECTED function is used to remove all the filters
that the user applied to the PivotTable, so that the percentage is calculated based
on the total sales, regardless of the user's selections.

Overall, the ALLSELECTED function is a powerful tool for working with filters
in DAX, and it can be used in many different scenarios where you need to retrieve
the selected values along with the unselected values.

KEEPFILTERS()

The KEEPFILTERS function is used to keep the filters applied to a table while
evaluating an expression that involves another table. It is commonly used in
scenarios where you need to perform a calculation based on two tables, but you
want to keep the filters applied to only one table.

The syntax for KEEPFILTERS function is as follows:

Example 1: Suppose you have two tables - Sales and Products, and you want to
calculate the average price of products sold in each region. However, you only
want to consider the products that were sold in the current year.
165 | P a g e Ezekiel Aleke

You can use the KEEPFILTERS function to keep the filter applied to the Sales
table while evaluating the expression that involves the Products table, as shown
below:

This expression calculates the average price of products sold in the current year,
while keeping the filter applied to the Sales table.

Example 2: Suppose you have a measure that calculates the total sales for each
product category, and you want to calculate the percentage contribution of each
product category to the total sales. You can use the KEEPFILTERS function to
keep the filter applied to the product category while evaluating the expression
that involves the total sales, as shown below:

This expression calculates the percentage contribution of each product category


to the total sales, while keeping the filter applied to the product category.

LOOKUPVALUE()

The LOOKUPVALUE function is used to return the value of a column from a


table that matches a specified value in another column. It is commonly used in
scenarios where you want to retrieve a value from one table and use it in a
calculation involving another table.
166 | P a g e Ezekiel Aleke

The syntax for LOOKUPVALUE function is as follows:

Example 1: Suppose you have a Sales table that contains information about sales
transactions, and a Products table that contains information about the products
sold. You want to retrieve the product name for a specific product ID and use it
in a report. You can use the LOOKUPVALUE function to retrieve the product
name from the Products table, as shown below:

This expression retrieves the product name for product ID "P001" from the
Products table. If the product ID is not found in the Products table, the function
returns "Unknown".

Example 2: Suppose you have a Sales table that contains information about sales
transactions, and a Calendar table that contains information about dates. You
want to calculate the sales amount for a specific date and use it in a report. You
can use the LOOKUPVALUE function to retrieve the sales amount for the
specified date from the Sales table, as shown below:

This expression retrieves the sales amount for the current date from the Sales
table. If the current date is not found in the Sales table, the function returns 0.
167 | P a g e Ezekiel Aleke

ERROR()

The ERROR function in DAX returns an error message as a string. This can be
useful when you want to handle errors in a specific way or show custom error
messages in your DAX formulas.

Example 1: Suppose you have a measure that divides two values, but there is a
possibility that the denominator may be zero.

To handle this situation, you can use the ERROR function to return a custom error
message.

Example 2: Suppose you have a measure that looks up a value in a table, but
there is a possibility that the lookup value does not exist in the table.

To handle this situation, you can use the ERROR function to return a custom error
message.
168 | P a g e Ezekiel Aleke

BLANK()

The BLANK function in DAX returns a blank value. This can be useful when
you want to handle empty or null values in a specific way or show custom
messages for missing data.

Example 1: Suppose you have a measure that calculates the average sales for a
product category, but there are no sales data available for some categories. To
handle this situation, you can use the BLANK function to return a blank value.

Example 2: Suppose you have a measure that calculates the year-to-date sales
for a product, but there are no sales data available for the current year.

To handle this situation, you can use the BLANK function to return a blank value.
169 | P a g e Ezekiel Aleke

CHAPTER FIVE

IF AND SWITCH STATEMENT

IF and SWITCH are two important functions in DAX that are used to perform
conditional calculations.

IF()

The IF function in DAX is similar to the IF statement in Excel. It evaluates a


logical test and returns a value if the test is true, and another value if the test is
false. The syntax of the IF function is as follows:

Example 1: Suppose we have a table of sales data with columns for Region,
Product, and Sales Amount. We want to create a calculated column that returns
"High" if the sales amount is greater than 1000, and "Low" if the sales amount is
less than or equal to 1000.

Example 2: Suppose we have a table of employee data with columns for


Employee Name and Salary. We want to create a calculated column that returns
"Above Average" if the employee's salary is greater than the average salary of all
employees, and "Below Average" if it is less than or equal to the average.
170 | P a g e Ezekiel Aleke

SWITCH()

The SWITCH function in DAX is used to evaluate a series of conditions and


return a value based on the first condition that is true. It is similar to a nested IF
statement, but easier to read and maintain. The syntax of the SWITCH function
is as follows:

Example 1: Suppose we have a table of sales data with columns for Product and
Sales Amount. We want to create a calculated column that returns a commission
rate based on the product sold. We can use the SWITCH function to evaluate each
product and return the appropriate commission rate.
171 | P a g e Ezekiel Aleke

Example 2: Suppose we have a table of customer data with columns for Customer
Name, Age, and Gender. We want to create a calculated column that categorizes
the customers into different age groups based on their gender. We can use the
SWITCH function to evaluate the customer's age and gender and return the
appropriate age group.
172 | P a g e Ezekiel Aleke

CHAPTER SIX

POWER QUERY

Power Query is a data transformation and preparation tool that is part of the
Microsoft Power BI suite. It allows users to easily extract, transform, and load
(ETL) data from a wide variety of sources into a structured format for analysis
and reporting. Power Query is available as a separate add-in for Microsoft Excel
and can also be accessed within Power BI Desktop and Microsoft SQL Server
Management Studio.

Here are some of the key features of Power Query:

1. Data Sources: Power Query supports numerous data sources, including


Excel workbooks, CSV files, SQL Server databases, SharePoint lists, web
pages, and many others. It also supports various file formats, such as JSON,
XML, and HTML.

2. Data Import and Transformation: Power Query provides a wide range


of transformation options that can be applied to imported data, such as
removing duplicates, filtering, sorting, grouping, pivoting, and merging
tables. Additionally, Power Query allows users to create custom
transformations using a functional language called "M."

3. Query Editor: Power Query's Query Editor is a visual interface that allows
users to build and refine queries using drag-and-drop functionality. It also
includes a formula bar that allows users to create advanced transformations
and custom formulas using the M language.

4. Data Modelling: Power Query integrates seamlessly with Power BI's data
modelling features, allowing users to create relationships between tables,
define hierarchies, and create calculated columns and measures using the
DAX language.
173 | P a g e Ezekiel Aleke

5. Data Cleansing and Profiling: Power Query includes tools for identifying
and cleaning data quality issues, such as incomplete data, missing values,
and inconsistent formatting. It also provides data profiling features that
allow users to identify patterns and anomalies in their data.

6. Data Mashups: Power Query's data mashup capabilities allow users to


combine and merge data from multiple sources and transform them into a
single, unified dataset for analysis.

Here are some examples of how Power Query can be used:

1. Combining Data from Multiple Sources: Suppose you have data stored
in different sources, such as an Excel workbook and a SQL Server
database. Power Query can be used to combine the data from these sources
into a single dataset by importing the data and merging the tables.

2. Cleaning and Transforming Data: Suppose you have a large dataset that
contains missing values, duplicate rows, and inconsistent formatting.
Power Query can be used to clean and transform the data by removing the
missing values, deduplicating the rows, and standardizing the formatting.

3. Automated Data Refresh and Load: Suppose you have a report that
needs to be updated with the latest data on a regular basis. Power Query
can be used to automate the data refresh and load process by creating a
query that connects to the data source and refreshes the data at a specific
time or interval.

Overall, Power Query is a powerful and versatile tool that can help users to
streamline their data workflows, reduce manual data entry errors, and improve
the accuracy and reliability of their data.
174 | P a g e Ezekiel Aleke

The Power Query UI consists of several key components:

1. Ribbon: The ribbon contains tabs and groups that provide access to
various Power Query functions, such as data source connectors, data
transformation operations, and query settings.

2. Query Editor: The Query Editor is the main workspace for building
queries. It allows users to view, transform, and preview data, as well as
manage query settings and properties.

3. Navigation pane: The navigation pane displays a list of queries and query
groups in the current workbook. It allows users to select, manage, and
organize queries.

4. Query Settings pane: The Query Settings pane displays a list of query
properties and settings, such as data source credentials, query options, and
metadata.

5. Preview pane: The Preview pane displays a preview of the data in the
current query or transformation step. It allows users to inspect and verify
the results of data transformations.

6. Formula bar: The formula bar displays the Power Query formula that
corresponds to the current query or transformation step. It allows users to
edit, create, or copy formulas.

7. Applied Steps pane: The Applied Steps pane displays a list of applied
transformations in the current query. It allows users to view, modify, and
delete transformation steps.

The Power Query UI is designed to be intuitive and user-friendly, with drag-and-


drop functionality, visual data previews, and easy-to-use query settings. Users
can easily connect to a variety of data sources, apply transformations, and load
the data into Power BI for analysis and visualization.
175 | P a g e Ezekiel Aleke

Data transformation in Power Query

Data transformation is an important step in the process of preparing data for


analysis in Power BI. It involves cleaning, shaping, and manipulating the data in
various ways to ensure that it is in a suitable form for analysis. In this process,
the data is transformed from its raw form to a form that is more useful for analysis.

Here are some of the most common data transformation techniques used in Power
BI, along with examples:

1. Filtering data: This involves removing unwanted data from a dataset based
on specific criteria. For example, you may want to filter out all sales data
for a certain product category that is not relevant to your analysis.

2. Sorting data: This involves arranging the data in a dataset in a specific


order, such as by date, alphabetically, or by numerical value. For example,
you may want to sort a list of customers by their total purchases.

3. Merging data: This involves combining data from multiple tables into one.
For example, you may have one table that contains customer information
and another that contains order information. You can merge these tables to
create a single table that contains both sets of information.

4. Splitting data: This involves dividing a single column of data into multiple
columns based on a specific delimiter or pattern. For example, you may
want to split a column of customer names into separate columns for first
name and last name.

5. Grouping data: This involves aggregating data based on specific criteria,


such as summing sales data by month or counting the number of orders by
customer. For example, you may want to group sales data by product
category to see which categories are generating the most revenue.
176 | P a g e Ezekiel Aleke

6. Calculating new columns: This involves creating new columns of data


based on calculations or formulas. For example, you may want to create a
new column that calculates the profit margin for each sale.

7. Renaming columns: This involves giving columns more descriptive names


to make them easier to understand. For example, you may want to rename
a column from "col1" to "customer ID".
177 | P a g e Ezekiel Aleke

CHAPTER SEVEN

DATA MODELLING IN POWER BI

Data modelling in Power BI refers to the process of organizing and structuring


data to create meaningful relationships between different tables, allowing for
efficient data analysis and visualization. It involves identifying the sources of
data, cleaning, transforming, and shaping the data, creating relationships between
different tables, and defining measures and calculations.

The following are the key steps in data modelling in Power BI:

1. Identify data sources: Identify the sources of data and import them into
Power BI.

2. Clean and transform data: Clean and transform the data as required to
ensure that it is accurate and consistent. This may include removing
duplicates, filling in missing values, and formatting data.

3. Shape data: Create a data model by shaping the data to create tables and
columns. This involves creating calculated columns, grouping data, and
pivoting data.

4. Create relationships: Create relationships between different tables in the


data model. This involves defining primary keys and foreign keys and
setting up relationships between the tables.

5. Define measures: Define measures to perform calculations on the data.


Measures can be created using DAX (Data Analysis Expressions)
language.

6. Test the model: Test the model to ensure that it is functioning as expected.
178 | P a g e Ezekiel Aleke

PYTHON
179 | P a g e Ezekiel Aleke

CHAPTER ONE

KEY TERMS IN PYTHON

Python has become one of the most popular programming languages in recent
years, due in part to its versatility, readability, and large community of
developers. It can be used for a wide variety of applications, including web
development, data analysis, scientific computing, artificial intelligence, and
more.

1. Variables: A variable is a name that refers to a value in Python. Variables


can store data types such as integers, floating-point numbers, strings, lists,
tuples, dictionaries, and more.
2. Data Types: Python supports various data types, including integers,
floating-point numbers, strings, booleans, lists, tuples, sets, and
dictionaries. Understanding the different data types and how to use them is
fundamental in Python programming.
3. Operators: Operators are symbols or keywords used to perform
arithmetic, logical, and comparison operations in Python. Examples
include +, -, *, /, %, >, <, ==, !=, and more.
4. Functions: A function is a block of code that performs a specific task. In
Python, functions are defined using the def keyword and can take
arguments and return values.
5. Control Structures: Python provides various control structures like if/else
statements, loops, and conditional expressions. These structures allow you
to control the flow of execution in a program.
180 | P a g e Ezekiel Aleke

6. Modules: A module is a file containing Python code that can be imported


into other Python programs. Python provides a wide range of modules for
performing different tasks like data manipulation, scientific computing,
web development, and more.
7. Classes and Objects: Python is an object-oriented programming language,
which means that it supports classes and objects. A class is a blueprint for
creating objects, and an object is an instance of a class.
8. Exception Handling: Exception handling is a way of dealing with errors
in Python. Python provides a try/except block for catching and handling
exceptions that may occur during program execution.
9. Packages: Packages are a collection of modules and other resources that
provide additional functionality to Python. Packages can be installed using
the pip command and are an essential part of Python programming.
10.Iterators and Generators: Iterators are objects that allow you to traverse
a sequence of values one by one. Generators are functions that allow you
to create iterators on the fly. Understanding how to use iterators and
generators is essential for efficient and memory-friendly programming.
181 | P a g e Ezekiel Aleke

CHAPTER TWO

VARIABLES

In Python, variables are used to store data values. A variable is a name that refers
to a specific value, which can be of any data type. To create a variable, you simply
give it a name and assign a value to it using the assignment operator (=).

Here's an example:

In this example, we're creating a variable called "x" and assigning the value 5 to
it. We can then use the variable "x" in our code to refer to this value.

Python variable names must follow certain rules:

• Variable names can contain letters, numbers, and underscores.

• Variable names must start with a letter or underscore.

• Variable names are case-sensitive.

• Variable names cannot be the same as a Python keyword (e.g. "if", "else",
"while", etc.)
182 | P a g e Ezekiel Aleke

Here are some examples of valid variable names in Python:

And here some examples of invalid names:

You can assign any data type to a variable in Python.

Here are some examples:

In Python, there are different types of variables that can be used to store different
kinds of data. Some of the most common variable types are:

• Integer: stores whole numbers, such as 5, 10, 1000, etc.

• Float: stores decimal numbers, such as 3.14, 0.5, etc.

• String: stores text or characters, such as "Hello", "world", "Python", etc.

• Boolean: stores a value that is either True or False.


183 | P a g e Ezekiel Aleke

Here are some examples of creating variables with different data types:

You can also change the value of a variable durin runtime by assigning a new
value to it.

For example:

In the above code, the variable x is first assigned the value 10, and then it is
assigned the value 20. When the print statement is executed, it outputs the current
value of x.

Note: anything after # is a comment and will be ignored by the interpreter.

Overall, variables are an important part of programming in Python, as they allow


you to store and manipulate data values in your code.
184 | P a g e Ezekiel Aleke

CHAPTER THREE

DATA TYPES

In Python, a data type is a classification of the type of data that a variable can
store.

Here are the most used data types in Python:

Integer: An integer is a whole number, without a decimal point. In Python,


integers are represented by the "int" data type. For example:

Float: A float is a number with a decimal point. In Python, floats are represented
by the "float" data type. For example:

Boolean: A boolean value is a value that can either be true or false. In Python,
boolean values are represented by the "bool" data type. For example:

String: A string is a sequence of characters. In Python, strings are represented by


the "str" data type. Strings can be enclosed in single or double quotes. For
example:
185 | P a g e Ezekiel Aleke

List: A list is an ordered collection of items. In Python, lists are represented by


the "list" data type. Lists can contain values of different data types. For example:

Tuple: A tuple is an ordered collection of items, similar to a list. However, tuples


are immutable, meaning their values cannot be changed. In Python, tuples are
represented by the "tuple" data type. For example:

Set: A set is an unordered collection of unique items. In Python, sets are


represented by the "set" data type. For example:

Dictionary: A dictionary is an unordered collection of key-value pairs. In Python,


dictionaries are represented by the "dict" data type. For example:

These are some of the most common data types in Python, but there are others as
well, such as complex numbers and bytes. Understanding data types is important
in programming, as it allows you to use the right type of variable for the data you
are working with, which can improve performance and prevent errors.
186 | P a g e Ezekiel Aleke

CHAPTER FOUR

LIST COMPREHENSION

List comprehension is a concise way to create a new list by applying an


expression to each element of an existing list. It allows you to write a one-liner
code to create a new list, rather than writing multiple lines of code using loops.
Here's how it works:

The syntax for list comprehension is as follows:

• expression: The expression that is applied to each element of the iterable.

• item: The variable that represents each element of the iterable.

• iterable: The existing list or other iterable to be used.

• condition: The optional condition that filters the elements of the iterable.
187 | P a g e Ezekiel Aleke

Here's an example that creates a new list of squares of the numbers in an existing
list:

Output:

In the above example, the expression num ** 2 is applied to each element num of
the iterable my_list, and a new list squares is created with the squared values. The
same result can be achieved using loops, but list comprehension makes the code
more concise and easier to read.
188 | P a g e Ezekiel Aleke

Here's another example that creates a new list of even numbers from an existing
list:

Output:

In this example, the condition num % 2 == 0 filters out the odd numbers from the
iterable, and the new list evens is created with only the even numbers. Again, the
same result can be achieved using loops, but list comprehension makes the code
more concise and easier to read.
189 | P a g e Ezekiel Aleke

CHAPTER FIVE

OPERATORS IN PYTHON

Operators in Python are symbols or keywords that are used to perform operations
on variables and values. Python supports a wide range of operators, including
arithmetic operators, comparison operators, logical operators, and more.

ARITHMETIC OPERATORS

Arithmetic operators are used to perform mathematical operations such as


addition, subtraction, multiplication, division, modulus, and exponentiation. The
following are examples of arithmetic operators in Python:
190 | P a g e Ezekiel Aleke

COMPARISON OPERATORS

Comparison operators are used to compare two values or variables and return a
Boolean value (True or False) based on the comparison. The following are
examples of comparison operators in Python:

LOGICAL OPERATORS

Logical operators are used to combine two or more Boolean expressions and
return a Boolean value based on the result of the logical operation. The following
are examples of logical operators in Python:

ASSIGNMENT OPERATORS
Assignment operators are used to assign values to variables. The following are
examples of assignment operators in Python:
191 | P a g e Ezekiel Aleke

BITWISE OPERATORS

Bitwise operators are used to perform bitwise operations on binary numbers. The
following are examples of bitwise operators in Python:

MEMBERSHIP OPERATORS

Membership operators are used to test whether a value or variable is a member of


a sequence or collection. The following are examples of membership operators in
Python:
192 | P a g e Ezekiel Aleke

CHAPTER SIX

CONTROL STRUCTURES

Python supports several control structures, including if-else statements, loops (for
and while), and functions.

Control structures are used in Python to control the flow of execution of a


program.

The most used control structures in Python include:

CONDITIONAL STATEMENTS

Conditional statements allow the program to make decisions based on certain


conditions. The most common conditional statements in Python are if, elif, and
else.

a) if STATEMENT

The if statement is used to execute a block of code if a certain condition is true.


Here's an example:

Output: You are eligible to vote!

In this example, the program checks if the age variable is greater than or equal to
18. If the condition is true, it prints "You are eligible to vote!".

b) elif STATEMENT

The elif statement is used to add more conditions to the if statement. It stands for
"else if". Here's an example:
193 | P a g e Ezekiel Aleke

Output: You are not old enough to vote yet.

In this example, the program first checks if the age variable is greater than or
equal to 18. If the condition is false, it moves on to the elif statement and checks
if the age is equal to 17. If this condition is true, it prints "You can vote next
year!". If both the if and elif conditions are false, it prints "You are not old enough
to vote yet.".

c) else STATEMENT

The else statement is used to execute a block of code if none of the conditions in
the if or elif statements are true. Here's an example:

Output: You are eligible to vote!

In this example, if the age variable is greater than or equal to 18, it prints "You
are eligible to vote!". Otherwise, it prints "You are not old enough to vote yet.".

LOOPS

Loops are used to repeat a block of code multiple times. There are two types of
loops in Python: for loop and while loop.

a) for LOOP
194 | P a g e Ezekiel Aleke

The for loop is used to iterate over a sequence (such as a list, tuple, or string) and
execute a block of code for each item in the sequence. Here's an example:

Output:

apple

banana

cherry

In this example, the program iterates over the fruits list and prints each fruit.

b) while LOOP

The while loop is used to execute a block of code repeatedly as long as a certain
condition is true. Here's an example:
195 | P a g e Ezekiel Aleke

CHAPTER SEVEN

TRANSFER STATEMENT
A transfer statement is used to transfer the flow of control from one part of the
code to another. There are three types of transfer statements in Python: break,
continue, and return.

BREAK STATEMENT

The break statement is used to terminate a loop statement (for or while) before its
normal completion. When a break statement is encountered inside a loop, the loop
is immediately terminated, and the program control is transferred to the next
statement after the loop.

Example:

Output:

In this example, we have a for loop that iterates from 0 to 9. Inside the loop, we
have an if statement that checks if the current value of i is equal to 5. If it is, the
break statement is executed, and the loop is terminated. Therefore, the output only
shows the values of i from 0 to 4.
196 | P a g e Ezekiel Aleke

CONTINUE STATEMENT

The continue statement is used to skip the current iteration of a loop (for or while)
and continue with the next iteration. When a continue statement is encountered
inside a loop, the program control is transferred to the next iteration of the loop.

Example:

In this example, we have a for loop that iterates from 0 to 9. Inside the loop, we
have an if statement that checks if the current value of i is equal to 5. If it is, the
continue statement is executed, and the current iteration of the loop is skipped.
Therefore, the output shows the values of i from 0 to 4 and from 6 to 9.

RETURN STATEMENT

The return statement is used to return a value from a function. When a return
statement is encountered inside a function, the function is terminated, and the
program control is transferred back to the caller with the returned value.
197 | P a g e Ezekiel Aleke

Example:

In this example, we have a function called add that takes two arguments a and b
and returns their sum. The function is called with arguments 2 and 3, and the
returned value is stored in the variable result. Finally, the value of result is printed,
which is 5.

EXCEPTION HANDLING

Exception handling is a mechanism in Python that allows us to handle errors and


exceptions that occur during program execution. Python provides a try-except
block for exception handling.

The try block contains the code that might raise an exception, and the except
block contains the code that is executed if an exception is raised. The except block
can also specify the type of exception to handle.
198 | P a g e Ezekiel Aleke

Example:
199 | P a g e Ezekiel Aleke

CHAPTER EIGHT

OBJECT-ORIENTED PROGRAMMING

FUNCTIONS

A function is a block of organized, reusable code that performs a specific task.


Functions help in modularizing code, which makes code easy to read, understand,
and maintain.

A function in Python has the following basic syntax:

Here, def keyword is used to define the function, followed by the function_name
and parameters in parentheses. The Docstring is an optional string literal that
describes what the function does. Finally, the return statement is used to return
the output of the function.

Let's look at an example to see how a function is defined and called in Python:
200 | P a g e Ezekiel Aleke

In the above example, we have defined a function greet() that takes a name
parameter and greets the person by printing a message on the console. We have
then called this function by passing the name John as an argument.

Python allows functions to have default parameter values as well. Let's see an
example of a function with default parameter value:

In the above example, we have defined a function greet() that takes a name
parameter with a default value of "Anonymous". If the name parameter is not
provided while calling the function, it will use the default value.
201 | P a g e Ezekiel Aleke

Python also allows functions to return multiple values. Let's see an example of a
function that returns multiple values:

In the above example, we have defined a function calculate() that takes two
parameters num1 and num2, calculates their sum and product, and returns both
the values. We have then called this function and assigned the returned values to
two variables result1 and result2.

CLASS

a class is a blueprint for creating objects. It defines a set of attributes and methods
that an object of that class will have. Here is an example of a simple class
definition:
202 | P a g e Ezekiel Aleke

In this example, we define a class called Person that has two attributes (name
and age) and four methods (__init__, get_name, get_age, and set_name,
set_age). The __init__ method is a special method called a constructor, which is
called when a new object is created from the class. The self parameter refers to
the object that is being created.
203 | P a g e Ezekiel Aleke

Here's an example of how to use the Person class to create objects:

In this example, we create two Person objects (person1 and person2) and use
the get and set methods to access and modify their attributes. The get methods
return the values of the attributes, and the set methods modify the values of the
attributes.
204 | P a g e Ezekiel Aleke

CHAPTER NINE

REGULAR EXPRESSIONS

Regular expressions (also called regex) are a powerful tool for working with text
data in Python. They provide a way to search for and manipulate patterns of text.
Here is a detailed explanation of regular expressions in Python with examples.

The re Module

Python's built-in re module provides functions for working with regular


expressions. Here are some of the most used functions:

• re.search(pattern, string): searches a string for a match to the given pattern.

• re.match(pattern, string): matches a pattern at the beginning of a string.

• re.findall(pattern, string): returns all non-overlapping matches of a pattern


in a string.

• re.sub(pattern, replacement, string): replaces all occurrences of a pattern in


a string with the replacement string.

Regular Expression Syntax

Regular expressions are made up of a combination of regular characters and


special characters that have special meanings. Here are some examples of regular
expression syntax:

• . : Matches any single character except a newline character.

• * : Matches zero or more occurrences of the preceding character.

• + : Matches one or more occurrences of the preceding character.

• ? : Matches zero or one occurrences of the preceding character.

• [] : Matches any one of the characters inside the square brackets.


205 | P a g e Ezekiel Aleke

• | : Matches either the pattern to the left or the pattern to the right.

• () : Matches a group of characters, which can be used to apply quantifiers


or to capture the group's content as a separate substring.

Regular Expression Examples

Here are some examples of how to use regular expressions in Python:

Example 1: Search for a pattern in a string

In this example, we search for the pattern "quick" in the string "The quick brown
fox jumps over the lazy dog". The re.search function returns a match object,
which we can use to access information about the match. In this case, we use the
group method to return the matched string.
206 | P a g e Ezekiel Aleke

Example 2: Match a pattern at the beginning of a string

In this example, we use the re.match function to match the pattern "^The" at the
beginning of the string. The ^ character is a special character that matches the
beginning of a string.

Example 3: Find all occurrences of a pattern in a string

In this example, we use the re.findall function to find all occurrences of the
pattern "o" in the string. The function returns a list of all non-overlapping
matches.
207 | P a g e Ezekiel Aleke

Example 4: Replace a pattern in a string

In this example, we use the re.sub function to replace all occurrences of the
pattern "brown" with the replacement string "red" in the original string.
208 | P a g e Ezekiel Aleke

CHAPTER TEN

NUMPY BASICS: ARRAYS AND VECTORIZED


COMPUTATION

NumPy is a Python package that stands for "Numerical Python". It is a library


that provides support for multi-dimensional arrays and matrices, as well as a
range of mathematical functions to perform computations on those arrays.

Here are the key features of NumPy:

• N-dimensional array object: This is a fast and flexible container for large
datasets in Python. It provides functionality to create and manipulate arrays
and supports a wide range of mathematical operations.

• Broadcasting: This feature allows NumPy to perform operations on arrays


of different shapes and sizes, which is useful when dealing with datasets of
different sizes.

• Vectorized Computation: This feature allows NumPy to perform


computations on entire arrays, rather than on individual elements, which is
much faster than traditional Python operations.

ARRAYS IN NUMPY

Arrays are the main data structure in NumPy. An array is a collection of elements
of the same data type, and can have any number of dimensions. A one-
dimensional array is called a vector, while a two-dimensional array is called a
matrix.

Here is an example of how to create a NumPy array:


209 | P a g e Ezekiel Aleke

This creates a one-dimensional array with four elements. We can access elements
of the array using indexing:

We can also perform operations on arrays, such as addition and multiplication:

VECTORIZED COMPUTATION

Vectorized computation is a key feature of NumPy that allows for fast


computations on arrays. Instead of performing operations on individual elements,
NumPy can perform operations on entire arrays at once, which is much faster.

Here is an example of how to perform vectorized computation in NumPy:


210 | P a g e Ezekiel Aleke

CHAPTER ELEVEN

PANDAS

Pandas is an open-source library for data manipulation and analysis in Python. It


provides easy-to-use data structures and data analysis tools for handling tabular
data, including numerical, categorical, and time-series data.

The key data structures provided by Pandas are:

Series: A one-dimensional array-like object that can hold any data type, such as
integers, floats, strings, etc. It has both a sequence of values and a sequence of
labels, called the index, which allows for fast lookups and operations on the data.

DataFrame: A two-dimensional table-like data structure that consists of rows


and columns. Each column in a DataFrame is a Series object, and each row is a
set of values that correspond to the columns. DataFrames can be thought of as a
spreadsheet or a SQL table.

Pandas provides a wide range of functions for data manipulation, including


filtering, grouping, aggregating, merging, reshaping, and sorting data.

Filtering: You can select specific rows and columns based on certain criteria
using boolean indexing, where you create a boolean mask based on a condition
and then apply it to the DataFrame.

Grouping: You can group data by one or more columns and apply an aggregation
function, such as sum, mean, or count, to each group. This is useful for analyzing
trends and patterns in your data.

Aggregating: You can apply aggregation functions to your data, such as sum,
mean, min, max, and std, to compute summary statistics on your data.
211 | P a g e Ezekiel Aleke

Merging: You can combine two or more DataFrames based on a common


column using different types of joins, such as inner, outer, left, and right joins.
This is useful for combining different sources of data.

Reshaping: You can reshape your data from wide to long or long to wide using
the melt and pivot functions, respectively. This is useful for converting your data
into different formats for analysis and visualization.

Pandas also integrates well with other Python libraries, such as NumPy,
Matplotlib, and Scikit-learn, making it a powerful tool for data analysis and
machine learning.

DATA LOADING AND FILE FORMATS

Data Loading in pandas refers to the process of reading data from external sources
into a pandas DataFrame, which is a two-dimensional tabular data structure.
Pandas supports various file formats that can be used to store and exchange data.
In this answer, I will explain the different file formats supported by pandas and
how to load data from them.

CSV (Comma Separated Values): CSV is a simple text format in which each
line represents a row of data, and each field within a row is separated by a comma.
Pandas provides a read_csv() function that can be used to read CSV files into a
DataFrame. You can specify various parameters such as the delimiter, header,
and encoding to customize the behavior of the function.

Example code:

Excel: Excel is a spreadsheet application developed by Microsoft. Pandas


provides a read_excel() function that can be used to read Excel files into a
212 | P a g e Ezekiel Aleke

DataFrame. You can specify various parameters such as the sheet name, header,
and encoding to customize the behavior of the function.

Example code:

JSON (JavaScript Object Notation): JSON is a lightweight data interchange


format that is easy for humans to read and write and easy for machines to parse
and generate. Pandas provides a read_json() function that can be used to read
JSON data into a DataFrame. You can specify various parameters such as the
orient, lines, and encoding to customize the behavior of the function.

Example code:

SQL: SQL is a domain-specific language used in programming and designed for


managing data held in a relational database management system. Pandas provides
a read_sql() function that can be used to read data from a SQL database into a
DataFrame. You can specify various parameters such as the table name, SQL
query, and connection string to customize the behavior of the function.

Example code:

HTML: HTML is a markup language used for creating web pages. Pandas
provides a read_html() function that can be used to read HTML tables into a
213 | P a g e Ezekiel Aleke

DataFrame. You can specify various parameters such as the URL, header, and
encoding to customize the behavior of the function.

Example code:
214 | P a g e Ezekiel Aleke

CHAPTER TWELVE

DATA CLEANING AND PREPARATION

Importing Data

The first step in data cleaning and preparation is importing the data. Pandas
provides several functions to read different types of data formats such as CSV,
Excel, SQL, and more. Let's use the read_csv function to read a CSV file:

Here, we imported pandas and used the read_csv function to read a CSV file
called data.csv. The resulting data is stored in a pandas DataFrame object called
df.

Handling Missing Values

Missing values are a common problem in datasets and need to be handled before
further analysis. Pandas provides several functions to handle missing values such
as dropna and fillna. Let's demonstrate these functions using an example:

Here, we used the dropna function to drop rows with missing values and the fillna
function to fill missing values with the mean of the column. The inplace=True
parameter is used to modify the DataFrame in place.
215 | P a g e Ezekiel Aleke

Handling Duplicates

Duplicate rows can cause problems in analysis, so it is important to remove them.


Pandas provides the drop_duplicates function to remove duplicate rows. Let's use
an example to demonstrate this function:

Here, we used the drop_duplicates function to remove duplicate rows. The


inplace=True parameter is used to modify the DataFrame in place.

Fixing Data Types

Data types need to be fixed to ensure consistency in the data. Pandas provides
several functions to fix data types such as astype and to_datetime. Let's use an
example to demonstrate these functions:

Here, we used the astype function to convert the age column to an integer and the
to_datetime function to convert the date column to a datetime object.
216 | P a g e Ezekiel Aleke

CHAPTER THIRTEEN

DATA WRANGLING: JOINING DATAFRAME

Let us start with creating two data frames for joining. For this purpose, we will
use the following code:

In this example, we have created two data frames, df1 and df2, with a column
named key and value.

There are several types of join operations that can be performed in Pandas. These
are as follows:

• Inner Join

• Left Join

• Right Join

• Outer Join

We will explain each of these join types in the following sections with examples.

INNER JOIN

Inner join returns only the matching rows from both data frames. In other words,
the result will contain only the rows where the key values are present in both data
frames.
217 | P a g e Ezekiel Aleke

To perform an inner join, we will use the merge method and specify inner as the
value of the how parameter. Here is an example:

In the output, we can see that only the rows with keys 'B' and 'D' are present in
both data frames. Therefore, these rows are returned as the result.

LEFT JOIN

Left join returns all the rows from the left data frame and the matching rows from
the right data frame. If there is no match in the right data frame, the result will
contain NaN values in the corresponding columns.

To perform a left join, we will use the merge method and specify left as the value
of the how parameter. Here is an example:
218 | P a g e Ezekiel Aleke

In the output, we can see that all the rows from the left data frame (df1) are present
in the result. Rows with keys 'B' and 'D' are also present in the right data frame
(df2), and their values are included in the result. However, since there is no key
'A' or 'C' in the right data frame, the corresponding columns contain NaN values.

RIGHT JOIN

Right join returns all the rows from the right DataFrame and the matching rows
from the left DataFrame. If there are no matching rows from the left DataFrame,
then NaN values are used to fill in the columns from the left DataFrame.

Here is an example of how to perform a right join using Pandas:


219 | P a g e Ezekiel Aleke

As you can see, the resulting DataFrame contains all the rows from the second
DataFrame and only the matching rows from the first DataFrame (rows with 'A2'
and 'A3').

OUTER JOIN

Outer join returns all the rows from both the left and the right DataFrames. If
there are no matching rows, then NaN values are used to fill in the columns.
220 | P a g e Ezekiel Aleke

An outer join is a type of join operation that combines data from two or more data
frames based on a common column, while retaining all the rows from both data
frames. In Python, the pandas library provides a rich set of tools for working with
data frames, including support for outer joins. In this answer, we will explain how
to perform an outer join on data frames using pandas with examples.

Let's start with an example. Suppose we have two data frames, df1 and df2, that
contain information about employees and their departments, respectively. Here is
what they look like:
221 | P a g e Ezekiel Aleke

In this example, df1 contains the names, salaries, and department IDs of four
employees, while df2 contains the names of departments along with their
corresponding IDs.

To combine the information from both data frames, we can perform an outer join
on the common column, 'DepartmentID', using the merge() function in pandas.
The syntax for performing an outer join is as follows:
222 | P a g e Ezekiel Aleke

Here, df1 and df2 are the data frames to be merged, on='DepartmentID'
specifies the common column to join on, and how='outer' specifies that we want
to perform an outer join.

After performing the outer join, merged_df will contain all the rows from both
df1 and df2, and will fill in missing values with NaN if there is no match between
the two data frames. Here is what merged_df looks like after performing the outer
join:

As we can see, merged_df now contains all the rows from both df1 and df2, even
though the 'DepartmentID' 4 in df2 didn't have any corresponding row in df1.
Moreover, merged_df fills in the missing values in the 'Name' and 'Salary'
columns with NaN.

CONCATENATING DATAFRAMES

The concat() function concatenates two or more data frames along a specified
axis. The syntax for concatenating data frames is as follows:
223 | P a g e Ezekiel Aleke

Here, df1 and df2 are the data frames to be concatenated, axis=0 specifies the
axis along which to concatenate the data frames (0 for rows and 1 for columns),
and ignore_index=True creates a new index for the concatenated data frame.

Let's illustrate this with an example. Suppose we have two data frames, df1 and
df2, that have the same columns but different values:

We can concatenate these data frames along the rows (axis=0) using the concat()
function as follows:
224 | P a g e Ezekiel Aleke

CHAPTER FOURTEEN

PLOTTING AND VISUALIZATION USING PANDAS

Pandas is a powerful data analysis and manipulation library for Python that
provides a range of built-in functions for data visualization. In this answer, we
will explain how to create different types of plots using pandas with examples.

To start with, let's create a simple data frame with some data. Here is what it looks
like:

This data frame contains 100 rows, with columns 'x' and 'y', which are created
using NumPy's linspace() and sin() functions, respectively.

Now, let's explore different types of plots that we can create using pandas.

Line Plot

A line plot is a graph that displays information as a series of data points connected
by straight lines. In pandas, we can create a line plot using the plot() method with
kind='line'. Here is how to create a line plot of the 'y' column against the 'x'
column of our data frame:

This will create a line plot of the 'y' column against the 'x' column, with the x-axis
labeled 'x' and the y-axis labeled 'y'.
225 | P a g e Ezekiel Aleke

Bar Plot

A bar plot is a graph that represents categorical data with rectangular bars with
lengths proportional to the values that they represent. In pandas, we can create a
bar plot using the plot() method with kind='bar'. Here is how to create a bar plot
of the 'y' column against the 'x' column of our data frame:

This will create a bar plot of the 'y' column against the 'x' column, with the x-axis
labeled 'x' and the y-axis labeled 'y'.

Scatter Plot

A scatter plot is a graph that displays the relationship between two variables by
plotting their values as points on a two-dimensional coordinate system. In pandas,
we can create a scatter plot using the plot() method with kind='scatter'. Here is
how to create a scatter plot of the 'y' column against the 'x' column of our data
frame:

This will create a scatter plot of the 'y' column against the 'x' column, with the x-
axis labeled 'x' and the y-axis labeled 'y'.

Histogram

A histogram is a graph that displays the distribution of a set of continuous data


by dividing the data into bins and displaying the frequency of the values falling
into each bin. In pandas, we can create a histogram using the plot() method with
kind='hist'. Here is how to create a histogram of the 'y' column of our data frame:
226 | P a g e Ezekiel Aleke
227 | P a g e Ezekiel Aleke

CHAPTER FIFTEEN

PLOTTING AND VISUALIZATION USING MATPLOTLIB

NUMPY

NumPy is a fundamental library for numerical computing in Python. It


provides support for multi-dimensional arrays and matrices, as well as a
variety of mathematical operations on them. NumPy is essential for many
other libraries in the data science and machine learning ecosystem.

Example:

MATPLOTLIB

Matplotlib is a library for creating visualizations in Python. It provides a range of


tools for creating plots, charts, and graphs, and can be used for both basic and
advanced visualization tasks.

Basic Plotting

To create a simple plot using Matplotlib, you can use the plot() function. Here's
an example that creates a plot of the sine function:
228 | P a g e Ezekiel Aleke

In this example, we first generate 100 data points for the sine function using the
np.linspace() and np.sin() functions. Then, we use the plot() function to create
the plot, and title(), xlabel(), and ylabel() functions to add labels to the plot.
Finally, we use the show() function to display the plot.

Subplots

Matplotlib allows you to create multiple plots in a single figure using subplots.
Here's an example that creates two subplots of the sine and cosine functions:
229 | P a g e Ezekiel Aleke

Bar Charts

Matplotlib can be used for creating bar charts as well. Here's an example that
creates a bar chart of the number of students in a class:
230 | P a g e Ezekiel Aleke

SEABORN

Seaborn is a Python data visualization library based on Matplotlib that provides


a high-level interface for creating attractive and informative statistical graphics.
It is designed to work well with Pandas data frames and can be used to create a
wide range of visualization types, including scatter plots, line plots, bar plots,
heatmaps, and more. In this answer, I'll provide an overview of Seaborn's main
features and show some examples of how to use it to create different types of
plots.

Installation and Importing First, let's start with installing Seaborn. You can install
it using pip:

Once Seaborn is installed, you can import it in your Python script or Jupyter
notebook using:

Now, let's take a look at some examples of Seaborn plots.

Scatter Plot A scatter plot is a common way to visualize the relationship between
two variables. In Seaborn, you can create a scatter plot using the scatterplot()
function. Here's an example:
231 | P a g e Ezekiel Aleke

In this example, we load the tips dataset from Seaborn's built-in datasets and
create a scatter plot of the total_bill and tip variables. We use the
sns.scatterplot() function to create the plot and the plt.show() function to display
it. The resulting plot should show a positive relationship between the total bill
and the tip amount.

Line Plot A line plot is a useful way to visualize changes in a variable over time.
In Seaborn, you can create a line plot using the lineplot() function. Here's an
example:

In this example, we load the flights dataset from Seaborn's built-in datasets and
create a line plot of the passengers variable over time (i.e., the year variable).
232 | P a g e Ezekiel Aleke

We use the sns.lineplot() function to create the plot and the plt.show() function
to display it. The resulting plot should show an increasing trend in the number of
airline passengers over time.

Bar Plot A bar plot is a useful way to compare the values of different categories.
In Seaborn, you can create a bar plot using the barplot() function.

Here's an example:

In this example, we load the tips dataset from Seaborn's built-in datasets and
create a bar plot of the total_bill variable grouped by the day of the week. We
use the sns.barplot() function to create the plot and the plt.show() function to
display it. The resulting plot should show the average total bill amount for each
day of the week.
233 | P a g e Ezekiel Aleke

CHAPTER SIXTEEN

MACHINE LEARNING ALGORITHM(SCIKIT-LEARN)

Scikit-learn is a library for machine learning in Python. It provides a range of


tools for classification, regression, clustering, and dimensionality reduction, as
well as model selection and evaluation.

Example:
234 | P a g e Ezekiel Aleke

CHAPTER SEVENTEEN

WEB SCRAPPING WITH PYTHON

Web scraping is the process of extracting data from websites using automated
tools. Python has several libraries that make web scraping easier, including
BeautifulSoup, Scrapy, and Requests.

Here is an explanation of how to perform web scraping in Python using the


Requests and BeautifulSoup libraries, along with examples.

THE REQUESTS LIBRARY

The Requests library is a popular Python library for making HTTP requests. It
can be used to send GET and POST requests to web servers and retrieve the
HTML content of web pages.

Here is an example of how to use the Requests library to retrieve the HTML
content of a web page:

In this example, we use the get method of the requests library to send a GET
request to the URL "https://www.example.com". The content attribute of the
response object contains the HTML content of the page.

THE BEAUTIFULSOUP LIBRARY

The BeautifulSoup library is a Python library for parsing HTML and XML
documents. It can be used to extract data from HTML documents using various
methods, such as searching for specific HTML tags or attributes.
235 | P a g e Ezekiel Aleke

Here is an example of how to use the BeautifulSoup library to extract all links
from an HTML document:

In this example, we first retrieve the HTML content of the page using the
Requests library, and then use the BeautifulSoup library to parse the HTML
content. We then use the find_all method of the soup object to find all a tags, and
append the href attribute of each link to a list.

EXAMPLE OF WEB SCRAPING

Here is an example of how to use web scraping to extract data from a website.
Suppose we want to extract the top headlines from the CNN news website. We
can use the Requests and BeautifulSoup libraries to retrieve the HTML content
of the page and extract the headlines. Here is the code:
236 | P a g e Ezekiel Aleke

In this example, we first retrieve the HTML content of the CNN news website
using the Requests library. We then use the BeautifulSoup library to parse the
HTML content. We use the find_all method of the soup object to find all h3 tags
with the class cd__headline, which corresponds to the top headlines on the page.

We then extract the text content of each headline using the text attribute and
append it to a list. Finally, we print the list of headlines.

We are currently working on new edition.

I wish you good luck on your job search.


237 | P a g e Ezekiel Aleke

8 websites to get Remote Jobs:

1. https://t.co/CnGzmlytfJ
2. https://www.linkedin.com/jobs/
3. https://datajobs.com/
4. https://hired.com/
5. https://www.careerbuilder.com/
6. https://www.flexjobs.com/search?search=data+analysis&location=
7. https://www.glassdoor.com/index.htm
8. https://ng.indeed.com/
238 | P a g e Ezekiel Aleke

Send me a DM if you find this book helpful.

Twitter: https://twitter.com/ezekiel_aleke

Email: ezekielaleke@gmail.com

You might also like