0% found this document useful (0 votes)
17 views120 pages

PGSQL NoSQL

This document serves as reference material for the SQL and NoSQL course conducted by Mohammed Abdul Sami, focusing on RDBMS concepts with PostgreSQL and NoSQL concepts with MongoDB. It includes detailed lessons on various database topics, including SQL commands, database management, and normalization techniques. The document is intended for course participants and is not for unauthorized distribution or sale.

Uploaded by

jay khandelwal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views120 pages

PGSQL NoSQL

This document serves as reference material for the SQL and NoSQL course conducted by Mohammed Abdul Sami, focusing on RDBMS concepts with PostgreSQL and NoSQL concepts with MongoDB. It includes detailed lessons on various database topics, including SQL commands, database management, and normalization techniques. The document is intended for course participants and is not for unauthorized distribution or sale.

Uploaded by

jay khandelwal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

SQL AND NOSQL COURSE

With PostgreSQL & MongoDB

MOHAMMED ABDUL SAMI


Database Consultant
[email protected]
SQL and NoSQL Course

ABOUT THIS DOCUMENT


Reference material for the SQL and NoSQL course designed by Mohammed Abdul Sami. This material is
intended for the use of student who attended this course conducted by Mohammed Abdul Sami.

This course is primarily focused on teaching RDBMS concepts, SQL with PostgreSQL and NoSQL concepts
and NoSQL practice with MongoDB database.

Concepts and Commands explained in the document confirms to the concepts and commands as
described in the official documentation of PostgreSQL and MongoDB databases. All the examples are
tested and verified.

ABOUT TRAINER
Mohammed Abdul Sami is an experienced System Administrator & DBA having more than 25 years of
experience in managing Database Servers, Linux and Unix servers.

Since 2014 he is mostly involved in trainings and a frequent instructor at Oracle University for MySQL and
Solaris Courses.

DISCLAIMER
This document is provided as reference material to the participants who has attended SQL & NoSQL
training conducted by Mr. Mohammed Abdul Sami at Valtech (Ahmedabad Campus).

Unauthorized use, duplication or sale of this document or any of its contents without the author’s consent
is strictly prohibited.

This Document is provided for free to the intended audience only and is not for sale under any
circumstances.

All trademarks, logos, or brand names mentioned in the document are the property of their respective
owners.

BIBLIOGRAPHY
• PostgreSQL official documentation available here
• MongoDB official documentation available here

Page 1 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

Contents
Contents ....................................................................................................................................... 2
RDBMS ........................................................................................................................................ 3
Lesson 1: Introduction................................................................................................................... 4
Lesson 2: Introduction to PostgreSQL......................................................................................... 17
Lesson 3: PostgreSQL Clients ................................................................................................... 22
Lesson 4 : Structured Query Language....................................................................................... 26
Lesson 5 : SELECT Statement ................................................................................................... 51
Lesson 6: Creating and Managing Databases............................................................................. 58
Lesson 7: Advance Querying ...................................................................................................... 63
Lesson 8: Indexes & Views ......................................................................................................... 70
Lesson 9: Transactions ............................................................................................................... 76
Lesson 10: Copying and Moving Data......................................................................................... 78
NoSQL Databases ..................................................................................................................... 81
Lesson 11: Introduction ............................................................................................................... 82
Lesson 12: Introduction to MongoDB .......................................................................................... 88
Lesson 13: CRUD Operations ..................................................................................................... 93
Lesson 14: Aggregation ............................................................................................................ 106
Lesson 15: Indexes ................................................................................................................... 112
Lesson 16: Exporting and Importing Data ................................................................................. 117

Page 2 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

RDBMS
(SQL Databases)

Page 3 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

Lesson 1: Introduction

Page 4 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

RDBMS
RDBMS stands for Relational Database Management System. RDBMS recommends data to be logically
organized in Relations (tables), Tuples (rows or records), Attributes (fields or columns). Robustness and
power of RDBMS model derives from the use of simple tabular structure to represent the data. Data is
represented uniformly in a tabular structure as it is easy to organize and manipulate it.

Most of the RDBMS theory is based on mathematical set theory. It was first proposed by Dr. Edgar Frank
Codd, A computer scientists working at IBM company in 1970 and he improvised the theory throughout
1970s & 1980s. Theoretical development of the model continues till this day.

The Relations (tables) in RDBMS theory corresponds to mathematical sets, tuples (rows) correspond to
members of set. Many of the principles from mathematical set theory is applied in RDBMS.

RDBMS Terminology
Relation (Table)

In Relational database model, a table is a collection of data elements organised in terms of rows and
columns. A table is also considered as a convenient representation of relations. But a table can have
duplicate row of data while a true relation cannot have duplicate data.

Tuple (Record or Row)

A single entry in a table is called a Tuple or Record or Row. A tuple in a table represents a set of related
data.

Attribute (Field or Column name)

A table consists of several records(row), each record can be broken down into several smaller parts of
data known as Attributes.

Domain (Datatype)

When an attribute is defined in a relation(table), it is defined to hold only a certain type of values, which is
known as Attribute Domain.

Keys

A relation key is an attribute which can uniquely identify a particular tuple(row) in a relation(table). It is
also called as Primary Key

Foreign Key

An attribute which can link two relations. For example, a course-id uniquely identifies all the offered
courses along with their details in Course_table and which student has opted for which course in the
student-table.

Page 5 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

Degree

The degree of a relation/table is the number of columns it contains.

Cardinality

The cardinality of a relation/table is the number of rows it contains.

Instance

The data stored in database at a particular moment of time is called instance of database. .

Schema

A relation schema describes the structure of the relation, with the name of the relation(name of table), its
attributes and their names and datatype.

E. F. Codd's Rules for RDBMS


Dr. Edgar Frank Codd was an English computer scientist who, while working for IBM, invented the
relational model for database management, the theoretical basis for relational databases.

Rule 1 – Information Rule

All information is explicitly and logically represented in exactly one way – by data values in tables.

If an item of data does not reside somewhere in a table in the database, then it does not exist.

Rule 2 – Rule of Guaranteed Access

Every item of data must be logically addressable by resorting to a combination of table name, primary key
value and a column name.

Rule 3 – The Systematic Treatment of Null Values

The RDBMS handles records that have unknown or inapplicable values in a pre-defined fashion.

The RDBMS distinguishes between zeros, blanks and nulls in the records hand handles such values in a
consistent manner that produces correct answers, comparisons and calculations. Through the set of rules
for handling nulls, users can distinguish results of the queries that involve nulls, zeros and blanks.

Rule 4 – The Database Description Rule

The description of a database and in its contents are database tables and therefore can be queried on-line
via the data manipulation language.

There must be a Data Dictionary within the RDBMS that is constructed of tables and/or views that can be
examined using SQL.

Page 6 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

Rule 5 – The Comprehensive Sub-language rule

A RDBMS may support several languages. But at least one of them should allow user to do all of the
following:

o Define tables and views


o Query and update the data
o Set integrity constraints
o Set authorizations
o Define transactions.

RDBMS must be completely manageable through its own dialect of SQL.

Rule 6 – The View Updating Rule

All views that can be updated in theory, can also be updated by the system.

Data consistency is ensured since the changes made in the view are transmitted to the base table and
vice-versa.

Rule 7 – High Level Insert, Update and Delete Rule

The RDBMS supports insertions, updating and deletion at a table level.

This means that data can be retrieved from a relational database in sets constructed of data from multiple
rows and/or multiple tables. This rule states that insert, update, and delete operations should be
supported for any retrievable set rather than just for a single row in a single table.

Rule 8 – The Physical Independence Rule

The execution of adhoc requests and application programs is not affected by changes in the physical data
access and storage methods.

Database administrators can make changes to the physical access and storage method which improve
performance and do not require changes in the application programs or requests. Here the user specified
what he wants and need not worry about how the data is obtained.

Rule 9 – Logical Data Interdependence Rule

Logical changes in tables and views such adding/deleting columns or changing fields lengths need not
necessitate modifications in the programs or in the format of adhoc requests.

The database can change and grow to reflect changes in reality without requiring the user intervention or
changes in the applications. For example, adding attribute or column to the base table should not disrupt
the programs or the interactive command that have no use for the new attribute.

Page 7 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

Rule 10 – Integrity Independence Rule

Integrity constraints must be specified separately from application programs and stored in the catalog. It
must be possible to change such constraints as and when appropriate without unnecessarily affecting
existing applications.

The following Integrity rules(Constraints) should apply to every RDBMS :-

No component of a primary key can have missing values – this is the basic rule of Entity Integrity.

For each distinct foreign key value there must exist a matching primary key value in the same domain.
Conformation to this rule ensures what is called Referential integrity.

Rule 11 – Distribution Rule

Application programs and adhoc requests are not affected by change in the distribution of physical data.

Improved systems reliability since application programs will work even if the programs and data are
moved in different sites.

Rule 12 – No Subversion Rule

If the RDBMS has a language that accesses the information of a record at a time, this language should not
be used to bypass the integrity constraints.

Normalization
Database Anomalies

Anomalies in Database design are faults in organizing data which may lead to in efficient way of managing
database.

Update anomalies

If data items are scattered and are not linked to each other properly, then it could lead to strange
situations. For example, when we try to update one data item having its copies scattered over several
places, a few instances get updated properly while a few others are left with old values. Such instances
leave the database in an inconsistent state.

Deletion anomalies

We tried to delete a record, but parts of it was left undeleted because of unawareness, the data is also
saved somewhere else.

Insert anomalies

We tried to insert data in a record that does not exist at all.

Page 8 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

Normalization

Normalization is a method to remove all these anomalies and bring the database to a consistent state.

The primary objective of the normalization is that the data is structured in a way that all the attributes are
grouped along with primary key which ensures unique identification of all tuples. To achieve this we may
have to decompose the data into many relations.

To achieve consistent database the below Normalization forms available

Un-Normalized Form (UNF)

If a table contains non-atomic values at each row, it is said to be in UNF. An atomic value is something that
can not be further decomposed

Students Data

Name Address Phone No Course Duration Course Fee Fee Paid Final Marks Division Instructor

875642,9875
Niel 10, Rd-5 MySQL 40 25000 20000 97 Distinction Sami
43

Dave city center 874321 Oracle 80 50000 50000 Aaron

Mia 2nd avenue 982133 Postgresql 32 30000 30000 89 First Sami

Riley barrows 982156 Oracle 80 50000 45000 73 Second Aaron

870987,
Emma 5, Rd-8 MySQL 40 25000 25000 75 Second Sami
870543

George City Center MySQL 40 25000 25000 91 First Sami

Jane 2nd Avenue 984321 Postgresql 32 30000 30000 93 First Sami

First Normal Form (1NF)

A relation is said to be in 1NF if every attribute contains no non-atomic values and each row can provide a
unique combination of values.

Page 9 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

Students Data- 1st NF

Home Duratio Final


Name Address Phone No Course Course Fee Fee Paid Division Instructor
Phone n Marks

Niel 10, Rd-5 875642 987543 MySQL 40 25000 20000 97 Distinction Sami

Dave city center 874321 Oracle 80 50000 50000 Aaron

2nd
Mia 982133 Postgresql 32 30000 30000 89 First Sami
avenue

Riley barrows 982156 Oracle 80 50000 45000 73 Second Aaron

Emma 5, Rd-8 870987 870543 MySQL 40 25000 25000 75 Second Sami

City
George MySQL 40 25000 25000 91 First Sami
Center

2nd
Jane 984321 Postgresql 32 30000 30000 93 First Sami
Avenue

Second Normal Form (2NF)

A relation is said to be in 2NF if it is already in 1NF and each and every attribute fully depends on the
primary key of the relation.

Course Data

Course Duration Course Fee

Oracle 80 50000

MySQL 40 25000

Postgresql 32 30000

Page 10 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

Students Data in 2nd NF

Name Address Phone No Home Phone Course Fee Paid Final Marks Division Instructor

Niel 10, Rd-5 875642 987543 MySQL 20000 97 Distinction Sami

Dave city center 874321 Oracle 50000 Aaron

Mia 2nd avenue 982133 Postgresql 30000 89 First Sami

Riley barrows 982156 Oracle 45000 73 Second Aaron

Emma 5, Rd-8 870987 870543 MySQL 25000 75 Second Sami

George City Center MySQL 25000 91 First Sami

Jane 2nd Avenue 984321 Postgresql 30000 93 First Sami

Third Normal Form (3NF)

A relation is said to be in 3NF, if it is already in 2NF and there exists no transitive dependency in that
relation.

Course Data- 3rd NF

CourseID Course Duration Course Fee

100 Oracle 80 50000

200 MySQL 40 25000

300 Postgresql 32 30000

Page 11 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

Students Data- 3rd NF

StudentID Name Address Phone No Home Phone CourseID Fee Paid Final Marks Instructor

1551 Niel 10, Rd-5 875642 987543 MySQL 20000 97 Sami

1552 Dave city center 874321 Oracle 50000 Aaron

1553 Mia 2nd avenue 982133 Postgresql 30000 89 Sami

1554 Riley barrows 982156 Oracle 45000 73 Aaron

1555 Emma 5, Rd-8 870987 870543 MySQL 25000 75 Sami

1556 George City Center MySQL 25000 91 Sami

1557 Jane 2nd Avenue 984321 Postgresql 30000 93 Sami

Divisions Chart - 3rd NF

Division Min Marks Max Marks

Distinction 95 100

First 85 94

Second 75 84

Passed 65 74

Need Improvement 0 64

Fourth Normal Form (4NF)

When attributes in a relation have multi-valued dependency, further Normalization to 4NF.

Page 12 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

Course Data - 4th NF

CourseID Course Duration Course Fee

100 Oracle 80 50000

200 MySQL 40 25000

300 Postgresql 32 30000

Students Data - 4th NF

StudentID Name Address Phone No Home Phone CourseID Fee Paid Final Marks

1551 Niel 10, Rd-5 875642 987543 MySQL 20000 97

1552 Dave city center 874321 Oracle 50000

1553 Mia 2nd avenue 982133 Postgresql 30000 89

1554 Riley barrows 982156 Oracle 45000 73

1555 Emma 5, Rd-8 870987 870543 MySQL 25000 75

1556 George City Center MySQL 25000 91

1557 Jane 2nd Avenue 984321 Postgresql 30000 93

Instructor 4th NF

InstructorID Name address

DB01 Sami Arlington Drive

Page 13 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

Instructor 4th NF

InstructorID Name address

DB02 Aaron 4th Avenue

Divisions Chart - 4th NF

Division Min Marks Max Marks

Distinction 95 100

First 85 94

Second 75 84

Passed 65 74

Need Improvement 0 64

Instructor / Student Map - 4th NF

InstructorID StudentID

DB02 1551

DB01 1552

DB02 1553

DB01 1554

DB02 1555

DB02 1556

DB02 1557

Page 14 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

Workbook
Exercise - 1
Normalize the below health history data upto 3rd NF

HEALTH HISTORY REPORT

PET ID PET NAME PET TYPE PET AGE OWNER VISIT DATE PROCEDURE

246 ROVER DOG 12 SAM COOK JAN 13/2002 01 - RABIES VACCINATION

MAR 27/2002 10 - EXAMINE and TREAT WOUND

APR 02/2002 05 - HEART WORM TEST

298 SPOT DOG 2 TERRY KIM JAN 21/2002 08 - TETANUS VACCINATION

MAR 10/2002 05 - HEART WORM TEST

341 MORRIS CAT 4 SAM COOK JAN 23/2001 01 - RABIES VACCINATION

JAN 13/2002 01 - RABIES VACCINATION

519 TWEEDY BIRD 2 TERRY KIM APR 30/2002 20 - ANNUAL CHECK UP

APR 30/2002 12 - EYE WASH

Page 15 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

Exercise - 2

Normalize the below Invoice data unto 3rd Normal Form

INVOICE
HILLTOP ANIMAL HOSPITAL DATE: JAN 13/2002

INVOICE # 987

MR. RICHARD COOK

123 THIS STREET

MY CITY, ONTARIO

Z5Z 6G6

PET PROCEDURE AMOUNT

ROVER RABIES VACCINATION 30.00

MORRIS RABIES VACCINATION 24.00

TOTAL 54.00

TAX (8%) 4.32

AMOUNT OWING 58.32

Page 16 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

Lesson 2: Introduction to PostgreSQL

Page 17 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

Introduction to PostgreSQL
PostgreSQL is an object-relational database management system (ORDBMS) based on POSTGRES, Version
4.2, developed at the University of California at Berkeley Computer Science Department. POSTGRES
pioneered many concepts that only became available in some commercial database systems much later.
PostgreSQL is an open-source descendant of this original Berkeley code. It supports a large part of the SQL
standard and offers many modern features:

o complex queries
o foreign keys
o triggers
o updatable views
o transactional integrity
o multiversion concurrency control (MVCC)

Brief History of PostgreSQL

The object-relational database management system now known as PostgreSQL is derived from the POST-
GRES package written at the University of California at Berkeley. With over two decades of development
behind it, PostgreSQL is now the most advanced open-source database available anywhere.

Page 18 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

Database Limits

Limit Value

Maximum Database Size Unlimited

Maximum Table Size 32 TB

Maximum No Of Databases 4,29,49,50,911

Relations per Databse 1,43,16,50,303

Maximum Columns per Table 1600

Maximum Indexes per Table Unlimited

Columns per Index 32

Partition Keys 32

Common Database Object Names

Industry Term PostgreSQL Term

Table or Index Relation

Row Tuple

Column Attribute

Data Block Page (when block is on disk)

Page Buffer (when block is in memory)

Page 19 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

Client / Server Model

PostgreSQL connection lifecycle

A client begins a new session by asking for and authenticating a connection to the server.

• The server forks a new system process to handle the connection and work session. The session’s
state is initialized per a combination of server-level, database-level, and user-level configuration
parameters.
• The client does as much work as it needs by executing one or more transactions. Examples
include:
o Execute reads and writes against relations (tables, views, etc.)
o Use the SET command to change the session or transaction state
o Prepare and execute prepared statements
• The session ends when the client disconnects.
• The server destroys the process that handled the session.

Page 20 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

A database session: consists of all the work done over a single connection’s lifetime. Database sessions
are of variable length in time and consume a variable amount of resources on both the client and server.

Page 21 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

Lesson 3: PostgreSQL Clients

Page 22 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

GUI Client PgAdmin4


PostgreSQL provides PgAdmin as default graphical client for managing PostgreSQL database. Almost all
administration task can be performed from PgAdmin interface except for restarting the reserver or editing
postgresql.conf / pg_hba.conf files

o Available on MacOS, Windows and Linux


o Supports Desktop mode
o Supports in Web-Server mode and act as central GUI client fo many PostgreSQL Instances / RDS
for PostgreSQL instance.
o GUI Interface is same for both Desktop and Web-Server modes

Page 23 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

PGAdmin4 Web-Server Architecture

A central Web-Server based PGAdmin4 running on a linux machine and the DBA/Developers
connect through a Browser.

• Uses SQLLite database for storing metadata


• Requires authentication to access PGAdmin4
• Ability to store multiple PostgreSQL db connection details for each PGAdmin4 user

Centralized access to PostgreSQL databases for DBAs and Developers

DBeaver

Now a days very popular GUI base Database client capable of supporting almost all popular SQL and
NoSQL databases.

1. GUI based desktop client available on Windows, Linux and MacOS


2. Supports Any Database by using JDBC connections
3. Easy to use
4. A free version available as DBeaver Community Edition

Page 24 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

PostgreSQL connectors

PostgreSQL connectors enable applications to connect to and interact with a PostgreSQL database. They
allow developers to integrate PostgreSQL database operations within various environments and
programming languages

• Libpq : Native client, C Language connector


• JDBC : Language: Java3
• Npgsql : Language: .NET
• psycopg2 : Language: Python
• pgx : Language: Go
• node-postgres (pg) : Language: JavaScript (Node.js)
• DBD::pg : Language PERL
• PHP-pgsql : Language PHP

These are more connectors for various programming languages. Refer link below

PostgreSQL Connectors

Page 25 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

Lesson 4 : Structured Query Language

Page 26 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

SQL (Structured Query Language)


SQL was initially developed at IBM by Donald D. Chamberlin and Raymond F. Boyce for manipulating,
retrieving and storing data from RDBMS in the early 1970s.This version, initially called SEQUEL (Structured
English Query Language)

Also parallelly developed at Oracle Corporation and Ingres Database. Later it has been standardize under
ANSI SQL standards.

Sub Languages

Further SQL language has been classified in the following sub-languages

DDL

Data Definition Language. Useful create design relational schemas and create and manage other objects.
It consists of 4 statements

CREATE

ALTER

DROP

TRUNCATE

DML

Data Manipulation Language. used to save, retrieve and manipulate data. It consists of 4 statements

INSERT

UPDATE

DELETE

SELECT

TCL

Transaction Control Language. In a multiuser system transactions ensure concurrent DML operations
happen without any conflicts and ensures data consistency. It consists of 5 statements

BEGIN or START TRANSACTION

COMMIT

ROLLBACK

SAVEPOINT

Page 27 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

ROLLBACK TO SAVEPOINT

DCL

Data Control Language. It helps In implementing access restriction to the data so that data can only be
accessed or manipulated by authorized user. It consists of 2 statements

GRANT

REVOKE

Data Types
Data type defines what type data will be stored in a particular column.

• Numeric Data Types


• Date and Time Data Types
• String Data Types
• Special Data Types

Numeric Data Type

Storage
Name Size Description Range
smallint 2 bytes small-range integer -32768 to +32767
Integer, int 4 bytes typical choice for -2147483648 to +2147483647
integer
bigint 8 bytes large-range integer -9223372036854775808 to
+9223372036854775807
decimal variable user-specified up to 131072 digits before the decimal point;
precision, exact up to 16383 digits after the decimal point
numeric variable user-specified up to 131072 digits before the decimal point;
precision, exact up to 16383 digits after the decimal point
real 4 bytes variable-precision, 6 decimal digits precision
inexact
double 8 bytes variable-precision, 15 decimal digits precision
precision inexact
smallserial 2 bytes small 1 to 32767
autoincrementing
integer

Page 28 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

Storage
Name Size Description Range
serial 4 bytes autoincrementing 1 to 2147483647
integer
bigserial 8 bytes large 1 to 9223372036854775807
autoincrementing
integer

Monetary Types

The money type stores a currency amount with a fixed fractional precision. The fractional precision is
determined by the database's lc_monetary setting.

Name Storage Size Description Range

money 8 bytes currency amount -92233720368547758.08 to +92233720368547758.07

Character Types

Name Description

character varying(n), varchar(n) variable-length with limit

character(n), char(n), bpchar(n) fixed-length, blank-padded

bpchar variable unlimited length, blank-trimmed

text variable unlimited length

Date/Time Types

Page 29 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

Storage
Name Size Description Low Value High Value

timestamp [ (p) ] [ without 8 bytes both date and time (no time 4713 BC 294276 AD
time zone ] zone)

timestamp [ (p) ] with time 8 bytes both date and time, with 4713 BC 294276 AD
zone time zone

date 4 bytes date (no time of day) 4713 BC 5874897 AD

time [ (p) ] [ without time 8 bytes time of day (no date) 00:00:00 24:00:00
zone ]

time [ (p) ] with time zone 12 bytes time of day (no date), with 00:00:00+1559 24:00:00-1559
time zone

interval [ fields ] [ (p) ] 16 bytes time interval -178000000 178000000


years years

Where (p) specifies the number of fractional digits retained in the seconds field and range is 0-6.

Interval field values:

YEAR

MONTH

DAY

HOUR

MINUTE

SECOND

Page 30 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

Binary Data Types

Name Storage Size Description

bytea 1 or 4 bytes plus the actual binary string variable-length binary string

Boolean Data Type

Name Storage Size Description Valid Values

boolean 1 byte state of true or false True, yes,1,on False,no,0,off

Enumerated Types

We can create a new data type associated with a specific list of values. When any column declared with
this type then it will accept values from given list.

Create ENUM Type:

Syntax:
CREATE TYPE <typename> AS ENUM ('v1', 'v2', 'v3',….’vn’);

Example:
CREATE TYPE weekdays AS ENUM ('Monday', 'Tuesday', 'Wednesday',’Thursday’, ‘Friday’);

Data Definition Language


DDL consists of 4 statements namely CREATE, ALTER, DROP and TRUNCATE. This sub language is helpful in
defining table structure, (like name of table, columns names and their data types) change the definition of
table (like adding a new columns) or dropping the table.

Creating a Database

A database is a collection of database objects such as tables etc. It allows us to group the objects for
efficient management.

Syntax:
CREATE DATABASE <DATABASENAME>;

Page 31 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

Example:
CREATE DATABASE students;

List Database or Table from PSQL

SHOW command helps in listing all the existing databases and tables which are present in the current
database.

Postgres=> \l
Postgres=> \d

Creating Tables

• A table is a named collection of rows


• Each table row has same set of columns
• Each column has a data type and optionally a constraint

Tables can be created using the CREATE TABLE statement

Syntax:
CREATE TABLE table_name
( [column_name data_type [ column_constraint], ..... );

Example:
CREATE TABLE COURSES (courseid int, course_name varchar(25), duration_in_hrs int);

Checking Table Structure from PSQL

Table structure can be displayed from PSQL command line client by the following meta command

Postgres=> \d <table name>

Changing the table definition (ALTER TABLE)

ALTER TABLE is used to modify the structure of a table

Can be used to:

• Add a new column


• Modify an existing column
• Add a constraint
• Drop a column

Page 32 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

• Rename a column, constraint or table

Syntax:
ALTER TABLE table_name ADD COLUMN column_name datatype column_constraint;
ALTER TABLE table_name DROP COLUMN column_name;
ALTER TABLE table_name RENAME COLUMN column_name TO new_column_name;
ALTER TABLE table_name ALTER COLUMN column_name [SET DEFAULT value | DROP DEFAULT];
ALTER TABLE table_name ALTER COLUMN column_name [SET NOT NULL| DROP NOT NULL];
ALTER TABLE table_name ADD CHECK expression;
ALTER TABLE table_name ADD CONSTRAINT constraint_name constraint_definition;
ALTER TABLE table_name RENAME TO new_table_name;

DROP TABLE

The DROP TABLE statement is used to drop a table All data and structure is deleted

Related indexes and constraints are dropped Syntax:

DROP TABLE [ IF EXISTS ] name [, ...];

TRUNCATE TABLE

Truncate table statement deletes all the rows from the given table and empties the table.

TRUNCATE TABLE <TABLE NAME>;

Data Manipulation Language

Data Definition Language consists of 4 statements INSERT, UPDTAE, DELETE and SELECT. These statements
are useful to add, modify or retrieve data from tables.

INSERT

INSERT statement add a new row to the table.

Syntax:
INSERT INTO <table name> VALUES (v1, v2,.....vn);

Note: Need to provide values for all the columns in the same order as they are arranged in the table.

INSERT statement can be used to add new row into the table by providing values only for few columns.

Syntax:
INSERT INTO <table name> (column1, column2, ...) VALUES (v1, v2,...);

Page 33 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

SELECT

SELECT statement allows us to retrieve the data from the tables.

Syntax:
SELECT * FROM <table name>;

Choosing Columns with SELECT:

Syntax:
SELECT column1, column2,... FROM <table name>;

UPDATE

UPDATE statement is useful to change the columns values in the table.

Syntax:
UPDATE <table name> SET <column1> = <value>, <column2> = <value> [WHERE
<condition>];

All the rows matching the giving conditions are changed. If WHERE is omitted then all rows in the table are
changed.

DELETE

DALETE statements deletes selected rows or all rows from the table.

Syntax:
DELETE FROM <table name> [WHERE <condition>];

If WHERE clause is omitted then all rows from the table are deleted.

PostgreSQL Operators
Operators helps to perform various data manipulations like arithmetic operations,
comparisons etc.

Arithmetic Operators

Operator Description Example Result

+ addition 2 + 3 5

Page 34 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

Operator Description Example Result

- subtraction 2 - 3 -1

* multiplication 2 * 3 6

/ division (integer division truncates the result) 4 / 2 2

% modulo (remainder) 5 % 4 1

^ exponentiation (associates left to right) 2.0 ^ 3.0 8

|/ square root |/ 25.0 5

||/ cube root ||/ 27.0 3

@ absolute value @ -5.0 5

Comparison operators
Operator Description Usage

< Less than? 1 < 2

<= Less than or equal to? 1 <= 2

<> Not equal? 1 <> 2

= Equal? 1 = 1

> Greater than? 2 > 1

>= Greater than or equal to? 2 >= 1

|| Concatenate strings Postgre' || 'SQL'

!!= NOT IN 3 !!= i

~~ LIKE scrappy,marc,hermit' ~~ '%scrappy%'

!~~ NOT LIKE bruce' !~~ '%al%'

~ Match (regex), case sensitive thomas' ~ '*.thomas*.'

~* Match (regex), case insensitive thomas' ~* '*.Thomas*.'

!~ Does not match (regex), case thomas' !~ '*.Thomas*.'


sensitive
!~* Does not match (regex), case thomas' !~ '*.vadim*.'
insensitive
IS NULL Test whether value is NULL

IS TRUE Test whether value is true

Page 35 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

IS DISTINCT Test whether values are distinct x IS DISTINCT y

LIKE X LIKE '%s%'

IN X in (v1, v2,..vn)

Logical Operators
Operator Description

AND True if Both the given conditions are


true
OR True if Any of the given conditions is
true
NOT Logical negation

Special Operators for Sub Queries


Operator Description

ALL True if all sub query results meets the


condition
ANY True if any of the sub query results
meets the condition
EXISTS True if sub query returns results

Constraints
Constraints are used to enforce data integrity

PostgreSQL supports different types of constraints:

• NOT NULL
• CHECK
• UNIQUE
• PRIMARY KEY
• FOREIGN KEY

Page 36 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

Constraints can be defined on a table at the time of creation or can also be enforced latter using ALTER
TABLE statement.

Primary Key

Any table can have only one primary key. It restricts duplicate values and NULL values in the column or
combination of columns. Primary key identifies all rows uniquely within the table it is defined. It is
recommended to have a primary key for each table.

Syntax-1
CREATE TABLE <table name> (
<columns name> <datatype> PRIMARY KEY,
<column name2> <datatype>,
..............
);

Syntax-2:
CREATE TABLE <table name> (
<columns name> <datatype> ,
<column name2> <datatype>,
<columns name3> <datatype>,
PRIMARY KEY (column1, column2......)
);

Syntax-3:
ALTER TABLE <table name> ADD CONSTRAINT PRIMARY KEY (col1, col2...);

Note: Creating a primary key also create a Unique index on the columns.

Unique Key

Like primary key Unique Key also restricts the duplicate values in the columns but it allows NULL values. A
table can have more than one Unique Keys

Syntax-1
CREATE TABLE <table name> (
<columns name> <datatype> UNIQUE,
<column name2> <datatype>,
..............
);

Syntax-2:

Page 37 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

CREATE TABLE <table name> (


<columns name> <datatype> ,
<column name2> <datatype>,
<columns name3> <datatype>,
UNIQUE(column1, column2......)
);

Syntax-3:
ALTER TABLE <table name> ADD CONSTRAINT <constraint name> UNIQUE (col1, col2...);

Note: Creating a Unique index also enforces UNIQUE constraints.

NOT NULL

NOT NULL constraint disallow NULL values in the columns.

Syntax:
CREATE TABLE <table name> (
<columns name> <datatype> NOT NULL,
<column name2> <datatype>,
..............
);

CHECK Constraint

CHECK constraint ensures that the values entered in the column satisfy the given condition.

Syntax-1
CREATE TABLE <table name> (
<columns name> <datatype>,
<column name2> <datatype> CHECK (<condition>,
..............
);

Syntax-2:
CREATE TABLE <table name> (
<columns name> <datatype> ,
<column name2> <datatype>,
<columns name3> <datatype>,
CONSTRAINT <constraint name> CHECK (<condition>)
);

Page 38 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

Syntax-3:
ALTER TABLE <table name> ADD CONSTRAINT <constraint name> CHECK (<condition>);

FOREIGN KEY

FOREIGN KEY constraint restricts the values in the columns to the available values in the PRIMARY KEY of
another table. It establishes the link between two tables by referencing the PRIMARY KEY of another table.

Syntax-1
CREATE TABLE <table name> (
<columns name> <datatype>,
<column name2> <datatype> REFERENCES <another table name(<column>),
..............
);

Syntax-2:
CREATE TABLE <table name> (
<columns name> <datatype> ,
<column name2> <datatype>,
<columns name3> <datatype>,
CONSTRAINT <constraint name> FOREIGN KEY (column) REFERENCES
<tablename>(<condition>)
);

Syntax-3:
ALTER TABLE table_name
ADD CONSTRAINT constraint_name
FOREIGN KEY (column_name) REFERENCES parent_table (parent_column);

• Constraints can be defined at the column level or table level


• Constraints can be added to an existing table using the ALTER TABLE statement
• Constraints can be declared DEFERRABLE or NOT DEFERRABLE
• Constraints prevent the deletion of a table if there are dependencies

Default values

Any columns can be assigned a default value which will replace NULL in case if the values for the column is
not provided.

Syntax-1

Page 39 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

CREATE TABLE <table name> (


<columns name> <datatype>,
<column name2> <datatype> DEFAULT <value>,
..............
);

Syntax-2:
ALTER TABLE table_name ALTER COLUMN column_name [SET DEFAULT value | DROP DEFAULT];

Generated Columns

Value of generated column is computed from the expression given in columns definition.

Example1:
CREATE TABLE t1 (
first_name VARCHAR(10),
last_name VARCHAR(10),
full_name VARCHAR(255) GENERATED ALWAYS A
(CONCAT(first_name,' ',last_name)) );

Example1:
CREATE TABLE trinagles(
base float,
height float,
area float GENERATED ALWAYS AS (base * height * 0.5) );

Built-In Functions
Numeric Functions

Function Description Example Result

ABS Calculate the absolute value of a number ABS(-10) 10

CBRT Calculate the cube root of a number CBRT(8) 2

Page 40 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

Round a number up to the nearest integer, which is greater


CEIL CEIL(-12. 8) -12
than or equal to number

DIV Return the integer quotient of two numeric values DIV(8,3) 2

Return the exponential value in scientific notation of a


EXP EXP(1) 2.7182818
number

FLOOR
Round a number down to the nearest integer, which is less
FLOOR(1 0.6) 10
than or equal to number

LN Return the natural logarithm of a numeric value LN(3) 1.0986122

LOG Return the base 10 logarithm of a numeric value LOG(100 0) 3

LOG Return the logarithm of a numeric value to a specified base LOG(2, 64) 6

Divide the first parameter by the second one and return the
MOD MOD(10,4 ) 1
remainder

PI Return the value of PI PI() 3.1415926

Raise a numeric value to the power of a second numeric


POWER POWER( 5, 3) 125
value

RADIANS Convert degrees to radians RADIANS (60) 1.0471975

Page 41 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

ROUND
Round a number to the nearest integer or to a specified
ROUND(1 0.3) 10
decimal places

SCALE Return the number of decimal digits in the fractional part SCALE(1. 234) 3

SIGN Return the sign (positive, negative) of a numeric value SIGN(-1) -1

SQRT Return the square root of a numeric value SQRT(3.0 ) 1.7320508

TRUNC
Truncate a numeric value to a whole number of to the
TRUNC(1 2.3) 12
specified decimal places

1 0.9684356
RANDOM Return a random number that ranges from 0 to 1 random()

String Functions

Function Description Example Result

Return the ASCII code value of a


ASCII character or Unicode code point of a ASCII(‘A’) 65
UTF8 character

Convert an ASCII code to a character


CHR or a Unicode code point to a UTF8 CHR(65) A’
character

Page 42 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

CONCAT
Concatenate two or more strings into
CONCAT(‘A’,’B’,’C’) ABC’
one

CONCAT_WS
Concatenate strings with a separator CONCAT_WS(‘,’,’A’,’B’,’ C’) A,B,C’

INITCAP
Convert words in a string to title case INITCAP(‘hI tHERE’) Hi There

LEFT Return the first n character in a string LEFT(‘ABC’,1) A’

LENGTH
Return the number of characters in a
LENGTH(‘ABC’) 3
string

LOWER
Convert a string to lowercase LOWER(‘hI tHERE’) hi there’

Pad on the left a a string with a


LPAD LPAD(‘123′, 5, ’00’) 00123’
character to a certain length

LTRIM Remove the longest string that


contains specified characters from LTRIM(‘00123’) 123’
the left of the input string

Page 43 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

Return MD5 hash of a string in


MD5 MD5(‘ABC’)
hexadecimal

POSITION
Return the location of a substring in a
POSTION(‘B’ in ‘A B C’) 3
string

Repeat string the specified number of


REPEAT REPEAT(‘*’, 5) *****’
times

REPLACE Replace all occurrences in a string of


REPLACE(‘ABC’,’B’,’A’) AAC’
substring from with substring to

REVERSE Return reversed string. REVERSE(‘ABC’) CBA’

Return last n characters in the string.


RIGHT
When n is negative, return all but first RIGHT(‘ABC’, 2) BC’
|n| characters.

Pad on the right of a string with a


RPAD RPAD(‘ABC’, 6, ‘xo’) ABCxox’
character to a certain length

Remove the longest string that


RTRIM
contains specified characters from RTRIM(‘abcxxzx’, ‘xyz’) abc’
the right of the input string

SUBSTRING
Extract a substring from a string SUBSTRING(‘ABC’,1,1) A’

Remove the longest string that


contains specified characters from
TRIM TRIM(‘ABC ‘) ABC’
the left, right or both of the input
string

UPPER
Convert a string to uppercase UPPER(‘hI tHERE’) HI THERE’

Page 44 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

splits a string on a specified delimiter


and returns the nth substring.
Split_part Slipt_part(‘wel-come’,’-‘,2) come
SPLIT_PART(string, delimiter,
position)

Date Functions

Function Description Example

current_date Returns Current date select current_date;

current_time Returns Current Time select current_time;

now() Returns Current date and time select now();

select
current_timesta mp Returns Current date and time
current_timestamp ;

Return the FIELD part from the date. Valid FIELDS are below:
extract(FIELD from date select extract(day from
'<date>') YEAR, DAY, MONTH, HOUR, MINUTE, SECOND, date '2021-05-24')
MILLISECONDS, DOY, DOW, CENTURY, QUARTER

age(timestamp '2001-
age ( timestamp, times Returns difference between given 2 dates or given date and
04-10', timestamp
tamp ) now()
'1957-06-13')

pg_sleep(n) Sleeps n number of seconds select pg_sleep(10)

Page 45 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

select pg_sleep('5
pg_sleep_for('interval') Sleeps for given interval
minutes')

Date Arithmetic
Date Addition

Date Additions can be done using interval key word.


select now() + interval '13 days'
select now() + interval '1 year'
select now() + interval '1 month'
select date '2021-02-15' + interval '1 month'

Interval can be in DAYS, MONTH, YEAR, HOURS, MINUTES and SECONDS

Date Subtraction
select date '2021-02-15' - date '2021-03-15' ;
select date '2021-02-15' + interval '30 days';
select ename, now()-hiredate from emp;
select ename, (now()-hiredate) from emp;
select ename, age(now(),hiredate) from emp;

Data Type Formatting Functions


to_char()
to_date()
to_number()
to_timestamp()

to_char() function can be used to format date or numeric values and output will be treated as
string.
Examples:
to_char(timestamp '2002-04-20 17:31:12.66', 'HH12:MI:SS')
to_char(now(), 'YYYY-MON-DD')
to_char(125, '999') → 125
to_char(125.8::real, '999D9') → 125.8
to_char(-125.8, '999D99S') → 125.80-

to_date() Function

Page 46 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

to_date('05 Dec 2000', 'DD Mon YYYY') → 2000-12-05


to_date('05-Dec-2000', 'DD-MON-YYYY') -> 2000-12-05

Date Time formatting Patterns

Pattern Description

HH hour of day (01–12)

HH12 hour of day (01–12)

HH24 hour of day (00–23)

MI minute (00–59)

SS second (00–59)

MS millisecond (000–999)

AM, am, PM or pm meridiem indicator (without periods)

YYYY year (4 or more digits)

BC, bc, AD or ad era indicator (without periods)

MONTH full upper case month name (blank-padded to 9 chars)

MON abbreviated upper case month name (3 chars in English, localized lengths vary)

MM month number (01–12)

DAY full upper case day name (blank-padded to 9 chars)

DY abbreviated upper case day name (3 chars in English, localized lengths vary)

DD day of month (01–31)

TZ upper case time-zone abbreviation (only supported in to_char)

to_number() Function
to_number('122345', '99G999D9S') -> 12,234.5

Page 47 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

Numeric Formatting

Pattern Description

9 digit position (can be dropped if insignificant)

0 digit position (will not be dropped, even if insignificant)

. (period) decimal point

, (comma) group (thousands) separator

PR negative value in angle brackets

S sign anchored to number (uses locale)

L currency symbol (uses locale)

D decimal point (uses locale)

G group separator (uses locale)

RN Roman numeral (input between 1 and 3999)

TH or th ordinal number suffix

EEEE exponent for scientific notation

Aggregate Functions
sum()
avg()
count()
min()
max()
string_agg()

Examples of Aggregate Function:


employees=# select sum(sal) from emp;
sum
----------
29025.00 (1 row)
employees=# select max(sal) from emp;

Page 48 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

max
---------
5000.00 (1 row)
employees=# select min(sal) from emp;
min
--------
800.00 (1 row)
employees=# select count(sal) from emp;
count
-------
14 (1 row)
employees=# select avg(sal) from emp;
avg
-----------------------
2073.2142857142857143
(1 row)
employees=# select job, count(*) from emp group by job;
job | count
-----------+-------
CLERK | 4
PRESIDENT | 1
MANAGER | 3
SALESMAN | 4
ANALYST | 2
(5 rows)
employees=# select deptno, count(*) from emp group by deptno;
deptno | count
--------+-------
30| 6
10| 3
20| 5
(3 rows) employees=#
employees=# select deptno, count(*) from emp group by deptno having count(*) < 4;
deptno | count
--------+-------
10| 3
(1 row)
employees=#
employees=# select * from emp limit 3;
employees=# select * from emp order by ename;
employees=# select job, count(*) from emp group by job order by count(*);
job | count

Page 49 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

-----------+-------
PRESIDENT | 1

job | count
-----------+-------
CLERK | 4
SALESMAN | 4
MANAGER | 3
ANALYST | 2
PRESIDENT | 1

(5 rows)

Page 50 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

Lesson 5 : SELECT Statement

Page 51 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

The SELECT statement can be used to retrieve (select) data from a table or an expression. It is one of the
most complex statement in SQL language with a lot of clauses to perform data retrieval and
transformation.

The SELECT statement can be used to retrieve (select) data from a table or an expression. It is one of the
most complex statement in SQL language with a lot of clauses to perform data retrieval and
transformation.

The SELECT statement has the following clauses:

• Select distinct rows using DISTINCT operator.


• Sort rows using ORDER BY clause.
• Filter rows using WHERE clause.
• Select a subset of rows from a table using LIMIT or FETCH clause.
• Group rows into groups using GROUP BY clause.
• Filter groups using HAVING clause.

Syntax:
SELECT [DISTINCT] [columns selection] [FROM <table>] [WHERE <expression>]
[GROUP BY <Expression> [HAVING <Expression>]] [ORDER BY <sort expression> ASC |
DESC] [LIMIT row_count OFFSET row_to_skip]

Aliasing

Alias is a temporary name given to a table or a column during execution of the query.

Below are the various ways of using an alias.

Column Alias

• Column aliases can be used in the SELECT list of a SQL query in PostgreSQL.
• Like all objects, aliases will be in lowercase by default. If mixed-case letters or special symbols, or
spaces are required, quotes must be used.
• Column aliases can be used for derived columns.
• Column aliases can be used with GROUP BY and ORDER BY clauses.
• We cannot use a column alias with WHERE and HAVING clauses.

Page 52 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

Table Alias

• Table aliases can be used in SELECT lists and in the FROM clause to show the complete record or
selective columns from a table.
• Table aliases can be used in WHERE, GROUP BY, HAVING, and ORDER BY clauses.
• When we need data from multiple tables, we need to join those tables by qualifying the columns using
table name/table alias.
• The aliases are mandatory for inline queries (queries embedded within another statement) as a data
source to qualify the columns in a select list.

Example:
SELECT ename employee_name, empno AS employee_id FROM emp;

Quoting
Single Quote / $ (dollar Quote)

• Single quotes and dollar quotes are used to specify non-numeric values

Example:
'hello world'
'2011-07-04 13:36:24' '{1,4,5}'
$$A string "with" various 'quotes' in.$$
$foo$A string with $$ quotes in $foo$

Double Quote

• Double quotes are used for names of database objects which either clash with keywords, contain
mixed case letters, or contain characters other than a-z, 0-9 or underscore

Example:
select * from "select" create table "HelloWorld" ...
select * from "Hi everyone and everything"

PostgreSQL Joins, Unions & Sub Queries


Table joining technique allows to combine rows from from 2 or more table. Following join types can be
used in PostgreSQL

Page 53 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

• Cross Join
• Inner Join
• Natural Join
• Left Outer Join
• Right Outer Join
• Full Outer Join

Join Condition

The join condition determines which rows from the two source tables are considered to be a “match”. The
join condition is specified with the ON or USING clause or implicitly by the word NATURAL.

Cross Join

Cross Join is a cartesian product which contains all rows from first table are joined with all rows from
second table. For example if the first table has m rows and second table has n rows then resultant cross
joined tables will have m*n rows.

Inner Join

Inner also called as Join combines only matching rows for two tables and leaves out the rows without
matches.

Syntax:
Select <columns list> from tableA A Inner Join TableB B On <condidition>;
Select <column list> from TableA A Join TableB B On <condition>;
Select <column list> from TableA A Join TableB B Using <common column>;

Natural Join

Natural Join is Inner Join only difference is that there is no need to give any Join Condition.

Syntax:
Select <column list> from TableA A Natural Join TableB B;

Left Outer Join

Left Outer Join also called as Left Join combines all matching rows from both table plus rows from left side
table which does not find matches also included.

Syntax:
Select <columns list> from tableA A Left Outer Join TableB B On <condidition>;

Page 54 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

Select <column list> from TableA A Left Join TableB B On <condition>;


Select <column list> from TableA A Left Join TableB B Using <common column>;

Right Outer Join

Right Outer Join also called as Right Join combines all matching rows from both table plus rows from Right
side table which does not find matches also included.

Syntax:
Select <columns list> from tableA A Right Outer Join TableB B On
<condidition>;
Select <column list> from TableA A Right Join TableB B On <condition>;
Select <column list> from TableA A Right Join TableB B Using <common column>;

Full Outer Join

Full Outer Join also called as Full Join combines all matching rows from both table plus rows from both left
side table and right side table which does not find matches also included.

Syntax
Select <columns list> from tableA A Full Outer Join TableB B On <condidition>;
Select <column list> from TableA A Full Join TableB B On <condition>;
Select <column list> from TableA A Full Join TableB B Using <common column>;

Combining Queries (Unions)

Results of 2 or more queries can be combined using SET operations like UNION, INTERSECT. Supported set
operations in PotgreSQL as follows:

• UNION
• UNION ALL
• INTERSECT
• EXCEPT

UNION

Union set operator allows to combine the results of 2 queries as a single result set.

The following are requirements for the queries in the syntax above:

o The number and the order of the columns must be the same in both queries.
o The data types of the corresponding columns must be the same or compatible.

Page 55 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

Syntax:
select <col1>, <col2>, ... from <table1> UNION select <col1>, <col2> ... from
<table2>

UNION ALL

UNION ALL works same as UNION but it combines all elements from both the sets where as UNION
returns common elements only once

Syntax:
select <col1>, <col2>, ... from <table1> UNION ALL select <col1>, <col2> ... from
<table2>

INTERSECT

Intersect allows to combine results of 2 queries and returns only common rows between them.

Syntax:
select <col1>, <col2>, ... from <table1> INTERSECT select <col1>, <col2>,.....
from <table2>

EXCEPT

EXCEPT compares the result sets of two queries and returns the distinct rows from the first query that are
not output by the second query.

Syntax:
select <col1>, <col2>, ... from <table1> EXCEPT select <col1>, <col2>,..... from
<table2>

Sub Queries

A Sub Query is also a query, which is defined under a main query. First Query is also called as outer query
or main query and second query is also called as inner query or subquery.

Server first sub query executed based on the sub query value main query will get executed.

If a sub query send single value to its nearest main query then that sub query is called Single-Valued-Sub
query.

If a sub query sends multiple values to its nearest main query then that sub query is called Multi-Valued-
Sub query.

Page 56 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

Note: If a sub query sends multiple values to its nearest main query then we have to use IN or NOT IN
operator between Main query and Sub query.

Single valued Subquery:


SELECT * FROM TABLENAME WHERE COLUMNNAME = (SELECT STATEMENT WHERE CONDITION)

Multi valued subquery

A subquery which returns more than one row to the outer query is called as Multi Values subquery.
Multivalues values sub queries can be handled by IN,
Multivalue SubQuery
select * from tablename where columname in (select statement where condition)

Example:

display all emp details whose earning salary above avg.salary


select * from emp where sal>=(select avg(sal) from emp)
select * from student where courseid in (select courseid from course where duration >
6)

Correlated Sub Query

In Correlated sub queries the outer query and the inner (sub) query are executed simultaneously.
Example:

display max salary for each dept and corresponding emp details
select * from emp e where sal=(select max(sal) from emp where deptno=e.deptno)

"First outer query gets executed, extracts 1 row at a time(candidate row) and that row is given to inner
query for processing, inner query will provide its output to outer query and based on the condition outer
query will display the extracted record".

Page 57 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

Lesson 6: Creating and Managing Databases

Page 58 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

Object Hierarchy

Database
• A running PostgreSQL server can manage multiple databases.
• A database is a named collection of SQL objects. It is a collection of schemas and the schemas
contain the tables, functions, etc.
• Databases are created with CREATE DATABASE command.
• Databases are destroyed with DROP DATABASE command .
• To determine the set of existing databases:
• SQL: SELECT datname FROM pg_database;
• PSQL META COMMAND: \l (backslash lowercase L)

Creating a Database

There is a program that you can execute from the shell to create new databases, createdb.

Syntax:
$ createdb <dbname>

Create Database SQL command can be used to create a database in a cluster.

Syntax:

Page 59 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

postgres=# CREATE DATABASE name


[ [ WITH ] [ OWNER [=] user_name ]
[ TEMPLATE [=] template ]
[ ENCODING [=] encoding ]
[ LC_COLLATE [=] lc_collate ]
[ LC_CTYPE [=] lc_ctype ]
[ TABLESPACE [=] tablespace_name ]
[ ALLOW_CONNECTIONS [=] allowconn ]
[ CONNECTION LIMIT [=] connlimit ]
[ IS_TEMPLATE [=] istemplate ] ]

Switching to a Database:
postgres=# \connect <dbname>

(meta shortcut \c can also be used instead of \connect).

Connecting to a Database:

We can connect to database using psql client

$ psql -U <USer> <dbname>


dname=#

Querying How Many Database Exists:

We can query pg_database system table to find the list of databases in the cluster.

Example:
postgres=# select datname from pg_database;

Or we can also use psql meta command to find the list of databases created so far.

postgres=# \l

Dropping Database:
postgres=# DROP DATABASE <dbname>;

(only Owner or postgres user can drop a database)

What is a Schema
• A database can contains one or more named schemas.
• By Default, all database contain public schema.

Page 60 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

• There are several reasons why one might want to use schemas:
✓ To allow many users to use one database without interfering with each other.
✓ To organize database objects into logical groups to make them more manageable.
✓ Third-party applications can be put into separate schemas so they cannot collide with the
names of other objects.

Relation Between USER / SCHEMA and OBJECTS

User Owns Schema and Schema contains the database objects like table, views etc.

PUBLIC Schema

Every database will be having a default schema named PUBLIC accessible to OWNER of database.

Creating Schema

A Schema can be created by connecting to the desired database and running the CREATE SCHEMA
command.

Syntax:
CREATE SCHEMA <schema name> AUTHORIZATION <username>;

For example if we need to create a schema “raj” in PROD database for user ‘raj’ then run the following
commands

Example
postgres=# \c prod

prod=# CREATE SCHEMA raj AUTHORIZATION raj;

Now a new schema called “raj” is created in the database PROD which is owned by user “raj”

Schema Search Path

Object in the database should be referred as <schema_name>.<object_name> however it is tedious to


give qualified names. To make this easier we can set SCHEMA SEARCH PATH so that the objects in the
schemas mentioned in SCHEMA SEARCH PATH can be accessed without qualified names.

To Check the Current Search PATH

To see the current search path use SHOW Command

postgres=# SHOW search_path;

Page 61 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

To SET the search path use SET command


postgres=# SET search_path TO myschema, public, raj;

Listing Schemas
postgres=# \dn
postgres=# select * from pg_namespace;

Database Objects

Database Schemas can contain different types of objects including:

• Tables
• Sequences
• Views
• Synonyms (only in commercial Postgresql)
• Domains
• Packages (only in commercial Postgresql)
• Functions
• Procedures

Page 62 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

Lesson 7: Advance Querying

Page 63 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

CASE Function
The CASE statement in PostgreSQL allows you to add conditional logic to your SQL queries. Here’s a basic
syntax and example:

Syntax:

SELECT

column1,

column2,

CASE

WHEN condition1 THEN result1

WHEN condition2 THEN result2

ELSE default_result

END AS new_column

FROM

table_name;

Example :

SELECT

empno,

ename,

sal,

CASE

WHEN sal > 2500 THEN 'High'

WHEN salary BETWEEN 1500 AND 2500 THEN 'Medium'

ELSE 'Low'

END AS salary_category

FROM

emp;

Page 64 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

COALESCE
The COALESCE function in PostgreSQL is used to return the first non-null value from a list of expressions.
This is particularly useful when dealing with potential NULL values in your data.

Syntax:

SELECT COALESCE(NULL, NULL, 'A BLANK', 'Default')

-- Output: 'A BLANK'

Example:

SELECT

empno,

ename,

sal + COALESE(comm, 0) AS Take_home

FROM emp;

NULLIF
The NULLIF function in PostgreSQL is used to compare two expressions and returns NULL if they are equal.
Otherwise, it returns the first expression.

Examples:
SELECT NULLIF(10, 10); -- Output: NULL
SELECT NULLIF(10, 20); -- Output: 10

Use Cases:

• NULLIF is commonly used to replace empty strings ('') or other default values with NULL in SELECT
statements.
• useful for avoiding division by zero errors

Type Casting
Data type casting is helpful to convert a datatype to another type. It can be done by using CAST function
or by using :: operator.

Syntax:

CAST ( expression AS target_data_type )

Page 65 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

-- OR

expression::target_data_type

Example:

SELECT CAST('2025-01-26' AS DATE);

-- OR

SELECT '2025-01-26'::DATE;

Example:

SELECT CAST(123 AS TEXT);

-- OR

SELECT 123::TEXT;

JSON / JSONB
JSON & JSONB datatype brings NoSQL support in PostgreSQL allowing users to save data in JSON objects.
PostgreSQL validates the syntax any input and rejects if it does not confirm to JSON syntax.

JSON support was included in PostgreSQL in version 9.2 and JSONB support included in version 9.4.

JSONB is much better choice to use as it is faster and supports many features.

comparison JSON vs JSONB


JSON JSONB
Stores data in plain text Stores data in binary format
Preserve the indentation of input data Does not preserve indentation of input data
Does not support indexing Supports Indexing (GIN indexes)
Does not allow adding or removing field Allows adding OR removing fields

Creating a Table With JSONB


create table emp_json (id serial, details JSONB);

Inserting Data in Emp_Json Table


insert into emp_json (details) values ('{ <valid JSON document> }'), ('{ <valid JSON
document> }'), ..

Page 66 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

Querying
select * from emp_json;

Querying a Specific JSON Field


select details -> 'ename' from test;

Avoiding Quotes in Output


select details ->> 'ename' from test;

Using a JSON Field in WHERE Clause


select details ->> 'ename', details ->> 'job', details ->> 'sal' from test where
details ->> 'ename' = 'FORD';
select details ->> 'ename' as ename, details ->> 'job' as Job, details ->> 'sal' as
sal from test where cast (details ->> 'sal' as integer) >= 2000;

Using a JSON Field in Order by Clause


select details ->> 'ename' as ename, details ->> 'job' as Job, details ->> 'sal' as
sal from test order by cast (details ->> 'sal' as numeric);

Using a JSON Field in Group by Clause


select details ->> 'job' as Job, count(*) from test group by details ->> 'job'

Check if the JSON Field Exists


select details ->> 'ename' as ename, details ->> 'job' as Job, details ->> 'sal' as
sal from test where details ? 'sal';

Check if a Key Value Pair Exists in


select details ->> 'ename' as ename, details ->> 'job' as Job, details ->> 'sal' as
sal from emp_json where details @> '{"ename":"KING"}';

Updating a Field (Only JSONB)


update emp_json set details = jsonb_set(details, '{sal}', '1000',false) where details
->> 'ename' = 'SMITH';

Removing a Field (Only JSONB)

update emp_json set details = details - 'comm' where details ->> 'ename' = 'SMITH';

Page 67 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

CTE - Common Table Expressions


Common Table Expressions (CTE) are useful to simplify the complex queries. CTEs are the auxiliary query
whose results can be used by the primary query. Result set of CTEs auxiliary query is temporary until
primary query execution completes.

Syntax:
WITH <cte_name> AS (auxiliary query) primary_query;

Example:
WITH emp_salgrade AS (select ename, job, sal, deptno, grade from emp join salgrade on
sal between losal and hisal) select ename, job, sal, grade from emp_salgrade es join
dept d on es.deptno=d.deptno;

Data Modifying Statement

Both the primary and auxiliary queries can be data modifying queries like INSERT, UPDATE or DELETE.
Temporary result set of the auxiliary query can be referenced by primary query only if the RETURNING *
clause is used otherwise not.

Example:
WITH del_table AS (delete from visitor_logs where log_date < CURRENT_DATE - interval
'1 month' RETURNING *) insert into visitor_archive select * from del_table;

The above statement deletes the rows from visitor_log table and also insert them in visitor_archive table.

Window Functions
Windows functions performs the calculations on set of rows that are related to current row.

Window Functions
o min(),
o max(),
o avg()
o rank() – ranks the output
o row_number() - generates the row numbers
o lag() - Showing the value of previous row in current row
o lead() - Showing the value of next row in current row
o last_value() - Showing last value

Page 68 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

o first_value() - Showing first value

min()
select ename, job, sal, min(sal) over (partition by job) from emp;
select ename, job, sal, min(sal) over (partition by job order by sal desc ) from emp
select ename, job, sal, min(sal) over () from emp

rank()
select ename, job, sal, rank() over (order by sal desc ) from emp;
select ename, job, sal, rank() over (partition by job order by sal desc ) from emp;

row_number()
select ename, job, sal, row_number() over (order by sal desc ) from emp;

lag()
select ename, job, sal, lag(sal,1) over (partition by job order by sal desc ) as
prev_sal from emp;

lead()
select ename, job, sal, lead(sal,1) over (partition by job order by sal desc ) as
next_sal from emp;

last_value()
select ename, job, sal, last_value(sal) over (partition by job order by sal desc )
from emp;

first_value()

select ename, job, sal, first_value(sal) over (partition by job order by sal desc ) from emp;

Page 69 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

Lesson 8: Indexes & Views

Page 70 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

PostgreSQL Indexes
An index allows the database server to find and retrieve specific rows much faster than it could do without
an index.

Types of Indexes Supported by PostgreSQL

o Hash Indexes
o B-Tree Indexes
o GIN Indexes
o GiST Indexes
o BRIN Indexes

B-Tree Indexes

This the default type of index created in PostgreSQL. B-Tree indexes can handle equality searches and
range searches efficiently. If the searches with following operators are done then optimizer selects B-Tree
index, if present.

Operations supported by B-tree


=, >, <, >=, <=, IS NULL, IS NOT NULL, BETWEEN, IN

However LIKE operator is supported only if the search term is anchored at the beginning of string (eg.
ename LIKE "foo%).

Syntax:
CREATE INDEX <indexname> ON <table_name> (<indexed_column> [ASC | DESC] [NULLS
{FIRST | LAST }, ,indexed_column2....);

Indexes and ORDER BY

You can adjust the ordering of a B-tree index by including the options ASC, DESC, NULLS FIRST, and/or
NULLS LAST when creating the index;

Syntax:
CREATE INDEX <index name> ON <tablename> (<col> NULLS FIRST);
CREATE INDEX <index name> ON <tablename> (<col> DESC NULLS LAST);

This will save sorting time spent by the query


Multicolumn Index

Page 71 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

If any of frequently running queries uses multiple columns in where clause then for such queries a
multicolumn index can be created which includes all columns used in where clause.

This speeds up the query execution.

Syntax:
CREATE INDEX test_idx1 ON test (id_1, id_2);

Currently, only the B-tree, GiST, GIN, and BRIN index types support multicolumn indexes. Up to 32
columns can be specified

Unique Indexes

Indexes can also be used to enforce uniqueness of a column's value or the uniqueness of the combined
values of more than one column

Syntax:
CREATE UNIQUE INDEX name ON table (column [, ...]);

Currently only B-tree indexes can be declared unique.

PostgreSQL automatically creates a unique index when a unique constraint or primary key is defined for a
table

Functional Indexes

An index can be created on a computed value from the table columns. Usually if the expression used in
where clause optimizer ignore the indexes and prefer full tablescan. We can avoid full tablescans by
creating the index with expression instead of column.

Syntax:
CREATE INDEX <index name> ON <tablename> (lower(col1));

They can also be created on user defined functions


Partial Index

A partial index is an index built over a subset of a table rather than whole table which reduces
maintenance of the index.

The index contains entries only for those table rows that satisfy the predicate and also optimizer uses this
index when the query predicate matches the index predicate.

Syntax:
CREATE INDEX <indexname> on <tablename> (<column>) where <condition;

Page 72 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

Views
The view is not physically object or table. Instead, the query is run every time the view is referenced in a
query.

PostgreSQL supports 2 types Views


o Normal Views
o Materialized Views

Normal Views
Creating Views

Before a view is used it needs to be created.

syntax
CREATE [ OR REPLACE ] [ TEMP | TEMPORARY ] VIEW name [ WITH ( view_option_name [=
view_option_value] [, ... ] ) ]
AS query
[ WITH [ CASCADED | LOCAL ] CHECK OPTION ]

Syntax Explanation

TEMPORARY or TEMP

If specified, the view is created as a temporary view. Temporary views are automatically dropped at the
end of the current session.
name

The name (optionally schema-qualified) of a view to be created


WITH (options)

This clause specifies optional parameters for a view; the following parameters are supported:
check_option (string)

allowed values are LOCAL or CASCADED.

Local: New rows are only checked against the conditions defined directly in the view itself

CASCADED: New rows are checked against the conditions of the view and all underlying base views

security_barrier (boolean)

Page 73 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

This should be used if the view is intended to provide row-level security.

security_invoker (boolean)

This option causes the underlying base relations to be checked against the privileges of the user of the
view rather than the view owner

query

A SELECT command which will provide the columns and rows of the view.

WITH [ CASCADED | LOCAL ] CHECK OPTION

When this option is specified, INSERT and UPDATE commands on the view will be checked to ensure that
new rows satisfy the view-defining condition If they are not, the update will be rejected.

This same as WITH (options) clause

Changing the Query of View

It is possible to change the query of the view only if all the existing columns are included in new query
otherwise such change is rejected however where clause can be changed.

To change the query of any view use


CREATE OR REPLACE VIEW name AS New-Query;

Altering View

A view can be alerted to change OWNER, SCHEMA, RENAME and view options

Syntax
ALTER VIEW [ IF EXISTS ] name OWNER TO { new_owner | CURRENT_USER | SESSION_USER };
ALTER VIEW [ IF EXISTS ] name RENAME TO new_name;
ALTER VIEW [ IF EXISTS ] name SET SCHEMA new_schema;
ALTER VIEW [ IF EXISTS ] name SET ( view_option_name [= view_option_value] [, ... ]
);
ALTER VIEW [ IF EXISTS ] name RESET ( view_option_name [, ... ] );

Drop View

A view can be dropped when it is not referenced by other views otherwise a error is returned.

Syntax:

Page 74 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

DROP VIEW name [CASCADE];

CASCADE option deletes all dependent objects as well

Materialized Views

Materialized views are created same like normal view the only difference is that the results are persisted
in materialized until materialized is refreshed. Materialized view behaves similar to tables.

Materialized view are read only.

Creating Materialized View

Syntax:
CREATE MATERIALIZED VIEW name AS query;

Example
CREATE MATERIALIZED VIEW mymatview AS SELECT * FROM emp;

Refreshing the Materialized view

Changes made to the base table are not reflected in materialized view until it is refreshed.
Syntax
REFRESH MATERIALIZED VIEW view name;

Example
REFRESH MATERIALIZED VIEW mymatview;

Indexes

Indexes can be created on materialized views in same way as they can be created on tables

Dropping Materialized View

A materialized view can be dropped just like any other object

Syntax
DROP MATERIALIZED VIEW [ IF EXISTS ] name;

Example
DROP MATERIALIZED VIEW myview;

Page 75 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

Lesson 9: Transactions

Page 76 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

Transactions provide a way of grouping DML statements and before accepting the changes we can review
and other accepts them all or discard the changes. Other benefit of the transaction is it protect transaction
from concurrently running other transactions and manipulating same data.

How it works

Every transaction starts with START TRANSACTION statement and ends with either COMMIT or ROLLBACK
statement.

Syntax:
START TRANSACTION;
some DML Statements;
COMMIT; / ROLLBACK;

Example:
START TRANSACTION;
SELECT * FROM EMP;
UPDATE EMP SET SAL = SAL + 100 WHERE ENAME = 'MILLER';
COMMIT;

Syntax:
START TRANSACTION;
some DML Statements;
SAVEPOINT <savepoint name>;
some more DML statements;
ROLLBACK TO SAVEPOINT <savepoint name>;
COMMIT; / ROLLBACK;

Page 77 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

Lesson 10: Copying and Moving Data

Page 78 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

Exporting / Importing Data


COPY command can be used for both importing and exporting data from / to a table to CSV, TEST or
BINARY files.

Exporting

A table data can be exported to a portable CSV or TAB Separated text format. PostgreSQL gives flexibility
of choosing which columns to include or can also specify a query instead of Table.

Syntax 1:
COPY table_name ( column_name [, ...] ) TO 'filename' WITH Options…..;

Syntax 2:
COPY ( SELECT query ) TO 'filename' WITH Options…..;

Syntax 3:
COPY ( SELECT query ) TO PROGRAM 'command' WITH Options…..;

Syntax 4:
COPY table_name ( column_name [, ...] ) TO PROGRAM 'command' WITH Options…..;

OPTIONS:
CSV

TEXT

DELIMITER <character>

Null <null replacement>

QUOTE ,quotation character>

HEADERS {TRUE | FALSE}

Example of Exporting Emp table:

In this example EMP table is exported to ‘/var/lib/pgsql/emp.csv’ file in CSV format


postgres=# copy emp to '/var/lib/pgsql/emp.csv' with CSV delimiter ',' null 'NULL'
quote '"' ;
COPY 14

Page 79 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

Importing
Any CSV or TAB Separated text file can be imported into a table If the column count and data types
matches irrespective of the origin of the file.
COPY <TableName>(Column1, Column2, ….) FROM /path/to/CSVfile WITH options.. [
WHERE condition ];

Options are same a Exporting options.

Example of Importing into emp_test table:


postgres=# copy emp from '/var/lib/pgsql/emp.csv' with CSV delimiter ',' null 'NULL'
quote '"' ;
COPY 14

Table to Table Data Transfer


Copy Data from Existing Table

Insert statement allows inserting data from a table to another using SELECT query provided provided the
datatype and constraints are not in conflict

Syntax:
INSERT INTO target_table SELECT * FROM source_table [WHERE ..];

Create New table and Insert Data from Existing Table


Create table statement allows to create a new table based on a SELECT Query.

Syntax:
create table new_table as select * from source_table ;

Create an Empty Table like Existing Table


We can create a new empty table with same structure as any of the existing tabl.

Syntax:
CREATE TABLE new_table (LIKE existing_table );

Page 80 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

NoSQL Databases

Page 81 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

Lesson 11: Introduction

Page 82 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

NoSQL Database

NoSQL is an approach to database design that can accommodate a wide variety of data models, including
key-value, document, columnar and graph formats. NoSQL, which stand for "not only SQL," is an
alternative to traditional relational databases in which data is placed in tables and data schema is carefully
designed before the database is built. NoSQL databases are especially useful for working with large sets of
distributed data.

Evolution of NoSQL

The evolution of NoSQL started in 1990's with the development and release of berkeleyDB. Developed by
University of California, Berkeley, influenced the development of NoSQL which supported certain
applications with specific storage needs.

NoSQL database Classification

NoSQL database is been classified in the following ways.

Key-value stores

Key-value stores, or key-value databases, implement a simple data model that pairs a unique key with an
associated value. Because this model is simple, it can lead to the development of key-value databases,
which are extremely performant and highly scalable for session management and caching in web
applications.

Examples
Aerospike, Berkeley DB, MemchacheDB, Redis and Riak.

Document databases

Document databases, also called document stores, store semi-structured data and descriptions of that
data in document format. They allow developers to create and update programs without needing to
reference master schema. Use of document databases has increased along with use of JavaScript and
the JavaScript Object Notation (JSON)

Examples
Couchbase Server, CouchDB, DocumentDB, MarkLogic and MongoDB.

Wide-column stores

Wide-column stores organize data tables as columns instead of as rows. Wide-column stores can be found
both in SQL and NoSQL databases. Wide-column stores can query large data volumes faster than
conventional relational databases.

Examples

Page 83 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

Google BigTable, Cassandra and HBase.

Graph stores

Graph data stores organize data as nodes, which are like records in a relational database, and edges,
which represent connections between nodes. Because the graph system stores the relationship between
nodes, it can support richer representations of data relationships.

Examples
AllegroGraph, IBM Graph, Neo4j and Titan

NoSQL classification and Adherence by Vendors

The basic NoSQL database classifications are only guides. Over time, vendors have mixed and matched
elements from different NoSQL database family trees to achieve more generally useful systems. That
evolution is seen, for example, in MarkLogic, which has added a graph store and other elements to its
original document databases. Couchbase Server supports both key-value and document
approaches. Cassandra has combined key-value elements with a wide-column store and a graph
database. Sometimes NoSQL elements are mixed with SQL elements, creating a variety of databases that
are referred to as multimodel databases.

NoSQL vs RDBMS

SQL NOSQL

RELATIONAL DATABASE MANAGEMENT SYSTEM Non-relational or distributed database system.


(RDBMS)

These databases have fixed or static or predefined They have have dynamic schema
schema

These databases are not suited for hierarchical data These databases are best suited for hierarchical
storage. data storage.

These databases are best suited for complex queries These databases are not so good for complex
queries

Verticlly Scalable Horizontally scalable

Page 84 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

CAP Theorem

The CAP Theorem is a fundamental theorem in distributed database systems that states any distributed
system can have at most two of the following three properties.

• Consistency
• Availability
• Partition tolerance

Lets Understand CAP Theorem

The CAP theorem states that a distributed system cannot simultaneously be consistent, available, and
partition tolerant.

Distributed databases

Distributed databases are usually non-relational databases that enable a quick access to data over a large
number of nodes.

Consistency

Any read operation that begins after a write operation completes must return the latest value.

In a consistent system, once a client writes a value to any server and gets a response, it expects to get that
value (or a fresher value) back from any server it reads from.

Availability

Every request receives a (non-error) response – without the guarantee that it contains the most recent
write

Partition Tolerance

The system continues to operate despite an arbitrary number of messages being dropped (or delayed) by
the network between nodes

PACELC theorem

PACELC theorem is an extension to the CAP theorem. It states that in case of network partitioning (P) in
a distributed computer system, one has to choose between availability (A) and consistency (C) (as per the
CAP theorem), but else (E), even when the system is running normally in the absence of partitions, one has
to choose between latency (L) and consistency (C )

The purpose of PACELC is to address the oversight in CAP theorem pertaining to consistency/latency
tradeoff of replicated systems as it is present at all times during system operation.

Page 85 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

Markup Languages
What is JSON?
History

Douglas Crockford is typically credited with discovering the format because he was the first to popularize
and specify its use around the year 2001.

Crockford, was developing an AJAX web application framework. However, needed to find a way to enable
real-time two-way browser-to-server communication without relying on Flash plugins or Java applets. It
was this need that served as the impetus behind the discovery of JSON. As JSON is used to transfer data
between browser and server.

JSON consists of nothing more than commas, curly braces, square brackets, and data, it can be easily
parsed into an array or object by any programming language.

JSON Data Structures

JSON data is formatted into two data structures that are used universally in all modern
programming languages:

• A JSON array is a list of values.


• A JSON object is a collection of name-value pairs.

JSON Arrays:
[ "red", "green", "blue", 7 ]

JSON Object (Document)


{
name: "Sami"
Occupation: "Trainer"
Location: "Hyderabad"
}

Complex JSON objects:


{
name: "Sami"
Occupation: "Trainer"
Location: "Hyderabad"
Canteach: ["mongoDB", "MySQL", "PosgreSQL", "Linux"]
LanguageFluency: {
English: "Very Good"
Hindi: "Very Good"
Telugu: "Very Good"

Page 86 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

Tamil: "Can Understand"


}
}

BSON

BSON is a computer data interchange format. The name "BSON" is based on the term JSON and stands for
"Binary JSON". It is a binary form for representing simple or complex data structures including associative
arrays (also known as name-value pairs), integer indexed arrays, and a suite of fundamental scalar types.
BSON originated in 2009 at MongoDB.

MongoDB represents JSON documents in binary-encoded format called BSON behind the scenes. BSON
extends the JSON model to provide additional data types, ordered fields, and to be efficient for encoding
and decoding within different languages

What is YAML?

Yet Another Markup Language (YAML)

Is a human-readable data-serialization language. It is commonly used for configuration files, but could be
used in many applications where data is being stored or transmitted. YAML targets many of the same
communications applications as XML but has a minimal syntax. It uses both Python-style indentation to
indicate nesting, and a more compact format that uses [] for lists and {} for maps making YAML
a superset of JSON.

employee
name: "Sami"
Occupation: "Trainer"
Location: "Hyderabad"
Canteach:
- mongoDB
- MySQL
- PosgreSQL
- Linux
LanguageFluency:
English: "Very Good"
Hindi: "Very Good"
Telugu: "Very Good"
Tamil: "Can Understand"

Page 87 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

Lesson 12: Introduction to MongoDB

Page 88 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

MongoDB is a document oriented database server developed in the C++ programming language. The word
Mongo is derived from Humongous.

History

The initial development of MongoDB began in 2007 when the company was building a cloud platform.

MongoDB was developed by a NewYork based organization named 10gen which is now known as
MongoDB Inc. It was initially developed as a PAAS (Platform As A Service). Later in 2009, it is introduced in
the market as an open source database server that was maintained and supported by MongoDB Inc.

The first ready production of MongoDB has been considered from version 1.4 which was released in
March 2010.

Features of MongoDB

• Support ad hoc queries

In MongoDB, you can search by field, range query and it also supports regular expression
searches.

• Indexing

You can index any field in a document.

• Replication

MongoDB supports Master Slave replication.

A master can perform Reads and Writes and a Slave copies data from the master and can only be
used for reads or back up (not writes)

• Duplication of data

MongoDB can run over multiple servers. The data is duplicated to keep the system up and also
keep its running condition in case of hardware failure.

• Load balancing

It has an automatic load balancing configuration because of data placed in shards.

• Supports mapreduce and aggregation tools.

MongoDB Architecture
Client Connection Protocols:

Page 89 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

The MongoDB Wire Protocol is a simple socket-based, request-response style protocol. Clients
communicate with the database server through a regular TCP/IP socket.

• TCP/IP Socket

Clients should connect to the database with a regular TCP/IP socket.

• Port

The default port number for mongod and mongos instances is 27017.

• Default Clients

The mongo shell is an interactive JavaScript interface to MongoDB. You can use the mongo shell to
query and update data as well as perform administrative operations.

MongoDB Compass is the GUI client for MongoDB.

• Security

MongoDB provides security by implementing user and roles. Security can be disabled.

• MongoDB Query Language

MongoDB uses its own Query Language which is similar to Javascript and does not uses standard
SQL.

Object Hierarchy

A MongoDB instance consisted of 2 high level objects called as DATABASES and USER/ROLES.

Page 90 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

DATABASE in turn may have one or more COLLECTIONS and INDEXES which actually stores data.

MONGO DB Vs Traditional RDBMS

RDBMS MONGODB

ROW DOCUMENT

TABLE COLLECTION

COLUMN PROPERTY or Field

One –Many Mapping Enbbeded-document’s

Data Types

MongoDB supports many datatypes. Some of them are −

String

This is the most commonly used datatype to store the data. String in MongoDB must be UTF-8 valid.

Integer

This type is used to store a numerical value. Integer can be 32 bit or 64 bit depending upon your server.

Boolean

This type is used to store a boolean (true/ false) value.

Double

This type is used to store floating point values.

Min/ Max keys

This type is used to compare a value against the lowest and highest BSON elements.

Arrays

Page 91 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

This type is used to store arrays or list or multiple values into one key.

Timestamp

This can be handy for recording when a document has been modified or added.

Object

This datatype is used for embedded documents.

Null

This type is used to store a Null value.

Symbol

This datatype is used identically to a string; however, it's generally reserved for languages that use a
specific symbol type.

Date

This datatype is used to store the current date or time in UNIX time format. You can specify your own date
time by creating object of Date and passing day, month, year into it.

Object ID

This datatype is used to store the document’s ID.

Binary data

This datatype is used to store binary data.

Code

This datatype is used to store JavaScript code into the document.

Regular expression

This datatype is used to store regular expression.

Page 92 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

Lesson 13: CRUD Operations

Page 93 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

CRUD stands for Create, Read, Update and Delete basically they are equivalent to DML operations in
RDBMS.

MongoDB Operators
Comparison Operators:

Name Description

$eq Matches values that are equal to a specified value.

$gt Matches values that are greater than a specified value.

$gte Matches values that are greater than or equal to a specified value.

$in Matches any of the values specified in an array.

$lt Matches values that are less than a specified value.

$lte Matches values that are less than or equal to a specified value.

$ne Matches all values that are not equal to a specified value.

$nin Matches none of the values specified in an array.

$regex Matches values as per regular expression given

Page 94 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

Logical Operators:

Name Description

$and Joins query clauses with a logical AND returns all documents that match the conditions
of both clauses.

$not Inverts the effect of a query expression and returns documents that do not match the
query expression.

$nor Joins query clauses with a logical NOR returns all documents that fail to match both
clauses.

$or Joins query clauses with a logical OR returns all documents that match the conditions
of either clause.

Element Operators

Name Description

$exists Matches documents that have the specified field.

$type Selects documents if a field is of the specified type.

The Crud Operations


insert Method

Inserts a document or documents into a collection.


Syntax:

Page 95 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

db.collection.insert([
{doc1},{doc2},....],
{
writeConcern: <document>,
ordered: <boolean>
}
)

Examples:
db.products.insert( { item: "card", qty: 15 } )
db.products.insert(
[
{ _id: 11, item: "pencil", qty: 50, type: "no.2" },
{ item: "pen", qty: 20 },
{ item: "eraser", qty: 25 }
]
)

Unordered Inserts:

• Behavior: MongoDB attempts to insert all documents, regardless of errors.


• Use Case: Suitable when document order isn't critical and you want to insert as many documents
as possible, even if some fail.

db.products.insert(
[
{ _id: 20, item: "lamp", qty: 50, type: "desk" },
{ _id: 21, item: "lamp", qty: 20, type: "floor" },
{ _id: 22, item: "bulk", qty: 100 }
],
{ ordered: false }
)

Inserts with Write Concerns:


db.products.insert(
{ item: "envelopes", qty : 100, type: "Clasp" },
{ writeConcern: { w: "majority", wtimeout: 5000 } }
)

insertOne method

Inserts a single document into collection

Page 96 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

db.collection.insertOne(
<document>,
{
writeConcern: <document>
}
)

insertMany Method

Inserts multiple documents into the collection

db.collection.insertMany(
[ <document 1> , <document 2>, ... ],
{
writeConcern: <document>,
ordered: <boolean>
}
)

find Method

Selects documents in a collection or view and returns the selected documents.

Syntax:
db.collection.find({query}, {projection})

query: is a condition to match the documents

Projection: Selection of fields to return. Projection cannot contain both exclude and include specification
except for _id field

Query Examples:
db.bios.find( { qty: 23 } )
db.bios.find( { "name.last": "Hopper" } )

Note: To access fields in an embedded document, use dot notation


("<embedded document>.<field>").

Using $in Operator


db.employees.find(
{ _id: { $in: [ 1001, 1005) ] } }
)

Using $in Operator on Array to match any of the element

Page 97 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

db.employees.find(
{ canteach: { $in: [ "mongodb", "MySQL") ] } }
)

Using Regex with find method

db.employees.find(
{ name: { $regex: /^N/ } }
)

Using comparison Operators


db.employees.find( { doj: { $gt: new Date('2015-01-01') } } )
db.employees.find( { salary: { $gt: 10000 } } )
db.employees.find( { salary: { $lt: 10000 } } )

Range Queries:
db.employees.find( { salary: { $gte: 10000, $lt: 20000 } } )
db.employees.find( { doj: { $gt: new Date('2015-01-01'), $lt: new Date('2017-12-31')
} } )

Queries on Multiple fields


db.employees.find( {
doj: { $gt: new Date('1920-01-01') },
salary: { $gt : 10000}
} )

Querying for documents without a particular field


db.employees.find( {
doj: { $exists: false }
} )
db.employees.find( {
doj: null
} )

Querying an Array
db.employees.find( { canteach: { $all: [ "mongodb", "mysql" ] } } )

db.employees.find(
{ canteach: { $in: [ "mongodb", "MySQL") ] } }
)

db.employees.find( { canteach: { $size: 2} } )

Page 98 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

$elemmatch : The $elemMatch operator matches documents that contain an array field with at
least one element that matches all the specified query criteria

Consider the following collection


{ _id: 1, results: [ 82, 85, 88 ] }
{ _id: 2, results: [ 75, 88, 89 ] }
db.scores.find(
{ results: { $elemMatch: { $gte: 80, $lt: 85 } } }
)

Specify the Fields to Return (projection)


db.employees.find( {}, {_id:0, name:1, salary:1 } } )

Sorting Result

Sorting the Result (sort() method)

db.employees.find().sort({name: 1})

Limiting output

Limit method limit()

db.employees.find().limit(5)

Skip method
db.employees.find().skip(5)

Updating Data
Update Method
db.collection.update(query, update, options)

Modifies an existing document or documents in a collection. The method can modify specific fields of an
existing document or documents or replace an existing document entirely, depending on the update
parameter.

Syntax:
db.collection.update(
<query>,
<update>,
{
upsert: <boolean>,

Page 99 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

multi: <boolean>,
writeConcern: <document>,

}
)

Upsert Behavior

If upsert is true and no document matches the query criteria, update() inserts a single document. The
update creates the new document.

If upsert is true and there are documents that match the query criteria, update() performs an update.

db.people.update(
{ name: "Andy" },
{
name: "Andy",
rating: 1,
score: 1
},
{ upsert: true }
)

Updating Multiple Documents

If multi is set to true, the update() method updates all documents that meet the <query> criteria.

db.books.update(
{ stock: { $lte: 10 } },
{ $set: { reorder: true } },
{ multi: true }
)

With write concern


db.books.update(
{ stock: { $lte: 10 } },
{ $set: { reorder: true } },
{
multi: true,
writeConcern: { w: "majority", wtimeout: 5000 }
}
)

Changing Data

Page 100 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

Changes to documents can be done using any of the $set or $inc Operators

Example of $set:
db.books.update(
{ _id: 1 },
{
$set: {
item: "ABC123",
"info.publisher": "2222",
tags: [ "software" ],
"ratings.1": { by: "xyz", rating: 3 }
}
}
)

Example of $inc operator:


db.books.update(
{ _id: 1 },
{ $inc: {stock:5}
)

Remove Fields

Remove Fields from a Document

The following operation uses the $unset operator to remove the field

db.employees.update( { _id: 1 }, { $unset: { salary: 1 } } )

updateOne method

Updates one document which matches the filter

db.collection.updateOne(
<filter>,
<update>,
{
upsert: <boolean>,
writeConcern: <document>,
}
)

Page 101 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

updateMany method

Updates multiple documents within the collection based on the filter.

Syntax:
db.collection.updateMany(
<filter>,
<update>,
{
upsert: <boolean>,
writeConcern: <document>,
}
)

Deleting Data
Remove Method

Remove method removes / deletes the documents matching given query.

Syntax:
db.collection.remove(
{<query>},
{
justOne: <boolean>,
writeConcern: <document>,
}
)

And empty query deletes all documents from the collections. Example below.

db.employees.remove( { } )

remove method with a query deletes all the matching document

db.employees.remove({name:"Sami"})

Remove method with justOne deletes first match

db.employees.remove({name:"Sami"}, {justOne: true})

Delete Method

Page 102 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

Like remove method delete method also deletes the documents matching given query. There are 2
variations namely deletemany and deleteone.

Syntax:
db.collection.deleteMany(
<filter>,
{
writeConcern: <document>,
collation: <document>
}
)
db.collection.deleteOne(
<filter>,
{
writeConcern: <document>,
collation: <document>,
hint: <Index hint>
}
)

Advance Collection Creation

MongoDB is schema less, means there is no fixed schema or field definitions for collections however
MongoDB allow to specify certain validations such as size limit for collection or required fields.

Creating capped collection

Creating capped collection requires us to provide SIZE and MAX values.

SIZE: Maximum size of the collections. When Collection reaches this size then the oldest documents are
deleted to make room for new documents. MongoDB ensures that the collection does not grow above the
mention size. Mentioning SIZE is must for capped collections.

MAX: specifies maximum number of document a collection can have. SIZE takes the precedence when SIZE
limit is reached before reaching MAX.

Example:
db.createCollection("stock", { capped : true, size : 5242880, max : 5000 } )

The above example creates a collection named stock with 5MB size cap.

Page 103 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

Collection Validations

MongoDB does not enforce fixed schema requirements but it provides validations to declare required field
list and field properties like datatype, input validations etc.

Required Fields

Creating a collections with required fields

db.createCollection( "employees", {
validator: { $jsonSchema: {
bsonType: "object",
required: [ "ename", "job", "salary" ] }
}
}
)

A new collection named "employees" is created where every document required to have "ename", "job"
and "salary" columns.

Input Validation

Schema validations allows to specify datatype and other input validations for documents in a collection.
properties object which is embedded object within validator object facilitates input validations.

properties: {
field1 : {
bsonType: "<type>",
enum : [v1,v2,....],
description: "<some description>"
},
field2: {
bsonType: "<type>",
minimum : <number>,
maximum : <number>,
description: "<some descriptions>"
}
}

Example:
db.createCollection( "employee2", {
validator: { $jsonSchema: {
bsonType: "object",
required: [ "ename", "job", "salary" ],

Page 104 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

properties: {
ename : {
bsonType: "string",
description: "Name of the Employee"
},
job: {
bsonType: "string",
enum : ["Manager", "Analyst","Clerk","Salesman"],
description: "Designation of Employee, Choose one of the mentioned jobs"
},
salary: {
bsonType: "decimal",
minimum : 1000,
maximum : 1000000,
description: "Salary of Employee"
},
deptno: {
bsonType: "int",
description: "Department number of Employee"
}
}
}
}
}
)

View Collection Validation Specification use following method.

db.getCollectionInfos()

Page 105 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

Lesson 14: Aggregation

Page 106 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

The aggregation pipeline is a framework for data aggregation modeled on the concept of data processing
pipelines. Documents enter a multi-stage pipeline that transforms the documents into aggregated results.

Syntax:
db.collection.aggregate([ {stage1}, {stage2}, {stage3},....])

An aggregation pipeline consists of one or more stages that process documents:

• Each stage performs an operation on the input documents. For example, a stage can filter
documents, group documents, and calculate values.
• The documents that are output from a stage are passed to the next stage.
• An aggregation pipeline can return results for groups of documents. For example, return the total,
average, maximum, and minimum values.

Aggregation stages
$match stage

Usage:
{ $match: { <query> } }

Example:
db.emp.aggregate([{$match: {job:"CLERK"}}])

In Aggregation $match stage can come at any position.

$group Stage

Group stage perform group level aggregation computations like sum, count, min, max etc. For group level
computations it further needs accumulators as shown in table below.

Name Description

$avg Returns an average of numerical values. Ignores non-numeric values.

$first Returns a value from the first document for each group. Order is only defined if the
documents are in a defined order.

Page 107 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

$last Returns a value from the last document for each group. Order is only defined if the
documents are in a defined order.

$max Returns the highest expression value for each group.

$min Returns the lowest expression value for each group.

$stdDevPop Returns the population standard deviation of the input values.

$stdDevSamp Returns the sample standard deviation of the input values.

$sum Returns a sum of numerical values. Ignores non-numeric values.

Grouping Examples:

Sum of Salary for each job title:

db.emp.aggregate([ {$group:{ _id:"$job", tot_sal:{$sum:"$sal"}}}])

Number of employees per job title:

db.emp.aggregate([ {$group:{ _id:"$job", No_OF_Emp:{$sum:1}}}])

Total No. of employees:

db.emp.aggregate([ {$group:{ _id:null, All_Emp_COUNT:{$sum:1}}}])

Lowest Salary:

db.emp.aggregate([ {$group:{ _id:null, All_Emp_COUNT:{$min: "$sal" } }}])

Highest Salary:

db.emp.aggregate([ {$group:{ _id:null, All_Emp_COUNT:{$max:"$sal"}}}])

Page 108 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

$bucket stage

Syntax:
{
$bucket: {
groupBy: <expression>,
boundaries: [ <lowerbound1>, <lowerbound2>, ... ],
default: <literal>,
output: {
<output1>: { <$accumulator expression> },
...
<outputN>: { <$accumulator expression> }
}
}
}

Lets say that we want categorized products according to salePrice and count number of products within
each price range.

Here the price range we are using is called as boundaries based on which grouping is done for salePrice
field

Example:
db.emp.aggregate([
{$bucket:{groupBy:"$sal",boundaries:[0,1000,2000,5000],default:"Other",output:{"count
":{$sum:1}}}}])

Example of bucket with 2 output fields fist will count no. of products in each bucket and second will report
what is the maximum salePrice in each bucket

db.emp.aggregate([
{$bucket:{groupBy:"$sal",boundaries:[0,1000,2000,3000,4000,5000],default:"Other",outp
ut:{"count":{$sum:1},"max":{$max:"$sal"}}}}])

$bucketAuto stage

Here it will automatically decides on the boundaries and produce result

syntax:
{
$bucketAuto: {

Page 109 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

groupBy: <expression>,
buckets: <number>,
output: {
<output1>: { <$accumulator expression> },
...
}
}
}

Example:

db.emp.aggregate([
{$bucketAuto:{groupBy:"$sal",buckets:4,output:{"count":{$sum:1},"max":{$max:"$sal"}}}
}])

$addFields stage

Adds new fields to documents. $addFields outputs documents that contain all existing fields from the
input documents and newly added fields

Syntax:
{ $addFields: { <newField>: <expression>, ... } }

Lets add a new field called Annual Salary which is computed 12 times of sal.

db.emp.aggregate([ {$addFields: {Annual_Salary:{$multiply:["$sal",12]}}}])

$project stage

Passes along the documents with the requested fields to the next stage in the pipeline. The specified fields
can be existing fields from the input documents or newly computed fields.

Syntax:
{ $project: { <specification(s)> } }

Example:
db.emp.aggregate([ {$project: {_id:0,ename:1,job:1, sal:1}}])

$project and $addField Together


db.emp.aggregate([ {$addFields: {Ann_Sal:{$multiply:["$sal",12]}}}, {$project:
{_id:0,ename:1,job:1, sal:1, Ann_Sal:1}} ])

Page 110 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

$lookup stage

Performs a left outer join to an un-sharded collection in the same database to filter in documents from the
“joined” collection for processing.

Syntax:
{
$lookup:
{
from: <collection to join>,
localField: <field from the input documents>,
foreignField: <field from the documents of the "from" collection>,
as: <output array field>
}
}

For example if we have a dept collection with deptno, name and location filelds and an
employees collection with a matching deptno filed we can match both the collections.
db.dept.aggregate([{ $lookup: {
from: "emp",
localField: "_id",
foreignField: "deptno",
as: "employee-details" }
}
]
)

Page 111 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

Lesson 15: Indexes

Page 112 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

In MongoDB, indexes enhance the performance of queries by allowing the database to efficiently locate
documents within collections. Here we learn all about Indexes in MongoDB.

Supported Type of Indexes in MongoDB

• Single Field Indexes


• Compound Indexes
• Multikey Indexes
• Text Index
• Hashed Index
• Unique Index
• Sparse Index
• TTL Index
• Partial Index

Viewing existing indexes on a given collection


db.collection.getIndexes()

Single Field Indexes

Indexes created on single field either ascending or descending are Single Field Indexes

Syntax:
db.collection.createIndex( { field: 1 } )
Creates an ascending index
db.collection.createIndex( { field: -1 } )

Creates an descending index

The given field can be embedded document or a field from embedded document

Compound Indexes

MongoDB supports compound indexes, where a single index structure holds references to multiple
fields within a collection’s documents.

Syntax:
db.collection.createIndex( { <field1>: <type>, <field2>: <type2>, ... } )

Example:

Page 113 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

db.employees.createIndex({name:1, designation:-1})

Multikey Indexes

MongoDB creates an index key for each element in the array if the field used for index is an array.

Syntax:
db.coll.createIndex( { <field>: < 1 or -1 > } )

Example:
db.employees.createIndex({canteach:1})

Text Index

Text Indexes are useful for Fulltext searches on fields having large text data

db.collection.createIndex({text-column: "text"})

Example :
db.reviews.createIndex( { comments: "text" } )

Hashed Indexes

Hashed indexes are useful if used as sharing key as it result in more even distribution of data across the
shards.

db.collection.createIndex( { _id: "hashed" } )

TTL Indexes

TTL indexes are created with a particular expire time period in seconds to automatically remove the
associated document from the collection. TTL indexes are effective only if created on a date field or an
array containing date value.

Document Deletion:
• TTL Index deletes the document when the specified number of seconds lapses from the date
mentioned in field value
• If the specified field does not contain date then the document never expires
• If the document does to contain the specified field then also it will never expires

Syntax:
db.collection.createIndex( { "Date Field": 1 }, { expireAfterSeconds: 3600 } )
db.collection.createIndex( { "postedOn": 1 }, { expireAfterSeconds: 86400 } )

Page 114 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

Unique Indexes

A unique index ensures that the indexed fields do not store duplicate values; i.e. enforces uniqueness for
the indexed fields. By default, MongoDB creates a unique index on the _id field during the creation of a
collection.

Syntax:
db.collection.createIndex( <key and index type specification>, { unique: true } )

Example:
db.members.createIndex( { "user_id": 1 }, { unique: true } )

Create a Partial Index

Partial indexes only index the documents in a collection that meet a specified filter expression. By indexing
a subset of the documents in a collection, partial indexes have lower storage requirements and reduced
performance costs for index creation and maintenance.

Example:
db.products.createIndex(
{ pname: 1 },
{ partialFilterExpression: { stock: { $gt: 50 } } }
)

Sparse Indexes

Sparse indexes only contain entries for documents that have the indexed field, even if the index field
contains a null value. The index skips over any document that is missing the indexed field.

Syntax:
db.collection.createIndex( { "field1": 1 }, { sparse: true } )

Dropping Indexes
db.collection.dropIndex()
db.collection.dropIndexes()

example:
db.employees.dropIndex({name:1})

Drops index created on name field of employees collection

Page 115 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

db.employees.dropIndexes()

Drops all indexes created for employees collection

Reindexing

Can rebuild all indexes in a collection

db.collection.reIndex()

Querying Index Size


db.collection.totalIndexSize()

Displays the size of all indexes in a collection.

Page 116 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

Lesson 16: Exporting and Importing Data

Page 117 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

JSON Exports
MongoDB allows to export a collection in JSON format and also imports a JSON file into a collection using
mongoexport and mongoimport commands

Syntax:
mongoexport --host <hostname> --db <database> --collection <collection> --out
<outputfile.json>

Example:
mongoexport --host localhost --port 27017 --db mydatabase --collection users --out
users.json

CSV Exports
MongoDB allows to export a collection in CSV format and also imports a CSV file into a collection using
mongoexport and mongoimport commands

Syntax:
mongoexport --host <hostname> --db <database> --collection <collection> --type=csv -
-fields <field1>,<field2>,... --out <outputfile.csv>

example:
mongoexport --host localhost --port 27017 --db mydatabase --collection users --
type=csv --fields name,email,age --out users.csv

JSON Imports
Mongoimport command allows importing a fie containing valid JSON documents to a collection.

Syntax:
mongoimport --host <hostname> --db <database> --collection <collection> --file
<inputfile.json> --jsonArray

Example:
mongoimport --host localhost --port 27017 --db mydatabase --collection users --file
users.json –jsonArray

Page 118 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.
SQL and NoSQL Course

CSV Imports
Same mogoimport command allows to import a .CSV file into a MongoDB collections.

Syntax:
mongoimport --host <hostname> --db <database> --collection <collection> --type=csv --
file <inputfile.csv> --headerline

Example:
mongoimport --host localhost --port 27017 --db mydatabase --collection products --
type=csv --file products.csv –headerline

Page 119 of 119 by :- Mohammed Abdul Sami


For internal reference only not for distribution or sale.

You might also like