0% found this document useful (0 votes)
26 views79 pages

Unit 5

Uploaded by

adijadhav104
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views79 pages

Unit 5

Uploaded by

adijadhav104
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd

Unit 5

NoSQL Database

By - Prof. Priyanka.H.Shingate
Department of Computer Engineering, ZCOER, Pune

Distributed Database System


• Distributed database is the type of database
management system in which number of
databases are stored at various locations and
interconnected through the computer
networks.
• Definition: Distributes database is collection of
various interconnected databases which are
physically spread at various locations and
communicate via a computer network.
• In distributed database system each site may
have its own memory and its own database
server.

NoSQL Database Unit 5


Department of Computer Engineering, ZCOER, Pune

Example of Distributed Database System

NoSQL Databases Unit 5


Department of Computer Engineering, ZCOER, Pune

Advantages of Distributed Database System


 Modular Development − If the system needs to be expanded to new locations or
new units, in centralized database systems, the action requires substantial efforts
and disruption in the existing functioning. However, in distributed databases, the
work simply requires adding new computers and local data to the new site and
finally connecting them to the distributed system, with no interruption in
current functions.

 More Reliable − In case of database failures, the total system of centralized


databases comes to a halt. However, in distributed systems, when a component
fails, the functioning of the system continues may be at a reduced performance.
Hence DDBMS is more reliable.

Database Management Unit 4


Department of Computer Engineering, ZCOER, Pune

Advantages of Distributed Database System


 Better Response − If data is distributed in an efficient manner, then user requests
can be met from local data itself, thus providing faster response. On the other hand,
in centralized systems, all queries have to pass through the central computer for
processing, which increases the response time.

 Lower Communication Cost − In distributed database systems, if data is located


locally where it is mostly used, then the communication costs for data
manipulation can be minimized. This is not feasible in centralized systems.

 Increase Availability: in DDMS is scattered at various nodes. Therefore if one node


fails then data can be easily available from another node. It means that it have
higher availability.

Database Management Unit 4


Department of Computer Engineering, ZCOER, Pune

Disadvantages of Distributed Database System


 Overall Cost :
Various costs such as maintenance cost, procurement cost, hardware cost,
network/communication costs, labor costs, etc, adds up to the overall cost and make
it costlier than normal DBMS.

 Data design complex:


Designing a distributed database is more complex compared to a centralized
database.

 Lack of professional support


It is difficult to provide resources to every user. Due to a lack of communication
support, The resources produced by different vendors are not linked to the network,
as a result, many beneficial resources are not available to the users of different
locations.
Database Management Unit 4
Department of Computer Engineering, ZCOER, Pune

Types of Distributed Database System

Database Management Unit 4


Department of Computer Engineering, ZCOER, Pune

Homogenous Distributed Database System

Database Management Unit 4


Department of Computer Engineering, ZCOER, Pune

Heterogeneous Distributed Database System

NoSQL Database Unit 5


Department of Computer Engineering, ZCOER, Pune

CAP Theorem
CAP theorem states that in networked shared-data systems or distributed
systems, we can only achieve at most two out of three guarantees for a
database: Consistency, Availability and Partition Tolerance.

NoSQL Database Unit 5


Department of Computer Engineering, ZCOER, Pune

CAP Theorem
Consistency
 Consistency means that all clients see the same data at the same time, no matter which
node they connect to.
 For this to happen, whenever data is written to one node, it must be instantly forwarded or
replicated to all the other nodes in the system before the write is deemed ‘successful.’

NoSQL Database Unit 5


Department of Computer Engineering, ZCOER, Pune

CAP Theorem
Availability
 Availability means that any client making a request for data gets a response, even if
one or more nodes are down.
 Another way to state this—all working nodes in the distributed system return a valid
response for any request, without exception.

NoSQL Database Unit 5


Department of Computer Engineering, ZCOER, Pune

CAP Theorem
Partition tolerance:

 A partition is a communications break within a


distributed system—a lost or temporarily delayed
connection between two nodes.
 Partition tolerance means that the cluster must continue
to work despite any number of communication
breakdowns between nodes in the system.

NoSQL Database Unit 5


Department of Computer Engineering, ZCOER, Pune

CAP Theorem

NoSQL Database Unit 5


Department of Computer Engineering, ZCOER, Pune

CAP Theorem
CP database:
 A CP database delivers consistency and partition Write (x=10)
Read (x)
tolerance at the expense of availability.
 When a partition occurs between any two nodes, the
system has to shut down the non-consistent node
(i.e., make it unavailable) until the partition is resolved. Node A Node B
 Example: In Banking system availability is not important
as consistency.
Availability Lost: (if we Fail One)
AP database:
 An AP database delivers availability and partition
tolerance at the expense of consistency. Consistency Lost: Both request
 When a partition occurs, all nodes remain available but should process.
those at the wrong end of a partition might return an
older version of data than others. (When the partition is
resolved, the AP databases typically resync the nodes to
Database Management Unit 4
Department of Computer Engineering, ZCOER, Pune

CAP Theorem
CA database:
 A CA database delivers consistency and availability Write (x=10)
Read (x)
across all nodes.
 It can’t do this if there is a partition between any two
nodes in the system, however, and therefore can’t
deliver fault tolerance. Node A Node B

Database Management Unit 4


Department of Computer Engineering, ZCOER, Pune

Structured, Unstructured and semi-structured


Structured Data:
 The data which is to the point, factual, and highly
organized is referred to as structured data.
 It is quantitative in nature, i.e., it is related to
quantities that means it contains measurable
numerical values like numbers, dates, and times.
 Example: Relational data.Tables, spreadsheets,
databases

Unstructured Data
 All the unstructured files, log files, audio files,
and image files are included in the unstructured
data.
 Some organizations have much data available, but
they did not know how to derive data value since
the data is raw.
Database Management Unit 4
Department of Computer Engineering, ZCOER, Pune

Structured, Unstructured and semi-structured


Semi-Structured data –
Semi-structured data is information that does not reside in a relational database but that has
some organizational properties that make it easier to analyze. With some processes, you can
store them in the relation database (it could be very hard for some kind of semi-structured
data), but Semi-structured exist to ease space.
Example: XML data,Email,WWW.

Database Management Unit 4


Department of Computer Engineering, ZCOER, Pune

Difference between Structured & Unstructured

Database Management Unit 4


Department of Computer Engineering, ZCOER, Pune

Introduction of NoSQL Database


• NoSQL can be defined as an approach to database designing, which holds a vast
diversity of data such as key-value, multimedia, document, columnar, graph
formats, external files, etc.
• NoSQL is called a non-relational database.
• It is designed for distributed data stores where very large scale of data storing needs
(for example Google or Facebook which collects terabits of data every day for their
users).
• NoSQL is purposefully developed for handling specific data models having flexible
schemas to build modern applications.
• NoSQL is a non-relational database used to store different types of unstructured
data.
• NoSQL database doesn't use tables for storing data. It is generally used to store big
data and real-time web applications.
• Some famous examples are MongoDB, Neo4J, HyperGraphDB, etc.
Database Management Unit 4
Department of Computer Engineering, ZCOER, Pune

Introduction of NoSQL Database


• It is generally used to classify with below characteristics
 Non-Relational Data
 Schemaless
 21st-century databases to store unstructured data generated from Twitter,
IoT,eCommerce etworking sites

Database Management Unit 4


Department of Computer Engineering, ZCOER, Pune

Advantages and disadvantages of NoSQL Database


Advantages :
Disadvantages :
 High scalability
 Lack of Standardization
 Distributed Computing
 Backup of Database
 Lower cost
 Consistency
 Schema flexibility, semi-structure
data
 No complicated Relationships

Database Management Unit 4


Department of Computer Engineering, ZCOER, Pune

RDBMS vs NoSQL
 NoSQL
• RDBMS
Stands for Not Only SQL
 Structured and organized data
No declarative query language
 Structured query language  No predefined schema
(SQL) Key-Value pair storage, Column
 Data and its relationships are Store, Document Store, Graph
stored in separate tables. databases
 Data Manipulation Language,  Eventual consistency rather ACID
Data Definition Language property
 Tight Consistency  Unstructured and unpredictable data
 CAP Theorem
 Prioritizes high performance, high
availability and scalability
 BASE Transaction
Database Management Unit 4
Department of Computer Engineering, ZCOER, Pune

Feactures of NoSQL Database


Following are the Features of NoSQL:

1) Non- Relational:
 The relational model is never followed by NoSQL databases.
 Tables with flat fixed-column records should never be used.
 There are no advanced features such as query languages, query planners,
referential integrity joins, or ACID compliance.

2) Schema-Free:
 NoSQL databases are either schema-free or contain schemas that are more
loose.
 There is no requirement for any kind of data structure specification.
 Provides data structures that are heterogeneous within the same domain.

Database Management Unit 4


Department of Computer Engineering, ZCOER, Pune

Feactures of NoSQL Database


3) Simple API:
 Provides simple user interfaces for storing and querying data.
 APIs make it possible to manipulate and choose data at a low level.
 The most often used NoSQL query language is not based on any standard.

2) Distributed:
 A distributed execution of many NoSQL databases is possible.
 Only ensuring long-term consistency

Database Management Unit 4


Department of Computer Engineering, ZCOER, Pune

Types of NoSQL Databases


• NoSQL is a non-relational database that is used to store the data in the nontabular
form.
• NoSQL stands for Not only SQL.

• Types of NoSQL Database:


1. Document-based databases
2. Key-value stores
3. Column-oriented databases
4. Graph-based databases

Database Management Unit 4


Department of Computer Engineering, ZCOER, Pune

Key-value NoSQL Database


1. Key-value store:
• A key-value data store is a type of database that stores data as a collection of
key-value pairs.
• In this type of data store, each data item is identified by a unique key, and the
value associated with that key can be anything, such as a string, number,
object, or even another data structure.
• Key features of the key-value store:
 Simplicity.
 Scalability.
 Speed.

Database Management Unit 4


Department of Computer Engineering, ZCOER, Pune

Document NoSQL Database


2. Document NoSql Database:
 The document-based database is a non-
relational database.
 Instead of storing the data in rows and
columns (tables), it uses the documents to
store the data in the database.
 A document database stores data in JSON,
BSON, or XML documents.
 In the Document database, the particular
elements can be accessed by using the index
value that is assigned for faster querying.

Database Management Unit 4


Department of Computer Engineering, ZCOER, Pune

Column Oriented Databases


• A column-oriented database is a non-relational
database that stores the data in columns instead of
rows.
• That means when we want to run analytics on a small
number of columns, you can read those columns
directly without consuming memory with the
unwanted data.
• Columnar databases are designed to read data more
efficiently and retrieve the data with greater speed. A
columnar database is used to store a large amount of
data.
• Key features of columnar oriented database:
 Scalability.
 Compression.
 Very responsive.
Database Management Unit 4
Department of Computer Engineering, ZCOER, Pune

Graph-based Databases
• Graph-based databases focus on the relationship
between the elements.
• It stores the data in the form of nodes in the database.
• The connections between the nodes are called links or
relationships.
• Key features of graph database:
 In a graph-based database, it is easy to identify the
relationship between the data by using the links.
 The Query’s output is real-time results.
• An example of a social network graph.
• Given the people (nodes) and their relationships
(edges), you can find out who the "friends of friends" of
a particular person are—for example, the friends of
Howard's friends.
Database Management Unit 4
Department of Computer Engineering, ZCOER, Pune

BASE Properties
• The BASE properties of a database management system are a set of principles that
guide the design and operation of modern databases.
• The acronym BASE stands for Basically Available, Soft State, and Eventual
Consistency.

Database Management Unit 4


Department of Computer Engineering, ZCOER, Pune

BASE Properties
1) Basically Available:
• This property refers to the fact that the database system should always be
available to respond to user requests, even if it cannot guarantee
immediate access to all data.
• The database may experience brief periods of unavailability, but it should
be designed to minimize downtime and provide quick recovery from
failures.
2) Soft State
• This property refers to the fact that the state of the database can change
over time, even without any explicit user intervention.
• This can happen due to the effects of background processes, updates
to data, and other factors.
• The database should be designed to handle this change gracefully, and
ensure that it does not lead to data corruption or loss.
Database Management Unit 4
Department of Computer Engineering, ZCOER, Pune

BASE Properties
3) Eventual Consistency
• This property refers to the eventual consistency of data in the database, despite
changes over time.
• In other words, the database should eventually converge to a consistent state, even
if it takes some time for all updates to propagate and be reflected in the data.
• This is in contrast to the immediate consistency required by traditional ACID-
compliant databases.

Database Management Unit 4


Department of Computer Engineering, ZCOER, Pune

Difference between ACID and BASE Properties


Sr.No Criteria ACID BASE
1 Simplicity Simple Complex
2 Focus Commits Best attempt
3 Maintenance High Low
4 Consistency Of Data Strong Weak/Loose
5 Difficult to
Implementation Easy to implement
implement
6 Type of database Robust Simple
7 Type of code Simple Harder
8
Time required for completion Less time More time.

Database Management Unit 5


Department of Computer Engineering, ZCOER, Pune

MongoDB Database
• MongoDB is an open-source, cross-platform, and distributed document-based
database designed for ease of application development and scaling.
• It's called a "NoSQL" database.
• It is opposite to SQL based databases where it does not normalize data under
schemas and tables where every table has a fixed structure.
• Instead, it stores data in the collections as JSON based documents and does not
enforce schemas.
• (JSON)JavaScript Object Notation, more commonly known by the acronym JSON,
is an open data interchange format that is both human and machine-readable.

Database Management Unit 5


Department of Computer Engineering, ZCOER, Pune

MongoDB Database
• The following table lists the relation between MongoDB and RDBMS terminologies.

RDBMS (SQL Server, Oracle, etc.) MongoDB (NoSQL Database)


Database Database
Table Collection
Row (Record) Document
Column Field

Database Management Unit 5


Department of Computer Engineering, ZCOER, Pune

MongoDB Database
• In the RDBMS database, a table can have multiple rows and columns.
• Similarly in MongoDB, a collection can have multiple documents which are equivalent
to the rows.
• Each document has multiple "fields" which are equivalent to the columns. Documents
in a single collection can have different fields.
• The following is an example of JSON based document.

Database Management Unit 5


Department of Computer Engineering, ZCOER, Pune

Features of MongoDB Database


following are the Features of MongoDB –
1) Schema-less Database:
 It is the great feature provided by the MongoDB.
 A Schema-less database means one collection can hold different types of
documents in it.
 It is not necessary that the one document is similar to another document like in the
relational databases.
 Due to this cool feature, MongoDB provides great flexibility to databases.

2) Document Oriented:
 In MongoDB, all the data stored in the documents instead of tables like in
RDBMS.
 In these documents, the data is stored in fields(key-value pair) instead of rows and
columns which make the data much more flexible in comparison to RDBMS. And
each document contains its unique object id.
Database Management Unit 5
Department of Computer Engineering, ZCOER, Pune

Features of MongoDB Database


3) Indexing:
 In MongoDB database, every field in the documents is indexed with primary and
secondary indices this makes easier and takes less time to get or search data
from the pool of the data.
 If the data is not indexed, then database search each document with the specified
query which takes lots of time and not so efficient.

4) Replication:
 MongoDB provides high availability and redundancy with the help of replication,
it creates multiple copies of the data and sends these copies to a different server
so that if one server fails, then the data is retrieved from another server.

Database Management Unit 5


Department of Computer Engineering, ZCOER, Pune

Features of MongoDB Database


4) Aggregation:
 It allows to perform operations on the grouped data and get a single result or
computed result.
 It is similar to the SQL GROUPBY clause. It provides three different aggregations
i.e, aggregation pipeline, map-reduce function, and single-purpose aggregation
methods

5) High Performance:
 The performance of MongoDB is very high and data persistence as compared to
another database due to its features like scalability, indexing, replication, etc.

Database Management Unit 5


Department of Computer Engineering, ZCOER, Pune

Adavantages of MongoDB Database


 It is a schema-less NoSQL database. You need not to design the schema of the
database when you are working with MongoDB.
 It does not support join operation.
 It provides great flexibility to the fields in the documents.
 It contains heterogeneous data.
 It provides high performance, availability, scalability.
 It is a document oriented database and the data is stored in JSON documents.

Database Management Unit 5


Department of Computer Engineering, ZCOER, Pune

Disadavantages of MongoDB Database


 It does not support joins the way a relational database does. But join capability
may be manually added to the page with the proper code. But it could cause
execution to slow and impact performance.
 MongoDB stores the key names of each value pair. Also, data redundancy occurs
as a result of joins’ limitations. As a result, memory is seen more often than is
required.
 16 MB is the maximum size limit for documents.
 Security.

Database Management Unit 5


Department of Computer Engineering, ZCOER, Pune

Where to used MongoDB


• Where MongoDB should be used
 Big and complex data
 Mobile and social infrastructure
 Content management and delivery
 User data management

Database Management Unit 5


Department of Computer Engineering, ZCOER, Pune

MongoDB:CRUD Operations
• MongoDB provides a set of some basic but most essential operations that will help
you to easily interact with the MongoDB server and these operations are known as
CRUD operations.

Database Management Unit 5


Department of Computer Engineering, ZCOER, Pune

MongoDB: Create Operations


• The create or insert operations are used to insert or add new documents in the
collection.
• If a collection does not exist, then it will create a new collection in the database.

• You can perform, create operations using the following methods provided by the
MongoDB:
Method Description

It is used to insert a single document in the


db.collection.insertOne()
collection.

It is used to insert multiple documents in the


db.collection.insertMany()
collection.
db.createCollection() It is used to create an empty collection.
Database Management Unit 5
Department of Computer Engineering, ZCOER, Pune

MongoDB: Example of Create Operations


In the following example, we can see the Collection "users_detail" is created
with the below data.

Database Management Unit 5


Department of Computer Engineering, ZCOER, Pune

MongoDB: Example of Create Operations

Database Management Unit 5


Department of Computer Engineering, ZCOER, Pune

MongoDB: Read Operations


• The Read operations are used to retrieve documents from the collection, or in
other words, read operations are used to query a collection for a document.
• You can perform read operation using the following method provided by the
MongoDB:
Method Description

It is used to retrieve documents from the


db.collection.find()
collection.

• .pretty() : this method is used to decorate the result such that it is easy to read.

Database Management Unit 5


Department of Computer Engineering, ZCOER, Pune

MongoDB: Example of Read Operations

Database Management Unit 5


Department of Computer Engineering, ZCOER, Pune

MongoDB: Update Operations


• The update operations are used to update or modify the existing document in
the collection.
• You can perform update operations using the following methods provided by the
MongoDB:
Method Description
It is used to update a single document in the collection that satisfy the given
db.collection.updateOne()
criteria.

It is used to update multiple documents in the collection that satisfy the given
db.collection.updateMany()
criteria.

It is used to replace single document in the collection that satisfy the given
db.collection.replaceOne()
criteria.

Database Management Unit 5


Department of Computer Engineering, ZCOER, Pune

MongoDB: Example of Update Operations

Database Management Unit 5


Department of Computer Engineering, ZCOER, Pune

MongoDB: Delete Operations


• The delete operation are used to delete or remove the documents from a
collection.
• You can perform delete operations using the following methods provided by the
MongoDB:
Method Description
It is used to delete a single document from the collection that satisfy the given
db.collection.deleteOne()
criteria.

It is used to delete multiple documents from the collection that satisfy the given
db.collection.deleteMany()
criteria.

Database Management Unit 5


Department of Computer Engineering, ZCOER, Pune

MongoDB: Example of Delete Operations

Database Management Unit 5


Department of Computer Engineering, ZCOER, Pune

MongoDB: Indexing
• Indexes support the efficient resolution of queries.
• Without indexes, MongoDB must scan every document of a collection to select
those documents that match the query statement. This scan is highly inefficient
and require MongoDB to process a large volume of data.
• Indexes are special data structures, that store a small portion of the data set in
an easy-to-traverse form.
• The index stores the value of a specific field or set of fields, ordered by the
value of the field as specified in the index.
• A database index is a way to organize information so that the database engine can
quickly find the relevant results.

Database Management Unit 5


Department of Computer Engineering, ZCOER, Pune

MongoDB: Indexing
1) The createIndex() Method:
• To create an index, you need to use createIndex() method of MongoDB.
Syntax:
>db.COLLECTION_NAME.createIndex({KEY:1})
Here key is the name of the field on which you want to create index and 1 is
for ascending order. To create index in descending order you need to use -1.
Example:
>db.mycol.createIndex({"title":1})

• In createIndex() method you can pass multiple fields, to create index on multiple
fields.
Example: >db.mycol.createIndex({"title":1,"description":-1})

Database Management Unit 5


Department of Computer Engineering, ZCOER, Pune

MongoDB: Indexing
2. Display Index:
The getIndexes() method:
• This method returns the description of all the indexes int the collection.
Syntax: db.COLLECTION_NAME.getIndexes()
Example: > db.mycol.createIndex({"title":1,"description":-1})
> db.mycol.getIndexes()

3. Drop Index:
The dropIndex() method:
You can drop a particular index using the dropIndex() method of MongoDB.
Syntax: >db.COLLECTION_NAME.dropIndex({KEY:1})
Example: > db.mycol.dropIndex({"title":1})

Database Management Unit 5


Department of Computer Engineering, ZCOER, Pune

MongoDB: Indexing
The dropIndexes() method:
• This method deletes multiple (specified) indexes on a collection
Syntax: >db.COLLECTION_NAME.dropIndexes()
Example: >db.mycol.dropIndexes({"title":1,"description":-1})

4.Sorting Index:(Ascending and descending)


Ascending:
Example: db.Customer.find().sort({'CustID':1})

Descending
Example: db.Customer.find().sort({'CustID': -1})

Database Management Unit 5


Department of Computer Engineering, ZCOER, Pune

MongoDB: Aggregation
• Aggregations operations process data records and return computed results.
• Aggregation operations group values from multiple documents together, and can
perform a variety of operations on the grouped data to return a single result.
• In SQL count(*) and with group by is an equivalent of MongoDB aggregation.
• One of the most common use cases of Aggregation is to calculate aggregate values
for groups of documents.
• This is similar to the basic aggregation available in SQL with the GROUP BY clause
and COUNT, SUM and AVG functions.
• Aggregation operations allow you to group, sort, perform calculations, analyze data,
and much more.

Database Management Unit 5


Department of Computer Engineering, ZCOER, Pune

MongoDB: Aggregation Pipeline stages

Database Management Unit 5


Department of Computer Engineering, ZCOER, Pune

MongoDB: Aggregation
The aggregate() function is used to perform aggregation it can have three
operators stages, expression and accumulator.

Database Management Unit 5


Department of Computer Engineering, ZCOER, Pune

MongoDB: Aggregation
Stages: Each stage starts from stage operators which are:
 $match: It is used for filtering the documents can reduce the amount of documents
that are given as input to the next stage.
 $project: It is used to select some specific fields from a collection.
 $group: It is used to group documents based on some value.
 $sort: It is used to sort the document that is rearranging them
 $skip: It is used to skip n number of documents and passes the remaining
documents
 $limit: It is used to pass first n number of documents thus limiting them.
 $out: It is used to write resulting documents to a new collection

Database Management Unit 5


Department of Computer Engineering, ZCOER, Pune

MongoDB: Aggregation
Expressions: It refers to the name of the field in input documents for e.g. { $group :
{ _id : “$id“, total:{$sum:”$fare“}}} here $id and $fare are expressions.

Accumulators: These are basically used in the group stage


 sum: It sums numeric values for the documents in each group
 count: It counts total numbers of documents
 avg: It calculates the average of all given values from all documents
 min: It gets the minimum value from all the documents
 max: It gets the maximum value from all the documents
 first: It gets the first document from the grouping
 last: It gets the last document from the grouping

Database Management Unit 5


Department of Computer Engineering, ZCOER, Pune

How Aggregation Works in MongoDB

Database Management Unit 5


Department of Computer Engineering, ZCOER, Pune

MongoDB: Aggregation
1. Find the customer whose status is "completed"
customer> db.Customer.aggregate([{$match:{'Status':'completed'}}])
[
{
_id: ObjectId("635a9a5d29e81711afa59d0b"),CustID: 'A123',
Amount: 500, Status: 'completed'
},
{
_id: ObjectId("635a9a7c29e81711afa59d0c"),CustID: 'A124',
Amount: 200, Status: 'completed'
}
]

Database Management Unit 5


Department of Computer Engineering, ZCOER, Pune

MongoDB: Aggregation
2. Find sum of amount of customer under status "completed"
Example: customer>db.Customer.aggregate([{$match:{'Status':'completed'}},{$group:
{_id:'$CustID','totalAmount':{$sum:'$Amount'}}}])
Output:
[
{ _id: 'A123', totalAmount: 500 },
{ _id: 'A124', totalAmount: 200 }
]

Refer the Examples of Aggregation in DBMS Practical No : 11

Database Management Unit 5


Department of Computer Engineering, ZCOER, Pune

MongoDB: Aggregation
1. Find the customer whose status is "completed"
customer> db.Customer.aggregate([{$match:{'Status':'completed'}}])
[
{
_id: ObjectId("635a9a5d29e81711afa59d0b"),CustID: 'A123',
Amount: 500, Status: 'completed'
},
{
_id: ObjectId("635a9a7c29e81711afa59d0c"),CustID: 'A124',
Amount: 200, Status: 'completed'
}
]

Database Management Unit 5


Department of Computer Engineering, ZCOER, Pune

MongoDB: Map Reduce

Database Management Unit 5


Department of Computer Engineering, ZCOER, Pune

MongoDB: Map Reduce

Database Management Unit 5


Department of Computer Engineering, ZCOER, Pune

MongoDB: Map Reduce


MapReduce Syntax: In the above syntax −
Following is the syntax of the basic mapReduce • map is a javascript function that maps a
command –db.collection.mapReduce(function() value with a key and emits a key-value
{emit(key,value);}, //map function pair
function(key,values) {return reduceFunction}, { • reduce is a javascript function that
//reduce function reduces or groups all the documents
out: collection, having the same key
query: document, • out specifies the location of the map-
sort: document, reduce query result
limit:number • query specifies the optional selection
} criteria for selecting documents
) • sort specifies the optional sort criteria
• limit specifies the optional maximum
number of documents to be returned

Database Management Unit 5


Department of Computer Engineering, ZCOER, Pune

MongoDB: Map Reduce

Database Management Unit 5


Department of Computer Engineering, ZCOER, Pune

MongoDB: Example of Map Reduce

Database Management Unit 5


Department of Computer Engineering, ZCOER, Pune

MongoDB: Replication

Database Management Unit 5


Department of Computer Engineering, ZCOER, Pune

MongoDB: Replication

Database Management Unit 5


Department of Computer Engineering, ZCOER, Pune

MongoDB: Replication
• In MongoDB, replication can be implemented for the processing where it is taken
care of that same data is accessible on more than a single MongoDB server.
• Replication also helps in protecting a database from the loss of a particular
server.
• Data can be recovered in case there are a hardware failure or service
interruptions through this concept and approach.
Advantages Disadvantages
• Helps in disaster recovery and backup of data. More space required.
• Data can be kept safe through this redundant backup
approach.
• Minimizes downtime for maintenance.

Database Management Unit 5


Department of Computer Engineering, ZCOER, Pune

MongoDB: Replication
How Replication Works in MongoDB:
• MongoDB makes use of a replica set to achieve replication.
• Replica sets are collections of mongod instances which
targets to host the identical dataset.
• There is only one primary node associated with the replica
set.
 To perform this, a minimum of three nodes are
required.
 In this operation of replication, MongoDB assumes one
node of replica set as the primary node and the
remaining are secondary nodes.
 From within the primary node, data gets replicated to
secondary nodes.
 New primary nodes get elected in case there is
automatic maintenance or failover
Database Management Unit 5
Department of Computer Engineering, ZCOER, Pune

MongoDB: Sharding
• Sharding is a method for distributing or partitioning data across multiple
machines.
• MongoDB uses sharding to support deployments with very large data sets and high
throughput operations.

Database Management Unit 5


Department of Computer Engineering, ZCOER, Pune

MongoDB: Sharding
There are two methods for addressing system growth: vertical and horizontal
scaling.

Database Management Unit 5


Department of Computer Engineering, ZCOER, Pune

MongoDB: Sharding
A MongoDB sharded cluster consists of the
Sharded Cluster:
following components:
 shard: Each shard contains a subset of the
sharded data. Each shard can be deployed as
a replica set.
 mongos: The mongos acts as a query router,
providing an interface between client
applications and the sharded cluster. Starting
in MongoDB 4.4, mongos can support
hedged reads to minimize latencies.
 config servers: Config servers store
metadata and configuration settings for the
cluster.

Database Management Unit 5


Department of Computer Engineering, ZCOER, Pune

Thank You !!!!

Database Management Unit 4

You might also like