0% found this document useful (0 votes)
36 views57 pages

Unit 1 Mangodb

The document provides a comprehensive overview of MongoDB, a popular NoSQL database, highlighting its features, advantages, and installation process. It covers the differences between NoSQL and traditional relational databases, the CAP theorem, and various types of databases, including document, key-value, and graph databases. Additionally, it outlines the need for MongoDB's flexible schema design, scalability, and security features.

Uploaded by

vidhyarameshoffl
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
36 views57 pages

Unit 1 Mangodb

The document provides a comprehensive overview of MongoDB, a popular NoSQL database, highlighting its features, advantages, and installation process. It covers the differences between NoSQL and traditional relational databases, the CAP theorem, and various types of databases, including document, key-value, and graph databases. Additionally, it outlines the need for MongoDB's flexible schema design, scalability, and security features.

Uploaded by

vidhyarameshoffl
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 57

NoSQL - MongoDB

By

Dr T.LOGESWARI
Assistant Professor
Department of
BUSINESS ANALYTICS
PSG College of Arts
and Science
Coimbatore – 641014.
UNIT -I
Introduction to MongoDB
• Big Databases
• SQL-NoSQL Tradeoffs
• CAP Theorem
• Eventual Consistency NoSQL
• Database Types

MongoDB - Introduction
• Need
• MongoDB Vs RDBMS
• MongoDB Driver Installation
• Configuration
• Import & Export MongoDB Server Configuration
Big Database
•Definition: Databases specifically designed to store and
manage large-scale data efficiently.

•Types:
•Relational (e.g., PostgreSQL, MySQL – but limited for very
big data)
•NoSQL (e.g., MongoDB, Cassandra – designed for scale)
•Distributed Databases (e.g., Hadoop HDFS, Amazon
Redshift)
•Focus: How to organize, store, retrieve, and update large
volumes of data efficiently and reliably.
•Examples: Google Bigtable, Apache Cassandra, Amazon
DynamoDB.
Big Databases are designed for:
• Big databases are specifically designed to handle
large-scale data that traditional databases cannot
efficiently manage. Here's what they are designed for:
1. Scalability
• Handle huge volumes of data, from terabytes to
petabytes.
• Support horizontal scaling (adding more machines)
and vertical scaling (upgrading hardware).
2.High Availability & Fault Tolerance
• Ensure data is always accessible, even if some
servers fail.
• Use replication and clustering to avoid data loss
and downtime.
3. Fast Data Access & Query Performance
• Optimize read and write operations for large data.
• Use indexing, partitioning, and caching to ensure
quick responses.
4. Support for Different Data Models
• Manage structured, semi-structured,
and even some unstructured data.
• Examples:
• Relational databases (structured)
• Document databases like MongoDB (semi-
structured)
• Column stores like Cassandra (structured but
flexible)
5. Distributed Architecture
• Data is stored across multiple machines
or locations.
• Used in cloud environments or on-
premise data centers.
Examples of Big Databases
Type Example Use Case
Flexible
NoSQL
MongoDB schema, fast
Document
inserts
High write
Apache
Column Store throughput, IoT
Cassandra
data
Real-time
Wide-column Google Bigtable analytics, time-
series data
Distributed High-availability
Amazon Aurora
RDBMS SQL-based apps
Basic Understanding
1.Which big database is a NoSQL Document
type?
a) Google Bigtable
b) MongoDB
c) Amazon Aurora
d) Apache Cassandra
2.Fill in the blank:
________ is used for high write throughput and is
suitable for IoT data.
3.What type of database is Amazon Aurora?
Answer: __________
4.Which database supports real-time
analytics and time-series data?
5.True or False: Apache Cassandra is a wide-
column database.
Introduction to NoSQL

• The term “NoSQL” actually means “Not


Only SQL“.
• non-relational” database
• store data in an unstructured form, without
following a fixed schema.
• distributed data stores with high storage
capacity requirements.
• used for Big Data and real-time web
applications.
• Twitter, Facebook or Google collect several
terabytes of data about their users every day.
Introduction to NoSQL
History - NoSQL
• 1998 by Carl Strozz, in order to
designate his lightweight and
open source relational database.
• Adopted and popularized by
GAFAMs such as Google,
Facebook or Amazon faced with
huge volumes of data
• 2000 - graphical database Neo4j
• 2004 - Google Bigtable
• 2005 - CouchDB
• 2007 - Amazon Dynamo
• 2008 - Facebook made open
source the non-relational
database it uses internally:
Cassandra
Data Store - RDBMS
Data Store - NoSQL
Categories (Data Store)-
NoSQL
• Document databases: These databases store data
as semi-structured documents, such as JSON or
XML, and can be queried using document-oriented
query languages.
• Key-value stores: These databases store data as
key-value pairs, and are optimized for simple and
fast read/write operations.
• Column-family stores: These databases store
data as column families, which are sets of columns
that are treated as a single entity. They are
optimized for fast and efficient querying of large
amounts of data.
• Graph databases: These databases store data as
nodes and edges, and are designed to handle
complex relationships between data.
Categories - NoSQL
Data Store - Key Value
True Key-Value Store Format:
In a Key-Value Store, each entry is a key and a value, like this:

• Key → Value
• "user:111" → {FName: "ABC", LName:
"XYZ", City: "Bangalore", ...}
• "order:200" → {UserId: 111, Product:
"Mobile", Amount: 1000}
Data Store - Document
• Analogy:
• “A Document Store is like a folder full of individual Word
files (documents), each with different structures. You
can open and read any one without needing to follow a
strict format.”
Data Store - Column
• What is a Wide Column Store?
• A wide column store stores data in rows and columns,
but columns are grouped into families (super
columns), and each row can have a different structure.
• 🔸 Example Systems: Apache Cassandra, Apache HBase
• Think of it like a shelf of boxes, where each box is a row,
and inside each box you can have different sets of
labeled folders (column families) depending on what
that row needs."
Data Store - Graph
SQL - NoSQL Tradeoff
SQL - NoSQL Tradeoff
CAP Theorem
CAP Theorem
• CA(Consistency and Availability)-
CA systems prioritize Consistency and
Availability, but are not Partition Tolerant.
CA is mostly theoretical and rarely
implemented in modern distributed databases.
Example databases: Cassandra,
CouchDB, Riak, Voldemort.
• AP(Availability and Partition
Tolerance)-
• The system prioritizes
availability over consistency and can
respond with possibly stale data.
• The system can be distributed
across multiple nodes and is designed
to operate reliably even in the face of
network partitions.
• Example databases: Amazon
DynamoDB, Google Cloud Spanner.
• CP(Consistency and Partition
Tolerance)-
• The system prioritizes consistency
over availability and responds with the
latest updated data.
• The system can be distributed
across multiple nodes and is designed to
operate reliably even in the face of network
partitions.
• Example databases: Apache HBase,
MongoDB, Redis.
Database
Hierarchical Databases
This database follows the progression of data being
categorized in ranks or levels, wherein data is
categorized based on a common point of linkage.
EG: IBM - IMS
Network Databases
A network or net of database files linked with multiple
threads is observed. Notice how the Student, Faculty,
and Resources elements each have two-parent
records, which are Departments and Clubs.
EG: Integrated Data Store
Object-Oriented Databases
Information stored in a database is capable of
being represented as an object which
response as an instance of the database
model. Therefore, the object can be
referenced and called without any difficulty.
• Different objects linked to one another using
methods; one can get the address of the Person
(represented by the Person Object) using the
livesAt() method. Eg: ObjectDB
Relational Databases
In this database, every piece of information has a
relationship with every other piece of information.
This is on account of every data value in the database
having a unique identity in the form of a record.
Cloud Databases
• A cloud database is used
where data requires a
virtual environment for
storing and executing
over the cloud platforms
and there are so many
cloud computing services
for accessing the data
from the databases (like
SaaS, Paas, etc).

• There are some names of


cloud platforms are-

• Amazon Web Services (AWS)


• Google Cloud Platform (GCP)
• Microsoft Azure

Centralized Databases
•A centralized database is
basically a type of database that
is stored, located as well as
maintained at a single location
and it is more secure when the
user wants to fetch the data from
the Centralized Database.

Advantages
• Data Security
• Reduced Redundancy
• Consistency
Disadvantages
• The size of the centralized database is
large which increases the response and
retrieval time.
• It is not easy to modify, delete and
update.
Distributed Databases
• A type of database which
consists of multiple
databases that are
connected with each other
and are spread across
different physical
locations.
• Typically, distributed
databases operate on two
or more interconnected
servers on a computer
network.
• EG: Apache Ignite, Apache
Cassandra, Apache HBase,
Amazon SimpleDB
NoSQL Databases
Advantages of NoSQL
• There are many advantages of working
with NoSQL databases such as MongoDB
and Cassandra. The main advantages are
high scalability and high availability.

Disadvantages of NoSQL
• NoSQL has the following disadvantages.
• NoSQL is an open-source database.
• GUI is not available
• Backup is a weak point for some NoSQL databases
like MongoDB.
• Large document size.
MongoDB - Introduction
MongoDB - Introduction
• MongoDB is one of the most popular open-
source NoSQL database written in C++. As of
February 2015,
• MongoDB is the fourth most popular
database management system.
• It was developed by a company 10gen which
is now known as MongoDB Inc.
• MongoDB is a document-oriented
database which stores data in JSON-like
documents with dynamic schema.
• It means you can store your records
without worrying about the data
structure such as the number of fields or
types of fields to store values.
• MongoDB documents are similar to
JSON objects
MongoDB - History
• Dwight Merriman and Eliot Horowitz,
encountered development and scalability
issues with traditional relational database
approaches
• MongoDB - document-based database
which is developed in the C++
programming languages.
• The word Mongo is basically derived from
Humongous. MongoDB was first
developed by a New York-based
organization named 10gen in the year of
2007.
• Later 10gen changed the name and known
as MongoDB Inc as of today.
• At the beginning, MongoDB is basically
developed as a PAAS (Platform as a
Service) database.
• But, in the year 2009, it was introduced as
an open source database as named
MongoDB 1.0.
• The below diagram demonstrates the
release history of MongoDB to date.
• MongoDB 4.0 is the current stable version
which is released in February, 2018.
MongoDB - Need
• Flexibility:
MongoDB uses documents that can contain sub-
documents in complex hierarchies making it expressive
and flexible.
MongoDB can map objects from any programming
language, ensuring easy implementation and
maintenance.

• Flexible Query Model:


The user can selectively index some parts of each
document or a query based on regular expressions,
ranges, or attribute values, and have as many
properties per object as needed by the application layer.
• Native Aggregation:
• Native aggregation allows users to extract
and transform data from the database.
• The data can either be loaded into a new
format or exported to other data sources.

• Schema-less model:
• Applications get the power and
responsibility to interpret different
properties found in a collection's
documents.
MongoDB - Need
MongoDB - Need
1.General-Purpose Database:
MongoDB can serve diverse sets of data and multiple
purposes within a single application.

2. Flexible Schema Design:


The document-oriented approach allows non-defined
attributes to be modified on the fly.
This is a key contrast between MongoDB and other
relational databases.

3. Load Balancing and Scalability:


It is built to scale, both vertically and horizontally. Using the
technique of sharding, an architect can achieve both write
and read scalability.
Data balancing occurs automatically and transparently to
MongoDB - Need
4. Aggregation Framework:
MongoDB offers an Extract, Transform, Load (ETL) framework which
eliminates the need for complex data pipelines.

5. Native Replication:
Data gets replicated across a replica set without a complicated setup.

6. Security Features:
Authentication and authorization are taken into account.

7. JSON:
JSON is widely used across for frontend and API communication. It only
makes sense for the database to use the same protocol.

8. MapReduce:
MongoDB offers a great tool, MapReduce to build data pipelines(Batch
Processing Framework).
Install MongoDB on Windows
• MongoDB 4.4 and later only support 64-bit versions of
Windows.
• MongoDB 7.0 Community Edition supports the following
64-bit versions of Windows on x86_64 architecture:
• Windows Server 2022
• Windows Server 2019
• Windows 11
Install MongoDB on Windows
• Step 1: Go to the MongoDB Download Center
to download the MongoDB Community Server.
Install MongoDB on Windows
• Step 2: When the download is complete
open the msi file and click the next button in
the startup screen
Install MongoDB on Windows
• Step 3: Now accept the End-User License
Agreement and click the next button
Install MongoDB on Windows
• Step 4: Now select the complete option to
install all the program features.
Install MongoDB on Windows
• Step 5: Select “Run service as Network
Service user” and copy the path of the
data directory. Click Next
Install MongoDB on Windows
• Step 6: Click the Install button to start the
MongoDB installation process
Install MongoDB on Windows
• Step 7: After clicking on the install button
installation of MongoDB begins
• Step 8: Now click the Finish button to
complete the MongoDB installation
process

You might also like