BCA Semester – V (Third Year)
Subject Title: Next Generation Database (CAM302-3C)
Introduction to NoSQL and MongoDB
What is NoSQL?
NoSQL stands for "Not Only SQL". It refers to a new kind of database that
provides a mechanism for storing and retrieving data in ways that are different
from traditional relational databases (RDBMS) like MySQL, PostgreSQL, or
Oracle.
Why NoSQL?
Traditional SQL databases store data in tables with rows and columns. This
works well for structured data, but today’s applications deal with:
Large volumes of unstructured or semi-structured data (e.g., social
media, logs, IoT)
The need for high performance and scalability
Frequent changes to the data structure (schema)
NoSQL databases solve these challenges by being more flexible, scalable, and
better suited for modern web and cloud applications.
Characteristics of NoSQL Databases
Feature Description
No fixed structure. You can add different fields in
Schema-less
different records.
Horizontal scaling Easily scale by adding more servers (good for big data).
High performance Fast read/write speeds.
Flexible data Supports various formats: documents, key-value pairs,
models graphs, etc.
Types of NoSQL Databases
Type Description Example
Document- Stores data in JSON-like
MongoDB
oriented documents
Key-value Data is stored as key-value pairs Redis
Stores data in columns instead of
Column-family Cassandra
rows
Stores data as nodes and
Graph-based Neo4j
relationships
Overview of Database Types: Relational vs. NoSQL
Relational Databases (RDBMS)
Relational databases store data in tables (rows and columns). Each table has a
fixed schema (structure), and relationships between data are maintained using
foreign keys.
Common Examples:
MySQL
PostgreSQL
Oracle
Microsoft SQL Server
Key Characteristics:
Feature Description
Schema Fixed (predefined structure)
Data Format Tabular (rows and columns)
Relationships Strong support using foreign keys
Transactions Strong ACID compliance
Query Language SQL (Structured Query Language)
Banking, ERP, inventory systems, traditional
Best For
applications
Example Table:
ID Name Age City
1 Alice 25 Mumbai
2 Bob 30 Bangalore
Introduction to MongoDB
What is MongoDB?
MongoDB is the most popular document-based NoSQL database. It is used
by companies like Adobe, eBay, and Forbes.
MongoDB is a modern database that stores data in the form of documents
rather than rows and columns.
Traditional databases use tables with strict structures. MongoDB stores
data in flexible, JSON-like BSON documents (Binary JSON). that are
flexible and human-readable.
These documents are made of key-value pairs, where each key has a
value. Values can be numbers, text, lists, or even other documents.
Example MongoDB Document:
{
"name": "Amit",
"age": 21,
"course": "BCA",
"hobbies": ["Reading", "Coding"]
}
MongoDB documents can have different fields. One student may have
hobbies, another may not.
This makes MongoDB very useful for dynamic data, such as in web and
mobile applications, where data keeps changing.
MongoDB vs. SQL (RDBMS)
Feature MongoDB SQL (MySQL, etc.)
Storage Format BSON (like JSON) Tables with rows
Feature MongoDB SQL (MySQL, etc.)
Schema Dynamic Fixed
Joins Limited support Full join support
Scalability Horizontally scalable Often vertically scalable
Ideal Use Case Big data, real-time apps Traditional enterprise apps
MongoDB Structure
Database → Contains multiple collections
Collection → Similar to a table in SQL; contains documents
Document → Similar to a row; contains fields with values (like a JSON
object)
Example Document:
{
"name": "Alice",
"age": 25,
"email": "alice@[Link]",
"skills": ["Python", "MongoDB", "React"]
}
Basic MongoDB Commands
Operation Command
Create Database use myDatabase
Create Collection [Link]("users")
Insert Document [Link]({name: "Alice", age: 25})
Find Document [Link]()
Update Document [Link]({name: "Alice"}, {$set: {age: 26}})
Operation Command
Delete Document [Link]({name: "Alice"})
Real-World Use Cases of MongoDB
Content management systems (CMS)
Product catalogs (e.g., e-commerce)
Real-time analytics
Mobile and IoT applications
Social media applications
Advantages of MongoDB over RDBMS
Schema less: MongoDB is a document database in which one collection
holds different documents. Number of fields, content and size of the
document can differ from one document to another.
Structure of a single object is clear.
No complex joins.
Deep query-ability. MongoDB supports dynamic queries on documents
using a document-based query language that's nearly as powerful as SQL.
Tuning.
Ease of scale-out: MongoDB is easy to scale.
Conversion/mapping of application objects to database objects not
needed.
Uses internal memory for storing the (windowed) working set, enabling
faster access of data.
Why Use MongoDB?
Document Oriented Storage: Data is stored in the form of JSON style
documents.
Index on any attribute
Replication and high availability
Auto-sharding
Rich queries
Fast in-place updates
Professional support by MongoDB
Limitations of MongoDB
No support for complex joins like SQL
Transactions are supported but limited compared to RDBMS
High memory usage for very large datasets if not designed carefully
Where MongoDB is Used
MongoDB is used in a wide variety of fields because of its scalability, real-time
updates, and ability to handle different types of data.
A. E-Commerce Websites
Store product details like name, brand, price, image, and user reviews.
Each product can have different fields depending on category (e.g.,
clothes vs. electronics).
B. Real-Time Applications
Used in apps like Uber, Swiggy, and Zomato.
Manages user profiles, live locations, orders, and delivery statuses.
C. IoT (Internet of Things)
Smart gadgets like smartwatches and smart fridges send different types of
sensor data.
MongoDB stores this data efficiently and handles high data volume.
D. Social Media and Chat Apps
Stores user info, messages, posts, and images.
MongoDB handles real-time data changes well.
E. Healthcare and Education
Hospitals store different patient records and medical reports.
Schools store data like grades, attendance, and homework.
Summary
SQL NoSQL (MongoDB)
Structured data Flexible schema
Fixed schema Schema-less
SQL NoSQL (MongoDB)
Slower with big data Optimized for speed and scale
Good for relational data Good for dynamic and hierarchical data
1. Installing MongoDB on Different Platforms
A. Windows
Steps:
1. Download the MongoDB installer from:
[Link]
2. Choose:
o Version: Current Stable
o OS: Windows
o Package: MSI (Installer)
3. Run the Installer:
o Choose "Complete" Setup
o Tick “Install MongoDB as a Service” (Recommended)
4. After installation, MongoDB will be available in:
C:\Program Files\MongoDB\Server\<version>\
▶️To Start MongoDB:
MongoDB runs as a Windows Service.
To use MongoDB shell, open Command Prompt and type:
Mongosh
MongoDB Shell (mongosh)
Type mongosh in terminal to:
Connect to MongoDB server
Run commands like:
show dbs
use test
MongoDB comes with a JavaScript shell that allows interaction with a
MongoDB instance from the command line. The shell is useful for performing
administrative functions, inspecting a running instance, or just playing around.
The mongo shell is a crucial tool for using MongoDB and is used extensively
throughout the rest of the text.
The shell automatically attempts to connect to a MongoDB server on startup, so
make sure you start mongod before starting the shell.
The shell is a full-featured JavaScript interpreter, capable of running
arbitrary JavaScript programs.
MongoDB Statistics To get stats about MongoDB server, type the command
[Link]() in MongoDB client. This will show the database name, number of
collection and documents in the database
MongoDB ─ Create Database
The use Command
MongoDB use DATABASE_NAME is used to create database. The command
will create a new database if it doesn't exist, otherwise it will return the existing
database.
Syntax
use DATABASE_NAME
Example
If you want to create a database with name <mydb>, then use DATABASE
statement would be as follows:
>use mydb
switched to db mydb
To check your currently selected database, use the command db
>db
mydb
If you want to check your databases list, use the command show dbs.
>show dbs
local 0.78125GB
test 0.23012GB
Your created database (mydb) is not present in list. To display database, you
need to insert at least one document into it.
MongoDB ─ Drop Database
The dropDatabase() Method
MongoDB [Link]() command is used to drop a existing database.
Syntax
Basic syntax of dropDatabase() command is as follows:
[Link]()
This will delete the selected database. If you have not selected any database,
then it will delete default 'test' database.
Example
First, check the list of available databases by using the command, show dbs.
>show dbs
local 0.78125GB
mydb 0.23012GB
test 0.23012GB
>
If you want to delete new database <mydb>, then dropDatabase() command
would be as
follows:
>use mydb
switched to db mydb
>[Link]()
>{ "dropped" : "mydb", "ok" : 1 }
>
Now check list of databases.
>show dbs
local 0.78125GB
test 0.23012GB>
MongoDB ─ Create Collection
In MongoDB, a collection is a group of MongoDB documents, similar to a table
in relational databases.
You can create collections explicitly using the createCollection() method, or
implicitly when you first insert a document into a non-existent collection.
Method: [Link](name, options)
🔸 Syntax:
[Link]("<collection_name>", <options>)
<collection_name>: Name of the collection to create.
<options> (optional): Options parameter is optional, so you need to specify
only the name of the collection.
Following is the list of options you can use:
capped (Boolean): If true, creates a capped collection (fixed size).
If true, enables a capped collection. Capped collection is a fixed size
collection that automatically overwrites its oldest entries when it reaches
its maximum size. If you specify true, you need to specify size parameter
also.
size (Number): Specifies the size in bytes (for capped collections).
max (Number): Max number of documents (for capped collections).
validator: JSON schema validation.
Example 1: Create a Simple Collection
use myDatabase // Switch or create database
[Link]("students")
Explanation: This creates an empty collection named students in the
myDatabase database.
Example 2: Create a Capped Collection
[Link]("logs", { capped: true, size: 10240, max: 100 })
Example 3: Create Collection with Schema Validation
[Link]("employees", {
validator: {
$jsonSchema: {
bsonType: "object",
required: ["name", "salary"],
properties: {
name: {
bsonType: "string",
description: "must be a string and is required"
},
salary: {
bsonType: "double",
minimum: 0,
description: "must be a number and is required"
}
}
}
}
})
Explanation:
This collection enforces that documents must include a name (string) and
salary (non-negative number).
Notes
You don’t need to create collections manually. If you insert a document
into a non-existent collection, MongoDB will create it automatically.
Example:
[Link]({ name: "Python", duration: "3 months" })
MongoDB Collection Methods related to CRUD Operations
(CREATE, READ, UPDATE, DELETE)
1. C – CREATE: Insert Documents
[Link](document)
Purpose: Inserts a single document.
[Link] ({
name: "Amit",
age: 20,
course: "BCA"
})
[Link]([doc1, doc2, ...])
Purpose: Inserts multiple documents at once.
[Link] ([
{ name: "Priya", age: 21, course: "BSc" },
{ name: "Raj", age: 22, course: "BCA" }
])
2. R – READ: Query Documents
[Link]() – Returns all documents.
[Link]()
[Link](query) – Conditional retrieval.
[Link]({ course: "BCA" })
[Link](query) – Returns the first matching document.
[Link]({ name: "Amit" })
Example: DB name : college,
Example Dataset: students Collection
[Link]([
{ name: "Amit", age: 20, course: "BCA", marks: 72 },
{ name: "Priya", age: 21, course: "BSc", marks: 88 },
{ name: "Raj", age: 22, course: "BCA", marks: 67 },
{ name: "Neha", age: 20, course: "BCom", marks: 91 },
{ name: "Karan", age: 23, course: "BCA", marks: 55 }
])
[Link]()
Output: Returns all 5 documents.
[Link]({ course: "BCA" })
Output: Returns students enrolled in BCA.
Find students with marks greater than 70
[Link]({ marks: { $gt: 70 } })
Find with AND condition
[Link]({ course: "BCA", marks: { $gt: 60 } })
Find with OR condition
[Link]({
$or: [{ course: "BSc" }, { course: "BCom" }]
})
Find with projection (show only specific fields)
[Link](
{ course: "BCA" },
{ name: 1, marks: 1, _id: 0 }
)
Find and sort by marks (descending)
[Link]().sort({ marks: -1 })
Find with limit
[Link]().limit(3)
Output: First 3 documents in the collection.
Find with skip (pagination)
[Link]().skip(2).limit(2)
Output: Skips first 2 docs, shows next 2.
Find students whose names start with "P"
[Link]({ name: /^P/ })
Output: Priya
findOne() Examples
findOne() returns only the first matching document.
Find one student enrolled in BCA
[Link]({ course: "BCA" })
Output: First BCA student (likely Amit)
Find one with projection
[Link] (
{ course: "BCA" },
{ name: 1, _id: 0 }
)
Output: Only name of first BCA student
[Link]().pretty()
The [Link]().pretty() command in MongoDB is used to retrieve all
documents from a collection and format the output in a readable (pretty-
printed) way.
[Link]().pretty()
db – refers to the current database.
collection – the name of the collection (e.g., students, products, etc.).
.find() – fetches all documents in the collection.
.pretty() – formats the output for better readability (like JSON).
[Link]().pretty()
{
"_id" : 1,
"name" : "Rahul",
"age" : 20,
"course" : "BCA"
}
{
"_id" : 2,
"name" : "Priya",
"age" : 21,
"course" : "[Link] CS"
}
{
"_id" : 3,
"name" : "Amit",
"age" : 22,
"course" : "BBA"
}
3. U – UPDATE: Modify Existing Documents
[Link](filter, update)
Purpose: Updates the first matching document.
[Link](
{ name: "Amit" },
{ $set: { age: 21 } }
)
[Link](filter, update)
Purpose: Updates all matching documents.
[Link](
{ course: "BCA" },
{ $set: { status: "Active" } }
)
[Link](filter, replacement)
Purpose: Replaces the entire document.
[Link](
{ name: "Raj" },
{ name: "Raj", age: 23, course: "BSc" }
)
Here's a comprehensive list of MongoDB query and update operators,
categorized for clarity.
These operators are frequently used in CRUD operations, especially in update,
find, and aggregation queries.
1. Update Operators (used in updateOne, updateMany)
Operator Description
$set Sets the value of a field
$unset Removes the specified field
$inc Increments a field by a value
$mul Multiplies a field by a value
$rename Renames a field
$currentDate Sets the field to the current date
Only updates the field if the specified value is less than the
$min
existing value
$max Only updates if the specified value is greater
$push Adds an item to an array
$pop Removes first or last item from an array
$pull Removes items from array that match a condition
$addToSet Adds an item to an array only if it doesn’t already exist
$setOnInsert Sets a field only during upsert operation
Performs bitwise operations (AND, OR, XOR) on integer
$bit
fields
2. Query Comparison Operators (used in find())
Operator Description
$eq Equal to
$ne Not equal to
$gt Greater than
$gte Greater than or equal to
$lt Less than
$lte Less than or equal to
$in Matches any value in an array
$nin Matches none of the values in an array
3. Logical Operators
Operato
Description
r
$and Matches all the given conditions
$or Matches any one of the conditions
$not Inverts the effect of a query expression
$nor Matches if none of the conditions are true
4. Element Operators
Operator Description
$exists Checks if a field exists or not
$type Checks the data type of a field
5. Evaluation Operators
Operator Description
$regex Matches string using regular expression
$expr Allows the use of aggregation expressions in queries
$mod Performs modulo operation to match values
Operator Description
$text Performs text search (requires text index)
Matches documents using a JavaScript expression (not recommended
$where
for performance/security)
Example Usage of Operators:
// Set new age
[Link]({ name: "Amit" }, { $set: { age: 22 } })
// Increment age
[Link]({}, { $inc: { age: 1 } })
// Find students with age > 20
[Link]({ age: { $gt: 20 } })
// Find students whose name is either "Amit" or "Priya"
[Link]({ $or: [ { name: "Amit" }, { name: "Priya" } ] })
// Remove feesPaid field
[Link]({ name: "Raj" }, { $unset: { feesPaid: "" } })
Let’s assume the students collection has the following documents:
[
{ name: "Amit", age: 20, course: "BCA", feesPaid: false },
{ name: "Priya", age: 21, course: "BSc", feesPaid: true },
{ name: "Raj", age: 22, course: "BCA", feesPaid: false },
{ name: "Neha", age: 20, course: "BSc", feesPaid: true },
{ name: "John", age: 21, course: "BCA", feesPaid: false }
]
Part A: updateOne() Exercises
Exercise 1: Update a student's age
Task: Change Amit’s age to 21.
[Link](
{ name: "Amit" },
{ $set: { age: 21 } }
)
Exercise 2: Mark fees as paid for a specific student
Task: Update Raj's feesPaid to true.
[Link](
{ name: "Raj" },
{ $set: { feesPaid: true } }
)
Exercise 3: Add a new field (e.g., "grade") to one student
Task: Add a grade: "A" to Neha's record.
[Link](
{ name: "Neha" },
{ $set: { grade: "A" } }
)
Exercise 4: Increment a student's age by 1
Task: Increase John’s age by 1.
[Link](
{ name: "John" },
{ $inc: { age: 1 } }
)
Part B: updateMany() Exercises
Exercise 5: Add a common field to all documents
Task: Add status: "Enrolled" to all students.
[Link](
{},
{ $set: { status: "Enrolled" } }
)
Exercise 6: Update course name for a group
Task: Change course from "BCA" to "BCA Honors".
[Link](
{ course: "BCA" },
{ $set: { course: "BCA Honors" } }
)
Exercise 7: Increment age of all BSc students by 1
[Link](
{ course: "BSc" },
{ $inc: { age: 1 } }
)
Exercise 8: Reset feesPaid to false for all students
[Link](
{},
{ $set: { feesPaid: false } }
)
Exercise 9: Set grade to "Pending" if it doesn't exist
[Link](
{ grade: { $exists: false } },
{ $set: { grade: "Pending" } }
)
Bonus Challenge Exercises for Students
1. Add a contact field to all students with a default number.
2. Change all students aged more than 21 to have a status "Senior".
3. Add a lastUpdated field with the current date using $currentDate.
4. For all students whose feesPaid is false, set dueAmount: 10000.
Collection: products
[Link]([
{ name: "Laptop", brand: "Dell", price: 50000, stock: 25, categories:
["Electronics", "Computers"] },
{ name: "Smartphone", brand: "Samsung", price: 30000, stock: 50, categories:
["Electronics", "Mobile"] },
{ name: "Headphones", brand: "Boat", price: 1500, stock: 200, categories:
["Electronics", "Audio"] },
{ name: "Keyboard", brand: "Logitech", price: 1200, stock: 100, categories:
["Computers", "Accessories"] },
{ name: "Monitor", brand: "LG", price: 8000, stock: 15, categories:
["Electronics", "Display"] }
])
Part A: Basic Updates
1. Update the price of "Laptop" to ₹55,000
[Link](
{ name: "Laptop" },
{ $set: { price: 55000 } }
)
2. Increase the stock of "Smartphone" by 10 units
[Link](
{ name: "Smartphone" },
{ $inc: { stock: 10 } }
)
3. Add a new field rating: 4.5 to "Monitor"
[Link](
{ name: "Monitor" },
{ $set: { rating: 4.5 } }
)
4. Reduce the price of "Headphones" by ₹500
[Link](
{ name: "Headphones" },
{ $inc: { price: -500 } }
)
5. Set stock: 0 for "Keyboard"
[Link](
{ name: "Keyboard" },
{ $set: { stock: 0 } }
)
Part B: Advanced Field Operations
6. Remove the categories field from "Monitor"
[Link](
{ name: "Monitor" },
{ $unset: { categories: "" } }
)
7. Rename brand to manufacturer for all products
[Link](
{},
{ $rename: { brand: "manufacturer" } }
)
8. Update price to ₹1999 where price < ₹2000
[Link](
{ price: { $lt: 2000 } },
{ $set: { price: 1999 } }
)
9. Add discounted: true for products priced > ₹30000
[Link](
{ price: { $gt: 30000 } },
{ $set: { discounted: true } }
)
Array Operations
[Link] "Gadgets" to "Smartphone" categories list
[Link](
{ name: "Smartphone" },
{ $push: { categories: "Gadgets" } }
)
[Link] "Audio" only if it doesn't exist in "Headphones"
[Link](
{ name: "Headphones" },
{ $addToSet: { categories: "Audio" } }
)
[Link] "Accessories" from "Keyboard" categories
[Link](
{ name: "Keyboard" },
{ $pull: { categories: "Accessories" } }
)
[Link] "Peripherals" to products having "Computers" in categories
[Link](
{ categories: "Computers" },
{ $addToSet: { categories: "Peripherals" } }
)
Delete Operations
1. deleteOne()
Deletes a single document that matches the specified filter.
[Link](<filter>)
[Link]({ name: "John" })
Deletes the first document where name is "John".
🔹 2. deleteMany()
Deletes all documents that match the specified filter.
[Link](<filter>)
[Link]({ class: "10A" })
Deletes all documents where class is "10A".
MongoDB Compass (GUI Tool)
Download:
[Link]
Use it to:
o View collections and documents
o Run queries
o Monitor performance
o Create databases visually
MongoDB Architecture
What is Architecture in a Database?
Think of architecture like the blueprint of a house — it shows how
different rooms (components) are arranged and how they work together.
In MongoDB, the architecture explains how the database engine, storage,
clients, and servers communicate and store data.
Key Components of MongoDB Architecture
Client
These are applications or users who want to access the database.
They send requests to read or write data.
Examples: A Python script, [Link] app, or MongoDB Compass GUI.
MongoDB Server (mongod)
This is the heart of MongoDB. It runs as a background service.
It handles:
o Data storage
o Query processing
o Authentication
o Replication
It stores data in databases → collections → documents.
Databases, Collections, Documents
Think of it like this:
🏢 Database = School
📂 Collection = Class (e.g., students)
📄 Document = Student record (JSON)
MongoDB stores data as documents (BSON format) inside collections, which
belong to databases.
4️⃣ MongoDB Shell (mongosh)
A command-line tool to interact with the database.
Used by developers/admins for running commands and queries.
MongoDB - Data Storage
MongoDB is a NoSQL, document-oriented database that stores data in
a flexible, JSON-like format called BSON (Binary JSON). It is designed
for scalability, high performance, and ease of development. Here's how
data storage in MongoDB works:
1. Basic Storage Hierarchy
MongoDB stores data in the following hierarchy:
Database → Collections → Documents → Fields
Level Description
Database A container for collections, like a folder.
Collection A group of related documents, like a table in SQL.
Document The actual record, stored in BSON format (like JSON).
Field A key-value pair in a document (like a column).
2. BSON Format
BSON is a binary representation of JSON documents.
It supports additional data types like Date, Int, Double, Binary, etc.
Example document:
{
"_id": ObjectId("507f1f77bcf86cd799439011"),
"name": "Paneer Tikka",
"price": 250,
"category": "Starter",
"available": true
}
3. Flexible Schema (Schema-less Design)
Documents in the same collection can have different fields.
No need to predefine structure like in relational DBs.
This makes MongoDB very flexible and ideal for agile development.
Example – same collection with two different documents:
{ "name": "Pizza", "price": 300 }
{ "name": "Burger", "price": 150, "size": "medium" }
4. Storage Engine
MongoDB uses a storage engine to manage how data is stored on disk.
Default Engine: WiredTiger
Supports compression to save disk space.
Provides concurrency control and journaling for crash recovery.
Stores data in memory-mapped files.
Optional: In-Memory Storage Engine
Stores all data in RAM.
Very fast, but non-persistent.
5. Indexing for Fast Access
MongoDB automatically creates an index on _id field. You can add more
indexes to optimize queries.
[Link]({ category: 1 })
6. Data Compression
MongoDB (via WiredTiger) uses:
Snappy Compression (default): Balanced performance and size.
Zlib: Higher compression, slower.
Zstd: High compression ratio, good performance (MongoDB 4.2+)
7. Write Ahead Logging (Journaling)
MongoDB uses journaling to prevent data loss.
All changes are written to a journal file before being written to the actual
database.
This ensures durability and data recovery in case of a crash.
8. Replica Sets and Sharding (Advanced Storage)
Replica Set: Copies of data for redundancy and high availability.
Sharding: Splits large datasets across multiple servers (horizontal
scaling).
Storage Engine and Performance in Mongodb
In MongoDB, the storage engine is a critical component that manages how data is stored,
accessed, and modified on disk or in memory. It directly impacts the database's performance,
scalability, and reliability.
MongoDB supports multiple storage engines, with the most prominent being:
1. WiredTiger (Default since MongoDB 3.2):
Overview: A high-performance, scalable storage engine optimized for modern
hardware.
Key Features:
o Document-level concurrency: Uses optimistic concurrency control, allowing
multiple write operations to proceed concurrently without locking the entire
collection.
o Compression: Supports data compression (snappy or zlib for collections,
snappy for journals) to reduce disk usage, improving I/O performance.
o Cache management: Uses an internal cache to keep frequently accessed data
in memory, reducing disk I/O.
o Checkpoints: Periodically writes consistent snapshots to disk, ensuring
durability and faster recovery.
o Journaling: Logs operations for crash recovery, balancing durability and
performance.
Performance Benefits:
o High throughput for read and write operations due to fine-grained locking.
o Efficient use of disk space with compression.
o Scales well with multi-core CPUs and large datasets.
Use Case: General-purpose workloads, including high-concurrency applications, large
datasets, and mixed read/write operations.
2. In-Memory Storage Engine:
Overview: Stores data entirely in RAM for ultra-low latency and high throughput.
Key Features:
o Optimized for workloads requiring real-time processing and minimal latency.
o No disk I/O, as data resides in memory.
o Supports durability via replication (not persistent on disk).
Performance Benefits:
o Extremely fast read/write operations due to in-memory storage.
o Ideal for caching, real-time analytics, or transient data.
Trade-offs:
o Limited by available RAM.
o Data loss risk on server restart unless paired with replication.
Use Case: High-performance, low-latency applications like real-time analytics or
session stores.
3. MMAPv1 (Deprecated):
Overview: The original storage engine used in earlier MongoDB versions, now
deprecated.
Key Features:
o Maps data files to virtual memory, relying on the operating system for
caching.
o Collection-level locking, which can lead to contention in write-heavy
workloads.
Performance Drawbacks:
o Less efficient concurrency compared to WiredTiger.
o Higher disk usage due to lack of compression.
Use Case: Rarely used in modern deployments due to inferior performance and
scalability.
Replica Sets and Sharding
In MongoDB, replica and sharding are two key mechanisms used to ensure
high availability, scalability, and performance in distributed database systems.
Below is a concise explanation of each:
Replica (Replication)
Replication is the process of maintaining multiple copies of data across different
servers (nodes) to ensure high availability and data redundancy.
How it works:
MongoDB uses a replica set, which is a group of nodes (typically 3 or
more) that maintain identical copies of the data.
One node is the primary node, which handles all write operations. The
other nodes are secondary nodes, which replicate the primary's data and
can serve read operations (configurable).
If the primary node fails, an election process among the secondary nodes
selects a new primary to ensure continuous availability.
Replication provides fault tolerance and data durability by ensuring
data is available even if a node goes down.
Key Features:
Automatic failover: If the primary fails, a secondary takes over.
Data redundancy: Multiple copies protect against data loss.
Read scaling: Reads can be distributed to secondary nodes (with
eventual consistency).
Use Case: Replication is ideal for ensuring high availability, disaster
recovery, and read-heavy workloads.
Sharding
Sharding is the process of distributing data across multiple servers (shards) to
handle large datasets and high throughput by dividing the data into smaller
chunks.
How it works:
MongoDB splits a collection into smaller subsets called shards, each
stored on a different server.
A shard key (a field or combination of fields) determines how data is
partitioned across shards.
The config server stores metadata about the shards, and the mongos
router directs queries to the appropriate shard(s).
Sharding enables horizontal scaling by distributing data and query load
across multiple machines.
Key Features:
Scalability: Handles large datasets and high write/read throughput by
distributing workload.
Flexible distribution: Supports range-based or hash-based sharding for
even data distribution.
Parallel processing: Queries can be executed on multiple shards
simultaneously.
Use Case: Sharding is used for applications with massive datasets or high
traffic, such as social media platforms or e-commerce systems.
Key Differences
Feature Replication Sharding
High availability
Scalability and data
Purpose and data
distribution
redundancy
Data Same data on all
Data split across nodes
Distribution nodes
Read performance Read and write
Scales
(via secondaries) performance (via shards)
Failure Automatic failover No failover; relies on
Handling to secondary replication
Complexity Simpler to set up More complex (requires
shard key, config
Feature Replication Sharding
servers, etc.)
How They Work Together
In a production MongoDB deployment, replication and sharding are
often combined. Each shard can be a replica set, ensuring both
scalability (via sharding) and high availability (via replication).
For example, a large dataset is split into shards, and each shard is
replicated across multiple nodes in a replica set to ensure fault tolerance.
Summary
Replication ensures data availability and redundancy by maintaining
identical copies across nodes.
Sharding enables scalability by distributing data across multiple servers.
Together, they provide a robust solution for handling large-scale, high-
availability applications in MongoDB.