0% found this document useful (0 votes)

27 views23 pages

Module 7 NoSQL

Uploaded by

koustavdas23052004

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

27 views23 pages

Module 7 NoSQL

Uploaded by

koustavdas23052004

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 23

Module 7: NOSQL DB

Introduction to NoSQL

NoSQL (Not Only SQL) databases are designed to handle large volumes of unstructured and semi-structured
data. Unlike traditional relational databases that rely on fixed schemas and tables, NoSQL offers flexible data
models and supports horizontal scaling. This makes them well-suited for modern applications that require high
performance, scalability, and the ability to manage diverse data types efficiently.
Key Features of NoSQL Databases

● Dynamic schema: Allow flexible shaping of data to meet new requirements without the need to
migrate or change schemas.
● Horizontal scalability: They scale horizontally for adding more nodes into the existing ones and
acquire enough storage for even bigger datasets and much higher traffic by distributing the load on
multiple servers.
● Document-based: Data are presented in flexible, semi-structured formats like JSON/BSON (e.g.,
MongoDB).
● Key-value-based: They possess a simple but fast access pattern (e.g., Redis) by storing data as pairs
of keys and values.
● Column-based: Data are organized into columns instead of rows (e.g., CASSANDRA).
● Distributed and high availability: They are designed to be highly available and to automatically
handle node failures and data replication across multiple nodes in a database cluster.
● Flexibility: Allow developers to store and retrieve data in a flexible and dynamic manner, with
support for multiple data types and changing data structures.
● Performance: Perfect for big data and real-time analytics and high volume applications.

Challenges of NoSQL Databases

● Lack of standardization: NoSQL systems can be vastly different from one another, making it even
harder to choose the right one for a specific use case.
● Lack of ACID compliance: NoSQL databases may not provide consistency, which is a disadvantage
for applications that need strict data integrity.
● Narrow focus: Great for storage but lack functionalities as transaction management, in which
relational databases are great.
● Absence of Complex Query Support: They are not designed to handle complex queries, which means
that they are not a good fit for applications that require complex data analysis or reporting.
● Lack of maturity: Being relatively new, NoSQL may not have the reliability, security and feature set
of traditional relational databases.
● Management complexity: For large datasets, maintaining a NoSQL database could be quite more
complicated than managing a relational database.
● Limited GUI Tools: While some NoSQL databases, like MongoDB offer GUI tools like MongoDB
Compass, not all NoSQL databases provide flexible or user-friendly GUI tools.
SQL vs. NoSQL:

Feature SQL (Relational DB) NoSQL (Non-Relational DB)

Flexible (Documents, Key-Value,

Data Model Structured, Tabular
Graphs)

Scalability Vertical Scaling Horizontal Scaling

Schema Predefined Dynamic & Schema-less

ACID Support Strong Limited or Eventual Consistency

Best For Transactional applications Big data, real-time analytics

MySQL, PostgreSQL,
Examples MongoDB, Cassandra, Redis
Oracle
Popular NoSQL Databases & Their Use Cases

NoSQL Database Type Use Cases

Content management, product

MongoDB Document-based
catalogs

Caching, real-time analytics, session

Redis Key-Value Store
storage

Column-Family
Cassandra Big data, high availability systems
Store

Neo4j Graph Database Fraud detection, social networks

Use of NoSQL
● Big Data Applications: Efficiently stores and processes massive amounts of unstructured and
semi-structured data.
● Real-Time Analytics: Supports fast queries and analysis for use cases like recommendation engines
or fraud detection.
● Scalable Web Applications: Handles high traffic and large user bases by scaling horizontally across
servers.
● Flexible Data Storage: Manages diverse data formats (JSON, key-value, documents, graphs) without
rigid schemas.
Reasons to Choose NoSQL

A growing business faces a lot of challenges and opportunities, so it demands future-proof planning. Some tools
and technology are the best fit for your application today, but the same may not work tomorrow. Picking the
right database is also a part of the application, which is a challenging decision for organizations. It shows how
well you can architect your application. Suppose you choose a database for your application based on the
current scenario and consider the small number of users; then what can happen after a couple of years? Users
may grow, and if users grow, then you will start facing the scalability issue and several other problems on your
website. If your site isn't able to handle the large volume of user growth, then it will affect your business badly,
and your business may fail as well.

Well, there has always been an argument among developers that which database is best suitable for the
applications. Let us see why you should choose NoSQL. But before that, let's learn about how to choose between
relational and non-relational databases.

Relational or non-relational?
Relational or non-relational, both databases can store information but the difference lies in how they’re built, the
kind of information they store, and how they store it. In this article, we are going to discuss some scenarios
where you can choose non-relational databases over relational databases for your application. We will discuss
some features of NoSQL but you need to remember that there isn't technology or database that fits all. You will
find pros and cons for each one of them. So to make a choice, you first need to ask a few questions about the
needs of your application. Answering these questions will help you identify the needs of your application and
you will be able to find that is NoSQL is the best fit for your application or not.
● Can your application support your projected user volume growth?
● To cope up with user demands/activities. Do you need to scale your applications?
● How much money and time can be saved with 0% downtime?
● Would your application benefit from rapid development cycles? (flexible data model)
● Does your application generate huge amounts of data?

Why NoSQL?
For over forty years, relational databases have been the go-to solution for data storage. Structured Query
Language (SQL) is highly organized, much like a phone book, and is designed for reliable transactions.
However, as applications grow and require the ability to handle millions or even billions of queries in real time,
relational databases start to face scalability issues. Large websites like Facebook, Google, and Amazon generate
massive amounts of data and queries, and relational databases struggle to keep up. This is where NoSQL comes
into play.
NoSQL databases are designed to address two primary concerns:
● High speed operations
● Flexibility in data storage

These features make NoSQL ideal for handling massive amounts of unstructured data where the data
requirements aren't clearly defined at the start. Now, let's discuss five important features of NoSQL databases
that make them a great option for large-scale applications.
Why Choose NoSQL for Large-Scale Applications

1. Multi-Model

Relational databases require a fixed schema with tables and columns, and they need adjustments when
requirements change. This can result in the need for new columns, additional relationships, and coordination
with database administrators. NoSQL databases, on the other hand, provide much more flexibility. They don't
require predefined schemas and allow you to store different types of data together.
● Flexible Schema: NoSQL allows changes as your application evolves.
● Agile Development: Ideal for fast-paced, agile projects that need quick implementation.
● Handle Various Data Types: NoSQL lets you add new data types without restructuring the entire
database.

2. Easily Scalable

The primary reason to choose NoSQL is its scalability. While relational databases can be scaled, it is a
complicated and costly process. Relational databases often require server upgrades and "sharding" (splitting
data into smaller parts across multiple servers), which can lead to downtime.
● Effortless Scaling: NoSQL databases use a masterless, peer-to-peer architecture.
● Easy Expansion: Adding new servers to the cluster can be done quickly with minimal downtime.
● Performance Boost: This scalability results in improved performance and higher read/write speeds.

3. Distributed

NoSQL databases are designed to work on a global scale by distributing data across multiple locations,
including various data centers or cloud regions. This distribution improves both write and read operations,
which is something relational databases struggle with since they are usually centralized.
● Global Distribution: Data is distributed across multiple locations for greater reliability and
accessibility.
● Continuous Availability: NoSQL databases are built to keep systems running smoothly, even if one
node fails.

4. Redundancy and Zero Downtime

One major concern in building applications is the risk of hardware failure. NoSQL databases are designed to
address these issues at the architectural level, ensuring high availability. For example, databases like Cassandra
use various strategies to detect node failure, and Riak uses network partitioning to repair itself.
● Multiple Data Copies: Data is stored on several nodes, ensuring access even if one node fails.
● Zero Downtime: NoSQL databases ensure your application remains available, even during hardware
failures.
● Built-In Redundancy: No need for developers to create custom redundant solutions.

5. Big Data Applications

NoSQL databases are optimized for handling large volumes of data quickly, making them the perfect solution
for big data applications. They ensure that data doesn’t become a bottleneck in your system, allowing your
application to run seamlessly even when dealing with vast amounts of data.
● Handles Massive Data: NoSQL is designed to scale and process large data sets.
● Optimized Performance: Helps avoid data bottlenecks in fast, high-volume environments.

When to Use NoSQL vs. SQL

While NoSQL offers many advantages, it’s important to note that it isn’t the right fit for every application. For
certain types of data, SQL may still be a better choice.
● Transactional Data: If your application is focused on transactional data, SQL is the ideal choice.
SQL databases are designed for processing and managing transactions effectively.
● Analytical Data: For applications that require handling large amounts of analytical data, NoSQL is
generally more suitable. SQL was not designed for data analytics and might struggle with large
datasets.

The CAP Theorem in DBMS

The inherent trade-offs in networked shared-data system design make it very difficult to create a dependable and
effective system. The CAP theorem, or CAP principle, is a central foundation for comprehending these
trade-offs in distributed systems. The CAP theorem emphasizes the limitations that system designers have while
addressing distributed data replication. It states that only two of the three properties—consistency, availability,
and partition tolerance—can be concurrently attained by a distributed system.
Developers must carefully balance these attributes according to their particular application demands because of
this underlying restriction. Designers may decide which qualities to prioritize to obtain the best performance
and reliability for their systems by knowing the CAP theorem. This article will provide a thorough analysis of
all the properties given in the CAP theorem, investigate the associated trade-offs, and talk about how these ideas
relate to distributed systems in the real world.

What is the CAP Theorem?

The CAP theorem is a fundamental concept in distributed systems theory that was first proposed by Eric Brewer
in 2000 and subsequently shown by Seth Gilbert and Nancy Lynch in 2002. It asserts that all three of the
following qualities cannot be concurrently guaranteed in any distributed data system:
1. Consistency

Consistency means that all the nodes (databases) inside a network will have the same copies of a replicated data
item visible for various transactions. It guarantees that every node in a distributed cluster returns the same, most
recent, and successful write. It refers to every client having the same view of the data. There are various types of
consistency models. Consistency in CAP refers to sequential consistency, a very strong form of consistency.
Note that the concept of Consistency in ACID and CAP are slightly different since in CAP, it refers to the
consistency of the values in different copies of the same data item in a replicated distributed system. In ACID, it
refers to the fact that a transaction will not violate the integrity constraints specified on the database schema.
For example, a user checks his account balance and knows that he has 500 rupees. He spends 200 rupees on
some products. Hence the amount of 200 must be deducted changing his account balance to 300 rupees. This
change must be committed and communicated with all other databases that hold this user's details. Otherwise,
there will be inconsistency, and the other database might show his account balance as 500 rupees which is not
true.

2. Availability

Availability means that each read or write request for a data item will either be processed successfully or will
receive a message that the operation cannot be completed. Every non-failing node returns a response for all the
read and write requests in a reasonable amount of time. The key word here is "every". In simple terms, every
node (on either side of a network partition) must be able to respond in a reasonable amount of time.
For example, user A is a content creator having 1000 other users subscribed to his channel. Another user B who
is far away from user A tries to subscribe to user A's channel. Since the distance between both users are huge,
they are connected to different database node of the social media network. If the distributed system follows the
principle of availability, user B must be able to subscribe to user A's channel.
3. Partition Tolerance

Partition tolerance means that the system can continue operating even if the network connecting the nodes has a
fault that results in two or more partitions, where the nodes in each partition can only communicate among each
other. That means, the system continues to function and upholds its consistency guarantees in spite of network
partitions. Network partitions are a fact of life. Distributed systems guaranteeing partition tolerance can
gracefully recover from partitions once the partition heals.
For example, take the example of the same social media network where two users are trying to find the
subscriber count of a particular channel. Due to some technical fault, there occurs a network outage, the second
database connected by user B losses its connection with the first database. Hence the subscriber count is shown
to the user B with the help of replica of data which was previously stored in database 1 backed up prior to
network outage. Hence the distributed system is partition tolerant.

The CAP theorem states that distributed databases can have at most two of the three properties: consistency,
availability, and partition tolerance. As a result, database systems prioritize only two properties at a time.
The Trade-Offs in the CAP Theorem

The CAP theorem implies that a distributed system can only provide two out of three properties:

1. CA (Consistency and Availability)

These types of systems always accept the request to view or modify the data sent by the user and they are always
responded with data which is consistent among all the database nodes of a big, distributed network.

However, such types of distributed systems are not realizable in the real world because when network failure
occurs, there are two options: Either send old data which was replicated moments ago before network failure or
do not allow users to access the already existing data. If we choose the first option, our system will become
Available and if we choose the second option our system will become Consistent.
The combination of consistency and availability is not possible in distributed systems and for achieving CA, the
system has to be monolithic such that when a user updates the state of the system, all other users accessing it are
also notified about the new changes which means that the consistency is maintained. And since it follows
monolithic architecture, all users are connected to a single system which means it is also available. These types
of systems are generally not preferred due to a requirement of distributed computing which can be only done
when consistency or availability is sacrificed for partition tolerance.
Example databases: MySQL
2. AP (Availability and Partition Tolerance)

These types of system are distributed in nature, ensuring that the request sent by the user to view or modify the
data present in the database nodes are not dropped and are processed in presence of a network partition.

The system prioritizes availability over consistency and can respond with possibly stale data which was
replicated from other nodes before the partition was created due to some technical failure. Such design choices
are generally used while building social media websites such as Facebook, Instagram, Reddit, etc. and online
content websites like YouTube, blog, news, etc. where consistency is usually not required, and a bigger problem
arises if the service is unavailable causing corporations to lose money since the users may shift to new platform.
The system can be distributed across multiple nodes and is designed to operate reliably even in the face of
network partitions.
Example databases: Amazon DynamoDB, Google Cloud Spanner.

3. CP (Consistency and Partition Tolerance)

These types of systems are distributed in nature, ensuring that the request sent by the user to view or modify the
data present in the database nodes are dropped instead of responding with inconsistent data in presence of a
network partition.

The system prioritizes consistency over availability and does not allow users to read crucial data from the stored
replica which was backed up prior to the occurrence of network partition. Consistency is chosen over
availability for critical applications where latest data plays an important role such as stock market application,
ticket booking application, banking, etc. where problems will arise due to old data present to users of
application.
For example, in a train ticket booking application, there is one seat which can be booked. A replica of the
database is created, and it is sent to other nodes of the distributed system. A network outage occurs which
causes the user connected to the partitioned node to fetch details from this replica. Some users connected to the
unpartitioned part of the distributed network and already booked the last remaining seat. However, the user
connected to the partitioned node will still have one seat which makes the available data inconsistent. It would
have been better if the user was shown an error and made the system unavailable for the user and maintain
consistency. Hence consistency is chosen in such scenarios.
Example databases: Apache HBase, MongoDB, Redis.
Types of NoSQL Databases

NoSQL databases can be classified into four main types, based on their data storage and retrieval methods:
1. Document-based databases
2. Key-value stores
3. Column-oriented databases
4. Graph-based databases

Each type has unique advantages and use cases, making NoSQL a preferred choice for big data applications,
real-time analytics, cloud computing and distributed systems.

1. Document-Based Database
The document-based database is a nonrelational database. Instead of storing the data in rows and columns
(tables), it uses the documents to store the data in the database. A document database stores data in JSON,
BSON or XML documents.
Documents can be stored and retrieved in a form that is much closer to the data objects used in applications
which means less translation is required to use these data in the applications. In the Document database, the
particular elements can be accessed by using the index value that is assigned for faster querying.
Collections are the group of documents that store documents that have similar contents. Not all the documents
are in any collection as they require a similar schema because document databases have a flexible schema.

Key features of document database:

● Flexible schema: Documents in the database have a flexible schema. It means the documents in the
database need not be the same schema.
● Faster creation and maintenance: the creation of documents is easy and minimal maintenance is
required once we create the document.
● No foreign keys: There is no dynamic relationship between two documents so documents can be
independent of one another. So, there is no requirement for a foreign key in a document database.
● Open formats: To build a document we use XML, JSON, and others.

Database Use Case

MongoDB Content management, product catalogs, user profiles

CouchDB Offline applications, mobile synchronization

Firebase Firestore Real-time apps, chat applications

2. Key-Value Stores
A key-value store is a nonrelational database. The simplest form of a NoSQL database is a key-value store.
Every data element in the database is stored in key-value pairs. The data can be retrieved by using a unique key
allotted to each element in the database. The values can be simple data types like strings, numbers or complex
objects. A key-value store is like a relational database with only two columns which is the key and the value.

Key features of the key-value store:

● Simplicity: Data retrieval is extremely fast due to direct key access.

● Scalability: Designed for horizontal scaling and distributed storage.
● Speed: Ideal for caching and real-time applications.

Popular Key-Value Databases & Use Cases

Database Use Case

Redis Caching, real-time leaderboards, session storage

Memcached High-speed in-memory caching

Amazon DynamoDB Cloud-based scalable applications

3. Column Oriented Databases

A column-oriented database is a non-relational database that stores the data in columns instead of rows. That
means when we want to run analytics on a small number of columns, we can read those columns directly
without consuming memory with the unwanted data. Columnar databases are designed to read data more
efficiently and retrieve the data with greater speed. A columnar database is used to store a large amount of data.

Key features of Columnar Oriented Database

● High Scalability: Supports distributed data processing.

● Compression: Columnar storage enables efficient data compression.
● Faster Query Performance: Best for analytical queries.

Popular Column-Oriented Databases & Use Cases

Database Use Case

Apache Cassandra Real-time analytics, IoT applications

Google Bigtable Large-scale machine learning, time-series data

HBase Hadoop ecosystem, distributed storage

4. Graph-Based Databases
Graph-based databases focus on the relationship between the elements. It stores the data in the form of nodes in
the database. The connections between the nodes are called links or relationships, making them ideal for
complex relationship-based queries.
● Data is represented as nodes (objects) and edges (connections).
● Fast graph traversal algorithms help retrieve relationships quickly.
● Used in scenarios where relationships are as important as the data itself.

Key features of Graph Database

● Relationship-Centric Storage: Perfect for social networks, fraud detection, recommendation engines.
● Real-Time Query Processing: Queries return results almost instantly.
● Schema Flexibility: Easily adapts to evolving relationship structures
Popular Graph Databases & Use Cases

Database Use Case

Neo4j Fraud detection, social networks

Amazon Neptune Knowledge graphs, AI recommendations

ArangoDB Multi-model database, cybersecurity

Comparison of NoSQL Database Types

Key-Value
Feature Document-Based Column-Oriented Graph-Based
Store

JSON-like Key-Value Columns instead Nodes &

Data Model
documents pairs of rows Relationships

Fast lookups & Analytics & big Relationship-

Best Use Case Semi-structured data
caching data heavy data
Optimized
Query
Moderate Fast High for analytics for
Performance
relationships

Schema Flexible Dynamic Semi-structured Schema-less

High Scales with

Scalability Horizontal Highly scalable
horizontal relationships

Neo4j,
MongoDB, Redis,
Examples Cassandra, HBase Amazon
CouchDB DynamoDB
Neptune

What is a Graph Database?

A graph database (GDB) is a database that uses graph structures for storing data. It uses nodes, edges, and
properties instead of tables or documents to represent and store data. The edges represent relationships between
the nodes. This helps in retrieving data more easily and, in many cases, with one operation. Graph databases are
commonly referred to as NoSQL. Ex: Neo4j, Amazon Neptune, ArangoDB etc.

Representation:

The graph database is based on graph theory. The data is stored in the nodes of the graph and the relationship
between the data is represented by the edges between the nodes.
When do we need a Graph Database?

1. It solves Many-To-Many relationship problems

If we have friends of friends and stuff like that, these are many to many relationships.
Used when the query in the relational database is very complex.

2. When relationships between data elements are more important

For example- there is a profile and the profile has some specific information in it but the major selling point is
the relationship between these different profiles that is how you get connected within a network.
In the same way, if there is a data element such as a user data element inside a graph database there could be
multiple user data elements but the relationship is what is going to be the factor for all these data elements
which are stored inside the graph database.

3. Low latency with large scale data

When you add lots of relationships in the relational database, the data sets are going to be huge and when you
query it, the complexity is going to be more complex and it is going to be more than usual. However, in graph
databases, it is specifically designed for this particular purpose and one can query relationships with ease.

Why do Graph Databases matter?
Because graphs are good at handling relationships, some databases store data in the form of a graph.

For example, we have a social network in which five friends are all connected. These friends are Anay, Bhagya,
Chaitanya, Dilip, and Erica. A graph database that will store their personal information may look something like
this:

id first name last name email phone

555-111
1 Anay Agarwal [email protected]
-5555

555-222
2 Bhagya Kumar [email protected]
-5555

555-333
3 Chaitanya Nayak [email protected]
-5555

555-444
4 Dilip Jain [email protected]
-5555
555-555
5 Erica Emmanuel [email protected]
-5555

Now, we will also need another table to capture the friendship/relationship between users/friends. Our
friendship table will look something like this:

user_id friend_id

1 2

1 3

1 4

1 5

2 1

2 3
2 4

2 5

3 1

3 2

3 4

3 5

4 1

4 2

4 3
4 5

5 1

5 2

5 3

5 4

We will avoid going deep into the Database(primary key & foreign key) theory. Instead just assume that the
friendship table uses id's of both the friends. Assume that our social network here has a feature that allows every
user to see the personal information of his/her friends. So, If Chaitanya were requesting information then it
would mean she needs information about Anay, Bhagya, Dilip and Erica. We will approach this problem the
traditional way(Relational database). We must first identify Chaitanya's id in the User's table:

id first name last name email phone

555-333-555
3 Chaitanya Nayak [email protected]
5

Now, we'd look for all tuples in the friendship table where the user_id is 3. Resulting relation would be
something like this:
user_id friend_id

3 1

3 2

3 4

3 5

Now, let's analyse the time taken in this Relational database approach. This will be approximately log(N) times
where N represents the number of tuples in the friendship table or number of relations. Here, the database
maintains the rows in the order of id's. So, in general for 'M' no of queries, we have a time complexity of
M*log(N) Only if we had used a graph database approach, the total time complexity would have been O(N).
Because, once we've located Cindy in the database, we have to take only a single step for finding her friends.
Here is how our query would be executed:
Advantages: Frequent schema changes, managing volume of data, real-time query response time, and more
intelligent data activation requirements are done by graph model.
Disadvantages: Note that graph databases aren’t always the best solution for an application. We will need to
assess the needs of the application before deciding the architecture.
Limitations of Graph Databases:
● Graph Databases may not be offering better choice over the NoSQL variations.
● If an application needs to scale horizontally this may introduce poor performance.
● Not very efficient when it needs to update all nodes with a given parameter.

Restart VSS Writers Without Reboot
No ratings yet
Restart VSS Writers Without Reboot
2 pages
Dexxum - 3 User Manual (English Version) - 2009.6.24
No ratings yet
Dexxum - 3 User Manual (English Version) - 2009.6.24
93 pages
ALVAMT Variable Area Flowmeter Specs
No ratings yet
ALVAMT Variable Area Flowmeter Specs
6 pages
Introduction To Archival Organization and Description
No ratings yet
Introduction To Archival Organization and Description
6 pages
Notes of Java Unit 4
No ratings yet
Notes of Java Unit 4
54 pages
E Thesis Kmutnb
100% (3)
E Thesis Kmutnb
6 pages
Lecture 1
No ratings yet
Lecture 1
22 pages
Exam2 f10 PDF
No ratings yet
Exam2 f10 PDF
15 pages
Technical MCQs CTS
No ratings yet
Technical MCQs CTS
16 pages
Navrhy Ozvucnic Beyma 3 Fr106
No ratings yet
Navrhy Ozvucnic Beyma 3 Fr106
3 pages
Lab Manual For Jncia
100% (1)
Lab Manual For Jncia
35 pages
Measurement and Error Basics
No ratings yet
Measurement and Error Basics
41 pages
CICC Proposals for Cybercrime Convention
No ratings yet
CICC Proposals for Cybercrime Convention
4 pages
(Chap3-Lab8) BER Over AWGN vs. Rayleigh
No ratings yet
(Chap3-Lab8) BER Over AWGN vs. Rayleigh
4 pages
System Development Life Cycle Overview
No ratings yet
System Development Life Cycle Overview
10 pages
BSBPMG530 Student Assessment Tasks
No ratings yet
BSBPMG530 Student Assessment Tasks
21 pages
Motor List
No ratings yet
Motor List
3 pages
Unit 1
No ratings yet
Unit 1
14 pages
N4 Mathematics Lecturer Guide
100% (2)
N4 Mathematics Lecturer Guide
306 pages
Transparency and Accountability in The Management of Independent Mosque Funds (A Case Study of Medan City Mosque)
No ratings yet
Transparency and Accountability in The Management of Independent Mosque Funds (A Case Study of Medan City Mosque)
29 pages
Complex Numbers: Polar & Exponential Forms
No ratings yet
Complex Numbers: Polar & Exponential Forms
46 pages
Artifical Intelligence Class 10th
No ratings yet
Artifical Intelligence Class 10th
193 pages
90 Days Roadmap: Dsa Sheet
No ratings yet
90 Days Roadmap: Dsa Sheet
30 pages
Isa-Tr88 00 02-2015
No ratings yet
Isa-Tr88 00 02-2015
100 pages
Counterdrone
No ratings yet
Counterdrone
11 pages
Juspay Interview Experience
No ratings yet
Juspay Interview Experience
2 pages
Semi-Detailed Lesson Plan
67% (3)
Semi-Detailed Lesson Plan
2 pages
Diesel
100% (1)
Diesel
9 pages
TempSen Tempod User Manual
No ratings yet
TempSen Tempod User Manual
23 pages
Advanced GIS - Reveiw
No ratings yet
Advanced GIS - Reveiw
221 pages

Module 7 NoSQL

Uploaded by

Module 7 NoSQL

Uploaded by

Module 7: NOSQL DB

Challenges of NoSQL Databases

Feature SQL (Relational DB) NoSQL (Non-Relational DB)

Flexible (Documents, Key-Value,

Scalability Vertical Scaling Horizontal Scaling

Schema Predefined Dynamic & Schema-less

ACID Support Strong Limited or Eventual Consistency

Best For Transactional applications Big data, real-time analytics

NoSQL Database Type Use Cases

Content management, product

Caching, real-time analytics, session

Neo4j Graph Database Fraud detection, social networks

4. Redundancy and Zero Downtime

5. Big Data Applications

When to Use NoSQL vs. SQL

The CAP Theorem in DBMS

What is the CAP Theorem?

1. CA (Consistency and Availability)

3. CP (Consistency and Partition Tolerance)

Key features of document database:

Popular Document Databases & Use Cases

Database Use Case

MongoDB Content management, product catalogs, user profiles

CouchDB Offline applications, mobile synchronization

Firebase Firestore Real-time apps, chat applications

Key features of the key-value store:

●​ Simplicity: Data retrieval is extremely fast due to direct key access.

Popular Key-Value Databases & Use Cases

Database Use Case

Redis Caching, real-time leaderboards, session storage

Memcached High-speed in-memory caching

Amazon DynamoDB Cloud-based scalable applications

3. Column Oriented Databases

Key features of Columnar Oriented Database

●​ High Scalability: Supports distributed data processing.

Popular Column-Oriented Databases & Use Cases

Database Use Case

Apache Cassandra Real-time analytics, IoT applications

Google Bigtable Large-scale machine learning, time-series data

HBase Hadoop ecosystem, distributed storage

Key features of Graph Database

Database Use Case

Neo4j Fraud detection, social networks

Amazon Neptune Knowledge graphs, AI recommendations

ArangoDB Multi-model database, cybersecurity

Comparison of NoSQL Database Types

JSON-like Key-Value Columns instead Nodes &

Fast lookups & Analytics & big Relationship-

Schema Flexible Dynamic Semi-structured Schema-less

High Scales with

What is a Graph Database?

1. It solves Many-To-Many relationship problems

2. When relationships between data elements are more important

3. Low latency with large scale data

id first name last name email phone

id first name last name email phone

You might also like

● Simplicity: Data retrieval is extremely fast due to direct key access.

● High Scalability: Supports distributed data processing.