Next Generation
Technologies
Class : TYBSc IT Sem :
V
Chapter 2: NoSQL
Prepared by : Ms. Anuja
Narvekar
NoSQL
“NoSQL is a new way of designing Internet-scale database solutions. It is not a product or
technology but a term that defines a set of database technologies that are not based on the
traditional RDBMS principles.”
SQL
The idea of RDBMS was borne from E.F. Codd’s 1970 whitepaper titled “A relational model of
data for large shared data banks.” The language used to query RDBMS systems is SQL (Sequel
Query Language ).
RDBMS systems are well suited for structured data held in columns and rows, which can be
queried using SQL.
The RDBMS systems are based on the concept of ACID transactions. ACID stands for Atomic,
Consistent, Isolated, and Durable, where
• Atomic implies either all changes of a transaction are applied completely or not applied at
all.
• Consistent means the data is in a consistent state after the transaction is applied. This
means after a transaction is committed, the queries fetching a particular data will see the
same result.
• Isolated means the transactions that are applied to the same set of data are independent of
NoSQL
NoSQL is a term used to refer to non-relational databases.
NoSQL defined as “Not Only SQL”, which means that in addition to SQL other complimentary
database solutions exist.
Thus, it encompasses majority of the data stores that are not based on the conventional
RDBMS principles and are used for handling large data sets on an Internet scale.
Big data is posing challenges to the traditional ways of storing and processing data, such as
the RDBMS systems.
As a result, we see the rise of NoSQL databases, which are designed to process this huge
amount and variety of data within time and cost constraints.
Some examples of big data use cases that are a good fit for NoSQL databases are the
following:
Social Network Graph : Who is connected to whom? Whose post should be visible on the user’s
wall or homepage on a social network site?
Search and Retrieve : Search all relevant pages with a particular keyword ranked by the
Needs of NoSQL
Explosion of social media sites.
Rise of cloud-based solutions such as Amazon S3 (simple storage solution)
Expansion of open-source community.
NoSQL solution is more acceptable to a client now than a year ago.
Google, Facebook, Microsoft etc. are example of companies heading toward
NoSQL.
NoSQL Database Features
Schema
Non-relational
Commodity hardware
Highly distributable
CAP Theorem (Brewer’s Theorem)
Eric Brewer’s defined the CAP theorem in 2000.
The theorem states that when designing an application in a distributed environment
there are three basic requirements that exist, namely consistency, availability, and
partition tolerance.
Consistency means that the data remains consistent after any operation is
performed that changes the data, and that all users or clients accessing the
application see the same updated data.
Availability means that the system is always available.
Partition Tolerance means that the system will continue to function even if it is
partitioned into groups of servers that are not able to communicate with one
another
The BASE
Eric Brewer coined the BASE acronym .
BASE can be explained as;
Basically Available means the system will be available in terms of the CAP theorem.
Soft state indicates that even if no input is provided to the system, the state will
change over time. This is in accordance to eventual consistency.
Eventual consistency means the system will attain consistency in the long run,
provided no input is sent to the system during that time.
Hence BASE is in contrast with the RDBMS ACID transactions.
You have seen that NoSQL databases are eventually consistent but the eventual
consistency implementation may vary across different NoSQL databases.
NRW is the notation used to describe how the eventual consistency model is implemented
across NoSQL databases where,
N is the number of data copies that the database has maintained.
R is the number of copies that an application needs to refer to before returning a read
request’s output.
W is the number of data copies that need to be written to before a write operation is marked
as completed successfully.
Using these notation configurations , the databases implement the model of eventual
consistency.
Advantages of NoSQL
High scalability
Manageability and administration
Low cost
Flexible data models
Support for unstructured data
Disadvantages of NoSQL
Maturity
Support
Limited query capabilities
Administration
Expertise
SQL vs
NoSQL
Types of NoSQL Databases
Here are the four main types of NoSQL databases:
•Document databases
•Key-value stores
•Column-oriented databases
•Graph databases
Thank You