DBMS Unit 5 Macro | PDF | No Sql | Databases
0% found this document useful (0 votes)
29 views

DBMS Unit 5 Macro

Uploaded by

kkartikb18
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views

DBMS Unit 5 Macro

Uploaded by

kkartikb18
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Advantages of Distributed Database: System Distributed database {“id”:2,”name”:”Kavita”} ] } Here id, name are the keys and 1,2,

e the keys and 1,2, “Ankita”,


management has been introduced for various reasons ranging like “Prajkta” are the values corresponding to those keys. Key-value stores
organizational decentralization or data processing at lower cost.1. help the developer to store schema-less data . They work best for
Management of distributed data with different levels of transparency. A Shopping Cart Contents. The DynamoDB, Riak, Redis are some famous
DBMS should hide the detail of where each data item (like tables or examples of key-value store. 2.Document Store: The document store
relations) is physically stored within the system. (a) Distribution/Network make use of key-value pair to store and retrieve data. • The document is
transparency. In this all-internal network operations are hidden from the stored in the form of XML and JSON. • The document stores appear the
user. Network may be divided into transparency and naming most natural among NoSQL database types. • It is most commonly used
transparency. (b) Fragmentation transparency: The process of due to flexibility and ability to query on any field. • For example –
decomposing the database into smaller multiple units called as {“id”:101, “Name”: “AAA”, “City”: “Pune”}. MongoDB and CouchDB are
fragmentation. Fragmentation transparency makes the user unaware of two popular document oriented NoSQL database. 3.Graph: The graph
the existence of data fragments. (c) Replication transparency Replication database is typically used in the applications where the relationships
is coping data at multiple sites and at multiple locations for better among the data elements is an important aspect. The connections
availability, and reliability. Replication transparency makes the user between elements are called links or relationships. In a graph database,
unaware of the existence of copies of data. Increase reliability. Reliability connections are first-class elements of the database, stored directly. In
is defined as the probability that a system is running (working) at any point relational databases, links are implied, using data to express the
of time. In case of distributed database systems, if one system fails to relationships. The graph database has two components - 1) Node : The
work or down for some time other system can take over its all functions entities itself. For example - People, student, 2) Edge : The relationships
hence overall system does not affected. 3. Increase availability: among the entities. Graph base database is mostly used for social
Availability is defined as the probability that the system is continuously networks, logistics, spatial data. The graph databases are - Neo4J, Infinite
available (accessible) during a certain time interval for any database Graph, OrientDB. 4. Wide Column Store: Wide column store model is
operations. In case of distributed database systems, if one system is not similar to traditional relational database. In this model, the columns are
available (due to failure) for some time then other system can take over created for each row rather than having predefined by the table structure.
its all functions hence system is always available. 4. Improved • In this model number of columns are not fixed for each record. •
performance. A distributed DBMS fragments the database and try to keep Columns databases can quickly aggregate the value of a given column. The
data closer to the site where it is needed most of the time for operations. column store databases are widely used to manage data warehouses,
Disadvantages of Distributed Database System 1) The system becomes business intelligence, HBase, Cassandra are examples of column based
complex to manage and control. 2) The security issues must be carefully databases. BASE Properties: The relational database strongly follow the
managed. 3) The system requires deadlock handling during the ACID properties(Atomicity, Consistency, Isolation and Durability) while
transaction processing otherwise the entire system may be in inconsistent the NoSQL database follows BASE properties. Base properties consists of
state. 4) There is need of some standardization for processing of 1) Basically Available : It means the system is guaranteed to be available
distributed database system. Distributed DBMS: The database files are in the event of failure. 2) Soft State : It means, even without an input the
stored at geographically different locations across the network. As data system state may change. 3) Eventual Consistency : The system will
is distributed over the network , it requires time to synchronize data and become consistent over time. ACID Vs BASE: ACID: 1. It stands for
thus difficult to maintain. If one database fails, user can have access to atomicity consistency isolation and durability. 2. It shows consistency. 3.
other database files. It can have data replication as database is Availability is less important. 4. Evolution is difficult 5. It posses expensive
distributed. Hence there can be some data inconsistency. Centralized joins and relationships. 6. It has high maintenance costs. 7. It provides
DBMS: The database is stored at centralized location. A centralized vertical scaling. BASE: 1.It stands for basic availability soft state eventual
database is easier to maintain and keep updated since all the data are consistency. 2.It represents weak consistency. 3.Availability of system is
stored in a single location. If the centralized database fails, then there is more important. 4.Evolution is easy. 5.It is free from joins and
no access to a database. It has single database system, hence there is no relationships. 6.It has low maintenance cost. 7.It provides horizontal
data replication. Therefore there is no data inconsistency. NoSQL scaling. RDBMS VS NoSQL: RDBMS:1.The relational database system is
Database: • NoSQL stands for not only SQL. • It is nontabular database based on relationships among the tables. 2. It is vertically scalable. 3. It
system that store data differently than relational tables. There are various has predefined schema. NoSQL: It is non-relational database system. It
types of NoSQL databases such as document, key-value, wide column and can be used in distributed environment.2. It is horizontally scalable. 3.It
graph. • Using NoSQL we can maintain flexible schemas and these does not have schema or it may have relaxed schema. Types Of Data:
schemas can be scaled easily with large amount of data. Need: The NoSQL Structured data: Information stored in database is represented in strict
database technology is usually adopted for following reasons - 1) The format. Hence it is called as structured data. Data recorded in form of
NoSQL databases are often used for handling big data as a part of rows and columns is also a type of structured data. Only 2% of data
fundamental architecture. 2) The NoSQL databases are used for storing available in world is exists in form of Structured data. The DBMS checks to
and modelling structured, semistructured and unstructured data. 3) For ensure that all the data follows the structure and constraints
the efficient execution of database with high availability, NoSQL is used. specified in the schema. 1. It is having fixed and organized form of data.
4) The NoSQL database is non-relational, so it scales out better than 2. It is schema dependent and less flexible. 3. Structured query languages
relational databases and these can be designed with web applications. 5) are used to access the data present in the schema. 4. Storage requirement
For easy scalability, the NoSQL is used. Features: 1) The NoSQL does not for data is less. 5. Examples : Phone numbers, Customer Names, Social
follow any relational model. 2) It is either schema free or have relaxed Security numbers. Semi-structured data: It is combination of structured
schema. That means it does not require specific definition of schema. 3) and unstructured data. It is more flexible than structured data but less
Multiple NoSQL databases can be executed in distributed fashion. 4) It can flexible than unstructured data. It is the most flexible data. The tags and
process both unstructured and semi-structured data. 5) The NoSQL have elements are used to access the data.. Storage requirements for the data
higher scalability. 6) It is cost effective. 7) It supports the data in the form is significant.. Examples : Server logs, Tweets organized by hashtags,
of key-value pair, wide columns and graphs. Types of NoSQL Databases: emails sorted by the inbox, sent or draft folders. Unstructured data: It is
1. Key-Value Store: Key-value pair is the simplest type of NoSQL database. not predefined or organized form of data. It is the most flexible data. Only
• It is designed in such a way to handle lots of data and heavy load. • In textual queries are possible. Storage requirements for the data is huge.
the key-value storage the key is unique and the value can be JSON, string Examples : Emails and messages, Image files, Open ended survey answers.
or Binary objects. • For example {Customer: [ {“id”:1, “name”:”Ankita”}, CAP Theorem: • Cap theorem is also called as brewer’s theorem. • The
CAP Theorem is comprised of three components (hence its name) as they key, this data type is used. 7) Object : The object is implemented for
relate to distributed data stores : o Consistency : All reads receive the embedded documents. 8) Symbol : This data type is similar to string data
most recent write or an error. o Availability : All reads contain data, but it type. This data type is used to store specific symbol type. 9) Null : For
might not be the most recent. o Partition tolerance : The system continues storing the null values this data type is used. 10) Date : This data type is
to operate despite network failures (i.e.; dropped partitions, slow used to store current date or time. We can also create our own date or
network connections, or unavailable network connections between time object. 11) Binary data : In order to store binary data we need to use
nodes.) • The CAP theorem states that it is not possible to guarantee all this data type. 12) Regular expression : This data type is used to store
three of the desirable properties - consistency, availability, and partition regular expression. Aggregation: Aggregation is an operation used to
tolerance at the same time in a distributed system with data replication. process the data that results the computed results. The aggregation
BASE Consistency Model:1. BASE Introduction: Relational databases groups the data from multiple documents and operate on grouped data
have some rules to decide behaviour of database transactions. ACID to get the combined result. The MongoDB aggregation is equivalent to
model maintains the atomicity, consistency, isolation and durability of count(*) and with group by in sql. MongoDB supports the concept of
database transactions. NOSQL turns the ACID model to the BASE model. aggregation framework. The typical pipeline of aggregation framework is
2. Data storage: NOSQL databases use the concept of a key/value store. as follows 1) $match() stage - filters those documents we need to work
There are no schema restrictions for NoSQL database. It simply stores with, those that fit our needs 2) $group() stage - does the aggregation job
values for each key and distributes them across the database, it offers 3) $sort() stage - sorts the resulting documents the way we require
efficient retrieval. 3. BASE Guidelines: Basic availability- The database is (ascending or descending) The input of the pipeline can be one or several
accessible to all users most of the time. Soft state- The Data Stores No collections. The pipeline then performs successive transformations on the
need to have be write-consistent. Different replicas not be mutually data and the result is obtained. Map Reduce: Map reduce is a data
consistent all the time. Eventual consistency- Data Stores and exhibit processing programming model that helps in performing operations on
consistency at some later point also called as Lazy Read or Write. 4. large data sets and produce aggregate results. Map reduce is used for
Redundancy and Scalability: To add redundancy to a database, we can large volume of data. Where 1) map Function : It uses emit() function in
add duplicate nodes and configure replication. Scalability is simply a which it takes two parameters key and value key. Here the key is on which
matter of adding additional nodes. There can be hash function designed we make groups(such as group by name, or age) and the second
to allocate data to server. 5. Redundancy and Scalability: BASE properties parameter is on which aggregation is performed like avg(), sum() is
are much soft limitation than ACID Properties. There is no direct mapping calculated on. 2) reduce Function : This is a function in which we perform
between the two consistency models. A BASE data store values aggregate functions like avg(), sum() 3) out : It will specify the collection
availability, but it cannot guarantee consistency of replicated name where the result will be stored. 4) query : We will pass the query to
data at write time. SQL VS NOSQL: SQL 1. Full form is Structured Query filter the resultset. 5) sort : It specifies the optional sort criteria. 6) limit :
Language. 2. SQL is a declarative query language. 3. SQL databases works It specifies the optional maximum number of documents to be returned.
on ACID properties, Atomicity, Consistency, Isolation, Durability 4. Comparison MongoDB and RDBMS: MongoDB: 1.MongoDB is schema
Structured and organized data 5. Relational Database is table based. 6. less. 2. Complex joins are avoided. 3. It is document-based database. 4.
Data and its relationships are stored in separate tables. 7. Tight MongoDB has its own ad-hoc query language with rich features set. 5.
consistency. 8. Examples:MySQL, Oracle, MS SQL, PostgreSQL, SQLite, Easy to scale. 6. Mapping application objects to database object is not
DB2. NOSQL : Full form is Not Only SQL or Non-relational database. This needed. 7. Data access is faster. 8. MongoDB automatically uses all free
is Not a declarative query language. NoSQL database follows the Brewers memory available on the machine as a cache memory. RDBMS: 1 RDBMS
CAP theorem, Consistency Availability, Partition Tolerance. Unstructured requires schema. 2 Complex joins are used. 3 It is a table-based database.
and unreplicable data. Key-Value pair storage, Column Store, Document 4. RDBMS has SQL. 5. Not easy to scale. 6. RDBMS requires Mapping
Store, Graph databases. No pre-defined schema. Eventual consistency application objects to database object. 7. Data access is comparatively
rather than ACID property. Examples: MongoDB, Big Table, Neo4j, Couch slower than MongoDB. 8. This feature is provided in RDBMS. Database
DB, Cassandra, HBase. MongoDB • MongoDB is an open source, Replication and Sharding: Database Replication means creating multiple
document based database. • It is developed and supported by a company copies of data which is also called as database shards. Database Sharding
named 10gen which is now known as MongoDB Inc. • The first ready is related to horizontal partitioning in databases for purpose of improving
version of MongoDB was released in March 2010. There are so many availability of data. The portioning means separating rows of one table
efficient RDBMS products available in the market. Well, all the modern into multiple different tables. In such case each partition has the same
applications require Big data, faster development and flexible schema and columns, but different set of rows. The data held in each
deployment. This need is satisfied by the document based database like partition is unique and independent of the data in other partition. The
MongoDB. Features of MongoDB: 1) It is a schema-less, document based Vertical partitioning will separate all columns in table and insert into new
database system. 2) It provides high performance data persistence. 3) It table. The data held within one vertical partition is independent from the
supports multiple storage engines. 4) It has a rich query language support. data in all the other vertical partitions. 1. Sharding Process: Sharding
5) MongoDB provides high availability and redundancy with the help of process involves breaking up data on one server to multiple smaller
replication. That means it creates multiple copies of the data and sends chunks or parts, which is also called as logical shards. The logical shards
these copies to a different server so that if one server fails, then the data are distributed across separate database servers or nodes. The physical
is retrieved from another server. 6) MongoDB provides horizontal server holding multiple logical shards together called as physical shards.
scalability with the help of sharding. Sharding means to distribute data on All shards together represent an entire logical dataset. 2. Sharding
multiple servers. 7) In MongoDB, every field in the document is indexed Nothing Architecture This architecture doesn't share any data or
as primary or secondary. Due to which data can be searched very computing resources. The certain tables can be replicate into each shard
efficiently from the database. Data Types: Following are various types of to serve as reference tables. The table containing the necessary data for
data types supported by MongoDB. 1) Integer : This data type is used for computation of value must be kept in all shard, it would help to ensure
storing the numerical value. 2) Boolean : This data type is used for that all of the data required for computation is held in every shard. It is
implementing the Boolean values i.e. true or false. 3) Double : Double is possible to implement sharding at the application level, So application
used for storing floating point data. 4) String : This is the most commonly includes code that defines which shard to transmit the data. Few DBMS
used data type used for storing the string values. 5) Min/Max keys : This are having sharding capabilities by default, allowing you to implement
data type is used to compare a value against the lowest or highest BSON sharding directly at the database level.
element. 6) Arrays : For storing an array or list of multiple values in one
Study material provided by: Vishwajeet Londhe

Join Community by clicking below links

Telegram Channel

https://t.me/SPPU_TE_BE_COMP
(for all engineering Resources)

WhatsApp Channel
(for all tech updates)

https://whatsapp.com/channel/
0029ValjFriICVfpcV9HFc3b

Insta Page
(for all engg & tech updates)

https://www.instagram.com/
sppu_engineering_update

You might also like