Introduction To NoSQL

Uploaded by

shivaraj BG

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

34 views5 pages

Introduction To NoSQL

Uploaded by

shivaraj BG

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

NoSQL Big Data Management, MongoDB and Cassandra

1. Introduction
2. NoSQL Data Store
• NoSQL
• Schema-less Models
• Increasing flexibility for Data Manipulation
3. NoSQL Data Architecture Patterns
• Key-value store
• Document store
• Tabular data
• Object data store
• Graph Database
• Variations of NoSQL Architectural patterns
4. NoSQL to Manage Big Data
• Using NoSQL to Manage Big Data
5. Shared-Nothing Architecture for Big Data Tasks
• Choosing the distribution models
• Ways of Handling Bigdata problems
6. MongoDB, Databases
7. Cassandra Databases
This chapter focuses on providing detailed concepts of NoSQL data architectural patterns,
Management of Big Data, data distribution models, handling of big data problems using
NoSQL, MongoDB for document and Cassandra for columnar stores.

Learning Objectives:
1. Get conceptual understanding of NoSQL data stores, big data solutions, schema-less
models, and increased flexibility for data manipulation.
2. Get knowledge of NoSQL data architecture patterns namely, key-value pairs, tabular,
column family, big table, record columnar (RC), optimized row columnar (OCR) and
parquet, document, object and graph data stores, and the variations in architectural
patterns.
3. Get conceptual understanding of NoSQL data store management, applications and
handling problems in big data.
4. Solve Big data analytics using shared-nothing architecture, choosing a distribution
model among master-slave and peer-to-peer models, and get the knowledge of four
ways by which the NoSQL handles the bigdata problems
5. Apply the MongoDB databases and query commands.
6. Use the Cassandra databases, data model, clients, and integrate them with Hadoop.
Learning Outcome:
1. A new category of data stores is NoSQL (Not Only SQL) databases. NoSQL is an
altogether new approach of thinking about data stores.
2. NoSQL data model offers relaxation in one or more of the ACID properties, instead
follows CAP theorem and BASE.
3. NoSQL DBs possess greater flexibility for data manipulation (compared to SQL)
4. NoSQL data does not need fixed schema. The data model may drop support to joins in
Big Data environment.
Introduction
• Big Data uses distributed systems.
• A distributed system consists of multiple data nodes and distributed software
components.
• The tasks are executed in parallel.
Following are the features of distributed-computing architecture:
1. Increased reliability and fault tolerance.
2. Flexibility.
3. Sharding is storing the different parts of data onto different sets of data nodes, clusters
or servers.
4. Speed.
5. Scalability.
6. Resources sharing.
7. Open system makes the service accessible to all nodes.
8. Performance.
The following are the demerits of distributed computing,
1. Issues in troubleshooting in a larger networking infrastructure.
2. Additional software requirements.
3. Security risks for data and resources.

1. Overcoming Solution for Issues in Troubleshooting in a Larger Networking Infrastructure

• Effective Monitoring and Management
• Centralized Logging and Diagnostics
• Automation and Orchestration
2. Overcoming Solution for Additional Software Requirements
• Containerization
• Virtualization
• Microservices Architecture
3. Overcoming Solution for Security Risks for Data and Resources
• Encryption
• Access Controls, Security Policies and Training, Security Tools
• Security Audits
Software used for NoSQL big data management
1. NoSQL Databases:
• MongoDB: A popular document-oriented NoSQL database.
• Cassandra: A distributed column-family NoSQL database.
• Couchbase: A key-value and document-oriented NoSQL database.
• HBase: A distributed and scalable column-family store for Big Data.
• Neo4j: A graph database for managing highly interconnected data.
2. Big Data Processing Frameworks:
• Apache Hadoop: A framework for distributed storage and batch processing.
• Apache Spark: A fast and versatile data processing engine for real-time and batch
processing.
• Apache Flink: A stream processing framework for real-time analytics.
• Apache Kafka: A distributed data streaming platform for real-time data ingestion and
processing.
3. Data Ingestion and ETL Tools:
• Apache Nifi: An open-source data integration tool for automating data flows.
• Talend: A data integration and transformation tool for Big Data.
• Apache Flume: A distributed data collection and aggregation system.
4. Data Warehousing and Analytics:
• Amazon Redshift: A cloud-based data warehousing solution.
• Google BigQuery: A serverless, highly scalable data warehouse.
• Snowflake: A cloud-based data warehousing platform.
5. Data Visualization and BI Tools:
• Tableau: A popular data visualization tool.
• Power BI: Microsoft's business intelligence and data visualization tool.
• QlikView/Qlik Sense: Business intelligence and data discovery software.
6. Machine Learning and AI Frameworks:
• TensorFlow: An open-source machine learning framework.
• PyTorch: An open-source deep learning framework.
• Scikit-Learn: A machine learning library for Python.
7. Data Security and Governance:
• Apache Ranger: A framework for centralized security and governance for Big Data.
• Apache Sentry: A system for role-based access control in Big Data environments.
8. Monitoring and Management Tools:
• Cloudera Manager: A management and monitoring tool for Hadoop clusters.
• Hortonworks Data Platform (HDP): An open-source platform for Big Data
management.
• DataDog: A cloud-based monitoring and analytics platform for real-time data insights.
9. Containerization and Orchestration:
• Docker: A platform for containerization of applications.
• Kubernetes: An open-source container orchestration system.
10. Data Storage:
• Hadoop Distributed File System (HDFS): A distributed file system for storing Big
Data.
• Amazon S3: A scalable cloud-based object storage service.
• Google Cloud Storage: Google's object storage solution.

NoSQL Big Data Management

Use Case: Retail Analytics with NoSQL Big Data Management
Scenario: A retail company wants to analyze sales data, customer behavior, and inventory
management in real-time to optimize their operations and enhance the customer experience.
Data Ingestion: Data is ingested from various sources, such as point-of-sale (POS) systems, e-
commerce platforms, and inventory databases.
• Tools: Apache Nifi, Apache Kafka for real-time data streaming.
Data Storage: Data is stored in a NoSQL database for flexible and scalable data management.
• Database: MongoDB for document storage.
Real-Time Data Processing: Real-time data processing is performed to monitor sales and
inventory.
• Framework: Apache Spark for real-time analytics.
Machine Learning and Predictive Analytics: Machine learning models are applied to predict
customer preferences and optimize inventory.
• Framework: TensorFlow for model development.
Data Warehouse: Aggregated and processed data is loaded into a data warehouse for historical
analysis and reporting.
• Data Warehouse: Amazon Redshift for historical data storage.
Data Visualization and Reporting: Data is visualized and reported to provide insights to
decision-makers.
• Tools: Tableau for creating dashboards and reports.
Data Security and Governance: Data access controls and governance policies are enforced.
• Framework: Apache Ranger for access control.
Monitoring and Management: The entire system is monitored and managed to ensure stability
and performance.
• Tools: Cloudera Manager and DataDog for monitoring.
Containerization and Orchestration: The entire system can be containerized for scalability and
ease of management.
• Tools: Docker for containerization, Kubernetes for orchestration.
Data Archiving:Older data is archived for compliance and historical analysis.
• Storage: Amazon S3 for cost-effective data archiving.
Outcome:
• The retail company can monitor real-time sales data, predict inventory needs, optimize
pricing, and offer personalized recommendations to customers.
• Decision-makers can access interactive dashboards and historical reports to make
informed decisions.
• Data is securely managed, and the system can scale horizontally as data volumes
increase.

Big Data Ashish
No ratings yet
Big Data Ashish
7 pages
Big Data & Cloud Computing Insights
No ratings yet
Big Data & Cloud Computing Insights
11 pages
Bda File New
No ratings yet
Bda File New
6 pages
Last Min Preparation - Big Data
No ratings yet
Last Min Preparation - Big Data
5 pages
Big Data Unit 1 Notes
No ratings yet
Big Data Unit 1 Notes
20 pages
Abhishek Seminar 222
No ratings yet
Abhishek Seminar 222
19 pages
Unit 1 B Tech 3 Year BD
No ratings yet
Unit 1 B Tech 3 Year BD
10 pages
Unit 3 (Ii) - CC
No ratings yet
Unit 3 (Ii) - CC
10 pages
Course Code: CCS334 Course Name: Big Data Analytics Regulation: 2021 Year/Sem: Iii / Vi Faculty Incharge
No ratings yet
Course Code: CCS334 Course Name: Big Data Analytics Regulation: 2021 Year/Sem: Iii / Vi Faculty Incharge
12 pages
BD by Maaz
No ratings yet
BD by Maaz
19 pages
Unit1 - BDH
No ratings yet
Unit1 - BDH
77 pages
BD Unit 1
No ratings yet
BD Unit 1
5 pages
BD Imp Ques 1
100% (1)
BD Imp Ques 1
22 pages
BIG DATA Notes
No ratings yet
BIG DATA Notes
11 pages
Fundamentals of Working With Big Data in Databases
No ratings yet
Fundamentals of Working With Big Data in Databases
4 pages
Big Data - Comprehensive Summary
No ratings yet
Big Data - Comprehensive Summary
12 pages
Big Data Ecosystem Overview
No ratings yet
Big Data Ecosystem Overview
4 pages
Big Data and NoSQL Assignment
No ratings yet
Big Data and NoSQL Assignment
4 pages
Big Data Essentials for IT Professionals
No ratings yet
Big Data Essentials for IT Professionals
26 pages
Big Data Hadoop Complete Final Spaced
No ratings yet
Big Data Hadoop Complete Final Spaced
15 pages
BDA Notes
No ratings yet
BDA Notes
18 pages
Database Trends & Innovations
No ratings yet
Database Trends & Innovations
5 pages
Test 1 Big Data
No ratings yet
Test 1 Big Data
17 pages
Big Data Analytics Case Study Report
No ratings yet
Big Data Analytics Case Study Report
4 pages
Unit 3 DS
No ratings yet
Unit 3 DS
8 pages
IOT and Comp - Architecture
No ratings yet
IOT and Comp - Architecture
17 pages
Big Data Analytics
No ratings yet
Big Data Analytics
21 pages
Data Analytics Notes Unit 1
No ratings yet
Data Analytics Notes Unit 1
23 pages
Big Data-One
No ratings yet
Big Data-One
9 pages
Big Data Analytics
100% (1)
Big Data Analytics
14 pages
Bda 2M
No ratings yet
Bda 2M
10 pages
Big Data Applications & Database Insights
No ratings yet
Big Data Applications & Database Insights
15 pages
Understanding Big Data and Hadoop Basics
No ratings yet
Understanding Big Data and Hadoop Basics
17 pages
Super Important Questions For BDA
100% (1)
Super Important Questions For BDA
26 pages
MapReduce and SQL in Big Data Analytics
No ratings yet
MapReduce and SQL in Big Data Analytics
13 pages
BDA I Unit
No ratings yet
BDA I Unit
44 pages
Big Data - Cloud - AI
No ratings yet
Big Data - Cloud - AI
45 pages
Big Data Characteristics and Management
No ratings yet
Big Data Characteristics and Management
8 pages
Unit 1 (Diagrams)
No ratings yet
Unit 1 (Diagrams)
10 pages
Big Data Technology Report With Pages Removed
No ratings yet
Big Data Technology Report With Pages Removed
32 pages
Mod10-Wk10 CSG2132 Module 10 Big Data 2020
No ratings yet
Mod10-Wk10 CSG2132 Module 10 Big Data 2020
26 pages
Experiment No - 1 Bda
No ratings yet
Experiment No - 1 Bda
10 pages
ABSTRACT
No ratings yet
ABSTRACT
9 pages
Managing Big Data with Hadoop
No ratings yet
Managing Big Data with Hadoop
9 pages
Big Data Analytics Unit - 1 Notes
No ratings yet
Big Data Analytics Unit - 1 Notes
24 pages
Document (20) - 1
No ratings yet
Document (20) - 1
8 pages
Data Science
No ratings yet
Data Science
87 pages
Unit 2
No ratings yet
Unit 2
6 pages
BDA Module-1
No ratings yet
BDA Module-1
9 pages
IOTBDM - Mid Sem
No ratings yet
IOTBDM - Mid Sem
16 pages
TIE - 21CS71 SIMP With Key Answers
No ratings yet
TIE - 21CS71 SIMP With Key Answers
19 pages
Bda Question Bank
No ratings yet
Bda Question Bank
10 pages
Introduction To Big Data
No ratings yet
Introduction To Big Data
153 pages
Database Languages and Big Data Applications
No ratings yet
Database Languages and Big Data Applications
12 pages
Module 1
No ratings yet
Module 1
29 pages
Stream Processing Chapter 2
No ratings yet
Stream Processing Chapter 2
21 pages
Des - 2
No ratings yet
Des - 2
14 pages
Man in The Middle Attack
No ratings yet
Man in The Middle Attack
3 pages
1.1.1 The Age of Internet Computing
No ratings yet
1.1.1 The Age of Internet Computing
4 pages
Hadoop Installation Guide for Ubuntu
No ratings yet
Hadoop Installation Guide for Ubuntu
7 pages
Csi ZG520 Ec-2r First Sem 2023-2024
No ratings yet
Csi ZG520 Ec-2r First Sem 2023-2024
6 pages
Liquefaction of Natural Gas Using Single Stage Mixed Refrigerant PRICO Process
No ratings yet
Liquefaction of Natural Gas Using Single Stage Mixed Refrigerant PRICO Process
8 pages
AASHTO LRFD - The HL-93 Live Load Model - Dynamic Load Allowance
No ratings yet
AASHTO LRFD - The HL-93 Live Load Model - Dynamic Load Allowance
1 page
Major Project Report2023
No ratings yet
Major Project Report2023
69 pages
5 Ci Sinif Word Definition-4-2025
No ratings yet
5 Ci Sinif Word Definition-4-2025
2 pages
Evidence Plan INstitutional Assessment
No ratings yet
Evidence Plan INstitutional Assessment
20 pages
Teacher Record - Teacher Free TEFL
80% (5)
Teacher Record - Teacher Free TEFL
160 pages
RF Hearing Effect Patent
No ratings yet
RF Hearing Effect Patent
8 pages
ProfiPANEL PPD 90 Basic Enclosure Specs
No ratings yet
ProfiPANEL PPD 90 Basic Enclosure Specs
5 pages
Figures of Speech
No ratings yet
Figures of Speech
3 pages
Consumer Survey Marketing Study
No ratings yet
Consumer Survey Marketing Study
30 pages
Competition Law in India, USA and UK
No ratings yet
Competition Law in India, USA and UK
9 pages
Perspectives On Social Impact Measurement and Non-Profit Organisations
No ratings yet
Perspectives On Social Impact Measurement and Non-Profit Organisations
19 pages
ISO 4136 - 2022 (2022) - Libgen - Li
100% (1)
ISO 4136 - 2022 (2022) - Libgen - Li
16 pages
Chapter 3 HUMAN RESOURCE MANAGEMENT
No ratings yet
Chapter 3 HUMAN RESOURCE MANAGEMENT
10 pages
CNL 610 RS T8 DischargeSummaryTemplate
No ratings yet
CNL 610 RS T8 DischargeSummaryTemplate
2 pages
Asynchronous IO With Boost - Asio - Michael Caisse - CppCon 2016 PDF
No ratings yet
Asynchronous IO With Boost - Asio - Michael Caisse - CppCon 2016 PDF
104 pages
MD 008 C7MD00248C
No ratings yet
MD 008 C7MD00248C
12 pages
Propane Safety Sheet
No ratings yet
Propane Safety Sheet
4 pages
TTS 880
No ratings yet
TTS 880
2 pages
Forms of Energy Song
No ratings yet
Forms of Energy Song
2 pages
Hydrodynamics of An FLNG System in Tandem Offloading Operation
No ratings yet
Hydrodynamics of An FLNG System in Tandem Offloading Operation
13 pages
STEM Social Studies Atrekalongthegreatwallofchina - YloLlwmKRPaJFywoROLn0AiP
No ratings yet
STEM Social Studies Atrekalongthegreatwallofchina - YloLlwmKRPaJFywoROLn0AiP
5 pages
Simple & Multiple Regression
No ratings yet
Simple & Multiple Regression
12 pages
Understanding the Digestive System
No ratings yet
Understanding the Digestive System
2 pages
Dirk + Lodewyk Fourie Flight Tickets Brazil
No ratings yet
Dirk + Lodewyk Fourie Flight Tickets Brazil
3 pages
GenChem1 Lesson 2
No ratings yet
GenChem1 Lesson 2
48 pages
Planet, Code - PYTHON For LARGE LANGUAGE MODELS - A Beginners Handbook For Leveraging Llms Into Modern Development Workflows and Applications (2025)
100% (2)
Planet, Code - PYTHON For LARGE LANGUAGE MODELS - A Beginners Handbook For Leveraging Llms Into Modern Development Workflows and Applications (2025)
254 pages
Decietful Spirits..........
No ratings yet
Decietful Spirits..........
12 pages
Maths Chapter 1 and 2 Test
No ratings yet
Maths Chapter 1 and 2 Test
1 page

Introduction To NoSQL

Uploaded by

Introduction To NoSQL

Uploaded by

NoSQL Big Data Management, MongoDB and Cassandra

1. Overcoming Solution for Issues in Troubleshooting in a Larger Networking Infrastructure

NoSQL Big Data Management

You might also like