Big Data Assignment

Big Data refers to large, complex datasets characterized by volume, velocity, variety, veracity, and value, which require advanced tools for processing and analysis. It encompasses various data types, including structured, semi-structured, and unstructured data, and is essential for effective decision-making in organizations. Technologies like Hadoop, NoSQL databases, and analytics frameworks play a crucial role in managing and extracting insights from Big Data.

Uploaded by

sheshagirijoshi18

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

15 views2 pages

Big Data Assignment

Uploaded by

sheshagirijoshi18

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 2

Q1.

Define Big Data & explain its characteristics in detail

Big Data refers to large and complex datasets that cannot be efficiently stored, processed, or
analyzed using traditional database management systems or tools. These datasets are
characterized by their massive size, high speed of generation, and diverse formats. The concept of
Big Data is not just about handling large volumes of data, but also about extracting meaningful
insights and patterns to support decision-making. Characteristics of Big Data (5Vs): - Volume: Huge
amount of data from multiple sources (e.g., Facebook generates petabytes daily). - Velocity: Speed
of data generation and processing (e.g., stock market transactions). - Variety: Structured,
semi-structured, and unstructured (e.g., databases, JSON, videos). - Veracity: Accuracy and
reliability of data (e.g., filtering false data from social media). - Value: Extracting useful insights for
decision making (e.g., personalized recommendations). Additional: Variability, Visualization.
Importance: Helps in designing systems, ensures accurate analytics, and provides business value.

Q2. Classify the types of data with examples

Types of Data: 1. Structured Data – Tabular, easily stored in RDBMS (e.g., bank transactions). 2.
Semi-Structured Data – Partial organization (e.g., XML, JSON, emails). 3. Unstructured Data – No
predefined structure (e.g., videos, social media posts). Other classifications: - Quantitative vs
Qualitative - Real-time vs Batch Data Importance: Enables efficient storage, analysis, and better
decision-making.

Q3. Differentiate between Traditional Business Intelligence (TBI) & Big Data
(BD)
Traditional BI vs Big Data: - Data type: Structured vs Structured+Unstructured - Volume: MB–GB vs
TB–PB+ - Velocity: Batch vs Real-time - Tools: RDBMS, OLAP vs Hadoop, Spark, NoSQL -
Flexibility: Rigid schema vs Highly flexible - Cost: High vs Lower (open-source) - Use cases:
Reports vs Predictive, fraud detection Summary: TBI answers 'what happened' while Big Data
answers 'what, why, and what next'.

Q4. Explain architecture of Hadoop environment with neat diagram

Hadoop Architecture: - HDFS: Distributed storage (NameNode, DataNode). - YARN: Resource
management (Resource Manager, Node Manager). - MapReduce: Processing model (Map phase,
Reduce phase). - Ecosystem Tools: Hive, Pig, HBase, Sqoop, Flume. Diagram: [NameNode +
DataNodes for storage, YARN for resource allocation, MapReduce/Spark for processing, Hive/Pig
for querying].

Q5. Classify the types of analytics with examples

Types of Analytics: 1. Descriptive – What happened? (e.g., sales reports). 2. Diagnostic – Why did
it happen? (e.g., reasons for sales drop). 3. Predictive – What will happen? (e.g., churn prediction).
4. Prescriptive – What should be done? (e.g., best marketing strategy). Summary: Descriptive =
Past, Diagnostic = Cause, Predictive = Future, Prescriptive = Action.

Q6. Explain the importance of Big Data analytics for organizations

Importance: - Better decision making. - Customer insights and personalization. - Operational
efficiency, cost reduction. - Fraud detection in real-time. - Risk management and product
development. - Competitive advantage. Example: Walmart analyzes millions of transactions daily to
optimize supply chain.
Q7. List and briefly explain the technologies used in Big Data environment
Technologies: - Hadoop ecosystem (HDFS, MapReduce, Hive, Pig, HBase). - Apache Spark
(in-memory processing). - NoSQL databases (MongoDB, Cassandra, HBase). - Data ingestion
tools (Kafka, Flume, Sqoop). - Data warehousing (Redshift, BigQuery). - Visualization (Tableau,
Power BI). - ML frameworks (MLlib, TensorFlow). - Cloud platforms (AWS, Azure, GCP).

Q8. Explain CAP Theorem

CAP Theorem: In distributed systems, only 2 out of 3 are possible: - Consistency: Every read gets
latest write. - Availability: Every request receives a response. - Partition Tolerance: System works
despite network failures. Examples: - CP: HBase, MongoDB - AP: Cassandra, CouchDB - CA:
Rare, as partitions are inevitable.

Q9. What are the terminologies used in Big Data?

Terminologies: - Data Lake: Repository for all data types. - ETL: Extract, Transform, Load process.
- Data Warehouse: Structured storage. - Cluster: Group of machines. - Node: Single machine in
cluster. - Schema-on-Read: Structure defined at reading. - Data Mining: Pattern discovery. -
Streaming Data: Continuous flow from IoT, sensors. - Machine Learning: Algorithms for predictive
insights.

Q10. Explain NoSQL in Big Data

NoSQL Databases: Definition: Schema-less, scalable databases for structured + unstructured data.
Types: - Document-based (MongoDB, CouchDB). - Column-based (Cassandra, HBase). -
Key-Value (Redis, DynamoDB). - Graph (Neo4j). Advantages: - High scalability, schema-less, fast
performance. Use Cases: - Social media analytics, recommendation engines, fraud detection.

Ak As2
No ratings yet
Ak As2
15 pages
Unit 1 Big Data
No ratings yet
Unit 1 Big Data
15 pages
Module 1 Notes
No ratings yet
Module 1 Notes
12 pages
Big Data
No ratings yet
Big Data
27 pages
M1 Q&a
No ratings yet
M1 Q&a
26 pages
Very Imp Read Once
No ratings yet
Very Imp Read Once
30 pages
Big Data Imp-1
No ratings yet
Big Data Imp-1
16 pages
BDA IA1 New
No ratings yet
BDA IA1 New
21 pages
1
No ratings yet
1
4 pages
Big Data-Research
No ratings yet
Big Data-Research
5 pages
Question Bank
No ratings yet
Question Bank
62 pages
Big Data Notes
No ratings yet
Big Data Notes
89 pages
BD Imp Ques 1
100% (1)
BD Imp Ques 1
22 pages
UNIT-1:Overview of Big Data
No ratings yet
UNIT-1:Overview of Big Data
10 pages
Big Data
100% (2)
Big Data
190 pages
Bda Q&a
No ratings yet
Bda Q&a
15 pages
Big Data Ashish
No ratings yet
Big Data Ashish
7 pages
Big Data Analytics Unit - 1 Notes
No ratings yet
Big Data Analytics Unit - 1 Notes
24 pages
TIE - 21CS71 SIMP With Key Answers
No ratings yet
TIE - 21CS71 SIMP With Key Answers
19 pages
Ilovepdf Merged
No ratings yet
Ilovepdf Merged
2 pages
Big Data Characteristics and Management
No ratings yet
Big Data Characteristics and Management
8 pages
Togola Molobaly Dit Bébé Big Data Analyzing and Processing
No ratings yet
Togola Molobaly Dit Bébé Big Data Analyzing and Processing
6 pages
Big Data Analytics
No ratings yet
Big Data Analytics
21 pages
Big Data 1
No ratings yet
Big Data 1
28 pages
Bda QB
No ratings yet
Bda QB
24 pages
Big Data Analysis BDA IMP QNA Openinapp
No ratings yet
Big Data Analysis BDA IMP QNA Openinapp
33 pages
Unit 1 BDA Tut Sheet 2 Answer
No ratings yet
Unit 1 BDA Tut Sheet 2 Answer
19 pages
Unit 1 BDA
No ratings yet
Unit 1 BDA
3 pages
Big Data Analytics
No ratings yet
Big Data Analytics
61 pages
Understanding Big Data Analytics Concepts
No ratings yet
Understanding Big Data Analytics Concepts
39 pages
Bda Ans
No ratings yet
Bda Ans
18 pages
Bda Summer 2024 Solution
No ratings yet
Bda Summer 2024 Solution
26 pages
V'S" V'S,"
No ratings yet
V'S" V'S,"
4 pages
Bda Unit1 Ques
No ratings yet
Bda Unit1 Ques
36 pages
Assignment DBMS
No ratings yet
Assignment DBMS
4 pages
Bda 1
No ratings yet
Bda 1
4 pages
BDA Notes Part 1
No ratings yet
BDA Notes Part 1
11 pages
Big Data Analytics & Hadoop Guide
No ratings yet
Big Data Analytics & Hadoop Guide
14 pages
Big Data
No ratings yet
Big Data
67 pages
Unit 1 BD
No ratings yet
Unit 1 BD
3 pages
Big Data Analytics Overview
No ratings yet
Big Data Analytics Overview
14 pages
Unit 1 B Tech 3 Year BD
No ratings yet
Unit 1 B Tech 3 Year BD
10 pages
Bda Winter 2024 Solution
No ratings yet
Bda Winter 2024 Solution
25 pages
Introduction To Big Data Notes
No ratings yet
Introduction To Big Data Notes
4 pages
BD Unit 1
No ratings yet
BD Unit 1
5 pages
No SQL Database in Bda
No ratings yet
No SQL Database in Bda
84 pages
Ite06 Big Data Analytics-Qbank
No ratings yet
Ite06 Big Data Analytics-Qbank
18 pages
Understanding Big Data and Hadoop Basics
No ratings yet
Understanding Big Data and Hadoop Basics
17 pages
Big Data Hadoop Complete Final Spaced
No ratings yet
Big Data Hadoop Complete Final Spaced
15 pages
cp5293 Big Data Analytics Question Bank
0% (1)
cp5293 Big Data Analytics Question Bank
13 pages
Cp5293 Big Data Analytics Question Bank
0% (1)
Cp5293 Big Data Analytics Question Bank
13 pages
Bigdata CO1 4 Merged
No ratings yet
Bigdata CO1 4 Merged
5 pages
Big Data Tools and Its Framework
No ratings yet
Big Data Tools and Its Framework
5 pages
Big Data Module1 Answers Elaborated (1) (IA1)
No ratings yet
Big Data Module1 Answers Elaborated (1) (IA1)
4 pages
Overview of Big Data
No ratings yet
Overview of Big Data
7 pages
Title - Concept of Big Data: Presented by - Divyanshu Upadhyay Naman Gupta Adarsh Pandey Pankaj Chaudhary Shivbrat Singh
No ratings yet
Title - Concept of Big Data: Presented by - Divyanshu Upadhyay Naman Gupta Adarsh Pandey Pankaj Chaudhary Shivbrat Singh
17 pages
Big Data One Shot
No ratings yet
Big Data One Shot
45 pages
Unit 1 Big Data Analysis
No ratings yet
Unit 1 Big Data Analysis
2 pages
Planer Quick Return Mechanisms
No ratings yet
Planer Quick Return Mechanisms
6 pages
4 As Lesson Plan 1
No ratings yet
4 As Lesson Plan 1
6 pages
Pranay Report-1
No ratings yet
Pranay Report-1
36 pages
Mathematics Alternative B
No ratings yet
Mathematics Alternative B
5 pages
Live Break Bundle - Free Edition User Guide
100% (1)
Live Break Bundle - Free Edition User Guide
15 pages
Cab Request for RICO Auto Visit
No ratings yet
Cab Request for RICO Auto Visit
2 pages
UPSC Mains: Buddhism & Nalanda
No ratings yet
UPSC Mains: Buddhism & Nalanda
25 pages
Shattered But Unbroken by Amelia Van Der Merwe
No ratings yet
Shattered But Unbroken by Amelia Van Der Merwe
351 pages
Practical Class 2 Text Artificial Intelligence
No ratings yet
Practical Class 2 Text Artificial Intelligence
3 pages
Inventory & Downtime Cost Analysis
No ratings yet
Inventory & Downtime Cost Analysis
2 pages
ACS 100 User Manual and Installation Guide
No ratings yet
ACS 100 User Manual and Installation Guide
52 pages
MITPE - Learning Facilitator - Role Description
No ratings yet
MITPE - Learning Facilitator - Role Description
3 pages
2025 - 5 - CS3100 - Syllabus 2
No ratings yet
2025 - 5 - CS3100 - Syllabus 2
4 pages
Plastic Moment of Resistance
No ratings yet
Plastic Moment of Resistance
5 pages
CIE ICT Complete Answers Ch5
No ratings yet
CIE ICT Complete Answers Ch5
3 pages
Why Not Me Lyrics
No ratings yet
Why Not Me Lyrics
3 pages
Grade V Exam Syllabus & Schedule
No ratings yet
Grade V Exam Syllabus & Schedule
4 pages
BJT Review & Problems Tutorial
No ratings yet
BJT Review & Problems Tutorial
23 pages
Effective Paragraph Beginnings and Endings
100% (3)
Effective Paragraph Beginnings and Endings
11 pages
Business Plan of 247 AutoCare
No ratings yet
Business Plan of 247 AutoCare
45 pages
Grade 4 Q 2 Reading
No ratings yet
Grade 4 Q 2 Reading
14 pages
L2Challenges PUJournal 2018
No ratings yet
L2Challenges PUJournal 2018
25 pages
MaaS PPT. 27.08.2025 - Rev
No ratings yet
MaaS PPT. 27.08.2025 - Rev
51 pages
Fuse Box Diagram Lada Granta and Relay With Locations and Assignment of Elements
No ratings yet
Fuse Box Diagram Lada Granta and Relay With Locations and Assignment of Elements
21 pages
CUSAT B.Tech Syllabus 2006: CS Sem VII
No ratings yet
CUSAT B.Tech Syllabus 2006: CS Sem VII
19 pages
Nutri Scan
No ratings yet
Nutri Scan
4 pages
DLP #7 Eng 7
50% (2)
DLP #7 Eng 7
2 pages
Ureter
No ratings yet
Ureter
30 pages
1917 Roteiro-Script
No ratings yet
1917 Roteiro-Script
115 pages
Product PDF 4956
No ratings yet
Product PDF 4956
2 pages

Big Data Assignment

Uploaded by

Big Data Assignment

Uploaded by

Q1.

Define Big Data & explain its characteristics in detail

Q2. Classify the types of data with examples

Q4. Explain architecture of Hadoop environment with neat diagram

Q5. Classify the types of analytics with examples

Q6. Explain the importance of Big Data analytics for organizations

Q8. Explain CAP Theorem

Q9. What are the terminologies used in Big Data?

Q10. Explain NoSQL in Big Data

You might also like