0% found this document useful (0 votes)
26 views3 pages

Unit 1 BDA

unit 1 of big data analytics

Uploaded by

saisri.pentapati
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views3 pages

Unit 1 BDA

unit 1 of big data analytics

Uploaded by

saisri.pentapati
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Unit 1: Introduction to Big Data

(10-mark answers for each topic)

1. Big Data and Its Importance

• Definition: Big Data refers to datasets that are too large or complex to process using
traditional methods.

• Importance:

o Enables data-driven decision-making. o Provides predictive insights


in fields like healthcare, finance, and marketing.

o Drives innovation and operational efficiency.

• Key Applications:

o Healthcare: Personalized medicine and real-time monitoring.

o Retail: Enhanced customer personalization and inventory management.

2. Characteristics of Big Data (5 V's)

The key properties of Big Data are summarized as:

1. Volume: The size of data, measured in terabytes or petabytes.

2. Velocity: The speed at which data is generated and processed (e.g., social media).

3. Variety: Data in different formats like text, images, videos, etc.

4. Veracity: Ensuring accuracy and reliability of data despite inconsistencies.

5. Value: Deriving meaningful insights to enhance business operations.

3. Big Data Analytics

• Definition: The process of analyzing large and varied datasets to uncover hidden
patterns, correlations, and actionable insights.

• Steps in Big Data Analytics:

1. Data Collection: Gathering structured, semi-structured, and unstructured


data.

2. Storage: Using platforms like Hadoop and Spark.


3. Analysis: Employing algorithms for predictive, descriptive, and prescriptive
insights.

• Real-World Example:

o In e-commerce, analytics is used to recommend products based on browsing


history.

4. Basic Requirements for Big Data Analytics

1. Hardware Requirements: High-performance servers and storage systems.

2. Frameworks: Tools like Hadoop and Spark for data storage and processing.

3. Scalable Algorithms: Efficient algorithms for handling large datasets.

4. Expertise: Skilled professionals to manage data pipelines.

5. Big Data Applications

1. Healthcare: Disease outbreak prediction and real-time patient monitoring.

2. Finance: Fraud detection and algorithmic trading.

3. Retail: Targeted marketing and demand forecasting.

4. Transportation: Traffic prediction and route optimization.

6. MapReduce Framework

• Definition: A programming model for processing large-scale data in parallel.

• Phases:

1. Map Phase: Breaks data into key-value pairs.

2. Shuffle and Sort: Groups similar keys together.

3. Reduce Phase: Aggregates data to produce the final result.

Diagram: Refer to the MapReduce Workflow.

7. Algorithms Using MapReduce

• Examples:

1. Word Count: Counts the frequency of each word in a dataset.


2. Sorting: Arranges data in a specific order.

8. NoSQL Databases

• Definition: Non-relational databases optimized for Big Data.

• Types:

1. Key-Value Databases: Efficient for lookup operations (e.g., Redis).

2. Column-Family Databases: Stores data in columns instead of rows (e.g.,


Cassandra).

3. Document Databases: JSON-like documents (e.g., MongoDB).

4. Graph Databases: Nodes and edges represent relationships (e.g., Neo4j).

Diagram: Refer to the SQL vs NoSQL Comparison.

You might also like