0% found this document useful (0 votes)

31 views20 pages

BDA Lesson Plan Final

Uploaded by

sannakkiyukta

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOC, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

31 views20 pages

BDA Lesson Plan Final

Uploaded by

sannakkiyukta

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOC, PDF, TXT or read online on Scribd

You are on page 1/ 20

School of Computer Science & Engineering

Course Plan

Semester: 7 Year:2025-26
Course Title: Big Data and Analytics Course Code: 24ECSC404
Total Contact Hours: 58 Hrs Duration of ESA: 2 hrs
ISA Marks: 67 ESA Marks: 33
Lesson Plan Author: Prof.Channabasappa Muttal Date: 28/7/2025
Checked By: Dr. Suvarna K Date: 28/7/2025

Brief description of the course:

This course provides an in-depth understanding of terminologies and the core concepts behind big
data problems, applications, systems and the techniques, that underlie todays big data computing and
storage technologies. It provides an exposure on some of the most common frameworks such as
Apache Spark, Hadoop, MapReduce, Large scale data storage technologies such as in-memory
key/value storage systems, NoSQL distributed databases, Big Data Streaming Platforms such as
Apache Spark Streaming, Big data analysis and Visualization.

Prerequisites:
Programming skill and knowledge of object oriented programming, Data base management system
and Exploratory data analytics.

Course Outcomes (COs):

At the end of the course the student should be able to:

1. Identify the key issues in big data management and applications in business and scientific
computing.
2. Analyse the efficiency of big data storage solutions and integration strategies.
3. Apply stream data processing tools and techniques for real time data.
4. Develop big data processing models for a real-time application.
5. Apply data visualization techniques to interpret results.

Page 1 of 20.
School of Computer Science & Engineering

Course Articulation Matrix: Mapping of Course Outcomes (COs) with Program

Outcomes (POs)
Course Title: Big Data and Analytics Semester: 7
Course Code: 24ECAC401 Year: 2025-26

Course Outcomes (COs) / Program 1 2 3 4 5 6 7 8 9 10 11 12 13 14

Outcomes (POs)
1. Identify the key issues in big data M M
management and applications in
business and scientific computing.
2.Analyse the efficiency of big data M M
storage solutions and integration
strategies.
3.Apply stream data processing tools H H
and techniques for real time data.
4.Develop big data processing models M H M H
for a real-time application.
5.Apply data visualization techniques to M M M
interpret results.

Degree of compliance L: Low M: Medium H: High

Page 2 of 20.
School of Computer Science & Engineering

Competency addressed in the Course and corresponding Performance Indicators

Competency Performance Indicators

1.4 Demonstrate competence in computer science CSPI-1.4.1 Apply suitable data structures and
engineering knowledge. programming paradigm to solve problems.
CSPI-1.4.3 Apply suitable database concepts to
manage data.
2.1 Demonstrate an ability to identify and characterize CSPI-2.1.2 Identify processes,
an engineering problem modules ,variables, and parameters of computer
based system to solve the problems.
2.2 Demonstrate an ability to formulate a solution plan CSPI-2.2.2 Identify functionalities and computing
and methodology for an engineering problem resources.
2.3 Demonstrate an ability to formulate and interpret a CSPI-2.3.1 Apply computer engineering
model. principles to formulate models (mathematical or
otherwise) of a computer-based system or
process that is appropriate in terms of
applicability and required accuracy.
5.3 Demonstrate an ability to apply IT tools for the CSPI-5.3.1 Demonstrate proficiency in using IT
chosen engineering activity. tools for performing engineering activity.
10.1 Demonstrate effective use of written CSPI-10.1.2 Write a technical report for software
communication skills relevant to the engineering development life cycle activities using Standards.
discipline that convey information effectively to both
technical and non-technical stake holders.
10.2 Demonstrate competence in listening, speaking CSPI-10.2.2 Deliver effective oral presentations
and presentation to technical and non-technical audiences.
13.1:Demonstrate the knowledge required in the CSPI 13.1.1 Identify the source and type of data
domain of data engineering to develop computer based required for analysis and knowledge Discovery.
solutions.
CSPI 13.1.2 Apply suitable data engineering
techniques or tools to achieve data Consistency.

Eg: 1.2.3: Represents Program Outcome ‘1’, Competency ‘2’, and Performance Indicators ‘3’.

Page 3 of 20.
School of Computer Science & Engineering

Course Content

Course Code: 24ECAC401 Course Title: Big Data and Analytics

L-T-P: 2-0-1 Credits: 3 Contact Hrs: 4hrs/week
ISA Marks: 67 ESA Marks: 33 Total Marks: 100
Teaching Hrs: 30 Practical Hrs:28 Exam Duration:2 hrs

Content Hrs
Unit - 1
1. Introduction: Overview of Big data, Big Data Characteristics, Different Types of Data.
Data Analytics, Data Analytics Life Cycle 05 hrs

2. Big Data Storage: Clusters, File Systems and Distributed File Systems, NoSQL, No SQL
Database: Document-oriented, Column-oriented, Graph-based, MongoDB. Sharding,
Replication, Combining Sharding and Replication. On Disk Storage Devices, In-memory
05 hrs
Storage Devices.
3. Big Data Processing: Parallel Data Processing, Distributed Data Processing, Hadoop,
05 hrs
Map Reduce, Examples on MapReduce, Spark.
Unit - 2

4. Stream Processing: Introduction to Stream Processing-Batch Versus Stream Processing;

Examples of Stream Processing; Scaling Up Data Processing; Distributed Stream
Processing; Stream-Processing Model- Sources and Sinks, Immutable Streams Defined
from One Another, Transformations and Aggregations, Window Aggregations, Stateless 05hrs
and Stateful Processing.
5. Big DataAnalysis: Pig- Introduction, Pig Primitive Data Types - Running Pig - Execution
Modes of Pig – HDFS Commands - Relational Operators - Eval Function - Complex Data
Types - Piggy Bank - User-Defined Functions - Parameter Substitution - Diagnostic
05hrs
Operator - Word Count Example using Pig - Pig at Yahoo! - Pig Versus Hive.
6.Big Data Visualization: Hive – Introduction, Hive Architecture, Hive Data Types, Hive File
Format, Hive Query Language (HQL), RCFile Implementation, User-Defined Function (UDF).
Serialization and Deserialization 05hrs

Text Books

Page 4 of 20.
School of Computer Science & Engineering

1. SeemaAcharya, Subhashini Chellappan, Big Data and Analytics, Second Edition,Wiley India
Pvt Ltd 2022.
2. Gerard Maas and François Garillot, Stream Processing with Apache Spark Mastering
Structured Streaming and Spark Streaming, O’REILLY,2019

References
1. Big Data Analytics, Theory, Techniues, Platforms, and applications
Authors: Ümit Demirbaga, Gagangeet Singh Aujla, Anish Jindal, Oğuzhan Kalyon

Evaluation Scheme
In-Semester Assessment Scheme

Page 5 of 20.
School of Computer Science & Engineering

Assessment Conducted for Weightage in

marks Marks

ISA-1 (Theory) 30 33

ISA-2 (Theory) 30

Lab Activity 40 34

Total 67

End-Semester Assessment Scheme

Assessment Conducted for Weightage in

marks Marks
Theory 60 33
Total 33

Course Unitization for Minor Exams and End Semester Assessment

Page 6 of 20.
School of Computer Science & Engineering

No. of No. of No. of No. of No. of

Topics / Chapters Teaching Questions Questions Questions Questions Questions
Credits in ISA-1 in ISA-2 in Lab in Theory in Lab
Activity ESA ESA
Unit I
1. Introduction 5 1 -- -- - ---
2. Big Data Storage 5 1 -- -- -- ---
3. Big Data Processing 5 1 -- 1 -- --
Unit II
4. Stream Processing 5 -- 1 1 -- --
5. Big Data Analysis 5 -- 1 1 1 1
6.Big Data Visualization 5 --- 1 1 1 1

Note
1. Each Question carries 15 marks and might consist of sub-questions.
2. Mixing of sub-questions from different chapters within a unit (Unit I and Unit II) is allowed in
ISA I, ISAII, and ESA.
3. Answer 4 full questions of 15 marks each (two full questions from Unit I, and two full questions from
Unit II) out of 6 questions in ESA.

Date: HOD, CSE

Page 7 of 20.
School of Computer Science & Engineering

Course Assessment Plan

Course Title: Big Data and Analytics Code: 24ECAC401

Course outcomes (COs) Weightage in Assessment Methods

assessment
ISA I ISA II Lab ESA (Theory)
Assessment
1.Identify the key issues in big
data management and 10% ✓ ✓ ✓
applications in business and
scientific computing.
2.Analyse the efficiency of big
25% ✓ ✓
data storage solutions and
integration strategies.
3.Apply stream data
25% ✓ ✓ ✓ ✓
processing tools and
techniques for real time data.
4.Develop big data processing
20% ✓ ✓ ✓
models for a real-time
application.
5.Apply data visualization 20% ✓ ✓ ✓
techniques to interpret results.
Weightage 100% 22% 22% 25% 31%

Course Activity and Rubrics

Page 8 of 20.
School of Computer Science & Engineering

Lab Activity plan

Credit: 1 Big Data and Analytics Lab

Preamble:

Data is created constantly, and at an ever-increasing rate. Mobile phones, social media, imaging
technologies to determine a medical diagnosis—all these and more create new data, and that
must be stored somewhere for some purpose. Devices and sensors automatically generate
diagnostic information that needs to be stored and processed in real-time. Merely keeping up
with this huge influx of data is difficult, but substantially more challenging is analyzing vast
amounts of it, especially when it does not conform to traditional notions of data structure, to
identify meaningful patterns and extract useful information. These challenges of the data deluge
present the opportunity to transform business, government, science, and everyday life.

Objective: The student should be able to use Big Data and Analytics Frameworks and tools for
handling, processing, and analyzing huge datasets.

Team size: 3- 4 (Only for Hadoop Implementation) for other activites individual
assessments.

Type: Each batch will work for one distinct application area
Sl. CO Blooms Timeline PI Hrs
Experiments Marks
No. level wrt COE code
Hadoop Installation 1st &2nd
1 CO1 L3 1.4.1 4 Nil
week
Implementation of Replication and 3rd & 4th
2 CO2 L3 1.4.3 4 Nil
Sharding Week

Implementation of real time application 5th to 8th

3 L3 2.1.2 8 20
using Hadoop. Week
CO1
4 MongoDB query practice CO2 L3 9th Week 1.4.3 2 Nil
MongoDB query evaluation for the 10th
5 CO2 L3 2.3.1 2 10
given scenario Week
11th &12th
6 Hive query practice CO5 L3 13.1.1. 4 Nil
Week
13th &
7 Hive Query CO2 L3 14th 2.1.2 4 10
Week
Total 28 40

Assessment parameters and Rubrics for Lab Activity

Phases Exemplary Satisfactory Needs Notsatisfactory

Page 9 of 20.
School of Computer Science & Engineering

( 8-10 ) ( 5-7 ) Improvement ( 0-1 )

(2 - 4)
Problem The problem The problem The problem is The problem
Identification statement is clear, statement is clear vaguely defined or statement lacks
(PI - 2.1.2) concise, and but comprises of lacks significant clarity. It is unclear,
specific. It clearly ambiguities. clarity, resulting in
doesn't provide
identifies the difficulties to
problem, its understand the context, and doesn't
context, and its issue. adequately convey
scope. the issue.

Data Data sources are The majority of Data is poorly Sources of data are
preparation expertly-collected the data sources prepared with major missing or irrelevant.
(PI - 1.4.3) and highly relevant. are relevant and issues in cleaning, inadequate data
Data is collected appropriate. But, transformation, or preparation, with
systematically and minor issues exist handling missing significant problems
prepared with with cleaning or data. in collecting data and
thorough cleaning, transforming the cleansing.
transformation, and data.
integration.
Decision and Decision and Design and model Decision and design
design design are inappropriate or recommendation are
Model Selection recommendations recommendation poorly selected. The not relevant.
(PI – 2.3.1) have a strong base are reasonable. approach lacks
and justification. structure.
Implementation Selection of Selection of Significant technical Needs improvement
of real time appropriate appropriate problems or major in the analytical tools
application (PI- analytical tools and analytical tools functionality issues. and/or software
5.3.1,PI-13,1,2) techniques. and techniques. The project may not engineering
Developed code Developed code work as intended or techniques.
that follows the lacks core features. The developed code
that follows the
design is not mapping to the
design specification, but design specification.
can be further
specification.
improved.
The presentation is The presentation There is ambiguity in Communication is
professionally is understandable the presentation. The ineffective. Time
Presentation delivered.The but may lack speaker is management may be
(PI-10.2.2) speaker exhibits engagement. The unconfident. There is inconsistent.
strong speaker shows clear time
confidence.Demo is some confidence management.
delivered within the but may struggles
allotted time. with articulation.
Documentation is Documentation is Documentation is Documentation is
and Report (PI- comprehensive, ordered and unclear, poorly desperately
10.1.2) well-organized. comprehensive.co structured,Key points organized, with
uld be structured don’t carry clarity. significant problems
better. with clarity.

Date: HOD

Page 10 of 20.
School of Computer Science & Engineering

Chapter-wise Plan

Course Code and Title: 24ECAC401 / Big Data and Analytics

Chapter Number and Title: 1. Introduction Planned Hours: 05 hrs

Learning Outcomes:-
At the end of the topic the student should be able to:

Topic Learning Outcomes COs BL CA Code

1.Differentiate between different contemporary practices in data analytics. CO1 L3 1.4
2.Analyze the importance of big data and the challenges of Big Data and CO1 L3 2.3
Analytics;
3.Explain the Importance of data analysis for business intelligence CO1 L2 1.4
applications
4.Explain the various characterstics of bigdata CO1 L2 1.4

Lesson Schedule
Class No. - Portion covered per hour
1. Overview of Big data, Big Data Characteristics
2. Different Types of Data
3. Data Analytics
4. Data Analytics Life Cycle
5. Data Analytics Life Cycle

Review Questions

Sl. No. - Questions TLOs BL PI Code

1. Differentiate between descriptive and prescriptive data analytics. TLO1 L3 1.4.3
2. Explain how BI enables an organization to gain insight into the TLO3 L3 2.3.1
performance of an enterprise.
3. Explain the differences between BI and Data Science. TLO3 L2 1.4.3
4. Explain the emerging big data ecosystem. TLO4 L2 1.4.3
5. What are the three characteristics of Big Data, and what are the main TLO4 L2 1.4.1
considerations in processing Big Data?
6. Differentiate between the different types of data processed by big data TLO2 L3 1.4.1
solutions.

Page 11 of 20.
School of Computer Science & Engineering

Course Code and Title: 24ECAC401 / Big Data and Analytics

Chapter Number and Title: 2. Big Data Storage Planned Hours: 05 hrs
Learning Outcomes:-
At the end of the topic the student should be able to:

Topic Learning Outcomes COs BL CA Code

1.Understand the storage of large files spread across the nodes of a cluster CO2 L2 1.4
in a distributed file system;
2.Explain Document oriented,column oriented and Graph based databases. CO2 L2 1.4

3.Write Mongo query language to perform CRUD operations on a given CO2 L3 1.4
dataset in MongoDB.
4.Explain how sharding provides partial tolerance toward failures and CO2 L3 2.2
horizontal scalability;
5.Explain how Replication provides scalability, data availability and fault CO2 L3 1.4
tolerance;

Lesson Schedule
Class No. - Portion covered per hour
1. Clusters; File Systems; Distributed File Systems
2.No SQL Databases: Document-oriented, Column-oriented, Graph-based, MongoDB.
3. Sharding
4. Replication; Combining Sharding and Replication
5. On Disk Storage Devices, In-memory Storage Devices
Review Questions
Sl.No. - Questions TLOs BL PI Code
1. Explain how a large file is stored on a distributed file system. TLO1 L2 1.4.3
2.How sharding differs with relpiction? Illustrate with example. TLO3 L3 1.4.3
3. Explain how sharding helps to achieve horizontal scalability. TLO3 L2 2.2.2
4.Explain how data availability and fault tolerance achieved through TLO4 L3 2.2.2
replication in distributed data storage? Illustrate.
5. Illustrate the use of NoSQL for ETL operations on the dataset stored on TLO3 L3 1.4.3
a DFS.
6. Write a MongoDB Query language to create "Library " collection and TLO3 L3 1.4.3
perform CURD operations for the Library system.

Page 12 of 20.
School of Computer Science & Engineering

Course Code and Title: 24ECAC401 / Big Data and Analytics

Chapter Number and Title: 3. Big Data Processing Planned Hours: 05 hrs

Learning Outcomes:-
At the end of the topic the student should be able to:

Topic Learning Outcomes COs BL CA

Code
1.Explain parallel data processing. CO3 L2 2.3
2.Describe distributed data processing. CO3 L2 2.3
3.Show the processing and storage capability of Hadoop framework CO3 L3 13.1
4.Understand Hadoop Distributed File System (HDFS) CO3 L3 13.1
5.Describe the processing of real time data using Spark CO3 L2 2.3

Lesson Schedule
Class No. - Portion covered per hour
1. Parallel Data Processing
2. Distributed Data Processing
3. Hadoop ,Map Reduce
4. Examples on MapReduce
5. Spark

Review Questions
Sl.No. - Questions TLOs BL PI Code
1. Illustrate the process of parallel data processing with example. TLO1 L1 2.3.1
2. Explain distributed data processing with example. TLO2 L2 2.3.1
3. Draw and explain the framework of Hadoop. TLO3 L3 13.1.2
4.Write mapper and reducer function in Java to count the number of words TLO4 L3 13.1.2
in the chosen input file.
5.How the processing of continuous data is carried out using TLO5 L3 2.3.1
Spark.Discuss the suitable technique.

Page 13 of 20.
School of Computer Science & Engineering

Course Code and Title: 24ECAC401 / Big Data and Analytics

Chapter Number and Title: 4. Stream Processing Planned Hours: 05 hrs

Learning Outcomes:-
At the end of the topic the student should be able to:

Topic Learning Outcomes COs BL CA Code

1.Understand how stream processing is different from batch processing; CO3 L2 2.3
2.Apply stream processing tools and techniques for big data processing; CO3 L3 13.1
3.Explain immutable stream processing; CO3 L2 2.3
4.Explain transformations and aggregations performed on stream data; CO3 L2 2.3

Lesson Schedule
Class No. - Portion covered per hour
1. Introduction to Stream Processing, Batch Versus Stream Processing
2. Examples of Stream Processing, Scaling Up Data Processing
3.Distributed Stream Processing; . Stream-Processing Model- Sources and Sinks
4. Immutable Streams Defined from One Another, Transformations and Aggregations,
5. Window Aggregations, Stateless and Stateful Processing.

Review Questions
Sl.No. - Questions TLOs BL PI Code
1. Explain how stream processing is different from batch processing. TLO1 L2 2.3.1
2. Illustrate how distributed processing of stream data helps scale up of TLO2 L3 13.1.2
data processing facilitating big data processing.
3. Explain how tumbling windows are used for window aggregation. TLO3 L2 2.3.1
4. Differentiate between Stateless and Stateful Processing TLO4 L3 2.3.1

Page 14 of 20.
School of Computer Science & Engineering

Course Code and Title: 24ECAC401 / Big Data and Analytics

Chapter Number and Title: 5. Big Data Analysis Planned Hours: 05 hrs

Learning Outcomes:-
At the end of the topic the student should be able to:

Topic Learning Outcomes COs BL CA Code

1.Discuss the data types of Pig CO5 L2 13.1
2.Illustrate HDFS Commands CO5 L3 1.4

3.Differentiate between Pig versus Hive CO5 L3 1.4

Lesson Schedule
Class No. - Portion covered per hour
1. Pig- Introduction, Pig Primitive Data Types - Running Pig, Execution Modes of Pig – HDFS
Commands - Relational Operators
2. Eval Function - Complex Data Types
3. Piggy Bank - User-Defined Functions
4. Parameter Substitution - Diagnostic Operator, Word Count Example using Pig
5.Pig at Yahoo! - Pig Versus Hive

Review Questions
Sl. No. - Questions TLOs BL PI Code
1.Explain the data types og Pig. TLO1 L2 13.1.1
2.Write the functions of Piggy bank. TLO3 L2 1.4.3
3.Illustrate HDFS commands with example. TLO2 L3 1.4.3
4.Write a program to cont the number of words in Pig. TLO1 L3 13.1.1
5.Discuss the relational opereators with example. TLO1 L2 13.1.1
6.Why do we need MapReduce during Pig programming? TLO2 L3 1.4.3
7.Explain the architecture of Apache Pig TLO3 L2 1.4.3
8.What are the complex data types in pig? Illustrate with example TLO1 L2 13.1.1
9.What are the different execution mode available in Pig ? TLO3 L2 1.4.3

Page 15 of 20.
School of Computer Science & Engineering

Course Code and Title: 24ECAC401 / Big Data and Analytics

Chapter Number and Title: 6. Big Data Visualization Planned Hours: 05 hrs

Learning Outcomes:-
At the end of the topic the student should be able to:

Topic Learning Outcomes COs BL CA Code

1.Explain the need for Hive in Big Data and Analytics. CO5 L2 1.4
2.Describe the Architecture of HIVE and Data Types CO5 L2 1.4
3.Understand the Importance of Hive file formats. CO5 L2 1.4
4.Write HIVE Query Language to query a dataset in HIVE. CO5 L3 13.1

Lesson Schedule
Class No. - Portion covered per hour
1. Hive – Introduction, Hive Architecture
2. Hive Data Types, Hive File Format
3. Hive Query Language (HQL)
4. RCFile Implementation
5. User-Defined Function (UDF). Serialization and Deserialization.

Review Questions
Sl. No. - Questions TLOs BL PI Code
1. Discuss the features of HIVE. TLO1 L1 1.4.1
2. Explain with example HIVE is a data warehousing tool. TLO1 L2 1.4.3
3. Explain with neat diagram data units arranged in Hive. TLO2 L2 1.4.1
4. With diagram describe HIVE architecture TLO3 L2 1.4.3
5. Explain with example different file formats supported by HIVE. TLO3 L2 1.4.3
6. Explain DDL and DML in HIVE. TLO2 L2 1.4.3
7. Write HiveQL statement to create a data file for below schemas:Order: TLO4 L2 13.1.1
CustomerId, ItemID, ItemName, OrderDate, DeliveryDateCustomer:
CustomerId,CustomerName, Address, City, State, CountryCreate a table
for Order and Customer data.\n\nWrite HiveQL to find the number of items
bought by each customer.

Question Paper Title: Model Question Paper for (ISA-1)

Page 16 of 20.
School of Computer Science & Engineering

Course: Big Data and Analytics Course Code : 24ECAC401

Total Duration (H:M):1hr Maximum Marks: 30

Note: Answer any two full questions

Q.No. Questions Marks CO BL PO PI Code

1a Explain the characteristics of Big Data and 8 1 L3 1 1.4.1

differentiate structured, semi-structured, and
unstructured data with examples.

1b Illustrate the data analytics life cycle and discuss 7 1 L3 1 1.4.1

its importance in industry applications.
2a Discuss the significance of Big Data Analytics and 8 3 L3 1 1.4.3
enumerate various types of data encountered in
analytics scenarios.
2b Explain parallel and distributed data processing 7 2 L3 2 1.4.3
with reference to clusters and Hadoop ecosystem.
3a Describe the architecture and main features of 8 3 L3 13 13.1.2
NoSQL databases with emphasis on Document-
oriented and Column-oriented databases.
3b Illustrate the process of parallel and distributed data 7 3 L3 2 2.3.1
processing techniques with example. Also discuss the
methods of processing work loads and storage solutions
used in big data.

Question Paper Title: Model Question Paper for (ISA-II)

Course: Big Data and Analytics Course Code : 24ECAC401

Page 17 of 20.
School of Computer Science & Engineering

Total Duration (H:M):1hr Maximum Marks: 30

Note: Answer any two full questions

Q.No. Questions Marks CO BL PO PI Code

Q1.a Explain how leveraging distributed stream data 8 3 L3 1 1.4.3

processing leads to enhanced scalability in big data
environments

Q1.b Using a sliding window of 20 seconds with a reporting 7 3 L3 2 2.2.2

interval of 10 seconds, demonstrate the key features
and benefits of sliding windows.

Q2.a Develop a Pig Latin script to compute the frequency 8 5 L3 13 13.1.1

of each word in a dataset.
Q2.b Explain the main parts of Apache Pig and how they 7 5 L3 1 1.4.3
work together in its architecture.

Q3.a Define bucketing in Hive and provide HiveQL 8 5 L3 13 13.1.1

statements to create a table partitioned into three
buckets for the attributes student_roll, student_name,
and student_grade.

Q3.b Write HiveQL commands to: 7 5 L3 13 13.1.1

1) Create tables to store data for the given schemas:
Order (CustomerId, ItemID, ItemName, OrderDate,
DeliveryDate) and Customer (CustomerId,
CustomerName, Address, City, State, Country);
2) Retrieve the total number of items purchased by
each customer.

Question Paper Title: Model Question Paper for ESA

Course : Big Data and Analytics Course Code :24ECAC401

Page 18 of 20.
School of Computer Science & Engineering

Total Duration (H:M): 120 Minutes Maximum Marks :60

UNIT I

Q.No. Questions Marks CO BL PO PI Code

Q1.a Airlines collect a large volume of data that results from 8 1 L3 1 1.4.3
categories like customer flight preferences, traffic
control, baggage handling and aircraft maintenance.
Airlines can optimize operations with the meaningful
insights of big data analytics. This includes everything
from flight paths to which aircraft to fly on what routes.
Analyze the types of data involved and suggest the
suitable analytics technique.

Q1.b Explain how the following industries exploit their data 7 1 L3 1 1.4.1
flood for business promotion;
i. Social Media
ii. Credit card Companies
iii. Mobile companies
Q2.a Explain how data availability and fault tolerance 8 2 L2 2 2.2.2
achieved through replication in distributed data storage?
Illustrate with an example.
Q2.b Write a MongoDB Query language to create "Books" 7 2 L3 1 1.4.3
collection and implement Map Reduce to categories of
different attributes of Book collection.

Q3.a Illustrate the process of parallel and distributed data 8 3 L3 2 2.3.1

processing techniques with example. Also discuss the
methods of processing work loads and storage solutions
used in big data.

Q3.b Write mapper and reducer Java class to count the 7 3 L3 13 13.1.2
occurrence of words in the chosen input file.

UNIT II

.No. Questions Marks CO BL PO PI Code

Q4.a Illustrate how distributed processing of stream data 8 3 L3 1 1.4.3

helps scale up of data processing facilitating big data
processing.

Q4.b i)Show a tumbling window of 10 seconds over a stream 7 3 L3 2 2.2.2

of elements. Demonstrates the tumbling nature of
tumbling windows.

Page 19 of 20.
School of Computer Science & Engineering

ii) Show a sliding window with a window size of 20

seconds and a reporting frequency of 10 seconds. Show
the important characteristic of sliding windows in this
example.

Q5.a Write a program to count the number of words in Pig 8 5 L3 13 13.1.1

Q5.b Explain the architecture of Apache Pig 7 5 L2 1 1.4.3

Q6.a What is Bucketing in Hive? Write HQL to create a 3 8 5 L3 13 13.1.1

buckets (student_roll, student_name and
student_grade).

Q6.b Write the Hive QL statement to create a data file for 7 5 L3 13 13.1.1
below schemas:
 Order: CustomerId, ItemID, ItemName,
OrderDate, DeliveryDate
 Customer: CustomerId, CustomerName,
Address, City, State, Country
i. Create a table for Order and Customer data.
i. Write a HiveQL to find the number of items
bought by each customer.

Page 20 of 20.

Big Data Analytics Course Plan 2023-24
No ratings yet
Big Data Analytics Course Plan 2023-24
19 pages
6th Sem - Big Data - IsE
No ratings yet
6th Sem - Big Data - IsE
5 pages
BE AIDS R 20 VII VIII Sem Syllabus - Compressed
No ratings yet
BE AIDS R 20 VII VIII Sem Syllabus - Compressed
55 pages
Siddharth Big Data Report 1000016431
No ratings yet
Siddharth Big Data Report 1000016431
6 pages
Lesson Plan of BDA - 2025-26
No ratings yet
Lesson Plan of BDA - 2025-26
15 pages
Big Data Analytics for Engineers
No ratings yet
Big Data Analytics for Engineers
4 pages
BCS714D Syllabus
No ratings yet
BCS714D Syllabus
3 pages
Co Po Mapping Bda With Justiificaton
No ratings yet
Co Po Mapping Bda With Justiificaton
4 pages
No SQL Database in Bda
No ratings yet
No SQL Database in Bda
84 pages
Big Data Analytics
No ratings yet
Big Data Analytics
3 pages
Syllabus
No ratings yet
Syllabus
7 pages
It - (R20) - 4-1 - Big Data Analytics - Digital Notes
No ratings yet
It - (R20) - 4-1 - Big Data Analytics - Digital Notes
117 pages
Big Data Analytics-Digital Notes
No ratings yet
Big Data Analytics-Digital Notes
86 pages
Big Data Analytics Course Overview
No ratings yet
Big Data Analytics Course Overview
84 pages
Big Data Course Outline for Students
No ratings yet
Big Data Course Outline for Students
2 pages
Big Data Systems Course Overview
No ratings yet
Big Data Systems Course Overview
6 pages
Big Data - 2 Marks-1
No ratings yet
Big Data - 2 Marks-1
1 page
Big Data Analytics Course
No ratings yet
Big Data Analytics Course
19 pages
BD Course Handout (Spring 2024)
No ratings yet
BD Course Handout (Spring 2024)
4 pages
QB - Updated 1
No ratings yet
QB - Updated 1
15 pages
20csa311 - Big Data Bca-Ds
No ratings yet
20csa311 - Big Data Bca-Ds
13 pages
Big Data Qpapers
No ratings yet
Big Data Qpapers
4 pages
COMP9313: Big Data Management Overview
No ratings yet
COMP9313: Big Data Management Overview
79 pages
Ibda Course File
No ratings yet
Ibda Course File
33 pages
Big Data SV Publication
No ratings yet
Big Data SV Publication
142 pages
Big Data Analytics Course Guide
No ratings yet
Big Data Analytics Course Guide
2 pages
BDA Syllabus - Sem VII - Mumbai University
No ratings yet
BDA Syllabus - Sem VII - Mumbai University
3 pages
SEM VII BDA Syllabus Theory
No ratings yet
SEM VII BDA Syllabus Theory
4 pages
Bda 1
No ratings yet
Bda 1
95 pages
CS8091 Bigdata QB 2022-2023 Final
No ratings yet
CS8091 Bigdata QB 2022-2023 Final
6 pages
Big Data Course Handout - CS 3032
No ratings yet
Big Data Course Handout - CS 3032
5 pages
2CS702-CPD-Odd 23 24
No ratings yet
2CS702-CPD-Odd 23 24
9 pages
22IS61 Big Data Analytics 2025
No ratings yet
22IS61 Big Data Analytics 2025
4 pages
CSET 371 Course File
No ratings yet
CSET 371 Course File
81 pages
Experiment Pgno
No ratings yet
Experiment Pgno
50 pages
Big Data and Analytics Syllabus 2021
No ratings yet
Big Data and Analytics Syllabus 2021
3 pages
MCA 3rd Semester Big Data Analytics Syllabus
No ratings yet
MCA 3rd Semester Big Data Analytics Syllabus
15 pages
Bda 23456789010
No ratings yet
Bda 23456789010
7 pages
Introduction of Subject
No ratings yet
Introduction of Subject
28 pages
Big Data Analytics Course Overview
No ratings yet
Big Data Analytics Course Overview
84 pages
CCS334 Updated 05-05-2025
No ratings yet
CCS334 Updated 05-05-2025
19 pages
Unit 1 - BD - Introduction To Big Data
No ratings yet
Unit 1 - BD - Introduction To Big Data
83 pages
2024 25 ODD CE449 BDA Syllabus
No ratings yet
2024 25 ODD CE449 BDA Syllabus
4 pages
Big Data Curriculum for CS & CSE Students
No ratings yet
Big Data Curriculum for CS & CSE Students
2 pages
All in One
No ratings yet
All in One
362 pages
Bigdata
No ratings yet
Bigdata
54 pages
Appendix-74
No ratings yet
Appendix-74
42 pages
Big Data Management Syllabus
100% (1)
Big Data Management Syllabus
5 pages
Big Data Syllabus
No ratings yet
Big Data Syllabus
5 pages
Big Data Course Overview
No ratings yet
Big Data Course Overview
97 pages
Big Data Analytics Course Overview
No ratings yet
Big Data Analytics Course Overview
3 pages
Unit 1 - BD - Introduction To Big Data
100% (1)
Unit 1 - BD - Introduction To Big Data
90 pages
Big Data Integration Course Syllabus
No ratings yet
Big Data Integration Course Syllabus
9 pages
Big Data Course Overview and Insights
No ratings yet
Big Data Course Overview and Insights
7 pages
Co-Po Big Data Analytics
100% (1)
Co-Po Big Data Analytics
41 pages
BDA Syllabus
No ratings yet
BDA Syllabus
3 pages
Big Data Technology E1UJ502B
No ratings yet
Big Data Technology E1UJ502B
11 pages
Chapter 3.3
No ratings yet
Chapter 3.3
10 pages
Infosys Campus Recruitment Details Pattern
No ratings yet
Infosys Campus Recruitment Details Pattern
2 pages
KLET Infosys Test 2 - Answer Key
No ratings yet
KLET Infosys Test 2 - Answer Key
415 pages
Chap04 Queue Practise
No ratings yet
Chap04 Queue Practise
16 pages
E5 Stack Hackerrank
No ratings yet
E5 Stack Hackerrank
7 pages
Chap03 Stack Expression Programs
No ratings yet
Chap03 Stack Expression Programs
7 pages
Essentials of Data Structures & Algorithms
No ratings yet
Essentials of Data Structures & Algorithms
4 pages
Imagicle On Cisco CCW
No ratings yet
Imagicle On Cisco CCW
7 pages
VHDL
No ratings yet
VHDL
30 pages
Linux-PAM Guide for Sysadmins
No ratings yet
Linux-PAM Guide for Sysadmins
5 pages
Basics of Serial Communication1
No ratings yet
Basics of Serial Communication1
11 pages
Ict Marking Guide For s1 Paper 1
No ratings yet
Ict Marking Guide For s1 Paper 1
3 pages
A Survey On Applications of Wireless Sensor Networ
No ratings yet
A Survey On Applications of Wireless Sensor Networ
7 pages
Ch05 Wired Equivalent Privacy (WEP) - Sadiq
No ratings yet
Ch05 Wired Equivalent Privacy (WEP) - Sadiq
20 pages
Aws Technical Essentials
No ratings yet
Aws Technical Essentials
18 pages
Brpas004972185v1 - 2023 11 02
No ratings yet
Brpas004972185v1 - 2023 11 02
5 pages
00-887003-01 Rev B 9900 Workstation Service Manual
100% (10)
00-887003-01 Rev B 9900 Workstation Service Manual
330 pages
Harsh Vardhan Resume SE
No ratings yet
Harsh Vardhan Resume SE
1 page
Obstacle Avoidance Robot Guide
No ratings yet
Obstacle Avoidance Robot Guide
9 pages
Low-Level vs High-Level Languages
No ratings yet
Low-Level vs High-Level Languages
62 pages
Query Based Reports in Maximo: Overview of Maximo Ad-Hoc Reporting Functionality
No ratings yet
Query Based Reports in Maximo: Overview of Maximo Ad-Hoc Reporting Functionality
40 pages
Troubleshooting Guide For Cisco Catalyst 8000v Edge Software
No ratings yet
Troubleshooting Guide For Cisco Catalyst 8000v Edge Software
46 pages
ICT 11 Student Test Prep
No ratings yet
ICT 11 Student Test Prep
2 pages
Go H CK Yourself: A Simple Introduction To Cyber Attacks and Defence 1st Edition Bryson Payne Full Digital Chapters
No ratings yet
Go H CK Yourself: A Simple Introduction To Cyber Attacks and Defence 1st Edition Bryson Payne Full Digital Chapters
66 pages
Error Control Techniques in Wireless Communication
No ratings yet
Error Control Techniques in Wireless Communication
4 pages
Hydra Cheat Sheet - by Codelivly
No ratings yet
Hydra Cheat Sheet - by Codelivly
6 pages
CIO 308 Multi-Input Module Data Sheet
No ratings yet
CIO 308 Multi-Input Module Data Sheet
10 pages
(Question Bank) - Software Engineering - Fynl
No ratings yet
(Question Bank) - Software Engineering - Fynl
8 pages
Synology DS225+ Data Sheet Enu
No ratings yet
Synology DS225+ Data Sheet Enu
15 pages
200 Acacad 20 en M02SG
No ratings yet
200 Acacad 20 en M02SG
51 pages
AI Chatbot Building for Professionals
No ratings yet
AI Chatbot Building for Professionals
46 pages
LinkPlay A28 Wireless Audio Module Manual
No ratings yet
LinkPlay A28 Wireless Audio Module Manual
22 pages
NFC Based Intelligent Bus Ticketing Syst
No ratings yet
NFC Based Intelligent Bus Ticketing Syst
5 pages
Nutanix NCA-6.5 Exam Q&A Guide
No ratings yet
Nutanix NCA-6.5 Exam Q&A Guide
5 pages
Excel ODBC Setup for Informatica Users
No ratings yet
Excel ODBC Setup for Informatica Users
22 pages
Assignent Task
No ratings yet
Assignent Task
3 pages

BDA Lesson Plan Final

Uploaded by

BDA Lesson Plan Final

Uploaded by

School of Computer Science & Engineering

Brief description of the course:

Course Outcomes (COs):

Course Articulation Matrix: Mapping of Course Outcomes (COs) with Program

Course Outcomes (COs) / Program 1 2 3 4 5 6 7 8 9 10 11 12 13 14

Degree of compliance L: Low M: Medium H: High

Competency addressed in the Course and corresponding Performance Indicators

Competency Performance Indicators

Course Code: 24ECAC401 Course Title: Big Data and Analytics

4. Stream Processing: Introduction to Stream Processing-Batch Versus Stream Processing;

Assessment Conducted for Weightage in

End-Semester Assessment Scheme

Assessment Conducted for Weightage in

Course Unitization for Minor Exams and End Semester Assessment

No. of No. of No. of No. of No. of

Date: HOD, CSE

Course Assessment Plan

Course Title: Big Data and Analytics Code: 24ECAC401

Course outcomes (COs) Weightage in Assessment Methods

Course Activity and Rubrics

Lab Activity plan

Credit: 1 Big Data and Analytics Lab

Implementation of real time application 5th to 8th

Assessment parameters and Rubrics for Lab Activity

( 8-10 ) ( 5-7 ) Improvement ( 0-1 )

Course Code and Title: 24ECAC401 / Big Data and Analytics

Topic Learning Outcomes COs BL CA Code

Sl. No. - Questions TLOs BL PI Code

Course Code and Title: 24ECAC401 / Big Data and Analytics

Topic Learning Outcomes COs BL CA Code

Course Code and Title: 24ECAC401 / Big Data and Analytics

Topic Learning Outcomes COs BL CA

Course Code and Title: 24ECAC401 / Big Data and Analytics

Topic Learning Outcomes COs BL CA Code

Course Code and Title: 24ECAC401 / Big Data and Analytics

Topic Learning Outcomes COs BL CA Code

3.Differentiate between Pig versus Hive CO5 L3 1.4

Course Code and Title: 24ECAC401 / Big Data and Analytics

Topic Learning Outcomes COs BL CA Code

Question Paper Title: Model Question Paper for (ISA-1)

Course: Big Data and Analytics Course Code : 24ECAC401

Total Duration (H:M):1hr Maximum Marks: 30

Note: Answer any two full questions

Q.No. Questions Marks CO BL PO PI Code

1a Explain the characteristics of Big Data and 8 1 L3 1 1.4.1

1b Illustrate the data analytics life cycle and discuss 7 1 L3 1 1.4.1

Question Paper Title: Model Question Paper for (ISA-II)

Course: Big Data and Analytics Course Code : 24ECAC401

Total Duration (H:M):1hr Maximum Marks: 30

Note: Answer any two full questions

Q.No. Questions Marks CO BL PO PI Code

Q1.a Explain how leveraging distributed stream data 8 3 L3 1 1.4.3

Q1.b Using a sliding window of 20 seconds with a reporting 7 3 L3 2 2.2.2

Q2.a Develop a Pig Latin script to compute the frequency 8 5 L3 13 13.1.1

Q3.a Define bucketing in Hive and provide HiveQL 8 5 L3 13 13.1.1

Q3.b Write HiveQL commands to: 7 5 L3 13 13.1.1

Question Paper Title: Model Question Paper for ESA

Course : Big Data and Analytics Course Code :24ECAC401

Total Duration (H:M): 120 Minutes Maximum Marks :60

Q.No. Questions Marks CO BL PO PI Code

Q3.a Illustrate the process of parallel and distributed data 8 3 L3 2 2.3.1

.No. Questions Marks CO BL PO PI Code

Q4.a Illustrate how distributed processing of stream data 8 3 L3 1 1.4.3

Q4.b i)Show a tumbling window of 10 seconds over a stream 7 3 L3 2 2.2.2

ii) Show a sliding window with a window size of 20

Q5.a Write a program to count the number of words in Pig 8 5 L3 13 13.1.1

Q5.b Explain the architecture of Apache Pig 7 5 L2 1 1.4.3

Q6.a What is Bucketing in Hive? Write HQL to create a 3 8 5 L3 13 13.1.1

You might also like