0% found this document useful (0 votes)
17 views6 pages

DAV Assignment Week-2

The document outlines a series of lab assignments for a B. Tech course in Data Analytics and Data Visualization for the academic year 2025-26. Each assignment involves analyzing various datasets related to retail, healthcare, e-commerce, and student performance, requiring students to perform data manipulations such as merging, grouping, and reshaping. The assignments include specific questions that guide students in extracting insights from the data, emphasizing practical applications of data analysis techniques.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views6 pages

DAV Assignment Week-2

The document outlines a series of lab assignments for a B. Tech course in Data Analytics and Data Visualization for the academic year 2025-26. Each assignment involves analyzing various datasets related to retail, healthcare, e-commerce, and student performance, requiring students to perform data manipulations such as merging, grouping, and reshaping. The assignments include specific questions that guide students in extracting insights from the data, emphasizing practical applications of data analysis techniques.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

SCHOOL OF COMPUTER SCIENCE AND DEPARTMENT OF COMPUTER SCIENCE

ARTIFICIAL INTELLIGENCE ENGINEERING


Program Name: B. Tech Assignment Type: Lab Academic Year: 2025-26

Course Coordinator Name Dr.K.Deepthi


Dr.J.Bhavana,
Dr.M.Ranjeeth Kumar,
Dr.N.Venkatesh,
Dr.Sudersan Beheran,
Dr.Hitesh Vijay Kumar P,
Dr.B.Girirajan,
Instructor(s) Name Mr.D.Sravan Kumar,
Mr.A.Vijay Kumar,
Mr.K.Arunima,
Ms.P.Nagalaxmi,
Chandra Prakash,
Mounika
Course Code 24CS201PC210 Course Title Data Analytics and Data Visualization
Year/Sem II/I Regulation R24
Date and Day
4-08-2025 Time(s) 09:00AM -05:00PM
of Assignment
24CSBTB03, 24CSBTB22
24CSBTB42, 24CSBTB31
Applicable to
Duration 2 Hours 24CSBTB04, 24CSBTB37
Batches
24CSBTB27

Assignment Number: 02/12(Week2)-Monday


Expected Time
Q. No. Question
to complete
You are working as a data analyst for a large retail company that
wants to analyse its sales and customer behaviour to optimize its
marketing strategies. You are given three datasets:
customers.csv:
customer_id, name, gender, city
101, Alice, F, New York
102, Bob, M, Chicago
103, Clara, F, Boston
104, Dan, M, New York
1 orders.csv:
order_id, customer_id, order_date, amount
1001, 101, 2023-11-01, 250
1002, 102, 2023-11-01, 180
1003, 101, 2023-11-03, 75
1004, 104, 2023-11-05, 400
products.csv:
product_id, order_id, product_name, category
P001, 1001, Laptop, Electronics
P002, 1002, Mouse, Electronics
P003, 1003, Book, Stationery
P004, 1004, Headphones, Electronics
Perform the following data manipulations and answer the
questions below:
Merging & Joining:
Merge the orders and customers datasets to get a complete view of
customer orders.
Join the resulting dataset with products using appropriate keys to
get full order details including product name and category.
Grouping:
Group the merged dataset to find total sales (amount) by city.
Group by customer name and product category to find how much
each customer spent in each category.
Reshaping:
Pivot the grouped data to create a summary table with customer
name as rows, product category as columns, and amount as values
(fill missing with 0).
Questions:
Which city had the highest total sales?
Which customer spent the most in the "Electronics" category?
Provide a reshaped table that compares spending across product
categories for each customer.
How would the result change if one order contained multiple
products? How would you handle such a scenario in your join?

SCHOOL OF COMPUTER SCIENCE AND DEPARTMENT OF COMPUTER SCIENCE


ARTIFICIAL INTELLIGENCE ENGINEERING
Program Name: B. Tech Assignment Type: Lab Academic Year: 2025-26

Course Coordinator Name Dr. K. Deepthi


Dr. J. Bhavana
Dr. M. Ranjeeth Kumar
Dr. N. Venkatesh
Dr. Sudersan Beheran
Dr.Hitesh Vijay Kumar P
Dr.B.Girirajan
Instructor(s) Name Mr.D.Sravan Kumar
Mr.A.Vijay Kumar
Mr.K.Arunima,
Ms.P.Nagalaxmi,
Chandra Prakash,
Mounika
24CS201PC210
Course Code Course Title Data Analytics and Data Visualization

Year/Sem II/I Regulation R24


Date and Day
05-08-2025 Time(s) 09:00AM -05:00PM
of Assignment
24CSBTB02, 24CSBTB17
Applicable to 24CSBTB18, 24CSBTB43
Duration 2 Hours
Batches 24CSBTB01

Assignment Number: 02/12(Week 2)-Tuesday


Expected Time
Q. No. Question
to complete
A hospital is analyzing patient data to study treatment
effectiveness. You are given the following datasets:
patients.csv:
patient_id, name, age, gender
P01, John, 45, M
P02, Maya, 30, F
P03, Alice, 55, F
treatments.csv:
treatment_id, patient_id, treatment_type, cost
T101, P01, Chemotherapy, 5000
T102, P02, Physiotherapy, 1500
T103, P01, Surgery, 8000
outcomes.csv:
1 treatment_id, recovery_days, success
T101, 30, Yes
T102, 10, Yes
T103, 60, No
Merge all three datasets to get a unified view of each patient's
treatment and outcome.
Group by treatment_type to find the average recovery days and
success rate.
Pivot the data to show each patient's total cost per treatment type.
Questions:
Which treatment type has the highest average cost?
Which patient had the longest recovery time?
How would you analyze the success rate by gender?

SCHOOL OF COMPUTER SCIENCE AND DEPARTMENT OF COMPUTER SCIENCE


ARTIFICIAL INTELLIGENCE ENGINEERING
Program Name: B. Tech Assignment Type: Lab Academic Year: 2025-26

Course Coordinator Name Dr.K.Deepthi


Dr. J. Bhavana
Dr. M. Ranjeeth Kumar
Dr. N. Venkatesh
Dr. Sudersan Beheran
Dr.Hitesh Vijay Kumar P
Instructor(s) Name Dr.B.Girirajan
Mr.D.Sravan Kumar
Mr.A.Vijay Kumar
Mr.K.Arunima,
Ms.P.Nagalaxmi,
Chandra Prakash,
Mounika

Course Code 24CS201PC210 Course Title Data Analytics and Data Visualization
Year/Sem II/I Regulation R24
Date and Day
06-08-2025 Time(s) 09:00AM -05:00PM
of Assignment
24CSBTB14, 24CSBTB15
Applicable to 24CSBTB16, 24CSBTB25
Duration 2 Hours
Batches 24CSBTB26, 24CSBTB35
24CSBTB39, 24CSBTB13
Assignment Number: 02/12(Week 2)-Wednesday
Expected Time
Q. No. Question
to complete
You are analyzing shopping patterns for an e-commerce platform.
The datasets available are:
users.csv:
user_id, name, membership
U001, Neha, Gold
U002, Ravi, Silver
U003, Anu, Gold
transactions.csv:
transaction_id, user_id, item_id, amount
T01, U001, I001, 300
T02, U002, I002, 150
T03, U001, I003, 450
items.csv:
item_id, category
1
I001, Clothing
I002, Electronics
I003, Groceries
Tasks:
 Join all datasets to analyze the full transaction details.
 Group by membership and category to find total spending.
 Reshape the data to display membership levels as rows
and spending by category as columns.

Questions:

 Which membership tier spends the most in each category?


 Which item category generates the most revenue?

SCHOOL OF COMPUTER SCIENCE AND DEPARTMENT OF COMPUTER SCIENCE


ARTIFICIAL INTELLIGENCE ENGINEERING
Program Name: B. Tech Assignment Type: Lab Academic Year: 2025-26

Course Coordinator Name Dr.K.Deepthi

Instructor(s) Name Dr. J. Bhavana


Dr. M. Ranjeeth Kumar
Dr. N. Venkatesh
Dr. Sudersan Beheran
Dr.Hitesh Vijay Kumar P
Dr.B.Girirajan
Mr.D.Sravan Kumar
Mr.A.Vijay Kumar
Mr.K.Arunima,
Ms.P.Nagalaxmi,
Chandra Prakash,
Mounika
Course Code 24CS201PC210 Course Title Data Analytics and Data Vizualization
Year/Sem II/I Regulation R24
Date and Day
07-08-2025 Time(s) 09:00AM -05:00PM
of Assignment
24CSBTB07, 24CSBTB08
Applicable to
Duration 2 Hours 24CSBTB10, 24CSBTB23
Batches
24CSBTB29
Assignment Number: 02/12(week 2) Thursday
Expected Time
Q. No. Question
to complete
As part of a university analytics project, you're analyzing student
performance across departments.
students.csv:
student_id, name, department
S001, Arjun, CSE
S002, Kavya, ECE
S003, Rahul, CSE
grades.csv:
grade_id, student_id, course, score
G01, S001, DBMS, 85
G02, S002, DSP, 78
G03, S003, DBMS, 92
G04, S001, OS, 88
1
Tasks:

 Merge the datasets to view student performance by course


and department.
 Group by department and course to calculate average
scores.
 Create a pivot table with student name as rows and courses
as columns with scores.

Questions:

 Which department has the highest average in DBMS?


 How would you handle missing course scores when
reshaping?

SCHOOL OF COMPUTER SCIENCE AND DEPARTMENT OF COMPUTER SCIENCE


ARTIFICIAL INTELLIGENCE ENGINEERING
Program Name: B. Tech Assignment Type: Lab Academic Year: 2025-26

Course Coordinator Name Dr.K.Deepthi


Dr. J. Bhavana
Dr. M. Ranjeeth Kumar
Dr. N. Venkatesh
Dr. Sudersan Beheran
Dr.Hitesh Vijay Kumar P
Dr.B.Girirajan
Instructor(s) Name Mr.D.Sravan Kumar
Mr.A.Vijay Kumar
Mr.K.Arunima,
Ms.P.Nagalaxmi,
Chandra Prakash,
Mounika
Course Code 24CS201PC210 Course Title Data Analytics and Data Vizualization
Year/Sem II/I Regulation R24
Date and Day
08-08-2025 Time(s) 09:00AM -05:00PM
of Assignment
24CSBTB05, 24CSBTB11
Applicable to 24CSBTB12, 24CSBTB19
Duration 2 Hours
Batches 24CSBTB20, 24CSBTB28
24CSBTB27
Assignment Number: 02/12(week 2) Friday
Expected Time
Q. No. Question
to complete
A hospital maintains two datasets:
One contains patient demographics (ID, name, age, gender).
Another contains visit logs (ID, date, department visited, doctor,
wait time).
The hospital management wants a summary report of average wait
1
time per department, segmented by gender.
 Merge both datasets using merge() on Patient ID.
 Group by Department and Gender.
 Calculate average Wait Time.

You might also like