Institute of Engineering & Technology
Department of Information Technology
Subject Code: 6ITRE1
Subject Name: Data Analytics
ASSIGNMENT
Q 1. Choose a dataset of your choice (e.g., population statistics, weather data, stock prices, or student
scores).
a) Define data analytics and explain its role in extracting insights from this dataset.
b) Describe how data visualization can help interpret the data better.
c) Calculate the mean, standard deviation, covariance, and correlation between at least two
numeric variables in the dataset.
Q 2. Using Python, complete the following tasks on a CSV dataset (e.g., sales or exam scores):
a) Load the data and identify missing values, outliers, and inconsistencies.
b) Use NumPy to compute basic statistics (mean, median, standard deviation).
c) Use lambda functions, map, and list comprehensions to process a numeric column (e.g., apply a
transformation to prices or marks).
Q 3. Design and execute SQL queries based on the following scenario:
You are building a simple database for an online bookstore with tables: Books, Authors, and
Sales.
a) Create the tables using appropriate Primary, Foreign, Candidate, and Super Keys.
b) Insert multiple sample records into each table.
c) Write queries to:
Update book prices based on author ID.
Delete books with zero sales.
Retrieve all books with sales greater than a given threshold using WHERE, GROUP
BY, and HAVING.
d) Explain the order of execution of your SQL query with SELECT, WHERE, GROUP BY, and
HAVING.
Q 4. Using the bookstore database from question 3:
a) Perform different types of JOINs (INNER, LEFT, RIGHT, SELF, EQUI, CROSS, NATURAL)
between Books, Authors, and Sales.
b) Write a correlated subquery to find authors whose average book sales are above the overall
average.
c) Use analytical functions (RANK, DENSE_RANK, LEAD, LAG) to compare book sales within each
category.