0% found this document useful (0 votes)
69 views7 pages

Student Data Analysis Report

Practical Of Data Science & Visualization.

Uploaded by

vhoratanvir1610
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
69 views7 pages

Student Data Analysis Report

Practical Of Data Science & Visualization.

Uploaded by

vhoratanvir1610
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 7

DATA SCIENCE AND VISUALIZATION 12202080501060

202046707

Practical: 3

Consider a student dataset containing the following information: student name,


gender, enrollment number, 4-semester results with marks for each subject, mobile
number, and city. Perform descriptive analysis, identify the data type, and implement
a method to find out the variation in the data.

Introduction
In the era of digital data, analyzing educational datasets is vital for understanding
student performance and identifying academic trends. This practical demonstrates
the use of descriptive statistics and variation analysis techniques using a synthetic
student dataset. The goal is to evaluate how students perform across semesters and
subjects while managing missing data.

Theory
Descriptive statistics summarize and describe the main features of a dataset. These
include measures like mean, median, mode, standard deviation, and range.
Understanding the data types (categorical or numerical) is essential for choosing the
right analytical methods.

Variation in the dataset is commonly measured using standard deviation, which


shows how spread out the values are from the mean. High variation indicates
inconsistency in student performance, while low variation indicates consistent
scores across subjects or semesters.

Dataset Description
The dataset consists of the following fields:

- Name (Categorical)
- Gender (Categorical)
- Enrollment Number (Identifier)
- Mobile Number (Categorical/String)
- City (Categorical)
- Marks for 4 semesters, each with 5 subjects (Numerical, with possible missing
values)

GCET 10
DATA SCIENCE AND VISUALIZATION 12202080501060
202046707

Descriptive Analysis
Code:
import pandas as pd

df = pd.read_csv("/content/drive/MyDrive/DSV
/Dataset_(12202080501060)/student_dataset_with_missing_values.csv")

print(df.head())

print(df.describe(include='all'))

print(df.info())

Variation in Data
marks_columns = df.select_dtypes(include=['float64', 'int64'])

average_marks = marks_columns.mean()

print("Average Marks per Subject:\n", average_marks)

GCET 11
DATA SCIENCE AND VISUALIZATION 12202080501060
202046707

std_deviation = marks_columns.std()

print("Standard Deviation:\n", std_deviation)

variance = marks_columns.var()

print("Variance:\n", variance)

GCET 12
DATA SCIENCE AND VISUALIZATION 12202080501060
202046707

Conclusion
From this analysis, we observed that there are varying levels of performance across
subjects and semesters. Descriptive statistics provided insight into central
tendencies and data distribution, while standard deviation helped identify the
variability in scores. This kind of analysis is instrumental in educational planning
and identifying students who may need academic support.

GCET 13
DATA SCIENCE AND VISUALIZATION 12202080501060
202046707

Practical 4

Plot the graph showing the results of students in each semester. (Using
Practical 3 dataset).

Introduction:

In the field of education, analysing student performance data across semesters helps
institutions identify trends, evaluate academic progress, and implement timely
interventions. In this analysis, we visualize how students perform in each semester
using a bar graph. The dataset includes marks for five subjects per semester over
four semesters.

Code:

import pandas as pd

import matplotlib.pyplot as plt

df = pd.read_csv("/content/drive/MyDrive/DSV
/Dataset_(12202080501060)/student_dataset_with_missing_values.csv")

sem1_cols = [col for col in df.columns if 'Sem1' in col]

sem2_cols = [col for col in df.columns if 'Sem2' in col]

sem3_cols = [col for col in df.columns if 'Sem3' in col]

sem4_cols = [col for col in df.columns if 'Sem3' in col]

sem_avg = {

'Semester 1': df[sem1_cols].mean(axis=1).mean(),

'Semester 2': df[sem2_cols].mean(axis=1).mean(),

GCET 14
DATA SCIENCE AND VISUALIZATION 12202080501060
202046707

'Semester 3': df[sem3_cols].mean(axis=1).mean(),

'Semester 4': df[sem4_cols].mean(axis=1).mean(),

plt.figure(figsize=(10, 6))

plt.bar(sem_avg.keys(), sem_avg.values(), color=['skyblue', 'salmon', 'lightgreen',


'violet'])

plt.title('Average Student Performance in Each Semester', fontsize=14)

plt.xlabel('Semester')

plt.ylabel('Average Marks')

plt.ylim(0, 100)

plt.grid(axis='y', linestyle='--', alpha=0.7)

plt.tight_layout()

plt.show()

GCET 15
DATA SCIENCE AND VISUALIZATION 12202080501060
202046707

Conclusion:

From the graph, we can observe the average performance across semesters. This
visualization aids in identifying whether student performance improves, declines, or
remains consistent. Such insights are valuable for curriculum planning and
academic counseling.

GCET 16

You might also like