0% found this document useful (0 votes)

4 views4 pages

Data Visualization Ex2 - Ex4

The document outlines three exercises focused on data manipulation using Python and pandas. Exercise 2 involves exploring a dataset with various DataFrame methods, Exercise 3 focuses on extracting important variables and removing unnecessary ones, and Exercise 4 addresses identifying and filling missing values in a dataset. Each exercise includes a clear aim, procedure, and program code, demonstrating practical applications of data handling in Python.

Uploaded by

mohammedthasleem304

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views4 pages

Data Visualization Ex2 - Ex4

Uploaded by

mohammedthasleem304

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

Ex:2

Aim:
To explore a dataset using dataframe, info, shape, head, tail, dtypes, describe,
grouping of data in python.

Procedure:
● Go start and search for Python IDLE and open it..
● Open a new python file and create a dataset.
● Type the program given below and save it as [Link]
● Run the program.

Program:

import pandas as pd
print("\n" + "="*50)
print(" DATA EXPLORATION REPORT")
print("="*50)
# 1. Create DataFrame
data = {
"Name": ["Arun", "Beema", "Charles", "Divya", "Elango", "Farhan", "Guna", "Hari"],
"Age": [21, 22, 23, 21, 22, 23, 21, 22],
"City": ["Chennai", "Coimbatore", "Chennai", "Madurai", "Coimbatore", "Chennai",
"Madurai", "Coimbatore"],
"Score": [88, 92, 85, 90, 75, 89, 95, 78]
}
df = [Link](data)

# Function to print section title

def section(title):
print("\n" + "-"*50)
print(title)
print("-"*50)
section("DATAFRAME")
print(df)

print("\n=== INFO ===")

print("Rows :", [Link][0])
print("Cols :", [Link][1])
section("SHAPE (ROWS, COLUMNS)")
print([Link])

section("HEAD (FIRST 5 ROWS)")

print([Link]())

section("TAIL (LAST 5 ROWS)")

print([Link]())

section("DATA TYPES")
print([Link])

section("DESCRIBE")
print([Link]())

section("GROUP BY CITY (MEAN AGE & SCORE)")

print([Link]("City").mean(numeric_only=True))

Result:

Thus, a dataset has been explored using dataframe, info, shape, head, tail,
dtypes, describe, grouping of data in python successfully.

Ex:3

Aim:
To extract important variables and remove useless variables from the dataset.

Procedure:
● Go to start and search for Python IDLE and open it..
● Open a new python file and create a dataset.
● Type the program given below and save it as [Link]
● Run the program.

Program:

import pandas as pd
# Dataset
data = {
'Name': ['Arun', 'Balu', 'Charan', 'Deepa'],
'Age': [21, 22, 20, 23],
'Marks': [85, 90, 78, 88],
'City': ['Erode', 'Karur', 'Coimbatore', 'Chennai'],
'Extra': ['x', 'y', 'z', 'p']
}
df = [Link](data)
print("Original Dataset:")
print(df)

# IMPORTANT VARIABLES
important_columns = ['Name', 'Age', 'Marks']
df_important = df[important_columns]
print("\nImportant Variables Dataset:")
print(df_important)

# REMOVAL OF USELESS VARIABLES

useless_columns = ['Extra']
df_cleaned = [Link](useless_columns, axis=1)
print("\nDataset After Removing Useless Variables:")
print(df_cleaned)

Result:

Thus, an important variables has been extracted and removed useless variables
from the dataset successfully.

Ex:4

Aim:
To identify and fill missing values within the dataset.

Procedure:
● Go to start and search for Python IDLE and open it..
● Open a new python file and create a dataset with a missing value.
● Type the program given below and save it as [Link]
● Run the program.

Program:

import pandas as pd
import numpy as np

# Dataset with missing values

data = {
'Name': ['Arun', 'Balu', 'Cheran', 'Deepa'],
'Age': [21, [Link], 20, 23],
'Marks': [85, 90, [Link], 88],
'City': ['Tirupur', 'Erode', None, 'Chennai']
}
df = [Link](data)
print("Original Dataset:")
print(df)

# 1. IDENTIFY MISSING VALUES

print("\nMissing Values Count in Each Column:")
print([Link]().sum())

# 2. FILL MISSING VALUES

# Fill numeric columns with mean
df['Age'] = df['Age'].fillna(df['Age'].mean())
df['Marks'] = df['Marks'].fillna(df['Marks'].mean())

# Fill text columns with a placeholder

df['City'] = df['City'].fillna("Unknown")
print("\nDataset After Filling Missing Values:")
print(df)

Result:

Thus, missing values in the dataset were successfully identified and filled.

Use Case Title
No ratings yet
Use Case Title
5 pages
Worksheet (1) : Introduction To Holistic Well-Being Objective: Instructions
No ratings yet
Worksheet (1) : Introduction To Holistic Well-Being Objective: Instructions
15 pages
Data Structures Using Python Lab Manual
No ratings yet
Data Structures Using Python Lab Manual
17 pages
RDBMS Lab Manual R-23
No ratings yet
RDBMS Lab Manual R-23
34 pages
2D2024 - 2687 Appellants Motion To Disqualify Circuit Judge Patricia Muscarella Due To Conflicts of Interest and To Vacate All Orders Issued in Lower Court
No ratings yet
2D2024 - 2687 Appellants Motion To Disqualify Circuit Judge Patricia Muscarella Due To Conflicts of Interest and To Vacate All Orders Issued in Lower Court
156 pages
Track Consignment: Sign in Register
No ratings yet
Track Consignment: Sign in Register
3 pages
Boot 2 Root
No ratings yet
Boot 2 Root
5 pages
CloudSim: A Guide for Developers
No ratings yet
CloudSim: A Guide for Developers
8 pages
Data Analysis
No ratings yet
Data Analysis
1 page
Day20 Non-Contiguous Memory Allocation
No ratings yet
Day20 Non-Contiguous Memory Allocation
30 pages
UNIT IV Virtual Environment BCME 802
No ratings yet
UNIT IV Virtual Environment BCME 802
12 pages
SCM - Enhanced Production Scheduling in Oracle Fusion Cloud Supply Planning
No ratings yet
SCM - Enhanced Production Scheduling in Oracle Fusion Cloud Supply Planning
39 pages
IT-PREBOARD - Practice Paper
No ratings yet
IT-PREBOARD - Practice Paper
11 pages
Telegram Channels for Proxies
No ratings yet
Telegram Channels for Proxies
18 pages
Cognizant Openings Till 9th Sept
No ratings yet
Cognizant Openings Till 9th Sept
4 pages
Social Network Analytics Session2
No ratings yet
Social Network Analytics Session2
34 pages
A Statistical Analysis of The Effects of Scrum and Kanban On Software Development Pro-Jects
No ratings yet
A Statistical Analysis of The Effects of Scrum and Kanban On Software Development Pro-Jects
9 pages
DSA Syllabus
No ratings yet
DSA Syllabus
4 pages
Automatic Test Pattern Generation
No ratings yet
Automatic Test Pattern Generation
8 pages
GTA San Andreas PC Cheat Codes Guide
0% (1)
GTA San Andreas PC Cheat Codes Guide
2 pages
OBJECTIVE: Creating and Altering Tables
No ratings yet
OBJECTIVE: Creating and Altering Tables
4 pages
Mini Project 33
No ratings yet
Mini Project 33
6 pages
Arctis 5 Headset Product Guide
No ratings yet
Arctis 5 Headset Product Guide
40 pages
SAP Business One Keyboard Shortcuts
No ratings yet
SAP Business One Keyboard Shortcuts
3 pages
Config 600 Pro Training Manual Section 2 - Logicalc Language Specification
No ratings yet
Config 600 Pro Training Manual Section 2 - Logicalc Language Specification
8 pages
Data Analysis in Retail Using Python
No ratings yet
Data Analysis in Retail Using Python
7 pages
LDAP EC-Net4 - UG
No ratings yet
LDAP EC-Net4 - UG
42 pages
CS1202 Computer Fundamentals Q&A
No ratings yet
CS1202 Computer Fundamentals Q&A
32 pages
Brány Skeldalu: Gates of Skeldal
No ratings yet
Brány Skeldalu: Gates of Skeldal
4 pages
Unit 04 Devops Updated
No ratings yet
Unit 04 Devops Updated
64 pages
Abhipawar0366 681057cf6106d
No ratings yet
Abhipawar0366 681057cf6106d
2 pages
IOS Developer JD 2
No ratings yet
IOS Developer JD 2
3 pages
Welcome To IST 380 !: Data Science Programming
No ratings yet
Welcome To IST 380 !: Data Science Programming
73 pages
CWT5011 Commands
No ratings yet
CWT5011 Commands
18 pages

Data Visualization Ex2 - Ex4

Uploaded by

Data Visualization Ex2 - Ex4

Uploaded by

Ex:2

# Function to print section title

print("\n=== INFO ===")

section("HEAD (FIRST 5 ROWS)")

section("TAIL (LAST 5 ROWS)")

section("GROUP BY CITY (MEAN AGE & SCORE)")

# REMOVAL OF USELESS VARIABLES

# Dataset with missing values

# 1. IDENTIFY MISSING VALUES

# 2. FILL MISSING VALUES

# Fill text columns with a placeholder

You might also like