0% found this document useful (0 votes)
10 views7 pages

EXP10

Uploaded by

kotharikoko250
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views7 pages

EXP10

Uploaded by

kotharikoko250
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Experiment No.

10

Program to implement functions of Pandas Library

Name/Roll No. : shruti

Class: SE-2

Date of Performance:21/3

Date of Submission:28/3
Experiment No. 10

Title: Program to implement functions of Pandas Library

Aim: Program to implement functions of Pandas Library

Objective: To introduce Pandas package for python

Theory:

Pandas is a powerful and flexible Python library used for data manipulation, analysis, and
cleaning. It is built on top of NumPy and provides high-performance data structures and
functions for working with structured data.

The two primary data structures in Pandas are:

Series: A one-dimensional labeled array capable of holding any data type, similar to a column in
a spreadsheet or a database.

DataFrame: A two-dimensional labeled data structure, akin to a table with rows and columns. It
is the most commonly used structure in Pandas for handling tabular data.

Pandas provides a variety of functions and tools, including:

Data Import/Export: Supports reading from and writing to various file formats such as CSV,
Excel, SQL, and JSON.

Data Cleaning: Offers methods for handling missing data, removing duplicates, and applying
transformations.
Data Manipulation: Includes operations like filtering, grouping, merging, and pivoting datasets.

Data Analysis: Facilitates statistical analysis, descriptive statistics, and time-series operations.

Pandas is widely used in data science, machine learning, and analytics for its ability to handle
large datasets efficiently and provide intuitive tools for data exploration and preprocessing. Its
seamless integration with libraries like Matplotlib and NumPy makes it a cornerstone of the
Python data analysis ecosystem.

Program:

import pandas as pd

import numpy as np

import os

# 1. Creating a Pandas Series from user input

print("1. Creating a Pandas Series:")

n = int(input("Enter number of elements in the Series: "))

series_data = []

series_index = []

for i in range(n):

​ val = input(f"Enter value {i+1}: ")

​ idx = input(f"Enter index for value {val}: ")

​ series_data.append(val)

​ series_index.append(idx)

data_series = pd.Series(series_data, index=series_index)


print("\nGenerated Series:")

print(data_series, "\n")

# 2. Creating a DataFrame from user input

print("2. Creating a Pandas DataFrame:")

rows = int(input("Enter number of rows: "))

columns = int(input("Enter number of columns: "))

column_names = []

for i in range(columns):

​ col = input(f"Enter name of column {i+1}: ")

​ column_names.append(col)

data = []

for i in range(rows):

​ row_data = []

​ print(f"Enter data for row {i+1}:")

​ for col in column_names:

​ value = input(f" {col}: ")

​ row_data.append(value)

​ data.append(row_data)

df = pd.DataFrame(data, columns=column_names)

print("\nGenerated DataFrame:")
print(df, "\n")

# 3. Exporting and Importing to/from CSV

csv_path = r"C:\Users\shruti\OneDrive\Desktop\shruti\DM.csv"

df.to_csv(csv_path, index=False)

print(f"Data exported successfully to:\n{csv_path}")

df_from_csv = pd.read_csv(csv_path)

print("\nData read back from CSV:")

print(df_from_csv, "\n")

# 4. Data Cleaning Example (optional)

print("4. Handling Missing Data (Simulating a missing value):")

df_with_nan = df.copy()

if len(df_with_nan) > 0 and 'Score' in df_with_nan.columns:

​ df_with_nan.loc[0, 'Score'] = np.nan

​ print("Before cleaning:\n", df_with_nan)

​ df_filled = df_with_nan.fillna(df_with_nan['Score'].astype(float).mean())

​ print("After filling NaN with mean:\n", df_filled)

else:

​ print("No 'Score' column to demonstrate missing data handling.\n")

# 5. Data Analysis

print("5. Basic Statistics (if numeric columns exist):")


try:

​ print(df.describe(include='all'), "\n")

except Exception as e:

​ print(f"Could not compute statistics: {e}")

# 6. Exporting to Excel

excel_path = r"C:\Users\shruti\OneDrive\Desktop\shruti kothari\user_data.xlsx"

df.to_excel(excel_path, index=False)

print(f"Data exported to Excel at:\n{excel_path}")

Output:
Conclusion: Comment on the functional areas where Pandas library is used

The Pandas library is widely used in data analysis, data manipulation, and data visualization
across various domains. It is essential in data science for handling large datasets, performing
operations like filtering, grouping, and aggregation. In finance and business analytics, Pandas
helps in time-series analysis, risk assessment, and performance tracking. It is also extensively
used in machine learning and research, enabling efficient preprocessing and structuring of raw
data for model training and insights.

You might also like