0% found this document useful (0 votes)

0 views4 pages

Pandas

Pandas is especially useful when working with tabular data such as CSV files, Excel spreadsheets, SQL tables, and time-series data.

Uploaded by

benti

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

0 views4 pages

Pandas

Pandas is especially useful when working with tabular data such as CSV files, Excel spreadsheets, SQL tables, and time-series data.

Uploaded by

benti

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

Pandas

Handout (Approx. 5 Pages)

1. Introduction to Pandas
Pandas is a powerful open-source Python library used for data manipulation, analysis,
and cleaning. It provides high-level data structures and tools that make working with
structured data easy and efficient. Pandas is built on top of NumPy and is a core library in
data science, statistics, machine learning, and business analytics.

Pandas is especially useful when working with tabular data such as CSV files, Excel
spreadsheets, SQL tables, and time-series data.

2. Installing and Importing Pandas

2.1 Installation

Pandas can be installed using the Python package manager pip:

pip install pandas

2.2 Importing Pandas

The standard convention is to import Pandas with the alias pd:

import pandas as pd

This alias is commonly used in almost all Pandas-based programs.

3. Core Data Structures in Pandas

3.1 Series

A Series is a one-dimensional labeled array capable of holding data of any type.

data = [10, 20, 30, 40]

series = [Link](data)
print(series)

Key features of Series:

• One-dimensional
• Indexed
• Supports different data types

3.2 DataFrame

A DataFrame is a two-dimensional data structure similar to a table in a database or an

Excel spreadsheet.

data = {
'Name': ['Ali', 'Sara', 'John'],
'Age': [20, 22, 21],
'Department': ['CS', 'IT', 'SE']
}

df = [Link](data)
print(df)

DataFrames are the most commonly used Pandas structure.

4. Loading and Saving Data

4.1 Reading Data from Files

df = pd.read_csv('[Link]')
df = pd.read_excel('[Link]')

4.2 Writing Data to Files

df.to_csv('[Link]', index=False)
df.to_excel('[Link]', index=False)

Pandas also supports reading from and writing to SQL databases.

5. Exploring and Inspecting Data
Common methods for exploring data:

print([Link]())
print([Link]())
print([Link]())
print([Link]())

These methods help understand the structure, data types, and summary statistics of datasets.

6. Data Selection and Indexing

6.1 Selecting Columns

print(df['Name'])

6.2 Selecting Rows

print([Link][0])
print([Link][1])

6.3 Conditional Selection

print(df[df['Age'] > 20])

7. Data Cleaning and Handling Missing Values

Real-world data often contains missing or incorrect values.

7.1 Checking for Missing Values

print([Link]().sum())

7.2 Handling Missing Values

[Link](0, inplace=True)
[Link](inplace=True)

Data cleaning is one of the most important uses of Pandas.

8. Data Analysis and Operations
8.1 Sorting Data

df.sort_values(by='Age', ascending=True)

8.2 Grouping Data

[Link]('Department')['Age'].mean()

8.3 Applying Functions

df['Age_plus_1'] = df['Age'].apply(lambda x: x + 1)

9. Pandas with Databases and Other Libraries

Pandas integrates well with:

• NumPy – numerical computing

• Matplotlib / Seaborn – data visualization
• SQL databases – data storage and retrieval
• Scikit-learn – machine learning

Example: Reading data from MySQL

pd.read_sql(query, connection)

10. Applications of Pandas

Pandas is widely used in:

• Data analysis and reporting

• Business intelligence
• Financial analysis
• Scientific research
• Machine learning preprocessing
• Database reporting systems

Segment Registers and Memory Segmentation
No ratings yet
Segment Registers and Memory Segmentation
5 pages
Data Addressing Modes
No ratings yet
Data Addressing Modes
1 page
A R
No ratings yet
A R
22 pages
Internal Architecture of 8086 8088 Microprocessors
No ratings yet
Internal Architecture of 8086 8088 Microprocessors
3 pages
Share Companies in Ethiopia A Comprehensive Overview
No ratings yet
Share Companies in Ethiopia A Comprehensive Overview
8 pages
Understanding Random Variables and Distributions
No ratings yet
Understanding Random Variables and Distributions
11 pages
Presentation 4
No ratings yet
Presentation 4
5 pages
Presentation 3
No ratings yet
Presentation 3
6 pages
Divide-And-Conquer Sorting
No ratings yet
Divide-And-Conquer Sorting
28 pages
2021 03 27 21254466v1 Full
No ratings yet
2021 03 27 21254466v1 Full
32 pages
Lab Exam
No ratings yet
Lab Exam
5 pages
Chapter 1
No ratings yet
Chapter 1
13 pages
Chapter-5-Information and Society
100% (1)
Chapter-5-Information and Society
10 pages
Intro to Computer Applications
No ratings yet
Intro to Computer Applications
2 pages
Introduction Chapter Two
No ratings yet
Introduction Chapter Two
31 pages
Chapter 1-Introduction
No ratings yet
Chapter 1-Introduction
7 pages
Introduction Chapter Two
No ratings yet
Introduction Chapter Two
26 pages
Chapter 2-2003
No ratings yet
Chapter 2-2003
27 pages
Grade 11 Question and Answer
No ratings yet
Grade 11 Question and Answer
20 pages
Ass1 Mam Tim
No ratings yet
Ass1 Mam Tim
11 pages
Amsterdam + Berlin Schedule & Curriculum Edorer Business Analytics & Data Science Bootcamp
No ratings yet
Amsterdam + Berlin Schedule & Curriculum Edorer Business Analytics & Data Science Bootcamp
14 pages
Web Thermo Tables - An On-Line Version of The TRC
No ratings yet
Web Thermo Tables - An On-Line Version of The TRC
12 pages
Class 12 Cs QP 3rd Preboard
No ratings yet
Class 12 Cs QP 3rd Preboard
8 pages
Open Data Architecture Evolution
No ratings yet
Open Data Architecture Evolution
8 pages
SSIS for Data Integration Pros
No ratings yet
SSIS for Data Integration Pros
30 pages
SQL Server Notes
No ratings yet
SQL Server Notes
215 pages
LLM Roadmap
No ratings yet
LLM Roadmap
23 pages
PeopleTools 8.62: Usage Monitor
No ratings yet
PeopleTools 8.62: Usage Monitor
20 pages
Triggers in Oracle Forms
No ratings yet
Triggers in Oracle Forms
3 pages
DBMS ST-2 Solution
No ratings yet
DBMS ST-2 Solution
9 pages
Agenticaiguide 250106204341 C238c4fa
No ratings yet
Agenticaiguide 250106204341 C238c4fa
52 pages
AI Project Cycle Notes
No ratings yet
AI Project Cycle Notes
3 pages
The Software Engineering Discipline
No ratings yet
The Software Engineering Discipline
4 pages
Answers-Assignment#2 CIS 203 Fall 15-16
No ratings yet
Answers-Assignment#2 CIS 203 Fall 15-16
2 pages
Javascript API Office Js Docs Reference Outlook Js 1.13
No ratings yet
Javascript API Office Js Docs Reference Outlook Js 1.13
2,371 pages
Learn. Connect. Explore
100% (1)
Learn. Connect. Explore
29 pages
A Mobile Cloud Computing System For Emergency Management
No ratings yet
A Mobile Cloud Computing System For Emergency Management
23 pages
Hands-On Lab 8: JOIN Operations
No ratings yet
Hands-On Lab 8: JOIN Operations
3 pages
RDBMS Unit-Iii
No ratings yet
RDBMS Unit-Iii
52 pages
SQL Essentials for Data Professionals
No ratings yet
SQL Essentials for Data Professionals
3 pages
Electronic Medical Record
No ratings yet
Electronic Medical Record
3 pages
BCA-404: Data Mining and Data Ware Housing
No ratings yet
BCA-404: Data Mining and Data Ware Housing
19 pages
Transaction and Master Files
No ratings yet
Transaction and Master Files
5 pages
CISA Mobile APP Questions
No ratings yet
CISA Mobile APP Questions
3 pages
Apex-Overview 23.1
No ratings yet
Apex-Overview 23.1
73 pages
HBase
No ratings yet
HBase
11 pages
AI - Min Learning
No ratings yet
AI - Min Learning
5 pages
Understanding Second Normal Form (2NF)
No ratings yet
Understanding Second Normal Form (2NF)
44 pages

Pandas

Uploaded by

Pandas

Uploaded by

Pandas

Handout (Approx. 5 Pages)

2. Installing and Importing Pandas

Pandas can be installed using the Python package manager pip:

pip install pandas

2.2 Importing Pandas

The standard convention is to import Pandas with the alias pd:

This alias is commonly used in almost all Pandas-based programs.

3. Core Data Structures in Pandas

A Series is a one-dimensional labeled array capable of holding data of any type.

data = [10, 20, 30, 40]

Key features of Series:

A DataFrame is a two-dimensional data structure similar to a table in a database or an

DataFrames are the most commonly used Pandas structure.

4. Loading and Saving Data

4.2 Writing Data to Files

Pandas also supports reading from and writing to SQL databases.

6. Data Selection and Indexing

6.2 Selecting Rows

6.3 Conditional Selection

print(df[df['Age'] > 20])

7. Data Cleaning and Handling Missing Values

7.1 Checking for Missing Values

7.2 Handling Missing Values

Data cleaning is one of the most important uses of Pandas.

8.2 Grouping Data

8.3 Applying Functions

9. Pandas with Databases and Other Libraries

• NumPy – numerical computing

Example: Reading data from MySQL

10. Applications of Pandas

• Data analysis and reporting

You might also like