Chapter 1 – Data Handling Using Pandas - I
Prepared by
Sanjay Kumar Parmar
BCA, MSc(IT)
PGT - Computer Science,
Sardar Patel Vidyalaya, Sewagram
Downloaded from : www.tutorialaicsip.com
REFERENCE - New CBSE Curriculum 2020-2133333.
2021-22
Data Handling using Pandas -I
Introduction to Python libraries- Pandas, Matplotlib.
Data structures in Pandas - Series and Data Frames.
Series: Creation of Series from – ndarray, dictionary, scalar value;
mathematical operations; Head and Tail
functions; Selection, Indexing and Slicing.
Data Frames: creation - from dictionary of Series, list of
dictionaries, Text/CSV files; display; iteration;
Operations on rows and columns: add, select, delete, rename;
Head and Tail functions; Indexing using Labels,
Boolean Indexing; Joining, Merging and Concatenation.
Importing/Exporting Data between CSV files and Data Frames.
Introduction to Python libraries- Pandas
Library for data analysis
Derived from PANel DAta System
Become a popular for choice for data analysis
It provides highly optimized performance with
back end source code is purely written in C or
Python
Makes simple and easy process for data analysis
Main author of Pandas is Wes McKiney
Basic Data Structures: Series and Dataframes
Data structures in Pandas
Data structure refer to special way of storing data to
serve a specific purpose
Pandas offers two basic data structure –
i. Series ii. Data Frames
To work with Pandas, import pandas and numpy library
import pandas as pd
import numpy as np
Series
An important data structure of pandas
Represents one-dimensional array, containing an
array of data of any NumPy data type
Series has two components:
i. An array ii. An associated array of indexes or data
Creating Series Objects
It can be created by using pandas Series()
S must
Creating an empty Series object capital
Syntax: <Series Object> = pandas.Series()
Example:
An empty series of pandas having default
data type float64
Creating Series Objects
Creating non-empty Series Syntax: <Series Object> =
pandas.Series(data, index=idx)
Where data can be one of the following: Series with
A python Sequence An ndarray some values
A dictionary A scalar value
Creating Series with Sequence
Creating Series Objects
Creating Series with ndarray
Creating Series with dictionary
Creating Series Objects
Creating Series with scalar value
Creating Series Objects Adding Functionality
Specifying/Adding NaN values in a Series
Creating Series Objects
Specify index as well as data with Series():
Using a mathematical function
Common Series attributes
Attribute Description
Series.index Retrives index of a series
Series.values Return series as ndarray
Series.dtype Return data type of series
Series.shape Return tuples (no.of rows) of the shape
Series.nbytes Return no. of bytes
Series.ndim Return no. of dimension
Series.size Return no. of elements
Series.hasnans Return true is there are any NaN value else false
Series.empty Return true if the series is empty, else false
Common Series attributes Example
Operations on Series
Accessing specific Elements
All Elements
Modifying Elements
Series
objects are
value-
mutable but
size-
immutable
objects.
Extracting Slices from Series
head() Function
head() Function: fetch first n rows from series
tail() Function
tail() Function: fetch last n rows from series
If no value provided for n,
then head() and tail() will
return first 5 and last 5 rows
respectively of pandas
object.
Vector & Arithmetic Operations on Series
reindex() and drop() Methods
Reindex() : Create a similar object but with a different
order of same indexes.
Drop() : remove any entry from series