0% found this document useful (0 votes)

10 views9 pages

Pandas Series - Notes For PA3

Uploaded by

thinkinboutyouch

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views9 pages

Pandas Series - Notes For PA3

Uploaded by

thinkinboutyouch

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 9

Data Handling Using Pandas – I

Data and Handling data:

➢Data is a raw fact (input for process).
➢Data is stored in different types [data types].
➢Data is handled and manipulated, i.e., stored / save in different,
✓Structures – Queue, List, Array, dictionary, etc.
✓Format – csv file, excel file, html file, etc.
➢Different structures and formats of data are converted into single format and stored [Storage Place is called
Data Warehouse]
Python Libraries:
Python libraries contain a collection of builtin modules that allow us to perform many actions.
Each library in Python contains a large number of modules that one can import and use.

For scientific and analytical use there are three Libraries:

1. Pandas -PANel Data
2. NumPy -Numerical Python
3. Matplotlib
These libraries allow us to manipulate, transform and visualise data easily and efficiently.
Pandas:
➢Pandas is an open-source Python Library providing high-performance data manipulation and analysis tool
using its powerful data structures.
➢The name Pandas is derived from the word Panel Data – an Econometrics from Multidimensional data.
➢Python with Pandas is used in a wide range of fields including academic and commercial domains including
finance, economics, Statistics, analytics, etc
It is a package useful for data analysis and manipulation
➢Pandas provide an easy way to create, manipulate and wrangle the data regardless of the origin in five typical
steps — load, prepare, manipulate, model, and analyze.
Key Features of Pandas
• Fast and efficient Data Frame object with default and customized indexing.
• Tools for loading data into in-memory data objects from different file formats.
• Data alignment and integrated handling of missing data.
• Reshaping and pivoting of date sets.
• Label-based slicing, indexing and sub-setting of large data sets.
• Columns from a data structure can be deleted or inserted.
• Group by data for aggregation and transformations.
• High performance merging and joining of data.
• Time Series functionality [which is the use of a model to predict future values based on previous observed
values]
NumPy
NumPy, which stands for ‘Numerical Python’, can be used for numerical data analysis and scientific
computing. NumPy uses a multidimensional array object and has functions and tools for working with these
arrays. Elements of an array stay together in memory, hence, they can be quickly accessed.
Two functions are arange() and array() one attribute –nan(not a number)
Matplotlib:
The Matplotlib library in Python is used for plotting graphs and visualisation. Using Matplotlib, with just a few
lines of code we can generate publication quality plots, histograms, bar charts, scatterplots, etc
Following are some of the differences between Pandas and Numpy:
1. A Numpy array requires homogeneous data, while a Pandas DataFrame can have different data types (float,
int, string, datetime, etc.).
2. Pandas have a simpler interface for operations like file loading, plotting, selection, joining, GROUP BY,
which come very handy in data-processing applications.
3. Pandas DataFrames (with column names) make it very easy to keep track of data.
4. Pandas is used when data is in Tabular Format, whereas Numpy is used for numeric array based data
manipulation
DATA STRUCTURE IN PANDAS
• A data structure is a way to arrange the data in such a way that so it can be accessed quickly and we can
perform various operation on this data like- retrieval, deletion, modification etc.
• Pandas deals with 3 data structure
1. Series
2. Data Frame
3. Panel
• These data structures are built on top of Numpy array, which means they are fast.
Dimension & Description:
The best way to think of these data structures is that the higher dimensional data structure is a container of its
lower dimensional data structure.
For example, DataFrameis a container of Series, Panel is a container of DataFrame.

Mutability
• All Pandas data structures are value mutable (can be changed) and except Series all are size mutable. Series is
size immutable.
• Note− DataFrame is widely used and one of the most important data structures. Panel is used much less

Series
• Series is a one-dimensional array (like structure with homogeneous data) containing a sequence of values of
any data type (int, float, list, string, etc) which by default have numeric data labels starting from zero.
The data label associated with a particular value is called its index.
For example, the following series is a collection of integers 10, 23, 56, … 10 23 56

Key Points
• Homogeneous data
• Size Immutable
• Values of Data Mutable
The axis labels are collectively called index.
▪ A pandas Series can be created using the Series() method from pandas module
Syntax:
pandas.Series(data, index, dtype, copy)
The parameters of the constructor are as follows

Series Generation:
A series can be created using various inputs like –
• A sequence [list]
• An Array -ndarray
• A Python Dictionary –dict()
• A Scalar value or constant
• A Mathematical expression / function
Create an Empty Series
A basic series, which can be created is an Empty Series
Sample Program:
#import the pandas library and aliasing as pd
import pandas as pd
s = pd.Series()
print(s)
print(“Data Type: ”,type(s))
Output:
Series([], dtype: float64)
Data Type:
<class 'pandas.core.series.Series'>
Here in the sample program
• S is the object name
• Series() is the method displays an empty list(by default) along with its default datatype.
• type() method displays the series data type.
Create Series:–Using Series() argument -data and input a sequence [list type].
List is a heterogeneous sequence (mixture of data types) but panda structures are homogeneous data series,
hence they are given to specific type.
import pandas as pd
# Generate series using Parameter data and input -Sequence [list] print()
print("Generates series using Parameter data and input -Sequence [list]")
print()
ser1 = pd.Series([10,20,30,40,50])
print("Series is:")
print(ser1)
Create Series:–Using Series() arguments -data, index and input list generated using range().

2. Creation of Series from NumPy Arrays:

import numpy as np # import NumPy with alias np
import pandas as pd
array1 = np.array([1,2,3,4])
series3 = pd.Series(array1)
print(series3)
series4 = pd.Series(array1, index = ["Jan", "Feb", "Mar", "Apr"])
print(series4)
Note:When index labels are passed with the array, then the length of the index and array must be of the same
size, else it will result in a ValueError.
3. Creation of Series from Dictionary:
Dictionary keys can be used to construct an index for a Series.
import pandas as pd
dict1 = {'India': 'NewDelhi', 'UK': 'London', 'Japan': 'Tokyo'}
print(dict1) #Display the dictionary {'India': 'NewDelhi', 'UK': 'London', 'Japan': 'Tokyo'}
series8 = pd.Series(dict1)
print(series8) #Display the series India NewDelhi UK London Japan Tokyo dtype: object
4.Creation of Series from Scalar Values:
series2 = pd.Series([50, index=[3,5,1])
print(series2)
Accessing Elements of a Series:
There are two common ways for accessing the elements of a series: Indexing and Slicing.
(A) Indexing:
Indexes are of two types:
1. positional index and
2. labelled index.
 Positional index takes an integer value that corresponds to its position in the series starting from 0, • iloc
is a method uses position i.e., row number or column number. It is referred as position-based indexing.
 Labelled index takes any user-defined label as index. • loc is another method uses names i.e., row name
or column name. It is referred as name-based indexing
• Using index values the elements can be accessed [one element at a time or a range.]
• Range of elements can be accessed by methods like iloc() and loc().
EXAMPLE :
import pandas as pd
seriesNum = pd.Series([10,20,30])
print(seriesNum[2])
seriesMnths = pd.Series([2,3,4],index=["Feb ","Mar","Apr"])
print(seriesMnths["Mar"])
seriesCapCntry = pd.Series(['NewDelhi', 'WashingtonDC', 'London', 'Paris'], index=['India', 'USA', 'UK',
'France'])
print(seriesCapCntry['India'] )
print(seriesCapCntry[['UK','USA']])
print(seriesCapCntry.loc[‘UK’,’USA’])
print(seriesCapCntry.iloc[0:2])

Slicing:
We can define which part of the series is to be sliced by specifying the start and end parameters [start :end] with
the series name.
When we use positional indices for slicing, the value at the endindex position is excluded, i.e., only (end -
start) number of data values of the series are extracted.
We can also use slicing to modify the values of series elements
updating the values in a series using slicing also excludes the value at the end index position. But, it changes the
value at the end index label when slicing is done using labels
import numpy as np
seriesAlph = pd.Series(np.arange(10,16,1), index = ['a', 'b', 'c', 'd', 'e', 'f'])
seriesAlph[1:3] = 50
seriesAlph['c':'e'] = 500
print(seriesAlph)

Attributes of Series:

example
sr = pd.Series(range(1,15,3), index = [x for x in 'abcde'])
print("ATTRIBUTES IN SERIES")
print("Is Series Empty:")
print("sr.empty:", sr.empty)
print()
print("sr.index:", sr.index)
print()
print("sr.values:", sr.values)
print()
print("sr.shape:", sr.shape)
print()
print("sr.size:", sr.size)
print()
print("sr.nbytes:", sr.nbytes)
print()
print("sr.ndim:", sr.ndim)
print()
print("sr.item:", sr.item)
print()
print("sr.hasnans:", sr.hasnans)

output:
ATTRIBUTES IN SERIES
Is Series Empty:
sr.empty: False

sr.index: Index(['a', 'b', 'c', 'd', 'e'], dtype='object')

sr.values: [ 1 4 7 10 13]

sr.shape: (5,)

sr.size: 5

sr.nbytes: 40

sr.ndim: 1

sr.item: <bound method IndexOpsMixin.item of a 1

b 4
c 7
d 10
e 13
dtype: int64>

sr.hasnans: False

Retrieving values from a series:-

-The values are retrieved using head,tail and count functions
Using index the values in the series items can be accessed (retrieved)
head() – return first five elements from series
head(n) – return first n elements from series
tail() –return last 5 elements from series.
tail(n) - return last n elements from series.
Count() - return number of non – NaN values in the series.
DATA STRUCTURE IN PANDAS –SERIES [ONE DIMENSION] Operations:
➢Series structure supports various operations like,
✓Basic Arithmetic operations [ +, -, *, / ]
✓Vector operations
✓Retrieving values based on conditions.
✓Deletion of elements
Basic Arithmetic operations [ +, -, *, / ]
• Operation done between two series objects as operands
• Same indexed values does the operations.
• If different index the one addition is possible and returns a series1 joined with series2
Example:
import pandas as pd
sr1 = pd.Series([10,20,30],index = [1,2,3])
sr2 = pd.Series([5,10,15], index=[1,2,3])
sr3 = pd.Series(range(100,150,10), index= [x for x in range(1,6)])
print("Addition:")
print("sr1 + sr2", sr1 + sr2)
print()
print("sr1 + sr3", sr1 + sr3)
print()
print("Subtraction:")
print("sr2 -sr1", sr2 -sr1)
print()
print("Multiplication:")
print("sr1 * sr2", sr1 * sr2)
print()
print("Division:")
print("sr1 / sr2", sr1 / sr2)
print()
Output:

Vector operations: [ +, -, *, / , , >=,>=,!=,==]

• Operation implemented at element level
• All arithmetic, and relational operation can be done
• One operand is a series object and other is a numeric / string literal(constant).
Example:
import pandas as pd
s=pd.Series([11,12,13,14])
print(s+2)
print(s>13)
print(s**2)
Output:
Deleting elements from a series:
➢Using drop() function an element from a series can be deleted by passing the index to the function.
➢Syntax:
seriesname.drop(index,inplace=True/False)
Note INPLACE = True(removes ELEMENT permanently FROM SERIES)
Example
import pandas as pd
s=pd.Series([11,12,13,14])
s.drop(2)
print(s)
s.drop(2,inplace=True)
print(s)

Unit I: Data Handling Using Pandas and Data Visualization: Marks:30
No ratings yet
Unit I: Data Handling Using Pandas and Data Visualization: Marks:30
75 pages
XII - Ip - Panda - I - Part - I - 2023 (1) 1 1
No ratings yet
XII - Ip - Panda - I - Part - I - 2023 (1) 1 1
25 pages
Pandas Notes
No ratings yet
Pandas Notes
19 pages
Unit I: Data Handling Using Pandas and Data Visualization: Marks:25
100% (1)
Unit I: Data Handling Using Pandas and Data Visualization: Marks:25
135 pages
Panda Ncert 1
No ratings yet
Panda Ncert 1
36 pages
Ncert Pandas
No ratings yet
Ncert Pandas
36 pages
Leip 102
No ratings yet
Leip 102
36 pages
Data Handling Python NCERT
No ratings yet
Data Handling Python NCERT
36 pages
CH 2
No ratings yet
CH 2
36 pages
Introduction to Python Libraries
No ratings yet
Introduction to Python Libraries
36 pages
Data Handling with Pandas in Python
No ratings yet
Data Handling with Pandas in Python
14 pages
Pandas
100% (1)
Pandas
163 pages
Python Pandas
100% (1)
Python Pandas
96 pages
Python Pandas
No ratings yet
Python Pandas
22 pages
Httpsncert Nic Intextbookpdfleip102 PDF
No ratings yet
Httpsncert Nic Intextbookpdfleip102 PDF
36 pages
Understanding Pandas Series and Indexing
No ratings yet
Understanding Pandas Series and Indexing
36 pages
Grade-XII-IP - Ch-1 - Series Notes
No ratings yet
Grade-XII-IP - Ch-1 - Series Notes
28 pages
XII IP QuickRevision
No ratings yet
XII IP QuickRevision
26 pages
Subject IP
No ratings yet
Subject IP
9 pages
Introduction to Python Pandas Library
No ratings yet
Introduction to Python Pandas Library
33 pages
Data Handling Using Pandas-1
No ratings yet
Data Handling Using Pandas-1
23 pages
Python Pandas
100% (1)
Python Pandas
35 pages
DAY6 Pandas Seaborn
No ratings yet
DAY6 Pandas Seaborn
97 pages
Python Pandas
No ratings yet
Python Pandas
21 pages
Data Handling Using Pandas - 1-2-1
No ratings yet
Data Handling Using Pandas - 1-2-1
10 pages
Unit 4
No ratings yet
Unit 4
36 pages
12ip 22 23
No ratings yet
12ip 22 23
188 pages
Unit III Part 2 1725700061785
No ratings yet
Unit III Part 2 1725700061785
85 pages
ML Lab8
No ratings yet
ML Lab8
28 pages
Introduction To Pandas & Data Structures
No ratings yet
Introduction To Pandas & Data Structures
11 pages
Python Unit - 6 Pandas
100% (1)
Python Unit - 6 Pandas
106 pages
Data Handlinng Using Pandas
No ratings yet
Data Handlinng Using Pandas
46 pages
Python Pandas Series
No ratings yet
Python Pandas Series
30 pages
Pandas Python
No ratings yet
Pandas Python
11 pages
Data Handling with Pandas in Python
No ratings yet
Data Handling with Pandas in Python
27 pages
UNIT - 3 Pandas
No ratings yet
UNIT - 3 Pandas
21 pages
XII-IP-QuickRevision 2 in 1
No ratings yet
XII-IP-QuickRevision 2 in 1
13 pages
Unit - 1 - Python Pandas
No ratings yet
Unit - 1 - Python Pandas
176 pages
Panda
No ratings yet
Panda
46 pages
1 IP 12 NOTES PythonPandas 2022 PDF
100% (3)
1 IP 12 NOTES PythonPandas 2022 PDF
66 pages
Python Pandas Series
No ratings yet
Python Pandas Series
45 pages
Unit 2
No ratings yet
Unit 2
81 pages
Pandas
No ratings yet
Pandas
57 pages
Pandas - Panel Data System
No ratings yet
Pandas - Panel Data System
4 pages
Chapter 1 and 2 Series and Data Frame
No ratings yet
Chapter 1 and 2 Series and Data Frame
45 pages
Pandas Series for Data Enthusiasts
No ratings yet
Pandas Series for Data Enthusiasts
11 pages
Ip-Data Handling With Pandas
No ratings yet
Ip-Data Handling With Pandas
13 pages
Eda U2
No ratings yet
Eda U2
61 pages
Unit - V Introduction To Pandas in Python
No ratings yet
Unit - V Introduction To Pandas in Python
21 pages
UNIT 3 (Chapter 2) Pandas
No ratings yet
UNIT 3 (Chapter 2) Pandas
43 pages
Python Basics: Data Types & Control Flow
No ratings yet
Python Basics: Data Types & Control Flow
55 pages
Exp 25 - 26
No ratings yet
Exp 25 - 26
17 pages
Pandas
No ratings yet
Pandas
82 pages
14 Pandas
No ratings yet
14 Pandas
25 pages
Ip Notes
No ratings yet
Ip Notes
20 pages
Class XII Python Pandas Study Material
No ratings yet
Class XII Python Pandas Study Material
180 pages
Unit 1 - Data Handling Using Pandas and Data Visualization (Series)
No ratings yet
Unit 1 - Data Handling Using Pandas and Data Visualization (Series)
11 pages
Data Handling Using Pandas - I
No ratings yet
Data Handling Using Pandas - I
42 pages
Intersection Design Principles
100% (1)
Intersection Design Principles
48 pages
014.mech Eng Session 009 Kinetic and Dynamic Principles 1
No ratings yet
014.mech Eng Session 009 Kinetic and Dynamic Principles 1
4 pages
Indices and Surds
No ratings yet
Indices and Surds
26 pages
Storage Assignment in A Unit Load Warehouse
No ratings yet
Storage Assignment in A Unit Load Warehouse
20 pages
Maths - IIT - JEE - Sample
100% (1)
Maths - IIT - JEE - Sample
23 pages
Uzan-The Arrow of Time and Meaning PDF
No ratings yet
Uzan-The Arrow of Time and Meaning PDF
29 pages
Decision Tree Lecture 1
No ratings yet
Decision Tree Lecture 1
7 pages
Annual Test Calendar 2023-24
No ratings yet
Annual Test Calendar 2023-24
1 page
Convolution Integrals in Quantum Physics
No ratings yet
Convolution Integrals in Quantum Physics
13 pages
Lesson Plan Exploring Spreadsheets Next Steps
No ratings yet
Lesson Plan Exploring Spreadsheets Next Steps
4 pages
PLSQL Cheat Sheet: by Via
No ratings yet
PLSQL Cheat Sheet: by Via
3 pages
Application of Coefficient of Contingency Among Classification
No ratings yet
Application of Coefficient of Contingency Among Classification
12 pages
Radiation-Plan and Design Lab SBA
No ratings yet
Radiation-Plan and Design Lab SBA
3 pages
DLMCSA01 Mastersolution
No ratings yet
DLMCSA01 Mastersolution
5 pages
Types and Analysis of Histograms
No ratings yet
Types and Analysis of Histograms
5 pages
New Keynesian Model Overview
No ratings yet
New Keynesian Model Overview
38 pages
ML Roadmap Day by Day
No ratings yet
ML Roadmap Day by Day
2 pages
CPGA Iput
No ratings yet
CPGA Iput
4 pages
DS Unit-1 IMP Notes
No ratings yet
DS Unit-1 IMP Notes
12 pages
IGCSE Co-Ordinated Sciences 0654 - 52 Paper 5 Oct - Nov 2020
No ratings yet
IGCSE Co-Ordinated Sciences 0654 - 52 Paper 5 Oct - Nov 2020
2 pages
Farm Management Handout STS Plant and Horticulture
100% (1)
Farm Management Handout STS Plant and Horticulture
104 pages
Delhi Public School Bangalore East Portions For Unit Test - 2 Examination (2024 - 2025)
No ratings yet
Delhi Public School Bangalore East Portions For Unit Test - 2 Examination (2024 - 2025)
4 pages
Detailed Lesson Plan in Multiplication On Whole Numbers Again-2
No ratings yet
Detailed Lesson Plan in Multiplication On Whole Numbers Again-2
7 pages
Rope-Sheave Interaction Analysis
No ratings yet
Rope-Sheave Interaction Analysis
10 pages
Solutions Stat200 Final Fall2015 Ol4 B
No ratings yet
Solutions Stat200 Final Fall2015 Ol4 B
9 pages
Sven O Krumke Integer Programming Polyhedra and Algorithms Lecture Notes
No ratings yet
Sven O Krumke Integer Programming Polyhedra and Algorithms Lecture Notes
188 pages
Development of Tubular Linear Permanent Magnet Syn
No ratings yet
Development of Tubular Linear Permanent Magnet Syn
8 pages
Assignment On Properties of Determinants-1
No ratings yet
Assignment On Properties of Determinants-1
3 pages
Model Evaluation Techniques Guide
No ratings yet
Model Evaluation Techniques Guide
40 pages
Advanced Commands for Sheet Metal Design
No ratings yet
Advanced Commands for Sheet Metal Design
26 pages

Pandas Series - Notes For PA3

Uploaded by

Pandas Series - Notes For PA3

Uploaded by

Data Handling Using Pandas – I

Data and Handling data:

For scientific and analytical use there are three Libraries:

2. Creation of Series from NumPy Arrays:

sr.index: Index(['a', 'b', 'c', 'd', 'e'], dtype='object')

sr.item: <bound method IndexOpsMixin.item of a 1

Retrieving values from a series:-

Vector operations: [ +, -, *, / , , >=,>=,!=,==]

You might also like