0% found this document useful (0 votes)

5 views14 pages

Python-Unit 4 Notes

The document outlines a course on Python Programming focusing on data analysis using libraries such as NumPy and Pandas. It covers topics including matrix operations, data structures, and visualization techniques, along with practical examples and case studies. The document also highlights the key features and functionalities of Pandas for data manipulation and analysis.

Uploaded by

THANGA SELVI R

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views14 pages

Python-Unit 4 Notes

Uploaded by

THANGA SELVI R

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 14

SCHOOL OF COMPUTING

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING

Academic Year 2025- 26 : Summer Semester
10211CS213 / PYTHON PROGRAMMING
Faculty Name: Dr. R. Thanga Selvi
Slot: S2-1L14 & S10-1L7
Unit IV Data Analysis using Python libraries 3
NumPy: Introduction, NdArray object, Data Types, Array Attributes, Indexing and Slicing,
Array manipulation, mathematical functions, Matplotlib; Pandas: Introduction to pandas data
structures-series-Data Frame-Panel-basic functions-descriptive statistics function- iterating
data frames-statistical functions-aggregations-visualization.
Case Study: Sales Forecasting

Python Matrices and NumPy Arrays

A matrix is a two-dimensional data structure where numbers are arranged into rows and columns

This matrix is a 3x4

Python Matrix

a=[[1,2,3],[4,5,6]] #Matrix
print(a)
print(len(a)) #2 rows
print(a[1]) #print second row
print(a[0][2])
for i in a: #i 0 and 1
print(i)
column = []; # empty list
for row in a:
column.append(row[2])
print("2nd column =", column)
o/p:
[[1, 2, 3], [4, 5, 6]]
2
[4, 5, 6]
3
[1, 2, 3]
[4, 5, 6]
[3,6]

Matrix Addition using Nested Loop

X = [[12,7],
[4,5],
[7,8]]

Y = [[5,8],
[6,7],
[4,5]]

result = [[0,0],
[0,0],
[0,0]]

# iterate through rows

for i in range(len(X)):
# iterate through columns
for j in range(len(X[0])):
result[i][j] = X[i][j] + Y[i][j]
for r in result:
print(r)
However, there is a better way of working with matrices in Python using NumPy package.

NumPy Array

NumPy is a package for scientific computing which has support for a powerful N-dimensional array
object

create a NumPy array

import numpy as np

A = np.array([[1, 2, 3], [3, 4, 5]])

print(A)
[[1 2 3]
[3 4 5]]
y = np.zeros( (2, 3) )
print(y)
[[0. 0. 0.]
[0. 0. 0.]]

A = np.arange(4)
print('A =', A)

B = np.arange(6).reshape(2, 3)
print('B =', B)

'''
Output:
A = [0 1 2 3]
B = [[ 0 1 2]
[3 4 5]]
'''

We use + operator to add corresponding elements of two NumPy matrices

A = np.array([[2, 4], [5, -6]])

B = np.array([[9, -3], [3, 6]])
C = A + B # element wise addition
print(C)

'''
Output:
[[11 1]
[ 8 0]]
'''
Multiplication of Two Matrices
To multiply two matrices, we use dot() method.

Note: * is used for array multiplication

A = np.array([[3, 6, 7], [5, -3, 0]])

B = np.array([[1, 1], [2, 1], [3, -3]])
C = A.dot(B)
print(C)

'''
Output:
[[ 36 -12]
[ -1 2]]
'''

A = np.array([[1, 1], [2, 1], [3, -3]])

print(A.transpose())

'''
Output:
[[ 1 2 3]
[ 1 1 -3]]
'''

import numpy as np

A = np.array([[1, 4, 5, 12],
[-5, 8, 9, 0],
[-6, 7, 11, 19]])

print("A[0] =", A[0]) # First Row

print("A[:,0] =",A[:,0])

A[0] = [1, 4, 5, 12]

A[:,0] = [ 1 -5 -6]

Slicing of a Matrix

letters = np.array([1, 3, 5, 7, 9, 7, 5])

print(letters[5:]) # Output:[7, 5]

import numpy as np

A = np.array([[1, 4, 5, 12, 14],

[-5, 8, 9, 0, 17],
[-6, 7, 11, 19, 21]])
print(A[:1,]) # first row, all columns

''' Output:
[[ 1 4 5 12 14]]
'''

matplotlib.pyplot is a plotting library used for 2D graphics in python programming language.

#Importing pyplot
from matplotlib import pyplot as plt

#Plotting to our canvas

plt.plot([1,2,3],[4,5,1]) #X and Y Axis

#Showing what we plotted

plt.show()

plt.title('Epic Info')
plt.ylabel('Y axis')
plt.xlabel('X axis')

plt.plot(x,y,linewidth=5)
from matplotlib import pyplot as plt
from matplotlib import style

style.use('ggplot')

x = [5,8,10]
y = [12,16,6]

x2 = [6,9,11]
y2 = [6,15,7]

plt.bar(x, y, align='center')

plt.bar(x2, y2, color='g', align='center')

plt.title('Epic Info')
plt.ylabel('Y axis')
plt.xlabel('X axis')

plt.show()

example.csv

1,5
2,7
3,8
4,3
5,5
6,6
7,3
8,7
9,2
10,12
11,5
12,7
13,2
14,6
15,9
16,2

from matplotlib import pyplot as plt

from matplotlib import style
import numpy as np

style.use('ggplot')

x,y = np.loadtxt('example.csv',
unpack=True,
delimiter = ',')

plt.plot(x,y)

plt.title('Epic Info')
plt.ylabel('Y axis')
plt.xlabel('X axis')

plt.show()

Pandas
Python Pandas is defined as an open-source library that provides high-performance data manipulation
in Python.
It is used for data analysis in Python and developed by Wes McKinney in 2008
Data analysis requires lots of processing, such as restructuring, cleaning or merging, etc. There are
different tools are available for fast data processing, such as Numpy, Scipy, Cython, and Panda. But we
prefer Pandas because working with Pandas is fast, simple and more expressive (efficient) than other
tools.
Pandas is built on top of the Numpy package, means Numpy is required for operating the Pandas.
Key Features of Pandas:
Used for reshaping of the data sets
Process a variety of data sets in different formats like matrix data and time series.
Provides fast performance, and If you want to speed it, even more, you can use the Cython (Cython is
a programming language, the code can also be written like ‘C’ Syntax)

Pandas Data Structure:

Pandas deals with the following three data structures −

 Series
 DataFrame
 Panel
Dimension & Description

DataFrame is a container of Series, Panel is a container of DataFrame

Data Dimensions Description
Structure
Series 1 1D labeled homogeneous array,
sizeimmutable.
Data 2 General 2D labeled, size-mutable
Frames tabular structure with potentially
heterogeneously typed columns.
Panel 3 General 3D labeled, size-mutable array.

Series
Series is a one-dimensional array like structure with homogeneous data. For example, the following
series is a collection of integers 10, 23, 56, …
10 23 56 17 52 61 73 90 26 72
DataFrame
DataFrame is a two-dimensional array with heterogeneous data. For example,
Name Age Gender Rating
Steve 32 Male 3.45
Lia 28 Female 4.6
Vin 45 Male 3.9
Katie 38 Female 2.78
Each column represents an attribute and each row represents a person.
Panel
Panel is a three-dimensional data structure with heterogeneous data.

Series:
Create a Series from ndarray
import pandas as pd
import numpy as np
data = np.array(['a','b','c','d'])
s = pd.Series(data)
print s

0 a
1 b
2 c
3 d

s = pd.Series(data,index=[100,101,102,103])
print s
Its output is as follows −
100 a
101 b
102 c
103 d

Create a Series from dict

A dict can be passed as input and if no index is specified, then the dictionary keys are taken in a sorted
order to construct index. If index is passed, the values in data corresponding to the labels in the index
will be pulled out.
import pandas as pd
import numpy as np
data = {'a' : 0., 'b' : 1., 'c' : 2.}
s = pd.Series(data)
print s
Its output is as follows −
a 0.0
b 1.0
c 2.0

Observe − Dictionary keys are used to construct index

import pandas as pd
import numpy as np
data = {'a' : 0., 'b' : 1., 'c' : 2.}
s = pd.Series(data,index=['b','c','d','a'])
print s
Its output is as follows −
b 1.0
c 2.0
d NaN
a 0.0

Observe − Index order is persisted and the missing element is filled with NaN (Not a Number)
Create a Series from Scalar
import pandas as pd
import numpy as np
s = pd.Series(5, index=[0, 1, 2, 3])
print s
Its output is as follows −
0 5
1 5
2 5
3 5

import pandas as pd
import numpy as np
x=['a','b','c','d']
s = pd.Series(x, index=[0, 1, 2, 3])
print(s[1]) #value present in 1st row will be printed
O/p:
b
print(s[:2]) #first two
o/p:
0 a
1 b
DataFrame

Create DataFrame

A pandas DataFrame can be created using various inputs like −

Lists
dict
Series
Numpy ndarrays
Another DataFrame

Create a DataFrame from Lists

import pandas as pd
data = [1,2,3,4,5]
df = pd.DataFrame(data)
print df
Its output is as follows −
0
0 1
1 2
2 3
3 4
4 5

import pandas as pd
data = [['Alex',10],['Bob',12],['Clarke',13]]
df = pd.DataFrame(data,columns=['Name','Age'])
print df
Its output is as follows −
Name Age
0 Alex 10
1 Bob 12
2 Clarke 13
DataFrame from Dict of ndarrays
All the ndarrays must be of same length.
import pandas as pd
data = {'Name':['Tom', 'Jack', 'Steve', 'Ricky'],'Age':[28,34,29,42]}
df = pd.DataFrame(data)
print df
Its output is as follows −
Age Name
0 28 Tom
1 34 Jack
2 29 Steve
3 42 Ricky

Observe the values 0,1,2,3. They are the default index assigned to each using the function range(n)
import pandas as pd
data = {'Name':['Tom', 'Jack', 'Steve', 'Ricky'],'Age':[28,34,29,42]}
df = pd.DataFrame(data, index=['rank1','rank2','rank3','rank4'])
print df
Its output is as follows −
Age Name
rank1 28 Tom
rank2 34 Jack
rank3 29 Steve
rank4 42 Ricky

Create a DataFrame from List of Dicts

List of Dictionaries can be passed as input data to create a DataFrame. The dictionary keys are by
default taken as column names.

import pandas as pd
data = [{'a': 1, 'b': 2},{'a': 5, 'b': 10, 'c': 20}]
df = pd.DataFrame(data)
print df
Its output is as follows −
a b c
0 1 2 NaN
1 5 10 20.0

Observe, NaN (Not a Number) is appended in missing areas.

The following example shows how to create a DataFrame by passing a list of dictionaries and the row
indices.
Live Demo
import pandas as pd
data = [{'a': 1, 'b': 2},{'a': 5, 'b': 10, 'c': 20}]
df = pd.DataFrame(data, index=['first', 'second'])
print df
Its output is as follows −
a b c
first 1 2 NaN
second 5 10 20.0

Create a DataFrame from Dict of Series

Dictionary of Series can be passed to form a DataFrame. The resultant index is the union of all the
series indexes passed.
import pandas as pd

d = {'one' : pd.Series([1, 2, 3], index=['a', 'b', 'c']),

'two' : pd.Series([1, 2, 3, 4], index=['a', 'b', 'c', 'd'])}

df = pd.DataFrame(d)
print df
Its output is as follows −
one two
a 1.0 1
b 2.0 2
c 3.0 3
d NaN 4
Note − Observe, for the series one, there is no label ‘d’ passed, but in the result, for the d label, NaN is
appended with NaN.

Column Selection
Row Selection, Addition, and Deletion
import pandas as pd
Selection
d = {'one' : pd.Series([1, 2, 3], index=['a', 'b', print df.loc['b']
'c']),
'two' : pd.Series([1, 2, 3, 4], index=['a', 'b', Its output is as follows −
'c', 'd'])}
one 2.0
df = pd.DataFrame(d) two 2.0
print df ['one']

Its output is as follows − Slice Rows

a 1.0 print df[2:4]

b 2.0 one two
c 3.0 c 3.0 3
d NaN d NaN 4

Column Addition Addition of Rows

df['three']=df['one']+df['two'] import pandas as pd

print(df)
df = pd.DataFrame([[1, 2], [3, 4]], columns =
one two c ['a','b'])
a 1.0 1 2.0 df2 = pd.DataFrame([[5, 6], [7, 8]], columns =
b 2.0 2 4.0 ['a','b'])
c 3.0 3 6.0
d NaN 4 NaN df = df.append(df2)
print df
Its output is as follows −
Column Deletion
a b
del df['one'] 0 1 2
1 3 4
# using pop function 0 5 6
df.pop('two') 1 7 8
Deletion of Rows

df = df.drop(0)

print df
Its output is as follows −

ab
134
178

DV Lab2 Updated
No ratings yet
DV Lab2 Updated
12 pages
Data Analysis with Python Libraries
No ratings yet
Data Analysis with Python Libraries
29 pages
Pandas Shan Ver2
No ratings yet
Pandas Shan Ver2
25 pages
Python NumPy for Beginners
100% (1)
Python NumPy for Beginners
84 pages
Pandas
100% (1)
Pandas
163 pages
RAW Data
No ratings yet
RAW Data
22 pages
Pandas
No ratings yet
Pandas
82 pages
Mohit
No ratings yet
Mohit
19 pages
Numpy Basics Introduction To
No ratings yet
Numpy Basics Introduction To
35 pages
Introduction to Python Pandas Library
No ratings yet
Introduction to Python Pandas Library
33 pages
DAY6 Pandas Seaborn
No ratings yet
DAY6 Pandas Seaborn
97 pages
Unit - 1 - Python Pandas
No ratings yet
Unit - 1 - Python Pandas
176 pages
14 Pandas
No ratings yet
14 Pandas
25 pages
Unit 1 Machine Learning
No ratings yet
Unit 1 Machine Learning
36 pages
Introduction To Numpy Pandas and Matplotlib
No ratings yet
Introduction To Numpy Pandas and Matplotlib
2 pages
Python for Scientific Computing: NumPy & Pandas
No ratings yet
Python for Scientific Computing: NumPy & Pandas
7 pages
ML Lab8
No ratings yet
ML Lab8
28 pages
P03 Introduction To Pandas Ans
No ratings yet
P03 Introduction To Pandas Ans
45 pages
Data Science - Unit-3-Part-2
No ratings yet
Data Science - Unit-3-Part-2
32 pages
Data Handling with Pandas in Python
No ratings yet
Data Handling with Pandas in Python
27 pages
NumPy and Pandas Basics for Data Analysis
No ratings yet
NumPy and Pandas Basics for Data Analysis
61 pages
UNIT 3 (Chapter 2) Pandas
No ratings yet
UNIT 3 (Chapter 2) Pandas
43 pages
Unit 5 Complete
No ratings yet
Unit 5 Complete
48 pages
Introduction to Python Libraries
No ratings yet
Introduction to Python Libraries
36 pages
Leip 102
No ratings yet
Leip 102
36 pages
Pandas Python
No ratings yet
Pandas Python
11 pages
Ilovepdf Merged (2) Merged
No ratings yet
Ilovepdf Merged (2) Merged
65 pages
Data Science Lab Manual
No ratings yet
Data Science Lab Manual
42 pages
Panda Ncert 1
No ratings yet
Panda Ncert 1
36 pages
Fods Lab Manual
No ratings yet
Fods Lab Manual
26 pages
CH 2
No ratings yet
CH 2
36 pages
Data Handling Python NCERT
No ratings yet
Data Handling Python NCERT
36 pages
Unit I: Data Handling Using Pandas and Data Visualization: Marks:30
No ratings yet
Unit I: Data Handling Using Pandas and Data Visualization: Marks:30
75 pages
Ip Programming
No ratings yet
Ip Programming
36 pages
Pandas Series - Notes For PA3
No ratings yet
Pandas Series - Notes For PA3
9 pages
Pandas
No ratings yet
Pandas
12 pages
Subject IP
No ratings yet
Subject IP
9 pages
Machine Learning Using Phython
No ratings yet
Machine Learning Using Phython
25 pages
Python Data Science Packages Guide
No ratings yet
Python Data Science Packages Guide
11 pages
UNIT-4 Important Q-A
No ratings yet
UNIT-4 Important Q-A
28 pages
Python Pandas
No ratings yet
Python Pandas
22 pages
Fundamentals of Data Science Lab Manual
No ratings yet
Fundamentals of Data Science Lab Manual
34 pages
Introduction to Pandas Library in Python
No ratings yet
Introduction to Pandas Library in Python
39 pages
Data Manipulation With Pandas
100% (1)
Data Manipulation With Pandas
38 pages
Q-Step WS 06112019 Data Analysis and Visualisation With Python
No ratings yet
Q-Step WS 06112019 Data Analysis and Visualisation With Python
76 pages
Python Pandas DataFrame Guide
No ratings yet
Python Pandas DataFrame Guide
53 pages
Python Data Processing
No ratings yet
Python Data Processing
36 pages
PP&DS Unit Iii
No ratings yet
PP&DS Unit Iii
26 pages
MLL Ip Xii
No ratings yet
MLL Ip Xii
22 pages
Data Handling with Pandas in Python
No ratings yet
Data Handling with Pandas in Python
14 pages
Data Visualization and Data Handling Using Pandas CLASS 12 - Aashi Nagiya
No ratings yet
Data Visualization and Data Handling Using Pandas CLASS 12 - Aashi Nagiya
19 pages
Ncert Pandas
No ratings yet
Ncert Pandas
36 pages
Understanding Pandas Series and Indexing
No ratings yet
Understanding Pandas Series and Indexing
36 pages
Pandas & Numpy
No ratings yet
Pandas & Numpy
32 pages
Understanding Pandas Data Structures
No ratings yet
Understanding Pandas Data Structures
56 pages
EXP1-siddhant Gupta (23 - SE - 148)
No ratings yet
EXP1-siddhant Gupta (23 - SE - 148)
17 pages
Pandas DataFrame Notes
67% (3)
Pandas DataFrame Notes
13 pages
Ip Study
No ratings yet
Ip Study
18 pages
AD3301 DEV Lab Manual
No ratings yet
AD3301 DEV Lab Manual
26 pages
Overview of Pandas DataFrames
No ratings yet
Overview of Pandas DataFrames
21 pages
Amazon Interview Questions
No ratings yet
Amazon Interview Questions
7 pages
Advanced Data Analyst Roadmap
No ratings yet
Advanced Data Analyst Roadmap
3 pages
Key Ip Pre Board 2024-25
No ratings yet
Key Ip Pre Board 2024-25
10 pages
E Data Analysis With Python Master Manual
No ratings yet
E Data Analysis With Python Master Manual
61 pages
WEEK 3 Assignment
No ratings yet
WEEK 3 Assignment
12 pages
Numpy, Pandas, and Matplotlib Basics
No ratings yet
Numpy, Pandas, and Matplotlib Basics
50 pages
Learn Machine Learning from Scratch
No ratings yet
Learn Machine Learning from Scratch
16 pages
Data Analysis and Visualization Exam Paper
No ratings yet
Data Analysis and Visualization Exam Paper
12 pages
Data Science Practical 01
No ratings yet
Data Science Practical 01
12 pages
Lec 04 - Pandas 2 - Continued
No ratings yet
Lec 04 - Pandas 2 - Continued
98 pages
II CSE CS3352 FDS QB Unit4
No ratings yet
II CSE CS3352 FDS QB Unit4
3 pages
12th Informatics Practices Annual Exam Question Paper Pattern
No ratings yet
12th Informatics Practices Annual Exam Question Paper Pattern
3 pages
CV - Ats Checker Rupesh 1
No ratings yet
CV - Ats Checker Rupesh 1
3 pages
Final R20 M.Tech AI Syllabus
No ratings yet
Final R20 M.Tech AI Syllabus
56 pages
CureBot: AI Healthcare Chatbot Project
No ratings yet
CureBot: AI Healthcare Chatbot Project
35 pages
Bca Ug Sep - III & IV Sem
No ratings yet
Bca Ug Sep - III & IV Sem
25 pages
Python Programming
No ratings yet
Python Programming
31 pages
Data Analytics Roadmap
No ratings yet
Data Analytics Roadmap
8 pages
IV 3
No ratings yet
IV 3
7 pages
12 Ip Dataframes Notes
No ratings yet
12 Ip Dataframes Notes
7 pages
Question Bank of Data Science Laboratory
No ratings yet
Question Bank of Data Science Laboratory
2 pages
Automation and Analytics Using Python Certisured Intership Report
No ratings yet
Automation and Analytics Using Python Certisured Intership Report
49 pages
02 AI Programming With Python
No ratings yet
02 AI Programming With Python
11 pages
Comprehensive BTC Futures Trading & Python Course (Systematic Strategy)
No ratings yet
Comprehensive BTC Futures Trading & Python Course (Systematic Strategy)
9 pages
Delhivery Business Case Study 1723758771
No ratings yet
Delhivery Business Case Study 1723758771
56 pages
Machine Learning Model For Movie Recomme
No ratings yet
Machine Learning Model For Movie Recomme
6 pages
Question and Answer For Unit 1,2,3,8
No ratings yet
Question and Answer For Unit 1,2,3,8
35 pages
Dev Lab Record
No ratings yet
Dev Lab Record
21 pages
Cbse Class 12 Informatics Practices Term1 Solved Question Paper 2022
No ratings yet
Cbse Class 12 Informatics Practices Term1 Solved Question Paper 2022
25 pages

Python-Unit 4 Notes

Uploaded by

Python-Unit 4 Notes

Uploaded by

SCHOOL OF COMPUTING

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING

Python Matrices and NumPy Arrays

This matrix is a 3x4

Matrix Addition using Nested Loop

# iterate through rows

create a NumPy array

A = np.array([[1, 2, 3], [3, 4, 5]])

We use + operator to add corresponding elements of two NumPy matrices

A = np.array([[2, 4], [5, -6]])

Note: * is used for array multiplication

A = np.array([[3, 6, 7], [5, -3, 0]])

A = np.array([[1, 1], [2, 1], [3, -3]])

print("A[0] =", A[0]) # First Row

A[0] = [1, 4, 5, 12]

letters = np.array([1, 3, 5, 7, 9, 7, 5])

A = np.array([[1, 4, 5, 12, 14],

matplotlib.pyplot is a plotting library used for 2D graphics in python programming language.

#Plotting to our canvas

#Showing what we plotted

plt.bar(x2, y2, color='g', align='center')

from matplotlib import pyplot as plt

Pandas Data Structure:

DataFrame is a container of Series, Panel is a container of DataFrame

Create a Series from dict

Observe − Dictionary keys are used to construct index

A pandas DataFrame can be created using various inputs like −

Create a DataFrame from Lists

Create a DataFrame from List of Dicts

Observe, NaN (Not a Number) is appended in missing areas.

Create a DataFrame from Dict of Series

d = {'one' : pd.Series([1, 2, 3], index=['a', 'b', 'c']),

Its output is as follows − Slice Rows

a 1.0 print df[2:4]

Column Addition Addition of Rows

df['three']=df['one']+df['two'] import pandas as pd

You might also like