Unit 1: Data Handling using
Pandas and Data Visualization
(25 marks)
Chapter 2: Data Visualisation
Designed by: Umesh Pun (PGT IP) APS Yol Cantt
REVISED
Introduction
Data visualization basically refers to the graphical or
visual representation of information and data using
visual elements like charts, maps and graphs etc.
Data visualization is immensely useful in decision
making.
Designed by: Umesh Pun (PGT IP) APS Yol Cantt
Using Pyplot of Matplotlib Library
For data visualization in Python, the matplotlib
library/package, Pyplot interface is used.
matplotlib is a high quality plotting library of python
that provides a very quick way to visualize data from
python and publication-quality figures in many
format.
Pyplot is a collection of methods within matplotlib
which allows user to construct 2D plots easily and
interactively.
Designed by: Umesh Pun (PGT IP) APS Yol Cantt
Installing and importing matplotlib
Install matplotlib on command prompt
pip install matplotlib
Importing PyPlot
import matplotlib.pyplot
This would required you to refer to every command of pyplot as
matplotlib.pyplot.<command>
import matplotlib.pyplot as plt
This would required you to refer to every command of pyplot as
plt.<command>
Designed by: Umesh Pun (PGT IP) APS Yol Cantt
Working with Pyplot methods
The PyPlot interface provides many methods for 2D
plotting of data.
The Pyplot interface lets one plot data in multiple
wats such as line chart, bar chart, pie chart, scatter
chart etc.
You can easily plot the data available in form of
NumPy arrays (ndarrays) or dataframes etc.
Designed by: Umesh Pun (PGT IP) APS Yol Cantt
Matplotlib - PyPlot features
Following features are provided in matplotlib library for data
visualization.
• Drawing: Plots can be drawn based on passed data
through specific functions.
• Customization : Plots can be customized as per requirement
after specifying it in the arguments of the
functions. Like color, style(dashed,dotted),
width; adding label, title and legend in plots
can be customized.
• Saving: After drawing and customization plots can be
saved for future use.
Designed by: Umesh Pun (PGT IP) APS Yol Cantt
Steps to plot in matplotlib
1. Install matplotlib by pip command
pip install matplotlib in command prompt.
2. Create a .py & import matplot library in it using
import matplotlib.pyplot as plt statement
Designed by: Umesh Pun (PGT IP) APS Yol Cantt
Steps to plot in matplotlib
3. Set data points in plot() method of plt object
4. Customize plot by changing different parameters
5. Call the show() method to display plot
6. Save the plot/graph, if required
Designed by: Umesh Pun (PGT IP) APS Yol Cantt
Basics of Simple Plotting
Data visualization essentially means graphical
representation of compiled data.
Graphs and Charts are effective tools for data
visualization.
You can create many different types of graphs and
charts using PyPlot.
Designed by: Umesh Pun (PGT IP) APS Yol Cantt
Basics nomenclature of a Plot
Pyplot provides the interface to the plotting library
in matplotlib.
It means that figures and axes are implicitly and
automatically created to achieve the desired plot.
A matplotlib figure can be categorized into several
parts:
1. Figure: It is a canvas which contains plot. It may
contain one or more than one axes (plots).
Designed by: Umesh Pun (PGT IP) APS Yol Cantt
Basics nomenclature of a Plot
2. Axes: A figure contains many axes. It contains
two or three axis objects. Each Axis has a title an x-
label and a y-label.
3. Axis: They are the number line like objects and
take care of generating the graph limits.
4. Artist: Everything which one can see on the
figure is an artist like Text objects, Line 2D objects,
collection objects.
Designed by: Umesh Pun (PGT IP) APS Yol Cantt
Basics nomenclature of a Plot
5. Labels: It is used to specify what kind of data you
are plotting.
6. Title: The title of the chart describes what it is.
7. Legend: They are used to explain what each line
means in the current figure.
Designed by: Umesh Pun (PGT IP) APS Yol Cantt
Basics of Simple Plotting
Some common chart types are:
1. Line chart
It is a type of chart which displays
information as series of data points called ‘markers’
connected by straight line segments.
With Pyplot, a line chart is created using plot()
function.
Designed by: Umesh Pun (PGT IP) APS Yol Cantt
Example (Line chart)
import matplotlib.pyplot as plt
week=[1,2,3,4]
prices=[40, 100, 60, 80]
plt.plot(week, prices)
plt.xlabel("Week")
plt.ylabel("Onion prices (Rs.)")
plt.show()
Designed by: Umesh Pun (PGT IP) APS Yol Cantt
Example (Line chart)
import matplotlib.pyplot as plt
overs=[0, 10, 20, 30, 40, 50]
ind =[0, 90, 190, 280, 350, 394]
aus=[0, 100, 200, 290, 340, 370]
plt.plot(overs, ind, 'b', linewidth=2)
plt.plot(overs, aus, 'r', linewidth=2, linestyle='dashed')
plt.xlabel("Overs")
plt.ylabel("Runs per over")
plt.title("T50 World Cup")
plt.show()
Designed by: Umesh Pun (PGT IP) APS Yol Cantt
Example (Line chart)
import matplotlib.pyplot as plt
year=[2015,2016,2017,2018,2019,2020]
x=[90,92,94,95,97,85]
xii=[89,90,93,85,95,98]
plt.plot(year,x,color='g')
plt.plot(year,xii,color='orange')
plt.xlabel('Year')
plt.ylabel('Passpercentage')
plt.title('Pass% till 2020')
plt.legend(('x','xii'), loc='upper right')
plt.show()
plt.savefig('line_plot.pdf')
Designed by: Umesh Pun (PGT IP) APS Yol Cantt
Basics of Simple Plotting
2. Bar chart
It is a type of chart that presents categorical
data with rectangular bars with height and lengths
proportional to the values that they represent.
The bars can be plotted vertically or horizontally.
With Pyplot, a bar chart is created using bar() or
barh() functions.
Designed by: Umesh Pun (PGT IP) APS Yol Cantt
Example (Bar chart)
import matplotlib.pyplot as plt
week=[1,2,3,4]
prices=[40, 100, 60, 80]
plt.bar(week, prices)
plt.title("Price per Week")
plt.xlabel("Week")
plt.ylabel("Onion prices (Rs.)")
plt.show()
Designed by: Umesh Pun (PGT IP) APS Yol Cantt
Example (Bar chart)
import matplotlib.pyplot as plt
medal=['Gold', 'Silver', 'Bronze']
india=[4,10,35]
plt.bar(medal, india, color=['r', 'b', 'g'])
plt.title("Olympics")
plt.xlabel("Medal type")
plt.ylabel("Medal count")
plt.show()
Designed by: Umesh Pun (PGT IP) APS Yol Cantt
Example (Bar chart)
import matplotlib.pyplot as plt
medal=['Gold', 'Silver', 'Bronze']
india=[4,10,35]
plt.barh(medal, india, color=['gold', 'silver', 'brown'])
plt.title("Olympics")
plt.xlabel("Medal type")
plt.ylabel("Medal count")
plt.show()
Designed by: Umesh Pun (PGT IP) APS Yol Cantt
Example (Bar chart)
import matplotlib.pyplot as plt
medal=['Gold', 'Silver', 'Bronze']
india=[4,10,35]
china=[2,1,10]
plt.bar(medal, india)
plt.bar(medal, china)
plt.title("Olympics")
plt.xlabel("Medal type")
plt.ylabel("Medal count")
plt.show()
Designed by: Umesh Pun (PGT IP) APS Yol Cantt
Example (Bar chart)
import matplotlib.pyplot as plt
import numpy as np
label = ['Anil', 'Vikas', 'Dharma', 'Mahen', 'Manish', 'Rajesh']
per = [94,85,95,85,77,80]
index = np.arange(len(label))
plt.bar(index, per)
plt.xlabel('Student Name', fontsize=5)
plt.ylabel('Percentage', fontsize=5)
plt.xticks(index, label, fontsize=5, rotation=30)
plt.title('Percentage of Marks achieve by student Class XII')
plt.legend(('Name','Per'), loc='upper right')
plt.show()
Designed by: Umesh Pun (PGT IP) APS Yol Cantt
Example (Bar chart)
Designed by: Umesh Pun (PGT IP) APS Yol Cantt
Basics of Simple Plotting
3. Histogram plot
It is a type of graph that provides a visual
interpretation of numerical data by indicating the
number of data points that lie within the range of the
values (“bins”).
It is similar to a vertical bar graph but without gaps
between the bars.
With Pyplot, a histogram is created using hist()
function.
Designed by: Umesh Pun (PGT IP) APS Yol Cantt
Example (Histogram chart)
import matplotlib.pyplot as plt
x = [1,1,2,3,3,5,7,8,9,10,
10,11,11,13,13,15,16,17,18,18,
18,19,20,21,21,23,24,24,25,25,
25,25,26,26,26,27,27,27,27,27,
29,30,30,31,33,34,34,34,35,36,
36,37,37,38,38,39,40,41,41,42,
43,44,45,45,46,47,48,48,49,50,
51,52,53,54,55,55,56,57,58,60,
61,63,64,65,66,68,70,71,72,74,
75,77,81,83,84,87,89,90,90,91]
plt.hist(x, bins=10, edgecolor="red")
plt.show()
Designed by: Umesh Pun (PGT IP) APS Yol Cantt
Example (Histogram chart)
Designed by: Umesh Pun (PGT IP) APS Yol Cantt
Example (Histogram chart)
import matplotlib.pyplot as plt
x=[5,15,25,35,45,55,55,65,75]
plt.hist(x, bins=3,edgecolor="red")
plt.show()
Designed by: Umesh Pun (PGT IP) APS Yol Cantt
Example (Histogram chart)
import matplotlib.pyplot as plt
x=[5,15,25,25,55,55,55,65,75]
plt.hist(x, bins=3,edgecolor="red")
plt.show()
Designed by: Umesh Pun (PGT IP) APS Yol Cantt