Data Visualization with Python
Data Visualization is the process of presenting data in the form of graphs or charts. It
helps to understand large and complex amounts of data very easily. It allows the
decision-makers to make decisions very efficiently and also allows them in identifying
new trends and patterns very easily. It is also used in high-level data analysis for
Machine Learning and Exploratory Data Analysis (EDA). Data visualization can be
done with various tools like Tableau, Power BI, Python.
Python provides various libraries that come with different features for visualizing data.
All these libraries come with different features and can support various types of graphs.
Python has several libraries for visualizing data, each with its features
Library package Matplotlib, Seaborn(statistical), bokeh(weather),geoplotlib(geographical
data) etc
Matplotlib
Matplotlib is a low-level library of Python which is used for data visualization. It is easy
to use and emulates MATLAB like graphs and visualization. This library is built on the
top of NumPy arrays and consist of several plots like line chart, bar chart, histogram, etc.
It provides a lot of flexibility but at the cost of writing more code.
To install Matplotlib type the below command in the terminal.
pip install matplotlib
Pyplot
Pyplot is a Matplotlib module that provides a MATLAB-like interface. Matplotlib is
designed to be as usable as MATLAB, with the ability to use Python and the advantage
of being free and open-source. Each pyplot function makes some change to a figure: e.g.,
creates a figure, creates a plotting area in a figure, plots some lines in a plotting area,
decorates the plot with labels, etc. The various plots we can utilize using Pyplot are Line
Plot, Histogram, Scatter, 3D Plot, Image, Contour, and Polar.
Line chart in Matplotlib – Python:
Matplotlib is a data visualization library in Python. The pyplot, a sublibrary of matplotlib, is a collection of
functions that helps in creating a variety of charts.
Line charts are used to represent the relation between two data X and Y on a different [Link] we will see some of
the examples of a line chart in Python : 17
Simple line plots
First import [Link] library for plotting functions. Also, import the Numpy
library as per requirement. (It’s optional to import numpy)
Then define data values x and y.
import [Link] as plt #importing a library
[Link]([1,2,3]) #creating data set
[Link]("y axis") #labelling y axis
[Link]('x axis') #labelling x axis
[Link]() #to display
OUTPUT
import [Link] as plt
[Link]('Line graph')
pt=[10,20,30]
[Link](pt,linestyle='dotted',color='r',linewidth='3',marker='o')
[Link]()
We can use any color to plot line also we can use hexadecimal color values(#081f00) or simply use color name as
red ,‘r’
We can use marker to mark x and y axis points. Example 0-circle,*-star, ,-pixel, x-X, +-plus, s-Square
We can use different types of line styles to plot line some types are mentioned in below table.
'solid' (default) '-'
'dotted' ':'
'dashed' '--'
'dashdot' '-.'
'None' '' or ' '
Bar Plot in Matplotlib:
A bar plot or bar chart is a graph that represents the category of data with rectangular
bars with lengths and heights that is proportional to the values which they represent. The
bar plots can be plotted horizontally or vertically. A bar chart describes the comparisons
between the discrete categories. One of the axis of the plot represents the specific
categories being compared, while the other axis represents the measured values
corresponding to those categories.
Creating a bar plot
The matplotlib API in Python provides the bar() function which can be used in
The syntax of the bar() function to beused with the axes is as follows:-
[Link](x, height, width, bottom, align)
This syntax creates a bar plot bounded with a rectangle depending on the given parameters.
And here height ,width,bottom,align these are optional to create a simple bar chart.
Example:
import [Link] as plt #importing library
[Link]('Bar graph') #creating a title
x = ["c","c++","java","python"] # creating datasets as x and y
y=[10,20,30,40]
[Link]("courses") # labeling x axis
[Link]("enrolled students number")#labeling y axis
[Link](x,y,color='r') #plotting bar graph for datasets with color red
[Link]()#displaying bar garph
Plotting Histogram in Python using Matplotlib:
A histogram is basically used to represent data provided in a form of some groups.
It is accurate method for the graphical representation of numerical data distribution.
It is a type of bar plot where X-axis represents the bin ranges while Y-axis gives
informationabout frequency.
Creating a Histogram
To create a histogram the first step is to create bin of the ranges, then distribute the
whole range of the values into a series of intervals, and count the values which fall into
each of the [Link] are clearly identified as consecutive, non-overlapping
intervals of [Link] [Link]() function is used to compute and
create histogram of x.
The following table shows the parameters accepted by [Link]() function
Attribute Parameter
x array or sequence of array
bins optional parameter contains integer or sequence or strings
density optional parameter contains boolean values
range optional parameter represents upper and lower range of bins
optional parameter used to create type of histogram [bar, barstacked, step,
histtype
stepfilled], default is “bar”
align optional parameter controls the plotting of histogram [left, right, mid]
weights optional parameter contains array of weights having same dimensions as x
bottom location of the baseline of each bin
Attribute Parame
ter
optional parameter which is relative width of the bars with respect to bin
rwidth
width
color optional parameter used to set color or sequence of color specs
optional parameter string or sequence of string to match with multiple
label
datasets
log optional parameter used to set histogram axis on log scale
Example:
from matplotlib import pyplot as plt
# Creating dataset
a = [22, 87, 5, 43, 56,73, 55, 54, 11,20, 51, 5, 79, 31,27]
# Creating histogram
a=[Link](a, bins = [0, 25, 50, 75, 100],histtype='stepfilled',density='True')
# Show plot
[Link]()
output:
pie chart in Python using Matplotlib
A Pie Chart is a circular statistical plot that can display only one series of data. The
area of the chart is the total percentage of the given data. The area of slices of the pie
represents the percentage of the parts of the data. The slices of pie are called wedges.
The area of the wedge is determined by the length of the arc of the wedge. The area
of a wedge represents the relative percentage of that part with respect to whole data. Pie
charts are commonly used in business presentations like sales, operations, survey
results, resources, etc as they provide a quick summary.
Creating Pie Chart
Matplotlib API has pie() function in its pyplot module which create a pie chart
representing the data in an array.
Syntax: [Link](data, explode=None, labels=None, colors=None,
autopct=None, shadow=False)
Parameters:
data represents the array of data values to be plotted, the fractional area of each slice
is represented by data/sum(data). If sum(data)<1, then the data values returns the
fractional area directly
labels is a list of sequence of strings which sets the label of each wedge.
color attribute is used to provide color to the wedges.
autopct is a string used to label the wedge with their numerical value.
shadow is used to create shadow of wedge.
Example:
import [Link] as plt
[Link]('pie chart')
x = [3, 8, 1, 10]
id=["101","102","103","104"]
c=["red","yellow","green","blue"]
a=[0.1,0.2,0.3,0.4]
[Link](x,labels=id, colors=c,explode=a,autopct="%.5f",shadow='True')
[Link]()
output:
Simple pie chart example.
import [Link] as plt
[Link]('pie chart')
x = [3, 8, 1, 10]
id=["101","102","103","104"]
c=["red","yellow","green","blue"]
a=[0.1,0.2,0.3,0.4]
[Link](x,labels=id, colors=c)
[Link](title='areas')
[Link]()
output: