0% found this document useful (0 votes)
13 views8 pages

Data Visualization

Uploaded by

Syed Rafiq
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views8 pages

Data Visualization

Uploaded by

Syed Rafiq
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

DATA VISUALIZATION

Data visualization means graphical or pictorial representation of the data


using graph, chart,etc. The purpose of plotting data is to visualize variation or
show relationships between variables.

Plotting using Matplotlib: Matplotlib library is used for creating static,


animated, and interactive 2D- plots or figures in Python. It can be installed
using the following pip command from the command prompt:
C:\pip install matplotlib
For plotting using Matplotlib, we need to import its Pyplot module using the
following command:
>>>import matplotlib.pyplot as plt
The pyplot module of matplotlib contains a collection of functions that can
be used to work on a plot. The plot() function of the pyplot module is used to
create a figure.

A figure contains a plotting area, legend, axis labels, ticks, title, etc. To plot x
versus y, we can write plt.plot(x,y). The show() function is used to display the
figure created using the plot() function. plot() function by default plots a line
chart, A figure can also be saved by using savefig() function.

Components of a plot
LINE CHART BAR CHART HISTOGRAM
You want to show trends over You want to compare You want to show the
time (e.g., days, months, categories or individual distribution of data
years). items. across intervals (bins).

Best for: Best for: Best for:

Stock prices over days Comparing marks of Frequency of student


students scores (0–10, 11–20, etc.)
Temperature change over a week
Sales of different products Age group distribution
Student marks growth in tests
Number of boys and girls in Rainfall in ranges
a class
Key Feature: Key Feature: Key Feature:
Points are connected by lines Bars are separate (no Bars are touching, and
to show movement or progress. touching), each bar each represents a range
represents a different of values, not a single
category. category.

Customisation of Plots: Pyplot library gives us numerous functions, which


can be used to customise charts such as adding titles or legends

Marker: A marker is any symbol that represents a data value in a


line chart.
Color: It is also possible to format the plot further by changing the colour of
the plotted data. We can either use character codes or the color names as
values to the parameter color in the plot().

Linewidth and Line Style: The linewidth and linestyle property can be
used to change the width and the style of the line chart. We can also set the
line style of a line chart using the linestyle parameter. It can take a string
such as "solid", "dotted", "dashed" or "dashdot.

Python Programme using Customization of plots,Marker,color


import matplotlib.pyplot as plt
date=["25/12","26/12","27/12"]
temp=[8.5,10.5,6.8]
plt.grid(True,color='gray',linewidth=0.7)
plt.plot(date, temp, label="Temperature", marker='o') plt.xlabel("Date")
plt.ylabel("Temperature")
plt.title("Date wise Temperature")
plt.yticks(temp)
plt.legend(loc='upper right')
plt.tight_layout()
plt.savefig("d:\\ex.png") # Saves in current directory
plt.show()
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
Programme using Linestyle & linewidth using DataFrame
import matplotlib.pyplot as plt
import pandas as pd
height=[121.9,124.5,129.5,134.6,139.7,147.3,152.4,157.5,162.6]
weight=[19.7,21.3,23.5,25.9,28.5,32.1,35.7,39.6,43.2]
df=pd.DataFrame({"height":height,"weight":weight})
plt.xlabel('Weight in kg')
plt.ylabel('Height in cm')
plt.title('Average weight with respect to average height')
#plot using marker'-*' and line colour as green
plt.plot(df.weight,df.height,marker='*',markersize=10,color='green',linewi
dth=2, linestyle='dashdot')
plt.show()
The Pandas Plot function (Pandas Visualisation): The plot() method of
Pandas accepts a considerable number of arguments that can be used to plot
a variety of graphs.
It allows customising different plot types by supplying the kind keyword
arguments.
syntax : plt.plot(„kind‟) where kind accepts a string indicating the type of .plot

Plotting a Line chart: A line plot is a graph that shows the frequency of
data along a number line. It is used to show continuous dataset. A line plot is
used to visualise growth or decline in data over a time interval.
Python Programme
Smile NGO has participated in a three week cultural mela. Using Pandas,
they have stored the sales (in Rs) made day wise for every week in a CSV
file named “MelaSales.csv”,
import pandas as pd
import matplotlib.pyplot as plt
df=pd.read_csv("D:\\mela.csv")
df.plot(kind='line', color=['r','b','brown'])
plt.title('Mela Sales Report')
plt.xlabel('Days')
plt.ylabel('Sales in Rs')
#Display the figure
plt.savefig("d:\\ex3.png")
plt.show()
Customising Line Plot: We can substitute the ticks at x axis with a list of
values of our choice by using plt.xticks(ticks,label)
Program 4-5 Assuming the same CSV file, i.e., Mela2.CSV, plot the line
chart with following customisations:
import pandas as pd
import matplotlib.pyplot as plt
df=pd.read_csv("D:\\mela2.csv")
#creates plot of different color for each week
df.plot(kind='line',
color=['red','blue','brown'],marker="*",markersize=10,linewidth=3,linestyle="--")
plt.title('Mela Sales Report')
plt.xlabel('Days')
plt.ylabel('Sales in Rs')
#store converted index of DataFrame to a list
ticks = df.index.tolist()
#displays corresponding day on x axis
plt.xticks(ticks,df['Day'])
plt.show()
Plotting Bar Chart:
 A Bar chart is a graphical display of vertically(bar) or
horizontally(barh) data using bars of different heights/Widths.
 By default, bar chart draws all the bars with equal widths and having a
default width of 0.8 units on a bar chart
 To plot a bar chart, we will specify kind=‟bar‟.We can also specify the
DataFrame columns to be used as x and y axes.
 Best suited for Data Comparisons
 Pyplot offers bar() function to create a bar chart.

Program 4-6 This program displays the Python script to display Bar plot
for the “Mela3.csv” file with column Day on x axis.
import pandas as pd
import matplotlib.pyplot as plt
df=pd.read_csv("D:\\Mela3.csv")
#creates plot of different color for each week
df.plot(kind='bar',x='Day',title='Mela Sales Report')
plt.title('Mela Sales Report')
plt.xlabel('Days')
plt.ylabel('Sales in Rs')
plt.show()
Customising Bar Chart: We can also customise the bar chart by adding
certain parameters to the plot function. We can control the edgecolor of
the bar, linestyle ,linewidth and color of the lines.
Program 4-7 This program displays the Python script to display Bar plot
for the “Mela3.csv” file with column Day on x axis.
import pandas as pd
import matplotlib.pyplot as plt
df=pd.read_csv("D:\\Mela3.csv")
#creates plot of different color for each week
df.plot(kind='bar',x='Day',title='Mela Sales Report',
color=['red','yellow','purple'], edgecolor='green',linewidth=2,linestyle='--')
plt.title('Mela Sales Report')
plt.xlabel('Days')
plt.ylabel('Sales in Rs')
plt.show()
Histogram

 A histogram is a summarisation tool for continuous data.


 A histogram provides a visual interpretation of numerical data by
showing the number of data points that fall within a specified range of
values (called bins).
 In a histogram, bins are the class intervals or ranges into which
the continuous data is divided. Each bin groups a set of data values
 It is similar to a vertical bar graph. However, a histogram, unlike a
vertical bar graph, shows no gaps between the bars.
 In a histogram, frequency refers to the number of times a
particular value or range of values occurs in your dataset..
 The df.plot(kind=‟hist‟) function automatically selects the size of
the bins based on the spread of values in the data.
 So in a histogram, the x-axis represents bins (or class intervals),
and the y-axis represents frequency.
 Histogram was introduced by “Karl Perason”

Note:
Sno Continuous data Discrete data
1 Measured Counted(Countable)
2 Height, weight, time, temperature,speed Students, cars, books, phones
3 Histogram Bar Chart
Program 4-8
import pandas as pd
import matplotlib.pyplot as plt
data = {
'Name': ['Arnav', 'Sheela', 'Azhar', 'Bincy', 'Yash','Nazar'],
'Height': [60,61,63,65,61,60],
'Weight': [47,89,52,58,50,47]
}
df = pd.DataFrame(data)
# Plot histogram for numeric columns
df[['Height','Weight']].plot(kind='hist', bins=5,edgecolor='black')
plt.title('Histogram of Height and Weight')
plt.xlabel('Values(Bins)')
plt.ylabel('Frequency')
plt.show()
Customising Histogram :We can also customise the Histogram chart
by adding certain parameters to the hist() function. We can change the
edgecolor, linewidth,hatch to each hist with pattern ( '-', '+', 'x', '\\', '*',
'o', 'O', '.'),fill(which takes boolean values True or False) of the
histogram.
Program 4-9
import pandas as pd
import matplotlib.pyplot as plt
data = {'Name':['Arnav', 'Sheela', 'Azhar','Bincy','Yash',
'Nazar'],
'Height' : [60,61,63,65,61,60],
'Weight' : [47,89,52,58,50,47]}
df=pd.DataFrame(data)
df.plot(kind='hist',edgecolor='Green',linewidth=2,linestyle=':',fil
l=False,hatch='o')
plt.show()
Histogram Parameters:
Histogram Programme using hist() parameters
import matplotlib.pyplot as plt
# Marks of 20 students in a test (out of 40)
marks = [12, 25, 18, 30, 22, 35, 28, 14, 10, 26,
32, 19, 15, 38, 21, 17, 24, 29, 27, 36]
# Plot Histogram (Normal frequency distribution)
plt.hist(
marks,
bins=[0, 10, 20, 30, 40], # class intervals (0–10,10–20,20–30,30–40)
cumulative=False, # normal frequency (not cumulative)
histtype='bar', # default bar style
align='mid', # bins aligned at mid
orientation='horizontal', # vertical histogram
color='lightcoral', edgecolor='black'
)
# Add labels and title
plt.xlabel("Marks (out of 40)")
plt.ylabel("Frequency")
plt.title("Histogram of Student Marks (20 Students)")
# Save figure as PNG file
plt.savefig("student_marks_histogram_normal.png", dpi=300)
# Show plot
plt.show()
Using Open Data :There are many websites that provide data freely for
anyone to download and do analysis, primarily for educational purposes.
These are called Open Data as the data source is open to the public.
“Open Government Data (OGD) Platform India” (data.gov.in) is a
platform for supporting the Open Data initiative of the Government of
India.
LINE CHART PARAMETERS

BAR CHART PARAMETERS

You might also like