0% found this document useful (0 votes)
120 views28 pages

Class 12 IP Ch-1, 2 3

Ip calsss 12
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
120 views28 pages

Class 12 IP Ch-1, 2 3

Ip calsss 12
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 28

Class: XII Commerce

SUBJECT: INFORMATICS PRACTICES

SUBJECT CODE: 065

STUDY MATERIAL FOR: PANDAS-


PANDAS
SERIES, DATAFRAME &
VISUALIZATION
PANDAS IN PYHTON

Pandas is a Python Library. It is used to analyze data. It is an open-source python library


which provides high-performance easy to use data structures and data analysis tools. We
will be discussing two data structures in detail i.e. Series and DataFrames.

SERIES

It is a one-dimensional homogenous collection of data elements. Homogeneous means every


element in Series must be of a same data type.

DATAFRAMES

It is a two-dimensional collection of data arranged in the form of rows and columns.

PANEL

It is a multidimensional data structure available in pandas.

SERIES

Series is a one-dimensional set of values.

Key Points :

1. Homogeneous date
2. Size immutable
3. Values of Data Mutable

It can contain elements of same data type which can be Integer, String, Float, Python object
etc. The axis labels are called index and is used to access values in a series.

CREATING A SERIES

A series can be created using the constructor / command:-

Seriesname = pandas. Series (data, index dtype)

Here we must pass; data as parameter, index and type are optional parameters.

Data can be a list, an array, a dictionary or even a scalar value.

Eg.1
import pandas as pd
s=pd.Series()
print(s)

Output

Will create an empty Series, thus output will be Series ([], dtype: float 64 )
* By default data type is taken as float

The number of elements in data and index must be same. If no index is given it will be
taken as 0,1,2…… so on upto the size -1

Creating a series from Lists Eg.3


Eg.2 import pandas as pd
import pandas as pd s=pd.Series([30,50,60],index[‘A’,’B,’C’’])
s=pd.Series([30,50,60]) print(s)
print(s) Output
Output A 30
B 50
0 30 C 60
1 50
2 60 dtype=int64
dtype=int64

Creating a Series from numpy array

Eg.4
import pandas as pd
import numpy as np
data=np.array([‘a’,’b’,’c’,’d’])
s=pd.Series(data)
print(s)
Output

0 a Values of type string will always show the data type as object.
1 b
2 c

dtype= object

Series created cannot be modified in size but values are mutable.

Creating a Series using dictionary

When a series is created using dictionary, the key values are taken as index, if no other
index is specified.

Eg.5
import pandas as pd
data={‘A’:’Anjali’,’B’:’Bharti’,’C’:’Charu’}
s=pd.Series(data)
print(s)
Output

A Anjali
B Bharti
C Charu
dtype= object
Creating a Series using range

We can use arrange function from numpy or normal range function to generate values for a
series.
Eg.6
s=pd.Series(range(1,6)) Will create a series of numbers from 1 to 5
print(s)
Output

0 1
1 2
2 3
3 4
4 5
dtype= int64

To create series which contains first 5 even numbers, we can use range with a
gap of 2.

Eg.7
s=pd.Series(range(2,11,2)) Will generate series for [2,4,6,8,10]
print(s)

Output

0 2
1 4
2 6
3 8
4 10
dtype= int64

Creating a series from Scalar value:

In order to create a series from scalar value, an index must be provided. The scalar value will
be repeated to match the length of index.

Eg.8
import pandas as pd
import numpy as np

s = pd.Series(10, index =[0, 1, 2, 3, 4, 5]) # giving a scalar value with index


print(s)
Output

0 10
1 10
2 10
3 10
4 10
5 10
dtype= int64

The Series can be created in different ways, but its always homogeneous or one dimensional.
SERIES ATTRIBUTES

Attributes are used to describe features / behaviour of a date structure. Each of the attribute
is explained taking following series as an example:

Attributes Description

Series.index Range of the index (axis labels) of the Series.

Series.values Return Series as ndarray or ndarray like depending upon dtype

Series.dtype Return the dtype object of the underlying data.

Series.shape Return a tuple of the shape of the underlying data.

Series.nbytes Return the number of bytes in the underlying data.

Series.ndim The number of dimensions of the underlying data, by definition 1.

Series.size Return the number of elements in the underlying data.

Series.hasnans Return if I have any nans; enables various perf speedups.

Series.empty Return true if Series is empty

at, iat To access a single value from Series

loc, iloc To access slices from Series

Examples:

import pandas as pd

s=pd.Series(range(2,11,2))

1. index: This shows the row labels(index) of the series.


Eg.: print(s.index)
Will show [0,1,2,3,4] as index of Series S.

2. values: It will fetch only the values stored in the series as an array.
Eg.: print(s.values)
Will show [2,4,6,8,10] as list of values.

3. dtype: It can be used to print the data type of values in the series. We can also
change the data type using the attribute.
Eg.: print(s.dtype)
Will print int64.

s.dtype=float64
will change the datatype to float and values will change from[2,4,6,8,10] to
[2.0,4.0,6.0,8.0,10.0]

4. shape: It return the shape of a series as a tuple


Eg.: print(s.shape)
Will print 5. Since there are 5 rows and 1 column. 1 is not specified since series is a
one dimensional.
5. nbytes: returns the number of bytes required by the elements of a series.
Eg.: print(s.nbytes)
Will print 40.
Since int64 means 8 bytes. One element takes 8bytes and there are 5 elements. Thus
total no of bytes required is 40.
6. ndim: It returns the number of dimensions of the data structure.
Eg.: print(s.ndim)
Will print 1. Since series is a one-dimensional data structure.

7. size: It returns the count. i.e. number of elements in the series.


Eg.: print(s.size)
Will print 5 since there are 5 elements in the series.
8. itemsize: It returns the size (memory required) by a single item in the series.
Eg.: print(s.itemsize)
Will print 8 since dtype int64 takes 8 bytes in the memory..
9. hasnans: It gives true if there is any NAN(Not a Number) value in the series. In case
there is no NAN, it gives false.
Eg.: print(s.hasnans)
Will print false, since we don’t have NAN.
10. empty: Gives true if there is no element in the series. If there are values, it gives
false.
Eg.: print(s.empty)
Will print false, since Series S has 5 values.

SERIES HEAD AND TAIL METHOD.

Head and tail are predefined methods to show selected number & values from top or bottom
of Series respectively.

head():- s.head() shows top 5 rows by default.


We can pass how many values should be shown in ().
Example s.head (3)
Will only show top 3 values.
Similarly s.tail() shows bottom 5 values, in case we need some other number, we provide it
in ().

print(s.head (3)) print(s.tail(3))


will print will print
0 2 0 6
1 4 1 8
2 6 2 10
dtype: int64 dtype: int64
ACCESSING /INDEXING A SERIES:

Indexing can be used to fetch as well as modify the data values in a series.

Series is size immutable but data mutable.

Thus data in series can be modified.

print(s[4]) will print 10.

but s[4]=12 will change the value at index 4.

Now if we print print(s)


0 2
1 4
2 6
3 8
4 12 The value is modified.
dtype = int64

SLICING A SERIES:

A series can also be fetched as a slice, as we do with list and tuples.

Seriesname [startindex:endindex:step]

startindex if not given is taken as 0 by default.


endindex if not given is taken as size of series.
step if not given is taken as 1 by default.

It start from startindex, goes upto endindex-1 increasing the index by given value of step.
Eg.:
s=pd.Series([10,20,30,40,50,60,70])
S has 7 values

print(s[2:5]) will print


2 30
3 40
4 50 Will exclude index 5. Only index 2 to 4 are printed step is taken as 1.
dtype:int64

print(s[:3]) will print


0 10
1 20
2 30 startindex is taken as 0 by default and goes upto end index -1. i.e.2
dtype:int64

print(s[1:6:2]) will print


1 20
3 40
5 60 will print from index 1, at a gap of 2 upto index 5
dtype:int64

print(s[::-1]) will print


6 70
5 60
4 50
3 40 startindex is taken as; end is taken as size. Since the step is 1, whole series
2 30 is fetched in reverse order
1 20
0 10
dtype:int64

print(s[::2]) will print


0 10
2 30
4 50 will fetch from 0 to size with a step of 2. Thus will skip 1 element in between
6 70
dtype:int64

print(s[4:]) will print


4 50
5 60
6 70 will fetch from 4 as start till end since no end position is given.
6 70
dtype:int64

s[2:4]=100

Will update the series. Values from index 2 to 3 will be changed to 100.

print(s)

0 10
1 20
2 100 Values from index 2 to 3 modified
3 100
4 50
5 60
6 70

Thus slicing can be used to modify set of values together in a series.

Incase indexes are strings; the last index will also be included while fetching a slice.
print(s[‘A’:’C’]) will print three values

‘A’ 10
‘B’ 20
‘C’ 30
USING ARITHMETIC OPERATIONS WITH SERIES:

Operation between a series and a scalar value

Eg.: s=pd.Series([10,20,30])
We can perform arithmetic operations with a series and a scalar value. In that case, each
element of the series is operated.
For example: print(s*3)
0 30
1 60
2 90 Here each element of the series is multiplied by 3
3 120
dtype:int64

print(s+50)
0 60
1 70
2 80 Here each element of the series is added by 50
3 90
dtype:int64

Operation between two series


Two series can also be added, subtracted, multiplied etc. For that they must have same
index. Values at same index are operated, for non matching NAN is recorded as result.
Eg.: S1=pd.Series([10,20,30])
S2=pd.Series([5,6,7])
print(s1-s2)
Output
0 5
1 14
2 23
dtype:int64
Here both the series have same index 0,1 and 2. Thus the corresponding values will
be subtracted.
Eg.: S1=pd.Series([10,20,30],index=[‘A’,’B’,’C’])
S2=pd.Series([5,6,7], index=[‘A’,’D’,’C’])
print(s1) print(s2)
A 10 A 5
B 20 D 6
C 30 C 7
dtype:int64 dtype:int64

print(s1+s2)
output
A 15
B NAN
C 37
D NAN
dtype:int64
Here values at same index i.e. A and C are added but the non matching index will be assigned
NAN.
Result can also be stored in another series object.
i.e. s3=s1+s2
FILTERING SERIES DATA:

We can put conditions to fetch the selected data from a series.

Eg.:
import pandas as pd
s = pd.Series([20,66,40,22,48,56])
print(s[s>40])
Output

1 66
4 48 Values which are above 40 are displayed
5 56
dtype= int64

SORTING SERIES DATA:


Sorting means arranging the values in increasing or decreasing order. There are two
functions available to sort a series.
sort_values() which arranges series as per values.
sort_index() which sorts as per index

Eg.:
import pandas as pd
s = pd.Series([20,66,40,22,48,56])
print(s.sort_values())
Output
0 20
3 22
2 40
4 48
5 56
1 66
dtype= int64

By default values are arranged in increasing order. To get in decreasing order, need to
write: s.sort_values(ascending=false)
By default ascending=true

Same way we can sort as per index.


DATAFRAMES IN PYTHON PANDAS

DataFrame is a data structure in python pandas which stores data in 2-dimensions i.e. rows
and columns. Here each columns have different type of values like integer, float, string etc.

➢ 2-dimensional data structure


➢ Rows and columns can be labeled
➢ Mutable data structure

CREATING A DATAFRAME

To create a dataframe, we need to import pandas. The following statement can be written to
create dataframe.

DFname=pd.DataFrame(data, index, columns)

Here, we specify the data to be used to create a dataframe which can be list of list, a list of series,
a dictionary of lists, a dictionary of series etc. (anything which has 2-dimensional).

Index labels and column labels can also be given.

If no row or column labels are given, they are taken as 0,1,2 and so on by default.

CREATING A DATAFRAME USING LIST OF LISTS

We can create a dataframe using nested list.

Eg.: A=[5,7,9],[8,2,4],[9,2,6]

Here A is the list, having 3 more list each having further 3 elements.

import pandas as pd

df=pd.DataFrame(A)

print(df)

Output

0 1 2

0 5 7 9 Here labels were taken as 0,1,2. Since no labels were given

1 8 2 4

2 9 2 6

import pandas as pd

df=pd.DataFrame(A,index=[‘R1’,’R2’,’R3’],columns=[‘X’,’Y’,’Z’])

print(df)
Output

X Y Z

R1 5 7 9 Here labels were given

R2 8 2 4

R3 9 2 6

Printing a Dataframe

Print(df) – print dataframe in form of a table with proper row and column labels as specified.

CREATING A DATAFRAME USING DICTIONARY OF LISTS

This is the most commonly used method to create a Data Frame. The keys in the dictionary are
used as column heading and the list given in the value part makes the data of that column.

Eg.: if, I have Admno., Name and Class of 4 students in following dictionary.

Studata={‘Admno.’:[123,124,125,126],’Name’:[‘Raj’,’Ram’,’Ravi’,’Rose’],’Class’:[‘X’,’XII’,’XI’,’X’]}

Now, we can create dataframe using dictionary

dfstu=pd.DataFrame(studata)

print(dfstu)

Output

Admno. Name Class

0 123 Raj X

1 124 Ram XII Dictionary keys are taken as column heading

2 125 Ravi XI Row labels are taken as 0,1,2… by default

3 126 Rose X

We can change the row index labels after the DataFrame is created

dfstu.index=[‘S1’,’S2’,’S3’,’S4’]

Index labels are also given while creating dataframe.

dfstu=pd.DataFrame(studata,index=[‘S1’,’S2’,’S3’,’S4’])
CREATING A DATAFRAME USING DICTIONARY OF SERIES

A dataframe can be created using dictionary of Series. It is similar to the way we create
dictionary of list. For example, if we need to create a dataframe to store data of their employees
including their empid, name and salary.

import pandas as pd

eid=pd.Series([1001,1002,1003])

name=pd.Series([‘Raj’,’Ram’,’Sam’])

salary=pd.Series([5000,6000,10000])

studata={‘Empid’:eid,’Name’:name,’Salary’:salary}

studf=pd.DataFrame(studata)

print(studf)

Output

Empid. Name Salary

0 1001 Raj 5000

1 1002 Ram 6000

2 1003 Sam 10000

Here, Series are created first, then we created dictionary using series and finally a dataframe is
created using dictionary of series.

If there would have been an index given to series, it will be taken as the row index label in the
dataframe.

DATAFRAME ATTRIBUTES

Attributes are used to describe features / behaviour of a dateframe. All required information
about the dataframe can be fetched using its attribute.

Following are the most commonly used attributes:

1. index: It fetch the list of row labels of a dataframe.


Eg.:
Name Class Marks
S1 Ram XI 80
S2 Raj XII 90
S3 Ravi X 75
Here df.index will give [‘s1’,’s2’,’s3’]
2. columns: It is used to fetch columns for the above dataframe.
print(df.columns) will print [‘Name’,’Class’,’Marks’]
.

3. axes: It represent both axes. Rows i.e. axis =0, columns i.e. axis =1. It fetech both row
label and column label heading.
print(df.axes) will print.
Index([‘s1’,’s2’,’s3’],dtype=object),
Index([‘Name’,’Class’,’Marks’], dtype=object)

4. dtypes: It gives datatype of each of the columns.


Eg.
df.dtypes will print
Name object
Class object
Marks float64

5. size: It returns the number of elements present in the dataframe.


df.size. if printed will give 3*3=9
It multiplies number of rows into number of columns in the dataframe.

6. shape: It returns a tuple stating the number of rows and number of columns in the
dataframe.
print(df.shape)
will print(3,3) since will have 3 students in 3 rows and 3 columns

7. values: It returns a 2-dimensional numpy array for Dataframe context.

8. ndim: It returns the number of dimensions in the dataframe which will always give 2.

9. empty: It return true or false depending if dataframe has some context or not. If
Dataframe has no values it return True, but if it contains data will give false.

10. T: Capital T is used to get transpose of a dataframe.

Eg. df=pd.DataFrame([9,7,3],[2,8,4])
print (df)
0 1 2
0 9 7 3
1 2 8 4
print (df.T)
1 2
0 9 2
1 7 8
2 3 4
USING LEN WITH DATAFRAME

Len is used in dataframe to get the number of rows in the dataframe.

USING COUNT WITH DATAFRAME

Count is used in dataframe to count not NAN values in dataframe rows or columns

df.count()

will fetch number of values in rows for each column. By defaulr axis is taken as 0.

Eg.

C0 C1 C2

R1 8 3 4

R2 9 NaN 6

df.count()

C0 2

C1 1 Count of not NaN values of each columns

C2 2 can also use df.count(axis=’index’)

When index/axis is 0 it counts the values in rows or counts the number of rows.

df.count(1)

R1 3

R2 2

SELECTING AND ACCESSING DATA

We can specify the name of the columns and index labels to select/fetch specify
values in a dataframe

Eg.:
Adm.No. Name Class Marks
S1 10005 Ram XI 80
S2 10006 Raj XII 90
S3 10007 Ravi X 75

SELECTING AND ACCESSING COLUMN

DataFrameName[‘Column Name’] using[]

Or

DataFrame.Columnname using.

df.Name
S1 Ram
S2 Raj
S3 Ravi
df[‘Name’]

S1 Ram
S2 Raj
S3 Ravi

FETCHING MULTIPLE COLUMN

Multiples columns can be fetched using their names in [] separated by (,)

df[[‘Name’,’Class’]]

Name Class Marks


S1 Ram XI 60
S2 Raj XII 80
S3 Ravi X 90

FETCHING VALUES FROM A DATAFRAME (From a particular set of rows/columns)

To fetch a subset of values from a Dataframe, we use loc with dataframe.

Syntax:

Nameofdataframe.loc[startrow:endrow,startcolumn:endcolumns]

Here labels of rows/columns are used to fetch the values.

The endrow/endcolumns is also inclusive in the list to be fetched.

df.loc[‘S2’,:] Will display all columns from row s2

Name Raj

Class XII

Marks 80

df.loc[:,’Name’:’Marks’] Display all rows for columns from Name to Marks

Name Class Marks


S1 Ram XI 60
S2 Raj XII 80
S3 Ravi X 90

df.loc[‘S1’:’S3’,’Name’:’Class’] Display s1 to s2 rows and Name to Marks columns


USING ILOC TO FETCH ROWS AND COLUMNS FROMA DATAFRAME

If we do not have row or column labels assigned or don’t remember. We can use numeric
index/position with iloc to fetch rows and columns.

dataframe.iloc[startrowindex:endrowindex,startcolumnindex:endcolumnindex]

Here ending index are exclusive.

df.iloc[0:2,1:3]

Will display first and second row

Second and third column

Class Marks

S1 XI 60

S2 XII 80

Display 2to 4(both inclusive)

df.iloc[2:5] or df.iloc[2:5,:]

Display first two rows and first two columns

df.iloc[0:2,0:2]

Fetching a single value in a row or column

Displaying marks of S2

df.Marks[‘S2’] will print 80

Values can also be modifies using this.

df.Marks[‘S2’] =75

ADDING NEW COLUMN

Dataframe[‘columnname’]=[newvalue]

The above statement is used to add or modify a column. If the name of the column is
already existing, its value will be overwritten and changed to the new values specified.

If the column name is not there, a new column is created with the given set of values.

df[“Grade”]=’A+’ Will print A+ in new columns for all the students.


df[“Grade”]=[‘A’,’B’,’C’] will create new column grade and put respective grades.

There are some more methods to add a new column.

df.at[:,‘columnname’]=[values]

Or

df.loc[:,‘columnname’]=[values]

or

df2=df.assign(columnname=values)

Thus the above column can be added using following statements as well:

df.at[:,‘Grade’]= [‘A’,’B’,’C’]

Or

df.loc[:,‘Grade’]= [‘A’,’B’,’C’]or

df2=df.assign(Grade= [‘A’,’B’,’C’])

ADDING NEW ROW

df.at[rowname/index,:]=[new values]

df.loc[rowname/index,:]=[new values]

Both at and loc can be used to add/modify a row in a dataframe. If the row label already
exists, the existing row data will be modified, if it does not exist the new row is created.

df.at[‘S4’,:]=[‘Rose’,’X’,’95’]

will add new row with the label s4 and respective values will be given

can also be written as

df.loc[‘S4’,:]=[‘Rose’,’X’,’95’]

MODIFY A SINGLE CELL

df.columnname[rowlabel]=value

or

df.at[‘rowlabel’,’columnlabel’]=value
or

df.iat[rowindex,columnindex]=value

df.Marks[‘S2’]=90

or

df.at[‘S2’,’Marks’]=90

or

df.iat[2,2]=90 any of the above will modify the marks of student with label s2.

DELETING A ROW/COLUMN

To delete a column from a dataframe del command is used.

del df[columnname]

del df[‘Marks’] will delete marks column

The another option is to use drop to delete either row or columns

df.drop(rowindex)

By default labels or positions are taken as row labels and deletes the respective rows from
the dataframe.

Eg.

df.drop([1,2])

Will delete second and third row from the dataframe

If we need to delete columns using drop, need to mention axis=1.

df.drop([columnlabel],axis=1)

Eg.:

df.drop([‘Marks’,’Grade’],axis=1)

will delete columns labeled as Marks and Grade.

RENAMING A ROW/COLUMN

To change the row label or column name, rename function can be used with
dataframe.

df.rename(index={names dictionary},columns={names dictionary},inplace=true/false)


Here we need to give dictionary for index and columns which takes oldname of
row/columns as key and new name is taken as value.

Eg.:

df.rename(column={‘Name’:’StudentName’},inplace=True)

will change column name from Name to StudentName

inplace=True makes sure the changes are done in the current dataframe itself.

df.rename(index={‘S1’:’R1’, ‘S2’:’R2’, ‘S3’:’R3’},inplace=True)

will change the row labels from S1 to R1 and from S2 to R2

BOOLEAN INDEXING ON DATAFRAME

Boolean indexing means indexing a DataFrame using Boolean values which can be True or
False. The row with a True index are displayed, and False index skipped.

Eg.: To show all students who scored above 90

df[‘Marks’]>90 will generate a list of Boolean values after checking the given condition for
Eg.: df is

Name Marks

S1 A 80

S2 B 93 df[‘Marks’]>90

S3 C 76 will give [False, True, False, True]

S4 D 91

df[df[‘Marks’]>90]

will be evaluated internally as df[[False,True,False]]

Thus will show only those records where Boolean index is True.

Name Marks

S2 B 93

S4 D 91
DATA VISUALIZATION IN PYHTON

Data visualization refers to the graphical or visual representation of information and data
using visual elements like charts, graphs and maps etc..
 It is immensely useful in decision making.
 Saves time
 Saves energy and efforts
 Easily understandable
 Visualized data is retained in memory as a picture for a longer time as compared to
bulky textual data.

PURPOSE OF PLOTTING

The plotting means generating graphs from available data. Python supports 2D Graphs and 3D
Graphs, there are different libraries available in Python for this purpose.

According to CBSE Class 12 IP, you need to learn on 2D Plotting.

The main purposes due to which Plotting is more important than raw data:

1. Using Graphs pattern in data can be understood easily.


2. Large data can be easily represented by using a small Graph.
3. In any organization, decision making is the most important. For taking decision Graphical
representation of data is used.
4. Same data can be represented by different types of graphs, so all the aspects of data can
be summarized.
5. Using Graphical plotting, comparative analysis can be done very easily.
6. Since Graphical representation of data involves figures, lines or bars; there is a less
chance of error and mistake.
Since python provides various libraries to develop graphs from a large data set, we can
develop many types of graphs very easily.

LIBRUARY USED

Using Pyplot of Matplot libruary

Matplot is a python libruary that provides many interfaces and functionality for 2D graphics.

Matplot libruary offers many different named collections of methods, Pyplot is one such
interfaces.

Pyplot – Collection of methods which allows user to contruct 2D plots easily and interactively.

LINE CHART

Line chart or Line graph is a type of chart which displays information as a series of data
points called ‘markers’ connected by a straight line segment.
To create a line chart following functions are used:
 plot(x,y,color,others): Draw lines as per specified lines
 xlabel(“label”): For label to x-axis
 ylabel(“label”): For label to y-axis
 title(“Title”): For title of the axes
 legend(): For displaying legends
 show() : Display the graph

Eg.:

import matplotlib.pyplot as mpp


mpp.plot(['English','Maths','Hindi'],[88,90,94],'Red')
mpp.xlabel('Subjects')
mpp.ylabel('Marks')
mpp.title('Progress Report Chart')
mpp.show()

In the above code, 3 subject marks are plotted on the figure.

Following code for plotting multiple lines on the graph.

import matplotlib.pyplot as mpp


o=[5,10,15,20]
r_india=[30,80,120,200]
mpp.plot(o,r_india,'Red')
r_aust=[25,85,100,186]
mpp.plot(o,r_aust,'Yellow')
mpp.xlabel('Runs')
mpp.ylabel('Overs')
mpp.title('Match Summary')
mpp.show()
You can change the colour using abbreviations and line style by using the linestyle parameter
also. Just do the following changes in above-given code and see the output:

mpp.plot(o,r_india,’m’,linestyle=’:’)
mpp.plot(o,r_aust,’y’,linestyle=’-.’)

BAR CHART

The bar graph represents data in horizontal or vertical bars. The bar() function is used to create
bar graph. It is most commonly used for 2D data representation.
HISTOGRAM CHART

It’s a bar chart showing FREQUENCY DISTRIBUTION.


In this case, the data is grouped into ranges, such as "100 to 199 ", " 200 to 300", etc, and then
plotted as bars based on the frequency values. The Range is also called as the “Bins”.

The width of the bars show the bins and y axis shows the frequency.

It is Similar to a Bar Graph, but with a difference that, in a Histogram each bar is for a range of
data.

The width of the bars corresponds to the class intervals, while the height of each bar
corresponds to the frequency of the class it represents.

A histogram is quite similar to vertical bar graph with no space in between vertical bars. When
you have data which has data points fall between a particular range, you can use histogram to
visualize this data. It is helpful to display statistical data or data inserted in measurable
quantities. For ex. Marks, scores, units etc. It was first introduced by Karl Pearson.

A histogram is a graphical display of data using bars of different heights. In a histogram,


each bar groups numbers into ranges. Taller bars show that more data falls in that range.

CONCEPT OF FREQUENCY DISTRIBUTION :

Let’s consider a test given to students out of 50 marks. Following are the scores they get.

As per the scores lets see how many students scored in different range of scores. Like,

You might also like