100% found this document useful (1 vote)
4K views37 pages

Final Ipl Project 1

The document describes an analysis of cricket match data from the Indian Premier League (IPL) spanning 2008-2018. The analysis uses Python and its data processing and visualization libraries to find insights such as the most successful teams, highest run/wicket wins, and most matches played in a season.

Uploaded by

aethenaethu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
4K views37 pages

Final Ipl Project 1

The document describes an analysis of cricket match data from the Indian Premier League (IPL) spanning 2008-2018. The analysis uses Python and its data processing and visualization libraries to find insights such as the most successful teams, highest run/wicket wins, and most matches played in a season.

Uploaded by

aethenaethu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 37

INFORMATIC PRACTICES PROJECT

ON
IPL ANALYSIS

Submitted by
Aethen Paul Mathew
Class XII C
INDEX•
•DECLARATION................................….........................................5

• ACKNOWLEDGMENT.................................................................6

• HEADER FILES USED..............................………..........................7

• INTRODUCTION ABOUT PYTHON.......................................8-9

• INTRODUCTION ABOUT MYSQL...................….......................10

• SOFTWARE AND HARDWARE REQUIREMENTS..............11

• WORKING DESCRIPTION..................................................12-13

• DATA COLLECTION...................................................................14

• DATA VISUALIZATION........................................………..............15

• SOURCE CODE...........................................................................16

• OUTPUT........................................................................................23

• CONCLUSION............................................................................32

• BIBLIOGRAPHY………………………………………………………………………….34
DECLARATION

I declare that the project work entitled

"IPL ANALYSIS", submitted to

department of INFORMATICS

PRACTICES, ST.PHILOMENA'S PUBLIC

SCHOOL, ELANJI is prepared by me.All

the coding are result of my personal

efforts.

Submitted by
Aethen Paul Mathew
Class XII C
ACKNOWLEDGMENT
Primarily I would thank God for being able to complete this

project with success. Then I would like to thank my

Informatics Practices teacher Mrs. Jasmine Jacob whose

valuable guidance has been the one that helped me to patch

the project and make it full proof success. Her suggestions and

instructions has served as the major contributor towards the

completion of this project.

I also express my gratitude to our senior principal Rev.Dr.John

Erniakulathil and principal Joju Joseph for their

encouragement and all the facilities provided for the

completion of this project.

Then I would like to thank my parents and friends who have

helped me with their valuable suggestions and guidance has

been helpful in various phases of the completion of this project


HEADER FILES USED

• CSV Connectivity
• INTRODUCTION ABOUT PYTHON

Python is a high level general purpose

open source programming language. It

is both object oriented and procedural.

Python is an extremely powerful

language.

FEATURES OF PYTHON
• Python is a high level, open source, general purpose

programming language.

• It is object oriented, procedural and functional.

• It has library to support GUI.

• It is extremely powerful and easy to learn.

• It is open source, so free to available for everyone.

• It supports on Windows, Linux and Mac OS.


• Python enables us to write clear, logical applications for small

and large tasks.

• It has high level built in datatypes:string,lists,dictionaries etc.

• It encourages us to write clear and well structured code.

APPLICATIONS OF PYTHON

• Machine Learning

• Data Analysis

• Web Development

• Console based authentication

• 3D CAD Applications
INTRODUCTION ABOUT MYSQL

MySQL is an open source and freely available

Relational Database Management System that uses

structured Query Language. It provides excellent

features for creating, storing, maintaining and

accessing data stored in the form of databases and

their respective tables.

Mysql database system works on client server

architecture. It constitutes a Mysql server which runs

on a machine containing the databases and Mysql

databases (clients) which are connected to these server

machines over a network.

ADVANTAGES OF MYSQL

• Reliability and performance

• Modifiable

• Multi platform support

• Powerful Processing Capabilities

• Integrity

• Authorization
SOFTWARE AND HARDWARE

REQUIREMENTS

SOFTWARE REQUIREMENTS:

• Python 3.6 x or higher version

• Pandas Library preinstalled

• Matplotlib library preinstalled

HARDWARE REQUREMENTS :

• A computer or a laptop with operating

system- windows 7 or above.

• x86 64-bit CPU(Intel/AMD architecture)

• 4GB RAM

• 5 GB free disk space


WORKING DESCRIPTION

INTRODUCTION ABOUT PROJECT

Cricket is one of the popular game in India.

After the Start of IPL, Indian cricket standards

reached an ultimate level and many talented

players got a chance to prove themselves in a

platform like IPL where many international

cricketers play together. IPL is the one of the

leading cricket tournament in the world.

The Indian Premiere League (IPL) is a

professional league for Twenty20 cricket

championship in India. It was initiated by the

Board of Control for Cricket in India head

quartered in Mumbai and is supervised by BCCI

Vice president Rajeev shukhla who serves as the

league's chairman and commissioner. The IPL


works on a franchise system based on American

style of hiring players and transfers.

And cricket, as you can imagine, is ripe

with data points. It's a battle between bat and

ball played across different formats and different

levels. The ball-by-ball analysis of matches can

produce some surprising hidden insights, such as

batting partnerships and who the best batting

partner is.

THE MAIN OBJECTIVES OF THIS PROJECT IS:

• To find the team that had won by maximum runs.

• To find the team that had won by maximum wickets.

• To find the team that had won by minimum runs.

• To find the team that had won by minimum wickets.

• To find the season that had most number of matches.

• To find the Most Successful IPL Team.

• To find Players who got max times Man of Match.


DATA COLLECTION

Data has been collected from

www.iplt20.com,www.cricsheet.org. Data

consists of the ball by ball details for a total

of 696 matches from 2008-2018. Ball by

ball data provides in depth detail of all the

balls thrown in that particular over. The

ball could be either wide, dead, no ball or a

player got singles, doubles, triples, six or

four on that ball. There are two csv files of

datasets. Matches.csv.gives the details of

match venue, location, Season, contesting

team, about toss winner and toss decision,

match result, win got by runs or wickets,

player of the match, details of all the three


umpires and match Winner etc.

Deliveries.csv is the ball by ball data and the

combination of all the deliveries from

2008-18.

It consists of different attributes Match_id,

bowling team, batting team, batsmen,

bowler, Nonstriker, no ball runs, penalty

runs, Extra runs, over, total runs etc.

Innings tell if the first team was going on

field or second one. Over describes the

current over number. Ball describes the

current ball number of the current over.


DATA VISUALIZATION

The most important and significant part of

data visualization and predictive analysis is

to represent the data in form of charts and

graphs to get a visual presentation of data.

The collected data is visualized to get a

better and clear understanding about all

the parameters of the Season, the team,

All- rounders, batsmen and bowlers so that

it will be helpful for the team selectors

Captains and managers for the next auction.

Different packages are used to get the

proper analysis and visualization for players

and teams.
SOURCE CODE

import numpy as np # numerical computing

import pandas as pd # data processing, CSV file 1/0

(e.g. pd.read_csv)

import matplotlib.pyplot as plt #visualization

import seaborn as sns #modern visualization

plt.rcParams['figure.figsize'] = (14, 8)

sns.set_style("darkgrid")

df = pd.read_csv("E:¥ipl1.csv")

Print('--------------------------------------------')
print(' --------------------------------------------')

print(df.info())

print()

print(' --------------------------------------------')

print('--------------------------------------------')

print('Total Matches are::::',df['id'].max())

print()

print('-------------------------------------------- ')

print('--------------------------------------------')

print('How many seasons data we've got in the dataset?')


print(df['season'].unique())

print()

print('--------------------------------------------')

print('--------------------------------------------')

print('Which Team had won by maximum runs?')

print(df.iloc[df['win_by_runs'].idxmax()])

print()

print('--------------------------------------------')

print('--------------------------------------------')

print('Which Team had won by maximum wickets?')

print(df.iloc[df['win_by_wickets'].idxmax()]

['winner])
print()

print('--------------------------------------------')

print('--------------------------------------------')

print('Which Team had won by (closest margin) minimum

runs?')

print(df.iloc[df[df['win_by_runs'].ge(1)].win_by_runs.id

xmin()]['winner'])

print()

print('--------------------------------------------')

print('--------------------------------------------')

print('Which Team had won by minimum wickets?')

print(df.iloc[df[df['win_by_wickets'].ge(1)].win_by wic

kets.idxmin()))
print()

print('--------------------------------------------')

print('--------------------------------------------')

print('Which season had most number of matches?')

sns.countplot(x='season', data=df)

plt.show()

print()

print('--------------------------------------------')

print('--------------------------------------------')
print('The Most Successful IPL Team is:::')

data = df.winner.value_counts()

sns.barplot(y = data.index, x = data, orient='h')

print()

print('--------------------------------------------')

print('--------------------------------------------')

print('Players who got max times Man of Match are:::')

top_players=df.player_of_match.value_counts()[:10]

#sns.barplot(x="day", y="total_bill", data=tips)

fig, ax = plt.subplots()

ax.set_ylim([0,20])
ax.set_ylabel("Count")

ax.set_title("Top player of the match Winners")

#top_players.plot.bar()

sns.barplot(x = top_players.index, y = top_players, orient='v');

#palette="Blues");

plt.show()
Output
<class 'pandas.core.frame. DataFrame'>

RangeIndex: 637 entries, 0 to 636

Data columns (total 17 columns):

Match_SK 637 non-null int64

match_id 637 non-null int64

Team1 637 non-null object

Team2 637 non-null object

match_date 637 non-null object

Season_Year 637 non-null int64

Venue_Name 636 non-null object

City_Name 637 non-null object

Country_Name 637 non-null object

Toss_Winner 636 non-null object

match_winner 634 non-null object

Toss_Name 636 non-null object

Win_Type 635 non-null object

Outcome_Type 637 non-null object

ManOfMach 633 non-null object

Win_Margin 628 non-null float64

Country_id 637 non-null int64

dtypes: float64(1), int64(4), object(12)


memory usage: 84.7+ KB
df.groupby('Season_Year') ('match_winner').value_counts()

Season_Year match_winner

2008 Rajasthan Royals 13

Kings XI Punjab 10

Chennai Super Kings 9

Delhi Daredevils 7

Mumbai Indians 7

Kolkata Knight Riders 6

Royal Challengers Bangalore 4

Deccan Chargers 2

2009 Delhi Daredevils 10

Deccan Chargers 9

Royal Challengers Bangalore 9

Chennai Super Kings 8

Kings XI Punjab 7

Rajasthan Royals 6

Mumbai Indians 5

Kolkata Knight Riders 3

2010 Mumbai Indians 11

Chennai Super Kings 9

Deccan Chargers 8

Royal Challengers Bangalore 8

Delhi Daredevils 7

Kolkata Knight Riders 7


Rajasthan Royals 6

df['Season_Year'].value_counts()

2013 76

2012 74

2011 73

2017 60

2016 60

2014 60

2010 60

2015 59

2008 58

2009 57

Name: Season_Year, dtype: int64

Which season had most number of matches?


-----------------------------------

-----------------------------------

Most successful IPL Teams is:::


CONCLUSION
In this paper, the performance of cricket players(batsmen)

and toss related analysis in IPL from season 2008-2018 has

been visualized. Finding out the hidden parameters, patterns

and attributes that lead to the outcome of a cricket match

helps the team owners and selectors to recognize better

players. A salary of IPL cricket players is decided through the

auction process. Thus, it is a part of franchise and matter of

decision making about which player to be bided for and at

what cost by the past performance of players in IPL. Every

Selector needs young and dynamic players who can handle the

pressure calmly, and go towards the winning line.

This paper highlights the player performance especially

batsmen and addresses the analysis that is done for Maximum

Man of the Matches, Maximum Centuries Scored by Batsmen,

Top Batsmen, Batsmen with Top Strike Rate, Top 10 Players

with Maximum Runs. Statistics of 696 matches have been used

in this experiment and even for toss related analysis such as

Count of Toss wins, Decision taken by each team after winning

the toss, Toss Decision Season Wise, Toss Decision Team Wise.

SK Raina considered as the finest batsmen who is second in


the top list of batsmen having maximum runs, maximum man

of the matches, maximum centuries scored, V Kohli at the first

position of maximum runs and even he is in the list for

maximum centuries. All other Indian Star batsmen MS Dhoni

(Best Captain, Maximum runs and Maximum man of the

matches), Rishabh Pant (second best strike rate and maximum

centuries), RG Sharma, S Dhawan, G Gambhir, YK Pathan and

M Vijay performed very well at the end of last five overs.

Selectors have the clear choice to give preference to Indian

Players at first as they performed very well in season from

2008-2018.

We also presented toss related analysis, in which MS Dhoni is

the best captain for CSK who won the toss maximum times

having count of 77 and elected to bat first. Their choice of bat

first mostly results in win. Most of the times filed first is

elected by the captains so that they can plan and perform well

by chasing. RCB, KKR, MI and KXIP elected field first most of

the times having count of 57 and 49. Selectors have the clear

choice to select batsmen from Mumbai Indians and Kings XI

Punjab as this two teams handled the pressure very well

during all the Seasons from 2008-2018. By considering all

this visualization and toss related analysis, Team Management


can select the right players and rights teams at the time of

auction. A good and strong cricket team can be formed within

a given budget, which will have the highest chance of winning.

BIBLIOGRAPHY

1. Informatics Practices with Python by

Preeti Arora

2. http//:en.wikipedia.org

3. http//:www.botskOOl.com

Common questions

Powered by AI

MySQL serves as a relational database management system that efficiently stores, maintains, and retrieves IPL data. It is advantageous in this context due to its reliability, performance, and multi-platform support, which are crucial for handling large datasets. MySQL's client-server architecture allows for robust data processing capabilities, making it an ideal choice for managing the comprehensive datasets required for detailed IPL analysis .

The project determines the 'Most Successful IPL Team' through data analysis of match outcomes, which includes examining match-winning statistics from 2008 to 2018 using Python and its data processing libraries. It involves calculating the frequencies of wins for each team across various seasons, visualized using bar plots to easily identify teams with the highest number of victories, providing insights into consistent performances over the years .

Analysis shows that toss decisions significantly impact match outcomes in IPL, with strategic advantages often derived from batting or fielding first based on team strengths and match conditions. For instance, MS Dhoni's consistent success with CSK is noted, where electing to bat first frequently led to victories. The analysis suggests some teams prefer fielding first to better strategize and chase targets, demonstrating the strategic importance of toss decisions that align with a team's capability to handle pressure during critical match phases .

Determining 'Players with the Maximum Man of the Match Awards' involves analyzing the player of the match data from the IPL datasets, thereby identifying players who frequently have impactful performances. This process illuminates standout players who consistently contribute to their team's success, offering valuable insights for team selectors to prioritize acquiring such impactful players during auctions. This analysis leverages Python's data manipulation and visualization capabilities to present these insights clearly .

Python is a high-level, general-purpose, open-source programming language that is both object-oriented and procedural. It supports GUI through libraries, making it very powerful and easy to learn, which is ideal for developing data analysis applications. Python's key features include built-in data types like strings, lists, and dictionaries, encouraging clear and well-structured code. Additionally, its cross-platform compatibility with Windows, Linux, and Mac OS enhances its suitability for projects like IPL analysis due to needing clear logical applications for both small and large tasks .

The hardware requirements for the IPL analysis project include a computer or laptop with at least a Windows 7 OS, an x86 64-bit CPU, 4GB RAM, and 5GB of free disk space, which is suitable for handling extensive data processing and visualization tasks. The software requirements include Python 3.6 or higher with Pandas and Matplotlib libraries preinstalled, essential for data processing and visualization, respectively. This environment ensures efficient execution of the project, aligning with Python's capabilities for large-scale data analysis .

The data collection for the IPL analysis included gathering detailed match data from sources such as www.iplt20.com and www.cricsheet.org. This data consisted of ball-by-ball details for matches between 2008 and 2018, allowing for a comprehensive analysis of various match aspects such as batting partnerships and individual performance metrics. These detailed datasets were crucial for deriving meaningful insights from large amounts of raw data, providing a solid foundation for further analysis and visualization .

Ball-by-ball analysis in IPL matches can uncover hidden insights such as optimal batting partnerships, the effectiveness of bowlers against specific batsmen, and strategic advantages in different game scenarios. It can also identify players' performance trends in pressure situations and determine factors contributing to match outcomes, like batting order effectiveness. Such granular data analysis allows teams to refine strategies, improve player performance, and make informed decisions for future matches .

Data visualization is crucial in the IPL project as it transforms complex data into understandable charts and graphs, helping selectors and managers gain insights into various parameters such as team performance, player statistics, and match outcomes. This visual representation aids in decision-making by highlighting trends and patterns that are not immediately apparent in raw data, thus improving auction strategies and team selection decisions for following seasons .

Insights from the IPL analysis project can greatly influence player auctions by identifying players with consistent performance, high strike rates, and match-winning capabilities. Data visualization helps team owners and selectors analyze past performances and predict future success, aiding in making informed bidding decisions. Furthermore, understanding the toss decision impact and analyzing match-winning strategies can guide teams in selecting players best suited for various game situations, optimizing team composition within budget constraints .

You might also like