9718/23, 10:20 AM Unile422-Copy7 - Jupyter Notebook
Analyzing PUBG Data with Python
PlayerUnknown’s Battlegrounds (PUBG) has taken the gaming world by storm, offering an
immersive battle royale experience. Beyond its gaming aspect, PUBG also provides a wealth of
data that can be mined and analyzed to gain insights into player behavior, strategies, and
performance,
In this Jupyter Notebook project, we will delve into the exciting world of PUBG data analysis
using Python. We will be working with a dataset containing a treasure trove of information,
including player statistics, match details, and in-game events. Our goal is to harness the power
of Python libraries such as Pandas, NumPy, and Matplotib to extract meaningful insights from
this dataset,
Import Library
In [2]: import pandas as pd
In [29]: import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.express as px
Uploading Csv fle
In [30]: df = pd.read_csv(r"
Users\Syed Arif\Desktop\Pubg_stats.csv")
Data Preprocessing
-head()
head is used show to the By default = 5 rows in the dataset
localhost B888/notebooks/Untilee2-Copy7. py
ana9718/23, 10:20 AM
In [31]: df.head()
Unile422-Copy7 - Jupyter Notebook
Out [32]: Unnamed:
* player Name Matches Played Kills Deaths Assists Damage_Dealt Headshots |
° 0 SteathMaster 250 587143 15248 24
1 4 Shipertion 312 232811 18975 a2
2 2 NinjaGamer 16 49288 ‘1786 156
3 3. Thunderstike 400 923267194 21097 288
4 4 SpoedDemon 149 968682 ees 123
tail()
tall is used to show rows by Descending order
In [32]: df.tail()
out [32]: Unnamed:
#: playor Name Matches Played Kills Deaths Assists Damage Dealt Headshi
216 «2168 Grmsonfider zee 7481871 17587 z
217217 BlazingSorcerer 200 set 10972 13129 1
218218 Frozen 208 58378 19758 1
219219 AbyssGuardan 20 sor 448 14987 z
220 220. SpectalPhantom 205 624 149100 15045 i
-Shape
It show the total no of rows & Column in the dataset
In [33]: df-shape
out{33]: (221, 15)
.Columns
It show the no of each Column
localhost B888/notebooks/Untilee2-Copy7. py
ana9718/23, 10:20 AM
In [34]: df.columns
out[34]: Index([ "Unnamed: @
‘assists’,
Unile422-Copy7 - Jupyter Notebook
“Player_Name", ‘Matches Played’, ‘Kills’, ‘Deaths’,
Damage_Dealt', ‘Headshots’, ‘Wins', ‘Top_1és', ‘Revives",
"Distance Traveled’, ‘Weapons_Used", ‘Time Survived’, ‘Rank'],
dtypes' object")
.dtypes
This Attribute show the data type of each column
In [35]: df.dtypes
out[35]: Unnamed: @
Player_Name
Matches_Played
kills
Deaths
Assists
Damage_Dealt
Headshots
Wins
Top_10s
Revives
Distance_Traveled
Weapons_Used
Time_Survived
Rank
dtype: object
-unique()
intea
object
intea
intea
intea
intea
intea
intea
intea
intea
intea
intea
intea
intea
object
In a column, It show the unique value of specific column,
localhost B888/notebooks/Untilee2-Copy7. py
ana9718/23, 10:20 AM Unile422-Copy7 - Jupyter Notebook
In [36]: df["Player_Name"].unique()
out[36]: array(['StealthMaster’, "SniperLion’, ‘NinjaGaner’, ‘ThunderStrike',
“SpeedDenon’, ‘BlazeFury’, ‘RapidShadow’, ‘Frostbite’,
“SavageQueen’, ‘SwiftStriker', "VenomousViper", ‘PhoenixFury'
‘SteelStorm’, ‘BlazingSlade', “StormChaser', ‘Nightmare’,
“CrimsonTide*, 'SilentShadow’, ‘VengefulViper", ‘SolarFlare',
“SkyDancer', ‘RogueWiraith’, ‘LethalLynx’, ‘FrostFang',
‘ScarletStrider’, ‘RagingRaptor’, "ShadowWisp’, "VenonStrike' ,
“FireFury', ‘BlazingSun', ‘ShadowStrike’, ‘SteelGuardian’,
“WickedWitch’, ‘RuthlessRaptor’, ‘FrostyFox', ‘ViperVenon’,
“CrimsonReaper', ‘PhantonGhost', 'StormStrider', ‘StormBreaker,
‘SapphireSword’, ‘ShadowReign’, ‘DragonSlayer’, ‘SilverShadow’,
‘Eagletye', ‘BlazingStorn’, 'MidnightSage’, ‘RapidBlaze',
‘Frostfire’, ‘ScarletWitch', ‘RagingTiger’, ‘SpectralRogue' ,
‘BlazingRaptor’, ‘EternalShadow', 'WickedStrider', 'CrimsonStorm’ ,
‘RuthlessReaper', ‘FrostFury', ‘ShadowBlade’, ‘RapidPhanton',
‘Viperstrike’, 'EternalBlaze', ‘Vengeance’, ‘LunarShadow’,
‘Deathstrike', ‘AzureBlade’, ‘RavenHeart’, ‘SerpentFury’,
“CrimsonRogue’, "VoidSeeker', ‘AstralSword’, 'FrozenFlane",
‘TwilightWarden', ‘ShadowPhoenix', ‘PhantonStrider', ‘EternalFire’,
‘NebulaBlade’, 'SilverHawk’, 'SolarSword’, "EclipseShadow’ ,
‘StarBlade', ‘LethalWraith’, ‘RadiantBlaze', ‘FrostGuardian',
‘mysticSerpent', ‘InfernoStorm’, 'BlazeRanger’, ‘RagingFire’,
“shadowDancer', ‘PhoenixWings', 'TceStorm’, ‘Noonlitsorcerer',
"DarkReaper’, ‘CosmicGhost', ‘StormRider’, ‘FlareRogue’,
"RadiantBlade', ‘TempestPhantom’, "SapphireViper’, ‘EternalFlame’,
"StarlightBlade', ‘CrimsonRider’, "BlazingSorcerer’, ‘FrozenFlare',
"AbyssGuardian', ‘SpectralPhantom' J, dtype=object)
-nuique()
It will show the fotal no of unque value from whole data frame
In [37]: df-nunique()
out{37]: Unnamed: @ 221
Player_Name 106
Matches_Played 70
kills 90
Deaths 82
Assists 65
Damage_Dealt 102
Headshots 78
Wins 28
Top_10s 59
Revives 39
Distance_Traveled 105
Weapons_Used 9
Time_Survived 107
Rank 4
dtype: intea
localhost B888/notebooks/Untilee2-Copy7. py9718/23, 10:20 AM
.describe()
Unile422-Copy7 - Jupyter Notebook
It show the Count, mean , median ete
In [38]: df.describe()
out[38]: Unnamed:
Matches, Playod Kills Deaths Assists Damage_Dealt Headshe
‘count 227000000 221.0000 221.0000 221.000000 22%.000000 227.0000 227.0000
mean 110.0000 -234.624434 612674208 142579186 2.615385 14801.004525 207.3619
std 63.941979 37178429 89.311216 92.882564 21.423045 1902.947975 29.7759
min 0.000000 143.0000 368.000000 68.0000 42.000000 865.0000 123.0000
25% 5.000000 206,000000 543.000000 117.000000 76,000000 13589.000000 193.0000
50% 110,000000 224.0000. 604.000000 138.0000 92.000000 14894.000000 210.0000
+165.000000 257.0000 674.000000 167.000000 117.000000 15987.000000 226.0000
max 220,000000 _409,000000. 923,000000 267.000000 139,000000 21037.000000 312.0000
-value_counts
It Shows all the unique values with their count
In [39]: df{
out[39]: Vengefulviper
Frostbite
VenomousViper
LethalLynx
Nightmare
MidnightSage
BlazingStorm
Eagletye
SilverShadow
SpectralPhanton
7
5
5
4
4
1
‘layer_Name" J. value_counts()
Name: Player Name, Length: 106, dtype: int6a
-isnull()
It shows the how many null values
localhost B888/notebooks/Untilee2-Copy7. py
sina9118/23, 10:20AM
In [40]: d#.isnul1()
out[4a]:
Unnamed:
Unitted22-Copy7 - Jupyter Notebook
prayer Name Matches Played Kits Deaths Assists
0 Fase CFase Fase Fase Fase Fave Fase Fas
2 False Feo False False False False Faso Fas
3 Fase Faso False False False False Faso Fas
2 Fase Faso Falso False Faso False Faso Fas
0 Fase Fase False False False Fase Fase Fas
221 rows «18 columns
In [41]: sns.heatmap(df.isnull())
pit. show()
0050
0100
vee
a2F
cane: ttotrnoboctal atop eh
ena9118/23, 10:20AM Unitted22-Copy7 - Jupyter Notebook
In [42]: df.isna().sum()
out[42]: Unnamed: @
Player_Name
Matches_Played
Kills
Deaths
Assists
Damage_Dealt
Headshots
Wins.
Top_1es
Revives
Distance_Traveled
Weapons_Used
Time_Survived
Rank
dtype: intea
seeccescccess009
Drop the Unnamed Column
In [43]: dF.drop([ ‘Unnamed: @'],axis=1, inplace=True)
Show the Rank in Barplot
In [44]: df.Rank.value_counts().plot(kind
ar")
out[44]:
100
0
eo 8 & 8B
Platinum
Diamona
Gold
silver
Top 10 players By Matches Played
localhost B888/notebooks/Untilee2-Copy7. py ma9718/23, 10:20 AM
In [48]:
bar_top_10 players
Unile422-Copy7 - Jupyter Notebook
px.bar(df, x="Player_Name", y="Matches Played", titl
bar_top_10 players. show()
localhost B888/notebooks/Untilee2-Copy7. py
sina9118/23, 10:20AM Unitted22-Copy7 - upyter Notebook
In [56]: # Sort the DataFrame by "Matches Played" in descending order
if = df.sort_values(by="Matches Played”, ascending-False)
# Select the top 5 players with the highest matches played
‘top_5 players = df-head(5)
# Create a bar plot for the top 5 players
plt.figure(figsize=(10, 6))
plt.bar(top_s_players["Player_Name"], top_S_players["Matches_Played"])
plt.xlabel("Player_Name”)
plt.ylabel (“Matches_Played")
plt.title("Top 5 Players by Matches Played")
plt.xticks(rotation=45)
out{s6]: ([@, 1, 2, 3],
[Text(@, 2, "*), Text(, @ '"), Text(a, 0,
), Text(@, @, '')])
‘Top 5 Players by Matches Played
s Played
8.8 8 es 8 8
Matches Pla
8
Top 5 players By Kills
localhost B888/notebooks/Untilee2-Copy7. py ona9118/23, 10:20AM Unitted22-Copy7 - upyter Notebook
In [59]: # Sort the DataFrame by "Kills" in descending order
if = df.sort_values(by="Kills", ascending=False)
# Select the top 5 players with the highest matches played
top_5 players = df-head(5)
# Create a bar plot for the top 5 players
plt.figure(figsize=(10, 6))
plt.bar(top_s_players["Player_Name"], top_S_players["kills"])
plt.xlabel("Player_Name”)
plt.ylabel ("Kills")
plt.title("Top 5 Players by Kills")
plt.xticks(rotation=45)
out(59]: (L841, 2, 3],
[Text(@, @, ''), Text(@, @, ''), Text(@, @, ''), Text(@, @, '')])
‘Top 5 Players by Kills
00
00
“00
200
°
f ff # F
Player Name
Top 5 players By Wins
localhost B888/notebooks/Untilee2-Copy7. py son9118/23, 10:20AM
In [6]:
out [60]:
Unitted22-Copy7 - upyter Notebook
# Sort the DataFrame by “Wins” in descending order
f = df.sort_values(by="Wins", ascending=False)
# Select the top 5 players with the highest matches played
top_5 players = df-head(5)
# Create a bar plot for the top 5 players
plt.figure(figsize=(10, 6))
plt.bar(top_s_players["Player_Name"], top_S_players["wins"])
plt.xlabel ("Player Name")
plt.ylabel (“wins”)
plt.title("Top 5 Players by Wins")
plt.xticks(rotation=45)
([@, 1, 2, 3],
[Text(@, @, ''), Text(@, @, ''), Text(@, @, ''), Text(@, @, '')])
‘Top 5 Players by Wins
localhost B888/notebooks/Untilee2-Copy7. py
wine9118/23, 10:20AM
In [67]:
Unitted22-Copy7 - Jupyter Notebook
# Sort the DataFrame by “Matches Played" in descending order
if = df.sort_values(by="Player Name", ascending=False)
# Select the top 5 players with the highest matches played
players = df-head(5)
“figure (figsize=(8, 6))
plt.bar(top_S_players["Player_Name"], top_S_players{"Rank"])
plt.xlabel(‘Player_Name' )
plt.ylabel (‘Rank’)
plt.title('Player_Name by Rank’)
pit. show()
Player_Name by Rank
Diamond
a
Siver
WickedWitch WickedStrider WeidSeeker
Player_Name
How many times Rank are Published
localhost B888/notebooks/Untilee2-Copy7. py
ana9118/23, 10:20AM Unitted22-Copy7 - Jupyter Notebook
In [69]: rank_counts = df["Rank
rank_counts.colunns = ["Rank", "Coun
rank_counts = rank_counts.sort_values(by="Count", ascending-False)
reset_index()
plt.figure(figsize=(10, 6))
plt.bar(rank_counts["Rank"], rank_counts[“Count"})
plt.xlabel ("Rank")
plt.ylabel ("Count")
plt.title("Rank Distribution Among Players")
plt.xticks(rotation=45)
pit. show()
Rank Distribution Among Players
100
»
©
#
| L
“
»
° a
Rank
How many Times Rank Published by Player
Name
localhost B888/notebooks/Untilee2-Copy7. py a39118/23, 10:20AM Unitted22-Copy7 - Jupyter Notebook
In [74]: # Create a cross-tabulation to count how many times each rank was achieved by (
cross_tab = pd.crosstab(df["Player_Name"], df["Rank"]).head(20)
# Plot the bar chart
cross_tab.plot(kind="bar", stacked=True, figsize=(15, 8)
4 Customize the plot
plt.xlabel("Player Name")
plt.ylabel ("Count")
plt.title("Rank Achievements by Player")
# Show the plot
pit. show()
Rank Achivemants by Payer
PeRaPES PEE PEER
PPEETEPPRPTT UP DTG!
In [ ]:
localhost B888/notebooks/Untilee2-Copy7. py sana