24/08/2020 Aggregating Pokémon Data with Python and Pandas
Share this
START CODING TODAY CAREERS IN CODING LEARNING TO CODE
21 JANUARY 2019 / INSIGHTS
Aggregating Pokémon
Data with Python and
Pandas
Most of the time, high-level decision-
makers require aggregated data. For
example, to understand sales trends,
business analysts need to aggregate
individual sales transactions by month,
quarter, or scal year. Data aggregation is
a key skill that can drive value for many
organizations.
https://news.codecademy.com/aggregating-pokemon-data-python-pandas/ 1/13
24/08/2020 Aggregating Pokémon Data with Python and Pandas
Share this
Supermarket sales aggregated by category
Pokémon is a video game where creatures (known
as Pokémon) of different types battle each other for
glory. To commemorate the release of the newest
Pokémon games for the Nintendo Switch (Let’s Go
Pikachu and Let’s Go Eevee), we will aggregate and
analyze Pokémon data in order to answer the
following questions:
1. How many Pokémon of each type are there?
2. Which Pokémon type is the most powerful?
First, we will complete our analysis using
spreadsheets because spreadsheets are the most
widely used tool for data analysis. However, Python
programming provides more exible and more
scalable analysis options than spreadsheets, so we
will complete the analysis using Python and the
Pandas library.
https://news.codecademy.com/aggregating-pokemon-data-python-pandas/ 2/13
24/08/2020 Aggregating Pokémon Data with Python and Pandas
Share this
Starting with spreadsheets
Let's start with a dataset of all Pokémon from
Pokémon Database. To follow along, download the
data here (right click and select "Save As...").
We will use Google Sheets, a free spreadsheet
application, for our analysis. Regardless of the
speci c spreadsheet tool you use, the underlying
concepts will be the same. Below is a preview of the
data.
Each row represents a single Pokémon
Each Pokémon belongs to one or two types, and
certain types are strong against other types. For
example, Charizard is a ying, re-breathing lizard,
so it is both flying and fire types, and it is weak
against water types.
https://news.codecademy.com/aggregating-pokemon-data-python-pandas/ 3/13
24/08/2020 Aggregating Pokémon Data with Python and Pandas
Share this
Pokémon that belong to two types occupy two rows
in our spreadsheet. In addition, every Pokémon has
multiple stats to determine how it performs in battle.
A description of each stat can be found in the
Pokémon Database. For example, Blastoise has a
higher defense stat than Charizard, so it will better
withstand physical attacks. For our analysis, we will
look at the type with the highest number for each
stat.
A pivot table is a tool designed speci cally to
aggregate data, and it will be the easiest way to
aggregate Pokémon by type in our spreadsheet. To
create a pivot table, select your data, and select the
Pivot Table option. Since we are aggregating by
Pokémon type, we will add “Type” to rows in the
pivot table options.
The resulting pivot table has a row for each unique type
https://news.codecademy.com/aggregating-pokemon-data-python-pandas/ 4/13
24/08/2020 Aggregating Pokémon Data with Python and Pandas
Right now, our pivot table is blank, and weShare
need thisto
add values to it. Since we want to count the number
of unique Pokémon in each type, we would add it to
values in our pivot table options.
We are counting the number of unique Pokémon names for each type
Since there is no type that is de nitively the
“strongest”, we will look at the strongest type for
each stat. This would be useful for Pokémon players
who are building balanced teams with both
https://news.codecademy.com/aggregating-pokemon-data-python-pandas/ 5/13
24/08/2020 Aggregating Pokémon Data with Python and Pandas
offensive;y- and defensively-inclined Pokémon. To
Share this
see which type has the highest median values for
each stat, we will add additional options to values in
the pivot table options.
Calculating the median of each stat for each type
At this point, we can see our results in the pivot
table:
The highest values in each column are highlighted in green, and lowest values are
highlighted in yellow
Here are some interesting observations from this
https://news.codecademy.com/aggregating-pokemon-data-python-pandas/ 6/13
24/08/2020 Aggregating Pokémon Data with Python and Pandas
initial analysis: Share this
There are a lot of water -type Pokémon and
very few ice -type Pokémon. Clearly, most
Pokémon aren't living in winter conditions! 🌴
Dragon -type Pokémon seem to be the
strongest while bug -type Pokémon are the
weakest. 🐉 > 🐞
Fairy -type Pokémon have low attack. Guess
we don’t have to worry about being attacked
by the tooth fairy. 🧚
Steel -type Pokémon have the strongest
defense. Does the aluminum industry have
something to say about that? 🤔
Rocks are slow. Even the ground and grass
are faster...🗿
Now to Python
Spreadsheets are great, and we were able to glean
some fun insights from our pivot table analysis.
However, using Python with the Pandas library is far
superior to spreadsheet analysis.
Writing code with Pandas is signi cantly quicker
than interacting with a spreadsheet’s GUI interface
https://news.codecademy.com/aggregating-pokemon-data-python-pandas/ 7/13
24/08/2020 Aggregating Pokémon Data with Python and Pandas
(did you see all of those screenshots above?).
Share this
To aggregate data with Pandas, you will need to
complete the following steps:
1. Import the Pandas library
2. Upload your data to a Pandas DataFrame
3. Complete the aggregation
For our Pokémon analysis, our commented code and
output are below:
https://news.codecademy.com/aggregating-pokemon-data-python-pandas/ 8/13
24/08/2020 Aggregating Pokémon Data with Python and Pandas
Unlike the spreadsheet analysis, there are no
Share this
intermediate steps when aggregating data with
Python and Pandas.
Modifying our analysis
The dataset we used contained all Pokémon.
However, only a subset of Pokémon are available in
each "Let's Go" game. Download the data with only
the subset here (right click and select "Save As...").
To aggregate this new data with spreadsheets, we
would have to repeat all of the manual steps
involved in making a pivot table. However, with
Python, we only need to modify a single line of code:
https://news.codecademy.com/aggregating-pokemon-data-python-pandas/ 9/13
24/08/2020 Aggregating Pokémon Data with Python and Pandas
Share this
We only needed to change one line of code to modify our analysis
When only Pokémon available in "Let’s Go Pikachu"
and "Let’s Go Eevee" are included, Dragon -type
Pokémon still have the highest overall stats.
However, they do not dominate as much as they did
before in each of the stats, and overall, the stats are
distributed more evenly across types.
What’s next
High-level decision makers often require analysts to
make minor adjustments to view data in a slightly
different format. In these cases, Python will save
signi cantly more time when compared to traditional
spreadsheet analysis.
I’d encourage you to try analyzing the data yourself.
Below are other modi cations you can apply to our
Pokémon analysis:
Include only nal evolved Pokémon
Exclude legendary Pokémon that are
ineligible for Pokémon competitions
https://news.codecademy.com/aggregating-pokemon-data-python-pandas/ 10/13
24/08/2020 Aggregating Pokémon Data with Python and Pandas
Insights Share this
7 Tips for Learning from Home
While Social Distancing INSIGHTS
5 Skills Developers
Need Beyond Writing
Learn From Home: A Guide for
Teachers, Students, and Parents
Code
During Covid-19 Learning to program tends to
center around, well,
MySpace and the Coding programming. But developers
Legacy it Left Behind spend most of their time on
tasks that require a different
set of skills.
See all 44 posts → KARL LYKKEN
UPDATES
Livestream: Getting Started with C++
Whether you're new to coding or trying to pick up a new language, our
C++ livestream is one that you won't want to miss.
CODECADEMY TEAM
https://news.codecademy.com/aggregating-pokemon-data-python-pandas/ 12/13
24/08/2020 Aggregating Pokémon Data with Python and Pandas
Codecademy News © 2020 Share this
Latest Posts Facebook Twitter
https://news.codecademy.com/aggregating-pokemon-data-python-pandas/ 13/13
24/08/2020 Aggregating Pokémon Data with Python and Pandas
To learn more about data wrangling with Python
Share this and
Pandas, take a look at Codecademy’s Data Analysis
with Pandas course.
Get more practice, more projects, and more guidance.
Upgrade to Pro
Albert Lee Read More
Read more posts by this author.
2 Comments Codecademy News 🔒 Disqus' Privacy Policy
1 Login
Recommend 7 t Tweet f Share Sort by Best
Join the discussion…
LOG IN WITH
OR SIGN UP WITH DISQUS ?
Name
Benjamin van der Kwaak • 8 months ago
So funny!
△ ▽ • Reply • Share ›
Mayowa Daniel • a year ago
Nice!
△ ▽ • Reply • Share ›
✉ Subscribe d Add Disqus to your siteAdd DisqusAdd ⚠ Do Not Sell My Data
— Codecademy News —
https://news.codecademy.com/aggregating-pokemon-data-python-pandas/ 11/13