0% found this document useful (0 votes)

35 views4 pages

Cheat Sheet Pandas

This document is a comprehensive cheat sheet for the Pandas library, covering various functionalities such as importing modules, loading and saving CSV files, converting datatypes, selecting and reshaping data, and performing operations on DataFrames. It includes code snippets and explanations for tasks like selecting rows and columns, adding new columns, renaming columns, and merging DataFrames. The cheat sheet serves as a quick reference for users to efficiently utilize Pandas for data manipulation and analysis.

Uploaded by

brhanegebregn

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

35 views4 pages

Cheat Sheet Pandas

Uploaded by

brhanegebregn

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

Pandas Cheat Sheet

by Justin1209 (Justin1209) via [Link]/101982/cs/21202/

Import the Pandas Module Loading and Saving CSVs (cont) Converting Datatypes

import pandas as pd > # Get the first DataFrame chunk: # Convert argument to numeric type
df_urb_pop [Link]_numeric(arg, errors="ra‐
Create a DataFrame df_urb_pop = next(urb_pop_reader) ise")

# Method 1 errors:
Inspect a DataFrame "raise" -> raise an exception
df1 = [Link]aFrame({
[Link](5) First 5 rows "coerce" -> invalid parsing will be set as NaN
'name':
['John Smith',
'Jane Doe'], [Link]() Statistics of columns (row
DataFrame for Select Columns / Rows
'address':
['13 Main St.', count, null values, datatype)
'46 Maple Ave.'], df = [Link]([

'age':
[34, 28] Reshape (for Scikit) ['January', 100, 100, 23,

}) 100],
nums = [Link](range(1, 11))
# Method 2 ['February', 51, 45, 145,
-> [ 1 2 3 4 5 6 7 8 9 10]
df2 = [Link]aFrame([ 45],
nums = nums.reshape(-1, 1)
['John Smith', '123 Main ['March', 81, 96, 65, 96],
-> [ [1],
St.', 34], ['April', 80, 80, 54, 180],
[2],
['Jane Doe', '456 Maple ['May', 51, 54, 54, 154],
[3],
Ave.', 28], ['June', 112, 109, 79,
[4],
['Joe Schmo', '9 129]],
[5],
Broadway', 51] columns=['month',
[6],
], 'east', 'north', 'south',
[7],
columns=['name',
'address', 'west']
[8],
'age']) )
[9],
[10]]
Loading and Saving CSVs Select Columns
You can think of reshape() as rotating this
# Load a CSV File in to a # Select one Column
array. Rather than one big row of numbers,
clinic_north = [Link]
DataFrame nums is now a big column of numbers -
df = [Link]d_csv('my-csv- there’s one number in each row. --> Reshape values for Scikit

f[Link]') learn: clinic_north.[Link]‐

# Saving DataFrame to a CSV File shape(-1, 1)

df.to_csv('new-csv-fi‐ # Select multiple Columns

le.csv') clinic_north_south
= df[['n‐

# Load DataFrame in Chunks (For orth', 'south']]

large Datasets) Make sure that you have a double set of

# Initialize reader object: brackets [[ ]], or this command won’t work!
urb_pop_reader
urb_pop_reader = [Link]d_c‐
sv('ind_pop_dat[Link]',
chunksize=1000)

By Justin1209 (Justin1209) Published 23rd November, 2019. Sponsored by [Link]

[Link]/justin1209/ Last updated 31st January, 2020. Measure your website readability!
Page 1 of 4. [Link]
Pandas Cheat Sheet
by Justin1209 (Justin1209) via [Link]/101982/cs/21202/

Select Rows Adding a Column Performing Column Operation (cont)

# Select one Row df = [Link]([ > -> lower, upper

march = [Link][2] [1, '3 inch screw', 0.5, # Perform a lambda Operation on a Column
# Select multiple Rows 0.75], get_last_name = lambda x: [Link]t(" ")[-1]
jan_feb_march = [Link]c[:3] [2, '2 inch nail', 0.10, df['last_name'] = [Link](get_last_‐
feb_march_april = [Link]‐ 0.25], name)

c[1:4] [3, 'hammer', 3.00, 5.50],

Performing a Operation on Multiple
may_june = [Link]c[-2:] [4, 'screwdriver', 2.50,
Columns
# Select Rows with Logic 3.00]
january = df[df.month == ], df = [Link]([
'January'] columns=['Product ID', ["Apple", 1.00, "No"],
-> <, >, <=, >=, !=, == 'Description', 'Cost to ["Milk", 4.20, "No"],
march_april = df[([Link] == Manufacture', 'Price'] ["Paper Towels", 5.00, "‐
'March') | ([Link] == ) Yes"],
'April')] # Add a Column with specified ["Light Bulbs", 3.75, "‐
-> &, | row-values Yes"],
january_february_march = df['Sold in Bulk?'] = ['Yes', ],
df[[Link](['January', 'Yes', 'No', 'No'] columns=["Item", "‐
'February', 'March'])] # Add a Column with same value Price", "Is taxed?"])
-> column_name.isin([" ", " in every row # Lambda Function
"]) df['Is taxed?'] = 'Yes' df['Price with Tax'] = [Link]‐
# Add a Column with calculation ly(lambda row:
Selecting a Subset of a Dataframe often
df['Revenue'] = df['Price'] - row['Price'] * 1.075
results in non-consecutive indices.
df['Cost to Manufacture'] if row['Is taxed?'] ==

Using .reset_index() will create a new 'Yes'

DataFrame move the old indices into a new Performing Column Operation else row['Price'],
colum called index. df = [Link]([
axis=1

)
['JOHN SMITH', 'john.smi‐
Use .reset_index(drop=True) if you dont th@gmail.com'], We apply a lambda to rows, as opposed to
need the index column. columns, when we want to perform functi‐
['Jane Doe', 'jdoe@yah‐
Use .reset_index(inplace=True) to prevent a onality that needs to access more than one
oo.com'],
new DataFrame from brein created. column at a time.
['joe schmo', 'joeschmo‐
@hotmail.com']
],
columns=['Name', 'Email'])
# Changing a column with an
Operation
df['Name'] = [Link]. apply(‐
lower)

By Justin1209 (Justin1209) Published 23rd November, 2019. Sponsored by [Link]

[Link]/justin1209/ Last updated 31st January, 2020. Measure your website readability!
Page 2 of 4. [Link]
Pandas Cheat Sheet
by Justin1209 (Justin1209) via [Link]/101982/cs/21202/

Rename Columns Column Statistics Pivot Tables

# Method 1 Mean = Average [Link]() orders =

[Link] = ['NewName_1', Median [Link]() pd.read_csv('[Link]')
'NewName_2, 'NewName_3', shoe_counts = orders.
Minimal Value [Link]()
'...'] groupby(['shoe_type',
Maximum Value [Link]()
# Method 2 'shoe_color']).
Number of Values [Link]()
[Link](columns={ [Link]nt().reset_index()
'OldName_1': 'NewNa‐ Unique Values [Link]() shoe_counts_pivot = shoe_c‐
me_1', Standard Deviation [Link]() ounts.pivot(
'OldName_2': 'NewNa‐ List of Unique Values [Link]() index = 'shoe_type',
me_2' columns = 'shoe_color',
Dont't forget reset_index() at the end of a
}, inplace=True
) values = 'id').reset_index()
groupby operation
Using inplace=True lets us edit the original We have to build a temporary table where
DataFrame. Calculating Aggregate Functions we group by the columns we want to

# Group By include in the pivot table

Series vs. Dataframes
grouped = df. groupby(['col1',
Merge (Same Column Name)
# Dataframe and Series 'col2']).col3
print(type(clinic_north)): .measurement()
. reset_index() sales = pd.read_csv('[Link]')
# <class 'pandas.c[Link]ries.Series'> # -> group by column1 and targets = [Link]d_csv('ta‐
print(type(df)): column2, calculate values of rgets.csv')
# <class 'pandas.c[Link][Link]taFrame'> column3 men_women = [Link]d_csv('me‐
print(type(clinic_north_south)) n_women_sale[Link]')
# Percentile
# <class 'pandas.c[Link][Link]taFrame'> # Method 1
high_earners = [Link]upby('‐
In Pandas category').wage sales_targets = [Link](sales,
- a series is a one-dimensional object that apply(lambda
. x: [Link]‐ targets, how=" ")
contains any type of data. centile(x, 75)) # how: "inner"(default), "‐
reset_index()
. outer", "left", "right"
- a dataframe is a two-dimensional object #Method 2 (Method Chaining)
# [Link]centile can calculate
that can hold multiple columns of different all_data =
any percentile over an array of
types of data.
values [Link](targets).merge(men_w
omen)
A single column of a dataframe is a series, Don't forget reset.index()
and a dataframe is a container of two or
more series objects.

By Justin1209 (Justin1209) Published 23rd November, 2019. Sponsored by [Link]

[Link]/justin1209/ Last updated 31st January, 2020. Measure your website readability!
Page 3 of 4. [Link]
Pandas Cheat Sheet
by Justin1209 (Justin1209) via [Link]/101982/cs/21202/

Inner Merge (Different Column Name) Melt

orders = [Link]lt(DataFrame, id_vars, value_vars, var_name, value_name='‐

pd.read_csv('[Link]') value')
products = [Link]d_csv('pr‐ id_vars: Column(s) to use as identifier variables.
odu[Link]') value_vars: Column(s) to unpivot. If not specified, uses all columns that are not set as id_vars.
# Method 1: Rename Columns var_name: Name to use for the ‘variable’ column.
orders_products = value_name: Name to use for the ‘value’ column.
[Link](orders, Unpivot a DataFrame from wide to long
[Link](columns={'i‐ format, optionally leaving identifiers set.
d':'product_id'}), how=" ")
.reset_index() Assert Statements
# how: "inner"(default), "‐ # Test if country is of type
outer", "left", "right" object
# Method 2: assert gapmin[Link]untry.d‐
orders_products = types == [Link]
[Link](orders, products, # Test if year is of type int64
left_on="pr‐
assert gapmin[Link]ar.dtypes
oduct_id", == np.int64
right_on="id",
# Test if life_expectancy is
suffixes=["_‐
of type float64
orders","_products"]) assert gapmin[Link]fe_exp‐
Method 2: ectancy.dtypes == np.float64
If we use this syntax, we’ll end up with two # Assert that country does not
columns called id. contain any missing values
Pandas won’t let you have two columns assert [Link]null(gapmind‐
with the same name, so it will change them er.country).all()
to id_x and id_y. # Assert that year does not
We can help make them more useful by contain any missing values
using the keyword suffixes. assert [Link]null(gapmind‐
er.year).all()
Concatenate

bakery =
pd.read_csv('[Link]')
ice_cream = [Link]d_csv('ic‐
e_crea[Link]')
menu = [Link]([bakery,
ice_cream])

By Justin1209 (Justin1209) Published 23rd November, 2019. Sponsored by [Link]

[Link]/justin1209/ Last updated 31st January, 2020. Measure your website readability!
Page 4 of 4. [Link]

Pandas - Cheatsheet
No ratings yet
Pandas - Cheatsheet
4 pages
Pandas Cheat Sheet
100% (1)
Pandas Cheat Sheet
2 pages
Pandas Cheat Sheet for Data Manipulation
No ratings yet
Pandas Cheat Sheet for Data Manipulation
1 page
Python Pandas Cheat Sheet Guide
No ratings yet
Python Pandas Cheat Sheet Guide
11 pages
Pandas Cheat Sheet for Data Science
No ratings yet
Pandas Cheat Sheet for Data Science
5 pages
Pandas DataFrame Cheat Sheet Guide
No ratings yet
Pandas DataFrame Cheat Sheet Guide
12 pages
Pandas
No ratings yet
Pandas
5 pages
Pandas Cheat Sheet
85% (13)
Pandas Cheat Sheet
2 pages
Pandas Cheat Sheet
100% (4)
Pandas Cheat Sheet
2 pages
Pandas Cheat Sheet CN
No ratings yet
Pandas Cheat Sheet CN
4 pages
Pandas Cheat Sheet
No ratings yet
Pandas Cheat Sheet
2 pages
Pandas DataFrame Notes
No ratings yet
Pandas DataFrame Notes
13 pages
Pandas Notes
No ratings yet
Pandas Notes
20 pages
Pandas Data Wrangling Cheat Sheet
100% (2)
Pandas Data Wrangling Cheat Sheet
6 pages
Introduction to Pandas DataFrames
100% (1)
Introduction to Pandas DataFrames
21 pages
Data Analysis With Python
No ratings yet
Data Analysis With Python
60 pages
Pandas DataFrame Cheat Sheet
No ratings yet
Pandas DataFrame Cheat Sheet
4 pages
Pandas DataFrame Cheat Sheet
100% (1)
Pandas DataFrame Cheat Sheet
10 pages
Cheat Sheet: The Pandas Dataframe Object: Preliminaries Get Your Data Into A Dataframe
100% (1)
Cheat Sheet: The Pandas Dataframe Object: Preliminaries Get Your Data Into A Dataframe
12 pages
Content Pandas Cheat Sheet
No ratings yet
Content Pandas Cheat Sheet
9 pages
Essential Pandas Operations Guide
No ratings yet
Essential Pandas Operations Guide
8 pages
Pandas DataFrame Notes
100% (1)
Pandas DataFrame Notes
10 pages
File Ip
No ratings yet
File Ip
22 pages
Pandas Cheat Sheet Free Resources At: Dataquest - Io/guide
No ratings yet
Pandas Cheat Sheet Free Resources At: Dataquest - Io/guide
7 pages
Cheat Sheet - Pandas
No ratings yet
Cheat Sheet - Pandas
12 pages
Pandas DataFrame Cheat Sheet Guide
No ratings yet
Pandas DataFrame Cheat Sheet Guide
10 pages
Unit IV
No ratings yet
Unit IV
49 pages
Python Pandas Cheat Sheet PDF
No ratings yet
Python Pandas Cheat Sheet PDF
2 pages
Pandas DataFrame Notes
No ratings yet
Pandas DataFrame Notes
10 pages
Pandas Merged
No ratings yet
Pandas Merged
2 pages
Essential Pandas Cheat Sheet Guide
No ratings yet
Essential Pandas Cheat Sheet Guide
5 pages
Essential Pandas DataFrame Guide
No ratings yet
Essential Pandas DataFrame Guide
9 pages
Pandas
No ratings yet
Pandas
26 pages
Pandas Dataframe
No ratings yet
Pandas Dataframe
8 pages
PANDAS Cheatsheet
No ratings yet
PANDAS Cheatsheet
4 pages
Unit 1 Python Pandas
No ratings yet
Unit 1 Python Pandas
20 pages
Overview of Pandas DataFrames
No ratings yet
Overview of Pandas DataFrames
21 pages
Pandas Cheat Sheet
No ratings yet
Pandas Cheat Sheet
17 pages
Pandas Introduction: What Is Python Pandas Used For?
No ratings yet
Pandas Introduction: What Is Python Pandas Used For?
28 pages
Python Pandas for Data Science
No ratings yet
Python Pandas for Data Science
59 pages
Learn Data Analysis With Pandas - Introduction
No ratings yet
Learn Data Analysis With Pandas - Introduction
2 pages
Pandas and Python
No ratings yet
Pandas and Python
24 pages
Python Interviews
No ratings yet
Python Interviews
154 pages
Pandas Cheat Sheet
No ratings yet
Pandas Cheat Sheet
19 pages
Pandas Cheat Sheet
No ratings yet
Pandas Cheat Sheet
2 pages
DataFrame Operations Guide
No ratings yet
DataFrame Operations Guide
9 pages
DAP 3 Module
No ratings yet
DAP 3 Module
62 pages
IFM - Coca Cola (Ha&Dat)
No ratings yet
IFM - Coca Cola (Ha&Dat)
5 pages
Adaptive Time Step
No ratings yet
Adaptive Time Step
2 pages
Halloween Lesson Plan Overview
No ratings yet
Halloween Lesson Plan Overview
4 pages
Wind Loading of Industrial, Mining and Petrochemical Structures
No ratings yet
Wind Loading of Industrial, Mining and Petrochemical Structures
16 pages
Understanding HOL Blocking in HTTP/3
No ratings yet
Understanding HOL Blocking in HTTP/3
23 pages
Garibaldi
No ratings yet
Garibaldi
6 pages
A State of Denmark 1st Edition Derek Raymond
No ratings yet
A State of Denmark 1st Edition Derek Raymond
487 pages
Understanding Emulsion Paints
No ratings yet
Understanding Emulsion Paints
10 pages
William Ophuls - Immoderate Greatness Why Civilizations Fail-CreateSpace Independent Publishing Platform (2012)
No ratings yet
William Ophuls - Immoderate Greatness Why Civilizations Fail-CreateSpace Independent Publishing Platform (2012)
114 pages
Tenses Class8
No ratings yet
Tenses Class8
15 pages
KP Prediction Template
90% (10)
KP Prediction Template
4 pages
Jersey Crepe Draped Plunging Midi Dress Karen Millen
No ratings yet
Jersey Crepe Draped Plunging Midi Dress Karen Millen
1 page
ARD Company Introduction 6.3.2025
No ratings yet
ARD Company Introduction 6.3.2025
31 pages
Department of Education Zambales National High School Senior High School Iba, Zambales Telefax No. (047) 602-1202
No ratings yet
Department of Education Zambales National High School Senior High School Iba, Zambales Telefax No. (047) 602-1202
25 pages
S90
No ratings yet
S90
4 pages
Ages
No ratings yet
Ages
9 pages
History of Fluorine: Moissan's Fluorine Cell, From His 1887 Publication
No ratings yet
History of Fluorine: Moissan's Fluorine Cell, From His 1887 Publication
3 pages
Quadrats Ws
No ratings yet
Quadrats Ws
4 pages
The Voice of The Rain (Stanza Wise Question)
100% (1)
The Voice of The Rain (Stanza Wise Question)
6 pages
Cummins EQB125 20 Diesel Engine PDF
100% (1)
Cummins EQB125 20 Diesel Engine PDF
4 pages
BULING: A Revised Custom of Mogpog Marinduque
No ratings yet
BULING: A Revised Custom of Mogpog Marinduque
27 pages
Data Loggers
No ratings yet
Data Loggers
2 pages
Stn3Nf06L: N-Channel 60 V, 0.07 Ω Typ., 4 A Stripfet™ Ii Power Mosfet In A Sot-223 Package
No ratings yet
Stn3Nf06L: N-Channel 60 V, 0.07 Ω Typ., 4 A Stripfet™ Ii Power Mosfet In A Sot-223 Package
12 pages
"Peter and Wolf" Lesson Plan
No ratings yet
"Peter and Wolf" Lesson Plan
36 pages
Prostate and Seminal Vesicles
No ratings yet
Prostate and Seminal Vesicles
83 pages
History of Programming Languages
No ratings yet
History of Programming Languages
3 pages
Coconut-Lemon Beverage Optimization
No ratings yet
Coconut-Lemon Beverage Optimization
7 pages
CMPM Reviewer
No ratings yet
CMPM Reviewer
5 pages
BLW29
No ratings yet
BLW29
12 pages
Understanding Orthogonal Models
No ratings yet
Understanding Orthogonal Models
5 pages

Cheat Sheet Pandas

Uploaded by

Cheat Sheet Pandas

Uploaded by

Pandas Cheat Sheet

by Justin1209 (Justin1209) via [Link]/101982/cs/21202/

f​[Link]') learn: clinic​_north.[Link]​‐

# Saving DataFrame to a CSV File sha​pe(-1, 1)

df.to_​csv​('n​ew-​csv​-fi​‐ # Select multiple Columns

# Load DataFrame in Chunks (For orth', 'south']]

large Datasets) Make sure that you have a double set of

By Justin1209 (Justin1209) Published 23rd November, 2019. Sponsored by [Link]

Select Rows Adding a Column Performing Column Operation (cont)

# Select one Row df = [Link]([ > -> lower, upper

c[1:4] ​ [3, 'hammer', 3.00, 5.50],

Using .reset​_in​dex() will create a new 'Yes'

By Justin1209 (Justin1209) Published 23rd November, 2019. Sponsored by [Link]

Rename Columns Column Statistics Pivot Tables

# Method 1 Mean = Average [Link]() orders =

# Group By include in the pivot table

By Justin1209 (Justin1209) Published 23rd November, 2019. Sponsored by [Link]

Inner Merge (Different Column Name) Melt

orders = [Link]​lt(​Dat​aFrame, id_vars, value_​vars, var_name, value_​nam​e='​‐

By Justin1209 (Justin1209) Published 23rd November, 2019. Sponsored by [Link]

You might also like

f[Link]') learn: clinic_north.[Link]‐

# Saving DataFrame to a CSV File shape(-1, 1)

df.to_csv('new-csv-fi‐ # Select multiple Columns

c[1:4] [3, 'hammer', 3.00, 5.50],

Using .reset_index() will create a new 'Yes'

orders = [Link]lt(DataFrame, id_vars, value_vars, var_name, value_name='‐