0% found this document useful (0 votes)
5 views3 pages

Pandas Complete Cheatsheet

Uploaded by

amirpawar09
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views3 pages

Pandas Complete Cheatsheet

Uploaded by

amirpawar09
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

■ Pandas AIML Complete Cheat Sheet

1. Data Loading & Saving (I/O Functions)


Function Description Example

pd.read_csv() CSV file se data read karta hai pd.read_csv('data.csv')


pd.read_excel() Excel file se data read karta hai pd.read_excel('data.xlsx')
pd.read_json() JSON file se data read karta hai pd.read_json('data.json')
pd.read_sql() SQL query/database se data read karta hai pd.read_sql('SELECT * FROM table', conn)
pd.read_parquet() Parquet format data read karta hai pd.read_parquet('data.parquet')
df.to_csv() DataFrame ko CSV me save karta hai df.to_csv('clean.csv', index=False)
df.to_excel() Excel file me save karta hai df.to_excel('clean.xlsx')
df.to_json() JSON format me save karta hai df.to_json('data.json')
df.to_numpy() NumPy array me convert karta hai df[['col1']].to_numpy()

2. Data Inspection & Summary


Function Description Example

df.head(n) First n rows dikhata hai df.head(5)


df.tail(n) Last n rows dikhata hai df.tail(3)
df.shape Rows, Columns return karta hai df.shape
df.info() Summary (types, null counts) df.info()
df.describe() Numerical summary df.describe()
df.dtypes Column data types df.dtypes
df.columns Column names df.columns
df.index Index info df.index
df.isnull().sum() Missing values count df.isnull().sum()
df.memory_usage() Memory usage check df.memory_usage(deep=True)
df.value_counts() Unique value counts df['col'].value_counts()
df.unique() Unique values list df['col'].unique()

3. Data Selection & Indexing


Method Description Example

df['col'] Single column select df['Age']


df[['c1','c2']] Multiple columns select df[['Age','Salary']]
df.loc[row, col] Label-based indexing df.loc[2,'Age']
df.iloc[row, col] Position-based indexing df.iloc[0,2]
Method Description Example

df.at[row, col] Fast scalar access df.at[2,'Age']


df.iat[row, col] Fast scalar access df.iat[2,3]
df.sample(n) Random n rows df.sample(10)
df.filter(like='str') Filter cols by name df.filter(like='Age')
df.xs() Cross-section (multi-index) df.xs(key=2020, level='Year')

4. Data Cleaning & Preprocessing


Method Description Example

df.dropna() Drop NaN values df.dropna()


df.fillna() Replace NaN df.fillna(0)
df.interpolate() Interpolate missing values df.interpolate()
df.drop() Delete rows/cols df.drop('col', axis=1)
df.rename() Rename columns df.rename(columns={'old':'new'})
df.replace() Replace values df['Gender'].replace({'M':1,'F':0})
pd.get_dummies() One-hot encoding pd.get_dummies(df, columns=['City'])
df.astype() Change type df['Age'].astype(int)
df.duplicated() Check duplicates df.duplicated().sum()
df.drop_duplicates() Remove duplicates df.drop_duplicates()
df.where() Conditional keep df['col'].where(df['col']>0)
df.mask() Conditional replace df['col'].mask(df['col']<0, 0)

5. Data Transformation & Feature Engineering


Method Description Example

df.assign() Add new col df.assign(Age2=df['Age']**2)


df.apply() Apply function df['log'] = df['Salary'].apply(np.log)
df.applymap() Element-wise apply df.applymap(str)
df.map() Series transformation df['Gender'].map({'M':1,'F':0})
df.groupby() Group summary df.groupby('Dept')['Salary'].mean()
df.agg() Multiple aggregations df.agg({'col1':'sum','col2':'mean'})
df.transform() Group-wise transform df.groupby('Dept')['Salary'].transform('mean')
df.sort_values() Sort values df.sort_values('Salary')
df.rank() Assign rank df['rnk'] = df['Score'].rank()
pd.cut() Bin numeric data pd.cut(df['Age'], bins=[0,18,35,60])
pd.qcut() Quantile binning pd.qcut(df['Age'], q=4)
df.corr() Correlation matrix df.corr()
Method Description Example

df.melt() Wide → Long df.melt(id_vars='ID')


df.pivot() Long → Wide df.pivot(index='ID', columns='Year')
df.stack()/unstack() Reshape multi-index df.unstack()
df.eval() Optimized computation df.eval('new=col1+col2')

6. Combining & Merging Datasets


Method Description Example

pd.concat() Concatenate pd.concat([df1,df2],axis=0)


pd.merge() SQL-style merge pd.merge(df1,df2,on='id')
df.join() Index join df1.join(df2)
df.update() Overwrite values df1.update(df2)
df.combine_first() Fill NaN from another df1.combine_first(df2)
pd.merge(..., indicator=True) Merge with flag pd.merge(df1,df2,on='id',indicator=True)

7. Time Series & Window Functions


Method Description Example

pd.to_datetime() Convert to datetime pd.to_datetime(df['date'])


df['date'].dt.year Extract year/month/day df['date'].dt.month
df.resample('M').mean() Monthly resample df.resample('M').mean()
df.rolling(3).mean() Moving average df['col'].rolling(3).mean()
df.expanding().sum() Expanding window df['col'].expanding().sum()

8. Misc / Utilities
Method Description Example

df.cumsum() Cumulative sum df['col'].cumsum()


df.cumprod() Cumulative product df['col'].cumprod()
df.clip() Value bounding df['col'].clip(0,100)
df.equals(df2) Check equality df.equals(df2)
df.empty Empty check df.empty
df.explode() Expand lists df.explode('tags')
df.style.highlight_max() Styling df.style.highlight_max(axis=0)

You might also like