■ Pandas AIML Complete Cheat Sheet
1. Data Loading & Saving (I/O Functions)
Function Description Example
pd.read_csv() CSV file se data read karta hai pd.read_csv('data.csv')
pd.read_excel() Excel file se data read karta hai pd.read_excel('data.xlsx')
pd.read_json() JSON file se data read karta hai pd.read_json('data.json')
pd.read_sql() SQL query/database se data read karta hai pd.read_sql('SELECT * FROM table', conn)
pd.read_parquet() Parquet format data read karta hai pd.read_parquet('data.parquet')
df.to_csv() DataFrame ko CSV me save karta hai df.to_csv('clean.csv', index=False)
df.to_excel() Excel file me save karta hai df.to_excel('clean.xlsx')
df.to_json() JSON format me save karta hai df.to_json('data.json')
df.to_numpy() NumPy array me convert karta hai df[['col1']].to_numpy()
2. Data Inspection & Summary
Function Description Example
df.head(n) First n rows dikhata hai df.head(5)
df.tail(n) Last n rows dikhata hai df.tail(3)
df.shape Rows, Columns return karta hai df.shape
df.info() Summary (types, null counts) df.info()
df.describe() Numerical summary df.describe()
df.dtypes Column data types df.dtypes
df.columns Column names df.columns
df.index Index info df.index
df.isnull().sum() Missing values count df.isnull().sum()
df.memory_usage() Memory usage check df.memory_usage(deep=True)
df.value_counts() Unique value counts df['col'].value_counts()
df.unique() Unique values list df['col'].unique()
3. Data Selection & Indexing
Method Description Example
df['col'] Single column select df['Age']
df[['c1','c2']] Multiple columns select df[['Age','Salary']]
df.loc[row, col] Label-based indexing df.loc[2,'Age']
df.iloc[row, col] Position-based indexing df.iloc[0,2]
Method Description Example
df.at[row, col] Fast scalar access df.at[2,'Age']
df.iat[row, col] Fast scalar access df.iat[2,3]
df.sample(n) Random n rows df.sample(10)
df.filter(like='str') Filter cols by name df.filter(like='Age')
df.xs() Cross-section (multi-index) df.xs(key=2020, level='Year')
4. Data Cleaning & Preprocessing
Method Description Example
df.dropna() Drop NaN values df.dropna()
df.fillna() Replace NaN df.fillna(0)
df.interpolate() Interpolate missing values df.interpolate()
df.drop() Delete rows/cols df.drop('col', axis=1)
df.rename() Rename columns df.rename(columns={'old':'new'})
df.replace() Replace values df['Gender'].replace({'M':1,'F':0})
pd.get_dummies() One-hot encoding pd.get_dummies(df, columns=['City'])
df.astype() Change type df['Age'].astype(int)
df.duplicated() Check duplicates df.duplicated().sum()
df.drop_duplicates() Remove duplicates df.drop_duplicates()
df.where() Conditional keep df['col'].where(df['col']>0)
df.mask() Conditional replace df['col'].mask(df['col']<0, 0)
5. Data Transformation & Feature Engineering
Method Description Example
df.assign() Add new col df.assign(Age2=df['Age']**2)
df.apply() Apply function df['log'] = df['Salary'].apply(np.log)
df.applymap() Element-wise apply df.applymap(str)
df.map() Series transformation df['Gender'].map({'M':1,'F':0})
df.groupby() Group summary df.groupby('Dept')['Salary'].mean()
df.agg() Multiple aggregations df.agg({'col1':'sum','col2':'mean'})
df.transform() Group-wise transform df.groupby('Dept')['Salary'].transform('mean')
df.sort_values() Sort values df.sort_values('Salary')
df.rank() Assign rank df['rnk'] = df['Score'].rank()
pd.cut() Bin numeric data pd.cut(df['Age'], bins=[0,18,35,60])
pd.qcut() Quantile binning pd.qcut(df['Age'], q=4)
df.corr() Correlation matrix df.corr()
Method Description Example
df.melt() Wide → Long df.melt(id_vars='ID')
df.pivot() Long → Wide df.pivot(index='ID', columns='Year')
df.stack()/unstack() Reshape multi-index df.unstack()
df.eval() Optimized computation df.eval('new=col1+col2')
6. Combining & Merging Datasets
Method Description Example
pd.concat() Concatenate pd.concat([df1,df2],axis=0)
pd.merge() SQL-style merge pd.merge(df1,df2,on='id')
df.join() Index join df1.join(df2)
df.update() Overwrite values df1.update(df2)
df.combine_first() Fill NaN from another df1.combine_first(df2)
pd.merge(..., indicator=True) Merge with flag pd.merge(df1,df2,on='id',indicator=True)
7. Time Series & Window Functions
Method Description Example
pd.to_datetime() Convert to datetime pd.to_datetime(df['date'])
df['date'].dt.year Extract year/month/day df['date'].dt.month
df.resample('M').mean() Monthly resample df.resample('M').mean()
df.rolling(3).mean() Moving average df['col'].rolling(3).mean()
df.expanding().sum() Expanding window df['col'].expanding().sum()
8. Misc / Utilities
Method Description Example
df.cumsum() Cumulative sum df['col'].cumsum()
df.cumprod() Cumulative product df['col'].cumprod()
df.clip() Value bounding df['col'].clip(0,100)
df.equals(df2) Check equality df.equals(df2)
df.empty Empty check df.empty
df.explode() Expand lists df.explode('tags')
df.style.highlight_max() Styling df.style.highlight_max(axis=0)