numpynotes.
md 2025-05-17
Awesome, you're now learning both NumPy and Pandas — killer combo for Data Science ⚡
Here’s your clean and structured Pandas course notes so far (formatted in Markdown — can be converted
to PDF anytime):
🐼 Pandas Full Course Notes
📦 1. Getting Started
Use pip install pandas
Import with import pandas as pd
Start in Google Colab, JupyterLab, or VS Code
🧱 2. DataFrames: The Core of Pandas
data = [[1, 2, 3], [4, 5, 6]]
df = pd.DataFrame(data, columns=['A', 'B', 'C'])
📋 DataFrame Attributes
df.shape # Rows x Columns
df.columns # Column names
df.index # Row index
df.dtypes # Data types
df.info() # Summary info
df.describe() # Stats summary
📂 3. Loading Data
pd.read_csv('file.csv')
pd.read_excel('file.xlsx')
pd.read_parquet('file.parquet')
You can load from URLs too.
🔍 4. Accessing Data
df.head(), df.tail(), df.sample()
df['column'] # Access column
1/4
numpynotes.md 2025-05-17
df[['col1', 'col2']] # Multiple columns
df.loc[rows, cols] # By label
df.iloc[rows, cols] # By index
df.at[row, 'col'] # Single value
df.iat[row, col] # Fastest single access
✏ 5. Modifying Data
df.loc[0, 'column'] = 99
df['new'] = df['a'] + df['b']
df.drop(columns=['old_col'], inplace=True)
df.rename(columns={'old': 'new'}, inplace=True)
df.copy() # Avoid shared memory issues
🧠 6. Filtering Data
df[df['age'] > 30]
df[(df['age'] > 30) & (df['city'] == 'NY')]
df.query("age > 30 and city == 'NY'")
df[df['name'].str.contains('John', case=False)]
📅 7. String & Date Functions
df['first_name'] = df['name'].str.split().str[0]
df['birth_date'] = pd.to_datetime(df['birth_date'])
df['year'] = df['birth_date'].dt.year
df['is_leap'] = df['birth_date'].dt.is_leap_year
🛠 8. Handling Missing Data
df.isna(), df.notna()
df.fillna(value)
df.interpolate()
df.dropna(subset=['column'], inplace=True)
📊 9. Aggregating Data
2/4
numpynotes.md 2025-05-17
df['col'].value_counts()
df.groupby('col').sum()
df.groupby('col').agg({'price': 'mean', 'qty': 'sum'})
🧾 10. Pivot Tables
df.pivot(index='day', columns='type', values='sales')
🔄 11. Merging & Concatenating
pd.merge(df1, df2, on='id', how='left')
pd.concat([df1, df2], axis=0)
🚨 12. Advanced Functions
df['shifted'] = df['sales'].shift(1)
df['cumsum'] = df['sales'].cumsum()
df['rank'] = df['sales'].rank()
df['rolling_mean'] = df['sales'].rolling(3).mean()
🧪 13. Apply & Lambda
df['category'] = df['height'].apply(lambda x: 'Tall' if x > 180 else 'Short')
Or with a custom function:
def classify(row):
if row['height'] > 180 and row['weight'] > 80:
return 'Heavy'
return 'Light'
df['type'] = df.apply(classify, axis=1)
💾 14. Save Your Work
3/4
numpynotes.md 2025-05-17
df.to_csv('data.csv', index=False)
df.to_excel('data.xlsx')
df.to_parquet('data.parquet')
If you want this in PDF, I can:
1. Convert it to HTML or LaTeX (for offline editing)
2. Send you a Python script that auto-generates the PDF
3. Help you paste it into a markdown-to-pdf tool
Which way works best for you? Want me to do the same style for your Pandas + NumPy combo PDF?
4/4