0% found this document useful (0 votes)
426 views2 pages

Pandas Notes

This document provides notes on data handling and analysis using Pandas, covering key operations such as reading, cleaning, filtering, analyzing, grouping, merging, and exporting data. It includes code examples for each operation, demonstrating how to manipulate DataFrames effectively. The notes serve as a quick reference guide for performing common data tasks in Python with Pandas.

Uploaded by

mehul garje
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
426 views2 pages

Pandas Notes

This document provides notes on data handling and analysis using Pandas, covering key operations such as reading, cleaning, filtering, analyzing, grouping, merging, and exporting data. It includes code examples for each operation, demonstrating how to manipulate DataFrames effectively. The notes serve as a quick reference guide for performing common data tasks in Python with Pandas.

Uploaded by

mehul garje
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

Pandas Notes - Data Handling & Analysis

Reading Data

- Use `pd.read_csv('file.csv')` to read CSV files.


- Use `pd.read_excel('file.xlsx')` to read Excel files.
- Use `df.head()` to view the first 5 rows.
- Use `df.tail()` to view the last 5 rows.
- Use `df.info()` to see data types and non-null counts.
Example:
df = pd.read_csv('data.csv')

Cleaning Data

- Use `df.isnull().sum()` to check missing values.


- Fill missing data: `df.fillna(value)`.
- Drop missing rows: `df.dropna()`.
- Rename columns: `df.rename(columns={'old':'new'})`.
Example:
df['age'].fillna(df['age'].mean(), inplace=True)

Filtering Data

- Single condition: `df[df['age'] > 30]`


- Multiple conditions: `df[(df['age'] > 25) & (df['marks'] > 80)]`
- Equality: `df[df['name'] == 'Bob']`
- `isin()`: `df[df['name'].isin(['Alice', 'David'])]`
- String match: `df[df['name'].str.startswith('A')]`
Example:
df.query('age > 30 and marks < 90')

Analyzing Data

- `df.describe()` gives statistical summary.


- Column stats: `mean()`, `max()`, `min()`, `mode()`.
- `value_counts()` for frequency count.
- `df.groupby('col')['val'].mean()` for grouped mean.
- `df.corr()` for correlation.
Example:
df.groupby('department')['marks'].agg(['min', 'max', 'mean'])

Grouping Data

- Use `groupby()` to group and aggregate.


- Average marks: `df.groupby('department')['marks'].mean()`
- Multiple stats: `agg(['min', 'max'])`
- Group by multiple: `df.groupby(['dept', 'name'])`
Pandas Notes - Data Handling & Analysis
- Reset index: `reset_index()` to flatten result.
Example:
df.groupby('department')['marks'].sum().reset_index()

Merging Data

- `pd.merge(df1, df2, on='id')` for inner join.


- `how='left'`, `'right'`, `'outer'` for other joins.
- Merge on multiple keys: `on=['id', 'name']`.
Example:
pd.merge(students, marks, on='id', how='left')

Exporting Data

- To CSV: `df.to_csv('file.csv', index=False)`


- To Excel: `df.to_excel('file.xlsx', index=False)`
- To JSON: `df.to_json('file.json')`
Example:
df.to_csv('cleaned_data.csv', index=False)

You might also like