Python Pandas – DataFrame (Notes)
1. Introduction
A DataFrame is a two-dimensional labeled data structure in Pandas.
It looks like a table with rows and columns (similar to Excel sheet or SQL table).
Each column is like a Pandas Series, and all together they form a DataFrame.
import pandas as pd
2. Creating a DataFrame
(a) From a Dictionary of Lists
data = {
'Name': ['Anu', 'Binu', 'Manu'],
'Age': [20, 21, 19],
'Marks': [85, 90, 88]
df = pd.DataFrame(data)
print(df)
(b) From a List of Dictionaries
data = [
{'Name':'Anu', 'Age':20},
{'Name':'Binu', 'Age':21}
df = pd.DataFrame(data)
(c) From a 2D List / Nested List
data = [[1,'Anu',85],[2,'Binu',90]]
df = pd.DataFrame(data, columns=['RollNo','Name','Marks'])
3. Accessing Data
Columns
print(df['Name']) # single column
print(df[['Name','Age']]) # multiple columns
Rows
loc → access by labels (index)
iloc → access by position
print(df.loc[0]) # row with index 0
print(df.iloc[1]) # 2nd row
Specific Value
print(df.loc[0, 'Name']) # element at row 0, column 'Name'
4. Attributes of DataFrame
df.shape → returns (rows, columns)
df.size → total number of elements
df.ndim → dimensions (always 2)
df.dtypes → data type of each column
df.columns → list of column labels
df.index → row index values
5. Basic Functions
head(n) → first n rows
tail(n) → last n rows
info() → summary of DataFrame
describe() → statistical summary (count, mean, std, min, max, etc.)
sum(), max(), min(), mean() → operations on columns
6. Adding & Deleting Data
Adding a New Column
df['Grade'] = ['A','B','A']
Deleting a Column
df = df.drop('Age', axis=1)
Adding a Row
df.loc[3] = ['Sinu', 22, 75, 'B']
Deleting a Row
df = df.drop(1) # removes row with index 1
7. Selection & Slicing
print(df[1:3]) # rows from index 1 to 2
print(df.loc[:, 'Name']) # all rows, only Name column
print(df.iloc[0:2, 1:3]) # slicing rows & columns
8. Applications of DataFrame
Handling student marksheets
Analysing sales data
Storing employee records
Data analysis & statistics
9. Practice Questions
1. Create a DataFrame with details of 5 students (RollNo, Name, Marks).
2. Display only the Name and Marks columns.
3. Find the maximum marks from the DataFrame.
4. Add a new column "Grade" based on marks.
5. Delete a column and a row from the DataFrame.
6. Show first 3 rows and last 2 rows of the DataFrame.
7. Write a program to print summary statistics of student marks.