0% found this document useful (0 votes)
73 views5 pages

Pandas Data Manipulation Techniques

This document discusses various techniques for manipulating data in Pandas such as filtering, selecting, sorting, aggregating, handling missing values, grouping, pivoting, combining, and applying functions to DataFrames. Key operations include filtering rows based on conditions, selecting columns, sorting by column values, calculating summary statistics, dropping rows with missing values, merging DataFrames, grouping and aggregating data, creating pivot tables, concatenating DataFrames horizontally and vertically, and applying custom functions to columns.

Uploaded by

Manan Sharma
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
73 views5 pages

Pandas Data Manipulation Techniques

This document discusses various techniques for manipulating data in Pandas such as filtering, selecting, sorting, aggregating, handling missing values, grouping, pivoting, combining, and applying functions to DataFrames. Key operations include filtering rows based on conditions, selecting columns, sorting by column values, calculating summary statistics, dropping rows with missing values, merging DataFrames, grouping and aggregating data, creating pivot tables, concatenating DataFrames horizontally and vertically, and applying custom functions to columns.

Uploaded by

Manan Sharma
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

day09-pandas-data-manipulation

February 2, 2024

Pandas Data Manipulation –by Punith V T


[1]: import pandas as pd

[58]: #sample data


data = { "A" : [1,2,3,4,5,10],
"B": ["bengaluru","channai","delhi","tumkur","coimbator","bengaluru"]}

df=pd.DataFrame(data)
df

[58]: A B
0 1 bengaluru
1 2 channai
2 3 delhi
3 4 tumkur
4 5 coimbator
5 10 bengaluru

1. Filtering Data: Filtering rows based on a condition.


[12]: gr2= df[df["A"]>2]

gr2

[12]: A B
2 3 delhi
3 4 tumkur
4 5 coimbator

2. Selecting Columns:
Selecting specific columns from a DataFrame.
[20]: sc = df[["B"]]

print(sc)

type(sc)

1
B
0 bengaluru
1 channai
2 delhi
3 tumkur
4 coimbator

[20]: pandas.core.frame.DataFrame

3.Sorting Data:
Sorting DataFrame by one or more columns.
[29]: sort = df.sort_values(by="B")
sort

[29]: A B
0 1 bengaluru
1 2 channai
4 5 coimbator
2 3 delhi
3 4 tumkur

[31]: #descending order


desort = df.sort_values(by="B")
desort

[31]: A B
0 1 bengaluru
1 2 channai
4 5 coimbator
2 3 delhi
3 4 tumkur

4. Aggregating Data:
Calculating summary statistics like mean, sum, count, etc.
[45]: meanA =df["A"].mean()
meanA

[45]: 3.0

[46]: valueCountA =df["A"].value_counts()


valueCountA

[46]: A
1 1
2 1

2
3 1
4 1
5 1
Name: count, dtype: int64

5. Handling Missing Data:


Dealing with missing values in your DataFrame
[51]: import pandas as pd
# sample
data_with_missing ={ "A": [1,2,3,None,4],
"B": ["a","b",None,"d","e"]}
df_miss=pd.DataFrame(data_with_missing)

df_miss.dropna()

[51]: A B
0 1.0 a
1 2.0 b
4 4.0 e

[52]: # Create two DataFrames


df1 = pd.DataFrame({'key': ['A', 'B', 'C'], 'value1': [10, 20, 30]})
df2 = pd.DataFrame({'key': ['B', 'C', 'D'], 'value2': [40, 50, 60]})

# Merge based on 'key' column


merged_df = pd.merge(df1, df2, on='key', how='inner')
print(merged_df)

key value1 value2


0 B 20 40
1 C 30 50
7. Grouping and Aggregating Data:
Grouping data by one or more columns and applying aggregate functions.
[62]: # Group by 'B' and calculate the sum of 'A' for each group
group_df = df.groupby('B')["A"].sum().reset_index()
print(group_df)

B A
0 bengaluru 11
1 channai 2
2 coimbator 5
3 delhi 3
4 tumkur 4
8. Pivot Tables:

3
Creating pivot tables to summarize and reshape data.
[63]: # Create a pivot table to show the mean 'A' for each 'B' category
pivot_table = df.pivot_table(values='A', index='B', aggfunc='mean')
print(pivot_table)

A
B
bengaluru 5.5
channai 2.0
coimbator 5.0
delhi 3.0
tumkur 4.0
9. Combining Data:
Concatenating or appending multiple DataFrames vertically or horizontally.
[65]: # Concatenate two DataFrames Horizontaly
df_concatenated = pd.concat([df1, df2], axis=1)
print(df_concatenated)

key value1 key value2


0 A 10 B 40
1 B 20 C 50
2 C 30 D 60

[69]: # Append one DataFrame to another


df_appended = df1._append(df2,ignore_index=True)
print(df_appended)

key value1 value2


0 A 10.0 NaN
1 B 20.0 NaN
2 C 30.0 NaN
3 B NaN 40.0
4 C NaN 50.0
5 D NaN 60.0
Applying function to the data
[70]: def square(x):
return x*x
# Apply the custom function to 'A' column
df["sq_A"]= df["A"].apply(square)
df

[70]: A B sq_A
0 1 bengaluru 1
1 2 channai 4

4
2 3 delhi 9
3 4 tumkur 16
4 5 coimbator 25
5 10 bengaluru 100

[ ]:

You might also like