0% found this document useful (0 votes)

37 views57 pages

Module 4

The document provides an introduction to Python libraries, emphasizing their importance in data analysis and visualization. It covers key libraries like Pandas, NumPy, Matplotlib, and their functionalities, including data manipulation and visualization techniques. Additionally, it outlines installation steps and basic usage examples for these libraries.

Uploaded by

kiransam1709

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

37 views57 pages

Module 4

Uploaded by

kiransam1709

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 57

Introduction to Python

Libraries for Data Analysis and

Visualization
Introduction to Python Libraries

• A Python library is a collection of related modules.

• It contains bundles of code that can be used repeatedly in different programs.
• It makes Python Programming simpler and convenient for the programmer. As we
don’t need to write the same code again and again for different programs.
• Python libraries play a very vital role in fields of Machine Learning, Data Science,
Data Visualization, etc.
Introduction to Python Libraries

• The Python Standard Library contains the exact syntax, semantics, and tokens of
Python.
• It contains built-in modules that provide access to basic system functionality like I/O
and some other core modules.
• Most of the Python Libraries are written in the C programming language.
• The Python standard library consists of more than 200 core modules. All these work
together to make Python a high-level programming language.
• Python Standard Library plays a very important role. Without it, the programmers
can’t have access to the functionalities of Python.
Introduction to Python Libraries

Some of the commonly used libraries are:

1. TensorFlow
2. Matplotlib
3. Pandas
4. Numpy
5. Scikit-learn
6. Math
And many more.
Introduction to Python Libraries

Example:

Here in the above code, we imported the math library and used one of its methods i.e.
sqrt (square root) without writing the actual code to calculate the square root of a
number. That’s how a library makes the programmers’ job easier.
How to install python libraries

Step 1:
How to install python libraries

Step 2:

Press Shift + Right Click

How to install python libraries

Step 3:

In the command line, type: pip install library-name

In this case the library name is python-math
Python Libraries for Data Analysis and Visualization

Data Analysis:
• Pandas: The cornerstone of data manipulation and analysis. It provides powerful data structures like
DataFrames, enabling you to efficiently clean, transform, and analyze data.
• NumPy: The foundation for numerical computing in Python. It offers high-performance multi-
dimensional arrays and mathematical functions, crucial for handling large datasets and performing
complex calculations.
• SciPy: Built on top of NumPy, SciPy provides advanced scientific and technical computing capabilities,
including statistical analysis, optimization, linear algebra, and more.
Python Libraries for Data Analysis and Visualization

Data Visualization:
• Matplotlib: The granddaddy of Python visualization libraries. It offers a comprehensive set of plotting
functions for creating a wide variety of static, animated, and interactive visualizations.
• Seaborn: Built on top of Matplotlib, Seaborn simplifies the creation of visually appealing statistical
graphics. It provides a high-level interface for common statistical plots and integrates seamlessly with
Pandas DataFrames.
• Plotly: A powerful library for creating interactive and web-based visualizations. It supports a wide
range of chart types, including 3D plots, and allows you to easily embed visualizations in web
applications.
PANDAS and MATPLOTLIB

• Pandas is a Python library used for working with data sets.

• It has functions for analyzing, cleaning, exploring, and manipulating data.

• The name "Pandas" refers to PANEL DATA SYSTEM and was created by Wes
McKinney in 2008

• Pandas allows us to analyze big data and make conclusions based on statistical
theories.

• Pandas can clean messy data sets, and make them readable and relevant
PANDAS and MATPLOTLIB
PANDAS and MATPLOTLIB

Installation of PANDAS: pip install pandas

Checking PANDAS Version: import pandas

print(pandas.__version__)

Importing PANDAS: import pandas

Importing PANDAS as ALIAS: import pandas as pd

PANDAS: Series and Dataframes

Series:
A Pandas Series is like a column in a table.
It is a one-dimensional array holding data of any type.
PANDAS: Series and Dataframes

Create a simple Pandas Series from a list

If nothing else is specified, the values are labeled with their index number. First value has
index 0, second value has index 1 etc.
This label can be used to access a specified value.
PANDAS: Series and Dataframes

With the index argument, you can name your own labels.
PANDAS: Series and Dataframes

Dataframe:
A Pandas DataFrame is a 2 dimensional data structure, like a 2 dimensional array, or a
table with rows and columns.
PANDAS: Series and Dataframes

Dataframe:
A Pandas DataFrame is a 2 dimensional
data structure, like a 2 dimensional array,
or a table with rows and columns.
PANDAS: Series and Dataframes

Loc and iloc in Dataframe:

Used to access a group of rows and

columns by labels.
PANDAS: Series and Dataframes

Loc and iloc in Dataframe:

PANDAS: Series and Dataframes

Loc and iloc in Dataframe:

Row Column
PANDAS: Importing and Exporting dataset

Reading and Writing Data to/from CSV Files

CSV (Comma-Separated Values) is a lightweight, easy-to-read format that is widely used for
storing data. Pandas provides robust functions for reading and writing CSV files.
Reading CSV Files
To load data from a CSV file into a Pandas DataFrame, use the pd.read_csv() function
PANDAS: Importing and Exporting dataset

Writing Data to CSV Files

To export a DataFrame back to a CSV file, use the pd.to_csv() function
PANDAS: Dataset Data Manipulation

Pandas data manipulation is the process of cleaning, transforming, and aggregating data using the
Pandas library. Pandas provides a variety of functions for performing these tasks, making it a
powerful and versatile tool for data analysis.

Here are some of the most common Pandas data manipulation tasks:
• Data selection: Pandas provides a variety of functions for selecting data, such as head(), tail(),
iloc(), and loc(). These functions allow you to select specific rows, columns, or subsets of data
from a DataFrame.
• Data filtering: Pandas provides a variety of functions for filtering data, such as query(),
drop(), and dropna(). These functions allow you to filter data based on specific criteria, such as
values, data types, or missing values.
PANDAS: Dataset Data Manipulation

• Data aggregation: Pandas provides a variety of functions for aggregating data, such as
mean(), median(), sum(), mode(), and count(). These functions allow you to calculate
summary statistics for groups of data. For example, you can calculate the total sales for each
product category or the average order value for each customer region.
• Data transformation: Pandas provides a variety of functions for transforming data, such as
map(), apply(), and replace(). These functions allow you to create new columns, modify
existing columns, and perform other transformations on data.
• Data Sorting: Pandas can be used to sort data by any column or index. For example, you can
sort a DataFrame by the customer's name or by the order date.
• Data Grouping: Pandas can be used to group data by any column or index. For example, you
can group a DataFrame by product category or by customer region.
PANDAS: Dataset Data Manipulation

• Merging: Pandas can be used to merge two or more DataFrames together. For example, you
can merge a DataFrame of customer data with a DataFrame of order data to create a single
DataFrame that contains all of the information for each customer.
• Joining: Pandas can be used to join two or more DataFrames together based on a common
column. For example, you can join a DataFrame of customer data with a DataFrame of
product data to create a single DataFrame that contains all of the information for each
customer and the products they have ordered.
PANDAS: Dataset Data Manipulation - Extras

• Use the head() and tail() functions to preview the data before you start manipulating it. This will help you to
identify any errors or inconsistencies in the data.
• Use the info() function to get information about the DataFrame, such as the data types of the columns and the
number of rows and columns in the DataFrame. This information can be helpful when choosing the
appropriate functions to use for data manipulation.
• Use the describe() function to calculate summary statistics for the data. This can help you to understand the
distribution of the data and to identify any outliers.
• Use the groupby() function to group the data by one or more columns. This can be useful for performing
aggregate operations on the data, such as calculating summary statistics or finding the most common values
in a column.
• Use the apply() function to apply a function to each row or column of the DataFrame. This can be useful for
performing transformations on the data, such as creating new columns or modifying existing columns.
MATPLOTLIB: Use of Data Visualization

Matplotlib is a low level graph plotting library in python that serves as a visualization utility.
Matplotlib was created by John D. Hunter.
Matplotlib is open source and we can use it freely.
Matplotlib is mostly written in python, a few segments are written in C, Objective-C and
Javascript for Platform compatibility.

Installation: pip install matplotlib

Import: import matplotlib

MATPLOTLIB: Use of Data Visualization

Most of the Matplotlib utilities lies under the pyplot submodule, and are usually imported under
the plt alias:
import matplotlib.pyplot as plt
MATPLOTLIB

Markers: You can use the keyword argument marker to emphasize each point with a specified marker:
MATPLOTLIB

Format Strings: marker | line | color

MATPLOTLIB

Format Strings: marker | line | color

MATPLOTLIB

Format Strings: marker | line | color

MATPLOTLIB

Marker Size: You can use the keyword argument markersize or the shorter version, ms to set the
size of the markers:
MATPLOTLIB

Marker Color: You can use the keyword argument markeredgecolor or the shorter mec to set the
color of the edge of the markers:
MATPLOTLIB

Marker Color: You can use the keyword argument markeredgecolor or the shorter mec to set the
color of the edge of the markers. Use both the mec and mfc arguments to color the entire marker.
MATPLOTLIB

Linestyle: You can use the keyword argument linestyle, or shorter ls, to change the style of the
plotted line:
MATPLOTLIB
MATPLOTLIB

Create Labels for a Plot: With Pyplot, you can use the xlabel() and ylabel() functions to set a
label for the x- and y-axis.

Create a Title for a Plot: With Pyplot, you can use the title() function to set a title for the plot.

Set Font Properties for Title and Labels: You can use the fontdict parameter in xlabel(), ylabel(),
and title() to set font properties for the title and labels.

Position the Title: You can use the loc parameter in title() to position the title. Legal values are:
'left', 'right', and 'center'. Default value is 'center'.
MATPLOTLIB

Display Multiple Plots: With the subplot() function you can draw multiple plots in one figure:
MATPLOTLIB: Basic Plots and Customizing for
effective visualization

Creating Scatter Plots: With Pyplot, you can use the scatter() function to draw a scatter plot.
The scatter() function plots one dot for each observation. It needs two arrays of the same length,
one for the values of the x-axis, and one for values on the y-axis:
MATPLOTLIB: Basic Plots and Customizing for
effective visualization

Providing Colors in Scatter Plots:

MATPLOTLIB: Basic Plots and Customizing for
effective visualization
Size: You can change the size of the dots with the s argument. Just like colors, make sure the array
for sizes has the same length as the arrays for the x- and y-axis:
MATPLOTLIB: Basic Plots and Customizing for
effective visualization

Alpha: You can adjust the transparency of the dots with the alpha argument. Just like colors, make
sure the array for sizes has the same length as the arrays for the x- and y-axis:
MATPLOTLIB: Basic Plots and Customizing for
effective visualization

Creating Bars: With Pyplot, you can use the bar() function to draw bar graphs:
MATPLOTLIB: Basic Plots and Customizing for
effective visualization
Horizontal Bars: If you want the bars to be displayed horizontally instead of vertically, use
the barh() function:
MATPLOTLIB: Basic Plots and Customizing for
effective visualization
Bar Width: The bar() takes the keyword argument width to set the width of the bars. The default
width value is 0.8
The barh() takes the keyword argument height to set the height of the bars
MATPLOTLIB: Basic Plots and Customizing for
effective visualization
Create Histogram: In Matplotlib, we use the hist() function to create histograms.
The hist() function will use an array of numbers to create a histogram, the array is sent into the
function as an argument.
MATPLOTLIB: Basic Plots and Customizing for
effective visualization

Creating Pie Charts: With Pyplot, you can use the pie() function to draw pie charts:
MATPLOTLIB: Basic Plots and Customizing for
effective visualization

Explode: Maybe you want one of the wedges to stand out? The explode parameter allows you to do
that. The explode parameter, if specified, and not None, must be an array with one value for each
wedge. Each value represents how far from the center each wedge is displayed.
MATPLOTLIB: Basic Plots and Customizing for
effective visualization
Legend: To add a list of explanation for each wedge, use the legend() function:
Thank You!

Python Pandas Beginner's Guide
No ratings yet
Python Pandas Beginner's Guide
45 pages
Pandas
No ratings yet
Pandas
10 pages
Data Analytics Preparation & Visualization
No ratings yet
Data Analytics Preparation & Visualization
54 pages
Python Pandas: Data Manipulation Guide
No ratings yet
Python Pandas: Data Manipulation Guide
84 pages
Pandas
No ratings yet
Pandas
8 pages
Pandas Assignment
No ratings yet
Pandas Assignment
12 pages
PP Unit-5 Notes
No ratings yet
PP Unit-5 Notes
15 pages
Wa0005.
No ratings yet
Wa0005.
29 pages
NumPy, Pandas, MatplotLib, Seaborn, ScikitLearn (SkLearn)
No ratings yet
NumPy, Pandas, MatplotLib, Seaborn, ScikitLearn (SkLearn)
14 pages
Introduction to Pandas Library in Python
No ratings yet
Introduction to Pandas Library in Python
39 pages
Pandas
No ratings yet
Pandas
11 pages
Python Pandas Tutorial For Beginners
100% (1)
Python Pandas Tutorial For Beginners
203 pages
DA&V Module 6 (SAMI)
No ratings yet
DA&V Module 6 (SAMI)
10 pages
Pandas
No ratings yet
Pandas
13 pages
L1 Pandaseries
No ratings yet
L1 Pandaseries
21 pages
All Document Reader 1715619870900
No ratings yet
All Document Reader 1715619870900
6 pages
Rest of The Ip Project
No ratings yet
Rest of The Ip Project
26 pages
Introduction to Pandas Library
No ratings yet
Introduction to Pandas Library
31 pages
Pandas Introduction
No ratings yet
Pandas Introduction
4 pages
Pandas
No ratings yet
Pandas
25 pages
Unit 3 (FODS)
No ratings yet
Unit 3 (FODS)
34 pages
Unit 2 Mca275 PPT Part 2
No ratings yet
Unit 2 Mca275 PPT Part 2
33 pages
Unit 5 Python Notes HM
No ratings yet
Unit 5 Python Notes HM
59 pages
Introduction to Python Pandas Library
No ratings yet
Introduction to Python Pandas Library
22 pages
XII-IP-Python & MySQL 2 Chapters (25.26)
No ratings yet
XII-IP-Python & MySQL 2 Chapters (25.26)
268 pages
Practical - 3 (Ai)
No ratings yet
Practical - 3 (Ai)
12 pages
DevOps Session 3 Pandas
No ratings yet
DevOps Session 3 Pandas
33 pages
AI Student HandbookXII 2025-26!8!20
No ratings yet
AI Student HandbookXII 2025-26!8!20
13 pages
Practical 7
No ratings yet
Practical 7
8 pages
Pandas - Panel Data System
No ratings yet
Pandas - Panel Data System
4 pages
Introduction To The Pandas Library - The Backbone o
No ratings yet
Introduction To The Pandas Library - The Backbone o
3 pages
Data Visualization1
No ratings yet
Data Visualization1
52 pages
Python Data Visualization Course
No ratings yet
Python Data Visualization Course
88 pages
Python Modules & Data Tools Guide
No ratings yet
Python Modules & Data Tools Guide
9 pages
Python Data Analysis Guide
100% (3)
Python Data Analysis Guide
72 pages
Q-Step WS 06112019 Data Analysis and Visualisation With Python
No ratings yet
Q-Step WS 06112019 Data Analysis and Visualisation With Python
76 pages
FDS Exp4
No ratings yet
FDS Exp4
5 pages
4 Pandas
No ratings yet
4 Pandas
35 pages
Mohit
No ratings yet
Mohit
19 pages
Unit 4
No ratings yet
Unit 4
105 pages
Unit 5
No ratings yet
Unit 5
8 pages
AIES Assignment1
No ratings yet
AIES Assignment1
15 pages
NumPy and Pandas: Essential Python Libraries
No ratings yet
NumPy and Pandas: Essential Python Libraries
72 pages
Unit 4
No ratings yet
Unit 4
36 pages
Week 4.1
No ratings yet
Week 4.1
16 pages
Pandas Series - Notes For PA3
No ratings yet
Pandas Series - Notes For PA3
9 pages
Python Data Analysis Guide
No ratings yet
Python Data Analysis Guide
75 pages
Pandas
100% (1)
Pandas
24 pages
Python Pandas Tutorial
No ratings yet
Python Pandas Tutorial
6 pages
Data Visualization and Data Handling Using Pandas CLASS 12 - Aashi Nagiya
No ratings yet
Data Visualization and Data Handling Using Pandas CLASS 12 - Aashi Nagiya
19 pages
Introduction to Python Pandas Library
No ratings yet
Introduction to Python Pandas Library
13 pages
JOINS
No ratings yet
JOINS
10 pages
Pandas
No ratings yet
Pandas
29 pages
Pandas
No ratings yet
Pandas
13 pages
Pandas Basics: Data Structures & Features
No ratings yet
Pandas Basics: Data Structures & Features
30 pages
CRAI AI BOOTCAMP Week Two 2025
No ratings yet
CRAI AI BOOTCAMP Week Two 2025
29 pages
Pandas
No ratings yet
Pandas
3 pages
Investment Management
No ratings yet
Investment Management
30 pages
34 Sanjay Pyasi 2
No ratings yet
34 Sanjay Pyasi 2
9 pages
Amul
No ratings yet
Amul
8 pages
Module 3
No ratings yet
Module 3
78 pages
Chocolate Sales Data by Product and Country
No ratings yet
Chocolate Sales Data by Product and Country
278 pages
Chapter 5 Understanding Enterprise Processes Notes
No ratings yet
Chapter 5 Understanding Enterprise Processes Notes
8 pages
Electric Vehicle Data
No ratings yet
Electric Vehicle Data
23,415 pages
Bip 23MBA128 PROJECT REPORT
No ratings yet
Bip 23MBA128 PROJECT REPORT
45 pages
Module 1
No ratings yet
Module 1
6 pages
Matrix New
No ratings yet
Matrix New
8 pages
Scope of Work
100% (1)
Scope of Work
2 pages
ERP Implementation Failure Factors
No ratings yet
ERP Implementation Failure Factors
8 pages
Transfer/Promotion Discrimination Form
No ratings yet
Transfer/Promotion Discrimination Form
6 pages
H3AC3 English Installation Guide
No ratings yet
H3AC3 English Installation Guide
1 page
Wilson, McCormack Et Al. - Lived Experience of Fetal Alcohol Spectrum Disorder
No ratings yet
Wilson, McCormack Et Al. - Lived Experience of Fetal Alcohol Spectrum Disorder
11 pages
Day 2 Passage 2
No ratings yet
Day 2 Passage 2
4 pages
BBCCT 103
No ratings yet
BBCCT 103
7 pages
I Built A Side Hustle With AI. Now It Pays Me $800 Every Month - by Raj Monetix ? - Jun, 2025
0% (1)
I Built A Side Hustle With AI. Now It Pays Me $800 Every Month - by Raj Monetix ? - Jun, 2025
7 pages
Data Structures Complete
50% (2)
Data Structures Complete
255 pages
Bamboo Pavement Case Study
No ratings yet
Bamboo Pavement Case Study
1 page
1st Assignment - Final
No ratings yet
1st Assignment - Final
13 pages
EPP - ICT - Creating A Multimedia Presentation Using The Advanced Features of MS PowerPoint Tool
No ratings yet
EPP - ICT - Creating A Multimedia Presentation Using The Advanced Features of MS PowerPoint Tool
27 pages
Understanding Immediate Memory Types
No ratings yet
Understanding Immediate Memory Types
9 pages
Siemens ASD Product Training
100% (1)
Siemens ASD Product Training
42 pages
Grammar Practice Will Future Predictions Worksheetttt
No ratings yet
Grammar Practice Will Future Predictions Worksheetttt
2 pages
MP Lab Manual
No ratings yet
MP Lab Manual
67 pages
Module 2 - 2A
No ratings yet
Module 2 - 2A
2 pages
USCIS Quito Interview Notice for I-730
No ratings yet
USCIS Quito Interview Notice for I-730
4 pages
Hytrin (Kandungan Sama Dengan Hytroz)
No ratings yet
Hytrin (Kandungan Sama Dengan Hytroz)
7 pages
Usage History: Total Balance Used Rs 0.04
No ratings yet
Usage History: Total Balance Used Rs 0.04
4 pages
Beauty Influencer: Do Generation Z Women Consumers Trust Them?
No ratings yet
Beauty Influencer: Do Generation Z Women Consumers Trust Them?
74 pages
Joni Patry: Vedic Astrology Insights
No ratings yet
Joni Patry: Vedic Astrology Insights
6 pages
Learning Style Inventory
No ratings yet
Learning Style Inventory
2 pages
Onnekas
No ratings yet
Onnekas
2 pages
Photoelectric Effect in Quantum Physics
No ratings yet
Photoelectric Effect in Quantum Physics
13 pages
Digestive Processes in Fish Anatomy
No ratings yet
Digestive Processes in Fish Anatomy
7 pages
A Matrix For Learning
No ratings yet
A Matrix For Learning
2 pages
Advance Construction Management: Engr.M.Abubakar Tariq - Ce5803
No ratings yet
Advance Construction Management: Engr.M.Abubakar Tariq - Ce5803
14 pages
Scale Calibration Procedures in Hospitality
No ratings yet
Scale Calibration Procedures in Hospitality
3 pages

Module 4

Uploaded by

Module 4

Uploaded by

Introduction to Python

Libraries for Data Analysis and

• A Python library is a collection of related modules.

Some of the commonly used libraries are:

Press Shift + Right Click

In the command line, type: pip install library-name

• Pandas is a Python library used for working with data sets.

• It has functions for analyzing, cleaning, exploring, and manipulating data.

Installation of PANDAS: pip install pandas

Checking PANDAS Version: import pandas

Importing PANDAS: import pandas

Importing PANDAS as ALIAS: import pandas as pd

Create a simple Pandas Series from a list

Loc and iloc in Dataframe:

Used to access a group of rows and

Loc and iloc in Dataframe:

Loc and iloc in Dataframe:

Reading and Writing Data to/from CSV Files

Writing Data to CSV Files

Installation: pip install matplotlib

Import: import matplotlib

Format Strings: marker | line | color

Format Strings: marker | line | color

Format Strings: marker | line | color

Providing Colors in Scatter Plots:

You might also like