0% found this document useful (0 votes)
27 views35 pages

Complete Matplotlib Pyplot Guide For Data Visualization

This document is a comprehensive guide to using Matplotlib's Pyplot for data visualization in Python, covering setup, basic and advanced plot types, customization options, and interactive features. It includes detailed syntax, parameters, and implementation examples for various plot types such as line plots, scatter plots, bar charts, histograms, and pie charts. Additionally, it discusses customizing plots with colors, markers, labels, legends, and axes management.

Uploaded by

iron pump
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views35 pages

Complete Matplotlib Pyplot Guide For Data Visualization

This document is a comprehensive guide to using Matplotlib's Pyplot for data visualization in Python, covering setup, basic and advanced plot types, customization options, and interactive features. It includes detailed syntax, parameters, and implementation examples for various plot types such as line plots, scatter plots, bar charts, histograms, and pie charts. Additionally, it discusses customizing plots with colors, markers, labels, legends, and axes management.

Uploaded by

iron pump
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 35

Complete Matplotlib Pyplot Guide for Data

Visualization
Table of Contents
1.​ Introduction and Setup
2.​ Basic Plot Types
3.​ Customizing Plots
4.​ Axes and Subplots
5.​ Advanced Plot Types
6.​ Styling and Themes
7.​ Annotations and Text
8.​ 3D Plotting
9.​ Interactive Features
10.​Saving and Exporting

1. Introduction and Setup


What is Matplotlib?

Matplotlib is a comprehensive library for creating static, animated, and interactive visualizations in
Python. Pyplot is its state-based interface that provides a MATLAB-like plotting framework.

Basic Setup
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

# Enable inline plotting in Jupyter notebooks


%matplotlib inline

Key Concepts

●​ Figure: The entire window or page that everything is drawn on


●​ Axes: The area on which data is plotted (subplot)
●​ Artist: Everything visible on the figure (lines, text, etc.)
2. Basic Plot Types
2.1 Line Plots

Basic Syntax
plt.plot(x, y, format_string, **kwargs)

Parameters

●​ x: Array-like, x-axis data


●​ y: Array-like, y-axis data
●​ format_string: String specifying color, marker, linestyle (e.g., 'ro-')
●​ label: String, legend label
●​ linewidth or lw: Float, line width
●​ linestyle or ls: String, line style ('-', '--', '-.', ':')
●​ color or c: Color specification
●​ marker: Marker style ('o', 's', '^', etc.)
●​ markersize or ms: Float, marker size
●​ alpha: Float (0-1), transparency

Implementation Examples
# Simple line plot
x = np.linspace(0, 10, 100)
y = np.sin(x)
plt.plot(x, y)
plt.title('Basic Line Plot')
plt.xlabel('X values')
plt.ylabel('Y values')
plt.show()

# Multiple lines with customization


plt.figure(figsize=(10, 6))
plt.plot(x, np.sin(x), 'b-', label='sin(x)', linewidth=2)
plt.plot(x, np.cos(x), 'r--', label='cos(x)', linewidth=2)
plt.plot(x, np.tan(x), 'g:', label='tan(x)', alpha=0.7)
plt.legend()
plt.grid(True)
plt.ylim(-2, 2)
plt.show()

2.2 Scatter Plots

Basic Syntax
plt.scatter(x, y, s=None, c=None, **kwargs)

Parameters

●​ s: Float or array-like, marker sizes


●​ c: Array-like or color, marker colors
●​ marker: Marker style
●​ cmap: Colormap name
●​ alpha: Transparency
●​ edgecolors: Edge color
●​ linewidths: Edge line width

Implementation
# Basic scatter plot
np.random.seed(42)
x = np.random.randn(100)
y = np.random.randn(100)
colors = np.random.rand(100)
sizes = 1000 * np.random.rand(100)

plt.figure(figsize=(10, 6))
plt.scatter(x, y, c=colors, s=sizes, alpha=0.6, cmap='viridis')
plt.colorbar()
plt.title('Scatter Plot with Variable Colors and Sizes')
plt.xlabel('X values')
plt.ylabel('Y values')
plt.show()

2.3 Bar Charts

Basic Syntax
plt.bar(x, height, width=0.8, **kwargs)
plt.barh(y, width, height=0.8, **kwargs) # Horizontal bars

Parameters

●​ x: Array-like, bar positions


●​ height: Array-like, bar heights
●​ width: Float or array-like, bar widths
●​ bottom: Float or array-like, bottom positions (for stacking)
●​ color: Color specification
●​ edgecolor: Edge color
●​ linewidth: Edge line width
●​ align: Alignment ('center', 'edge')

Implementation
# Vertical bar chart
categories = ['A', 'B', 'C', 'D', 'E']
values = [23, 45, 56, 78, 32]

plt.figure(figsize=(10, 6))
bars = plt.bar(categories, values, color=['red', 'blue', 'green', 'orange', 'purple'])
plt.title('Bar Chart Example')
plt.xlabel('Categories')
plt.ylabel('Values')
# Add value labels on bars
for bar in bars:
height = bar.get_height()
plt.text(bar.get_x() + bar.get_width()/2., height + 1,
f'{height}', ha='center', va='bottom')
plt.show()

# Horizontal bar chart


plt.figure(figsize=(10, 6))
plt.barh(categories, values, color='skyblue')
plt.title('Horizontal Bar Chart')
plt.xlabel('Values')
plt.ylabel('Categories')
plt.show()

# Stacked bar chart


categories = ['Q1', 'Q2', 'Q3', 'Q4']
values1 = [20, 35, 30, 35]
values2 = [25, 25, 15, 30]

plt.figure(figsize=(10, 6))
plt.bar(categories, values1, label='Product A')
plt.bar(categories, values2, bottom=values1, label='Product B')
plt.title('Stacked Bar Chart')
plt.xlabel('Quarters')
plt.ylabel('Sales')
plt.legend()
plt.show()

2.4 Histograms

Basic Syntax
plt.hist(x, bins=None, **kwargs)

Parameters

●​ x: Array-like, data values


●​ bins: Integer or array-like, number of bins or bin edges
●​ range: Tuple, range of values to include
●​ density: Boolean, normalize to show probability density
●​ cumulative: Boolean, cumulative histogram
●​ histtype: String, histogram type ('bar', 'step', 'stepfilled')
●​ orientation: String, 'horizontal' or 'vertical'
●​ color: Color specification
●​ alpha: Transparency
●​ edgecolor: Edge color

Implementation
# Basic histogram
np.random.seed(42)
data = np.random.normal(100, 15, 1000)

plt.figure(figsize=(12, 4))

# Basic histogram
plt.subplot(1, 3, 1)
plt.hist(data, bins=30, color='skyblue', edgecolor='black')
plt.title('Basic Histogram')
plt.xlabel('Values')
plt.ylabel('Frequency')

# Normalized histogram (probability density)


plt.subplot(1, 3, 2)
plt.hist(data, bins=30, density=True, color='lightgreen', alpha=0.7)
plt.title('Normalized Histogram')
plt.xlabel('Values')
plt.ylabel('Density')

# Cumulative histogram
plt.subplot(1, 3, 3)
plt.hist(data, bins=30, cumulative=True, color='coral', alpha=0.7)
plt.title('Cumulative Histogram')
plt.xlabel('Values')
plt.ylabel('Cumulative Frequency')

plt.tight_layout()
plt.show()

2.5 Pie Charts

Basic Syntax
plt.pie(x, labels=None, **kwargs)

Parameters

●​ x: Array-like, wedge sizes


●​ labels: List, wedge labels
●​ colors: List, wedge colors
●​ autopct: String or function, label format
●​ startangle: Float, starting angle
●​ explode: Array-like, wedge separation
●​ shadow: Boolean, drop shadow
●​ textprops: Dict, text properties

Implementation
# Basic pie chart
sizes = [30, 25, 20, 15, 10]
labels = ['A', 'B', 'C', 'D', 'E']
colors = ['gold', 'lightcoral', 'lightskyblue', 'lightgreen', 'pink']
explode = (0.1, 0, 0, 0, 0) # explode first slice
plt.figure(figsize=(10, 8))
plt.pie(sizes, labels=labels, colors=colors, autopct='%1.1f%%',
startangle=90, explode=explode, shadow=True)
plt.title('Pie Chart Example')
plt.axis('equal') # Equal aspect ratio ensures circular pie
plt.show()
3. Customizing Plots
3.1 Colors and Colormaps

Color Specifications
# Different ways to specify colors
plt.plot(x, y, color='red') # Named colors
plt.plot(x, y, color='r') # Single letter
plt.plot(x, y, color='#FF0000') # Hex codes
plt.plot(x, y, color=(1, 0, 0)) # RGB tuple
plt.plot(x, y, color=(1, 0, 0, 0.5)) # RGBA tuple

Colormaps
# Common colormaps
colormaps = ['viridis', 'plasma', 'inferno', 'magma',
'Blues', 'Reds', 'Greens', 'coolwarm', 'seismic']

# Using colormaps
plt.scatter(x, y, c=values, cmap='viridis')
plt.colorbar(label='Color Scale')

3.2 Markers and Line Styles

Marker Styles
markers = ['o', 's', '^', 'v', '<', '>', 'd', 'p', 'h', '*', '+', 'x']

# Line styles
linestyles = ['-', '--', '-.', ':']

# Format strings combine color, marker, and line style


plt.plot(x, y, 'ro-') # Red circles with solid line
plt.plot(x, y, 'b^--') # Blue triangles with dashed line

3.3 Labels and Titles

Comprehensive Labeling
plt.figure(figsize=(10, 6))
plt.plot(x, y)

# Titles and labels with customization


plt.title('Main Title', fontsize=16, fontweight='bold', pad=20)
plt.xlabel('X-axis Label', fontsize=12, fontweight='bold')
plt.ylabel('Y-axis Label', fontsize=12, fontweight='bold')

# Subtitle using suptitle


plt.suptitle('Figure Title', fontsize=18, y=0.98)

# Custom text positioning


plt.text(0.5, 0.5, 'Custom Text', transform=plt.gca().transAxes,
fontsize=12, ha='center', va='center')
plt.show()

3.4 Legends

Legend Customization
plt.figure(figsize=(10, 6))
plt.plot(x, np.sin(x), label='sin(x)')
plt.plot(x, np.cos(x), label='cos(x)')

# Legend with customization


plt.legend(loc='upper right', # Location
frameon=True, # Frame on/off
fancybox=True, # Rounded corners
shadow=True, # Drop shadow
ncol=1, # Number of columns
fontsize=12, # Font size
title='Functions', # Legend title
title_fontsize=14) # Title font size

plt.show()

3.5 Grid Customization

Grid Options
plt.figure(figsize=(10, 6))
plt.plot(x, y)

# Grid customization
plt.grid(True, # Enable grid
linestyle='-', # Line style
linewidth=0.5, # Line width
alpha=0.7, # Transparency
color='gray') # Color

# Fine control over major and minor grids


plt.grid(True, which='major', linestyle='-', alpha=0.7)
plt.grid(True, which='minor', linestyle=':', alpha=0.4)
plt.minorticks_on()

plt.show()
4. Axes and Subplots
4.1 Figure and Axes Management

Creating Figures
# Method 1: pyplot interface
plt.figure(figsize=(12, 8))
plt.plot(x, y)
plt.show()

# Method 2: Object-oriented interface


fig, ax = plt.subplots(figsize=(12, 8))
ax.plot(x, y)
plt.show()

# Figure parameters
fig = plt.figure(figsize=(12, 8), # Size in inches
dpi=100, # Dots per inch
facecolor='white', # Background color
edgecolor='black') # Edge color

4.2 Subplots

Basic Subplots
# Method 1: plt.subplot()
plt.figure(figsize=(15, 10))

plt.subplot(2, 2, 1) # 2 rows, 2 columns, position 1


plt.plot(x, np.sin(x))
plt.title('sin(x)')

plt.subplot(2, 2, 2)
plt.plot(x, np.cos(x))
plt.title('cos(x)')

plt.subplot(2, 2, 3)
plt.plot(x, np.tan(x))
plt.title('tan(x)')

plt.subplot(2, 2, 4)
plt.scatter(x[::10], np.sin(x[::10]))
plt.title('Scatter')

plt.tight_layout()
plt.show()

# Method 2: plt.subplots()
fig, axes = plt.subplots(2, 2, figsize=(15, 10))

axes[0, 0].plot(x, np.sin(x))


axes[0, 0].set_title('sin(x)')
axes[0, 1].plot(x, np.cos(x))
axes[0, 1].set_title('cos(x)')

axes[1, 0].plot(x, np.tan(x))


axes[1, 0].set_title('tan(x)')
axes[1, 0].set_ylim(-5, 5)

axes[1, 1].scatter(x[::10], np.sin(x[::10]))


axes[1, 1].set_title('Scatter')

plt.tight_layout()
plt.show()

Advanced Subplot Layouts


# Using GridSpec for complex layouts
from matplotlib.gridspec import GridSpec

fig = plt.figure(figsize=(15, 10))


gs = GridSpec(3, 3, figure=fig)

# Different sized subplots


ax1 = fig.add_subplot(gs[0, :]) # Top row, all columns
ax1.plot(x, np.sin(x))
ax1.set_title('Full width plot')

ax2 = fig.add_subplot(gs[1, 0]) # Middle left


ax2.plot(x, np.cos(x))
ax2.set_title('cos(x)')

ax3 = fig.add_subplot(gs[1:, 1:]) # Bottom right (2x2)


ax3.scatter(np.random.randn(100), np.random.randn(100))
ax3.set_title('Large scatter plot')

plt.tight_layout()
plt.show()

4.3 Axis Customization

Axis Limits and Scales


plt.figure(figsize=(15, 5))

# Linear scale
plt.subplot(1, 3, 1)
plt.plot(x, np.exp(x/10))
plt.xlim(0, 10)
plt.ylim(0, 10)
plt.title('Linear Scale')

# Log scale
plt.subplot(1, 3, 2)
plt.plot(x, np.exp(x/10))
plt.yscale('log')
plt.title('Log Y Scale')

# Both axes log scale


plt.subplot(1, 3, 3)
plt.loglog(x[1:], x[1:]**2)
plt.title('Log-Log Scale')

plt.tight_layout()
plt.show()

Tick Customization
plt.figure(figsize=(12, 6))
plt.plot(x, np.sin(x))

# Custom tick locations and labels


plt.xticks(np.arange(0, 11, 2), # Tick positions
['Zero', 'Two', 'Four', 'Six', # Custom labels
'Eight', 'Ten'],
rotation=45, # Rotation
fontsize=12) # Font size

plt.yticks(np.arange(-1, 1.2, 0.5))

# Hide ticks
plt.tick_params(axis='x', # Which axis
which='both', # Major and minor ticks
bottom=False, # Tick marks on bottom
top=False, # Tick marks on top
labelbottom=False) # Labels on bottom

plt.show()
5. Advanced Plot Types
5.1 Heatmaps

Using imshow()
# Create sample data
data = np.random.rand(10, 10)

plt.figure(figsize=(12, 5))

# Basic heatmap
plt.subplot(1, 2, 1)
plt.imshow(data, cmap='hot', interpolation='nearest')
plt.colorbar(label='Values')
plt.title('Basic Heatmap')

# Customized heatmap
plt.subplot(1, 2, 2)
im = plt.imshow(data, cmap='coolwarm', aspect='auto')
plt.colorbar(im, shrink=0.8)
plt.title('Customized Heatmap')

# Add value annotations


for i in range(data.shape[0]):
for j in range(data.shape[1]):
plt.text(j, i, f'{data[i, j]:.2f}',
ha='center', va='center', color='black')

plt.tight_layout()
plt.show()

5.2 Contour Plots

Contour and Contourf


# Create meshgrid
x = np.linspace(-3, 3, 50)
y = np.linspace(-3, 3, 50)
X, Y = np.meshgrid(x, y)
Z = np.exp(-(X**2 + Y**2))

plt.figure(figsize=(15, 5))

# Contour lines
plt.subplot(1, 3, 1)
contour = plt.contour(X, Y, Z, levels=10)
plt.clabel(contour, inline=True, fontsize=8)
plt.title('Contour Lines')

# Filled contour
plt.subplot(1, 3, 2)
plt.contourf(X, Y, Z, levels=20, cmap='viridis')
plt.colorbar(label='Values')
plt.title('Filled Contour')

# Combined
plt.subplot(1, 3, 3)
plt.contourf(X, Y, Z, levels=20, cmap='viridis', alpha=0.7)
contour = plt.contour(X, Y, Z, levels=10, colors='black', alpha=0.4)
plt.clabel(contour, inline=True, fontsize=8)
plt.colorbar(label='Values')
plt.title('Combined')

plt.tight_layout()
plt.show()

5.3 Box Plots

Box Plot Syntax


plt.boxplot(x, labels=None, **kwargs)

Parameters and Implementation


# Generate sample data
np.random.seed(42)
data1 = np.random.normal(100, 10, 200)
data2 = np.random.normal(90, 20, 200)
data3 = np.random.normal(80, 5, 200)
data = [data1, data2, data3]

plt.figure(figsize=(12, 6))

# Basic box plot


plt.subplot(1, 2, 1)
bp = plt.boxplot(data, labels=['Group A', 'Group B', 'Group C'])
plt.title('Basic Box Plot')
plt.ylabel('Values')

# Customized box plot


plt.subplot(1, 2, 2)
bp = plt.boxplot(data,
labels=['Group A', 'Group B', 'Group C'],
patch_artist=True, # Fill boxes
notch=True, # Notched boxes
showmeans=True, # Show means
meanline=True) # Mean as line

# Color the boxes


colors = ['lightblue', 'lightgreen', 'lightcoral']
for patch, color in zip(bp['boxes'], colors):
patch.set_facecolor(color)

plt.title('Customized Box Plot')


plt.ylabel('Values')

plt.tight_layout()
plt.show()

5.4 Violin Plots

Violin Plot Implementation


plt.figure(figsize=(10, 6))

# Violin plot (requires seaborn or manual implementation)


# Here's a basic approach using matplotlib
positions = [1, 2, 3]
parts = plt.violinplot(data, positions=positions, showmeans=True, showmedians=True)

# Customize colors
for pc in parts['bodies']:
pc.set_facecolor('lightblue')
pc.set_alpha(0.7)

plt.xticks(positions, ['Group A', 'Group B', 'Group C'])


plt.title('Violin Plot')
plt.ylabel('Values')
plt.show()

5.5 Error Bars

Error Bar Implementation


# Sample data with errors
x = np.arange(0, 10, 1)
y = np.exp(-x/10.0)
yerr = 0.1 * y
xerr = 0.1

plt.figure(figsize=(12, 6))

# Basic error bars


plt.subplot(1, 2, 1)
plt.errorbar(x, y, yerr=yerr, xerr=xerr, fmt='o-')
plt.title('Basic Error Bars')
plt.xlabel('X values')
plt.ylabel('Y values')

# Customized error bars


plt.subplot(1, 2, 2)
plt.errorbar(x, y, yerr=yerr, xerr=xerr,
fmt='s-', # Square markers, solid line
capsize=5, # Error bar cap size
capthick=2, # Cap thickness
ecolor='red', # Error bar color
elinewidth=2, # Error bar width
alpha=0.7) # Transparency

plt.title('Customized Error Bars')


plt.xlabel('X values')
plt.ylabel('Y values')

plt.tight_layout()
plt.show()
6. Styling and Themes
6.1 Built-in Styles

Available Styles
# List available styles
print(plt.style.available)

# Using styles
plt.style.use('seaborn-v0_8') # Seaborn style
plt.style.use('ggplot') # ggplot style
plt.style.use('classic') # Classic matplotlib
plt.style.use('dark_background') # Dark theme

# Temporary style context


with plt.style.context('seaborn-v0_8'):
plt.plot(x, y)
plt.show()

6.2 Custom Styling

RC Parameters
# Modify global parameters
plt.rcParams['figure.figsize'] = (12, 8)
plt.rcParams['font.size'] = 14
plt.rcParams['lines.linewidth'] = 2
plt.rcParams['grid.alpha'] = 0.3

# Or use rcParams context


with plt.rc_context({'font.size': 16, 'lines.linewidth': 3}):
plt.plot(x, y)
plt.show()

Font Customization
# Font properties
font_title = {'family': 'serif',
'weight': 'bold',
'size': 16}

font_label = {'family': 'sans-serif',


'weight': 'normal',
'size': 12}

plt.figure(figsize=(10, 6))
plt.plot(x, np.sin(x))
plt.title('Custom Font Title', fontdict=font_title)
plt.xlabel('X Label', fontdict=font_label)
plt.ylabel('Y Label', fontdict=font_label)
plt.show()
7. Annotations and Text
7.1 Text and Annotations

Adding Text
plt.figure(figsize=(10, 6))
plt.plot(x, np.sin(x))

# Simple text
plt.text(5, 0.5, 'Simple Text', fontsize=12)

# Text with box


plt.text(7, -0.5, 'Boxed Text', fontsize=12,
bbox=dict(boxstyle='round', facecolor='wheat', alpha=0.8))

# Annotation with arrow


plt.annotate('Maximum', xy=(np.pi/2, 1), xytext=(3, 1.2),
arrowprops=dict(arrowstyle='->', color='red'),
fontsize=12, ha='center')

plt.title('Text and Annotations')


plt.show()

Advanced Annotations
plt.figure(figsize=(12, 8))
x = np.linspace(0, 2*np.pi, 100)
y = np.sin(x)
plt.plot(x, y)

# Different arrow styles


arrow_styles = ['->', '-|>', '<->', '<|-|>']
positions = [(1, 0.8), (2, 0.5), (3, 0.2), (4, -0.2)]

for i, (pos, style) in enumerate(zip(positions, arrow_styles)):


plt.annotate(f'Style {i+1}', xy=pos, xytext=(pos[0], pos[1]+0.3),
arrowprops=dict(arrowstyle=style, color=f'C{i}'),
fontsize=10, ha='center')

plt.title('Different Arrow Styles')


plt.show()

7.2 Mathematical Expressions

LaTeX in Matplotlib
plt.figure(figsize=(10, 6))

# Mathematical expressions
plt.plot(x, np.sin(x), label=r'$y = \sin(x)$')
plt.plot(x, np.cos(x), label=r'$y = \cos(x)$')

plt.title(r'Trigonometric Functions: $f(x) = \sin(x)$ and $g(x) = \cos(x)$')


plt.xlabel(r'$x$ (radians)')
plt.ylabel(r'$f(x)$')

# Complex mathematical expression


plt.text(5, 0.7, r'$\int_0^{2\pi} \sin(x) dx = 0$', fontsize=14,
bbox=dict(boxstyle='round', facecolor='lightblue'))

plt.legend()
plt.grid(True, alpha=0.3)
plt.show()
8. 3D Plotting
8.1 3D Setup
from mpl_toolkits.mplot3d import Axes3D

# Creating 3D subplot
fig = plt.figure(figsize=(12, 9))
ax = fig.add_subplot(111, projection='3d')

8.2 3D Plot Types

3D Line and Scatter Plots


fig = plt.figure(figsize=(15, 5))

# 3D Line plot
ax1 = fig.add_subplot(1, 3, 1, projection='3d')
t = np.linspace(0, 4*np.pi, 100)
x = np.cos(t)
y = np.sin(t)
z=t
ax1.plot(x, y, z)
ax1.set_title('3D Line Plot')

# 3D Scatter plot
ax2 = fig.add_subplot(1, 3, 2, projection='3d')
n = 100
x = np.random.randn(n)
y = np.random.randn(n)
z = np.random.randn(n)
colors = np.random.rand(n)
ax2.scatter(x, y, z, c=colors, marker='o')
ax2.set_title('3D Scatter Plot')

# 3D Surface plot
ax3 = fig.add_subplot(1, 3, 3, projection='3d')
x = np.linspace(-5, 5, 50)
y = np.linspace(-5, 5, 50)
X, Y = np.meshgrid(x, y)
Z = np.sin(np.sqrt(X**2 + Y**2))
surf = ax3.plot_surface(X, Y, Z, cmap='viridis', alpha=0.7)
ax3.set_title('3D Surface Plot')

plt.tight_layout()
plt.show()
9. Interactive Features
9.1 Interactive Backend
# Enable interactive backend
%matplotlib widget # For Jupyter widgets
# or
%matplotlib notebook # For notebook backend

9.2 Event Handling


def onclick(event):
print(f'Button: {event.button}, x: {event.xdata:.2f}, y: {event.ydata:.2f}')

fig, ax = plt.subplots()
ax.plot(x, np.sin(x))

# Connect event handler


cid = fig.canvas.mpl_connect('button_press_event', onclick)
plt.show()

# Disconnect when done


# fig.canvas.mpl_disconnect(cid)

10. Saving and Exporting


10.1 Saving Figures

Save Methods
plt.figure(figsize=(10, 6))
plt.plot(x, np.sin(x))
plt.title('Sample Plot for Saving')

# Different formats and options


plt.savefig('plot.png', # Filename
dpi=300, # Resolution
bbox_inches='tight', # Tight bounding box
facecolor='white', # Background color
transparent=False, # Transparent background
format='png') # File format

# Other formats
plt.savefig('plot.pdf', format='pdf', bbox_inches='tight')
plt.savefig('plot.svg', format='svg', bbox_inches='tight')
plt.savefig('plot.eps', format='eps', bbox_inches='tight')
plt.show()

Format-Specific Options
# PNG with high DPI for publications
plt.savefig('high_res.png', dpi=600, bbox_inches='tight',
facecolor='white', edgecolor='none')

# PDF with metadata


plt.savefig('plot_with_metadata.pdf',
bbox_inches='tight',
metadata={'Title': 'My Plot', 'Author': 'Data Scientist'})

# SVG for web use


plt.savefig('web_plot.svg', format='svg', bbox_inches='tight')

10.2 Multiple Figures Management


# Create multiple figures
fig1 = plt.figure(figsize=(8, 6))
plt.plot(x, np.sin(x))
plt.title('Figure 1')

fig2 = plt.figure(figsize=(8, 6))


plt.plot(x, np.cos(x))
plt.title('Figure 2')

# Save specific figures


fig1.savefig('sine_plot.png', dpi=300, bbox_inches='tight')
fig2.savefig('cosine_plot.png', dpi=300, bbox_inches='tight')

# Show specific figure


plt.figure(fig1.number)
plt.show()

# Close figures to save memory


plt.close(fig1)
plt.close(fig2)
# Or close all
plt.close('all')
11. Performance and Optimization
11.1 Large Dataset Handling

Efficient Plotting Techniques


# For large datasets, use sampling or aggregation
large_x = np.random.randn(1000000)
large_y = np.random.randn(1000000)

# Method 1: Sample data


sample_size = 10000
indices = np.random.choice(len(large_x), sample_size, replace=False)
plt.scatter(large_x[indices], large_y[indices], alpha=0.5)
plt.title('Sampled Large Dataset')
plt.show()

# Method 2: Use hexbin for density plots


plt.figure(figsize=(10, 6))
plt.hexbin(large_x, large_y, gridsize=50, cmap='Blues')
plt.colorbar(label='Count')
plt.title('Hexbin Plot of Large Dataset')
plt.show()

# Method 3: 2D histogram
plt.figure(figsize=(10, 6))
plt.hist2d(large_x, large_y, bins=100, cmap='Blues')
plt.colorbar(label='Count')
plt.title('2D Histogram of Large Dataset')
plt.show()

11.2 Animation Basics

Simple Animation
from matplotlib.animation import FuncAnimation

# Set up figure and axis


fig, ax = plt.subplots(figsize=(10, 6))
ax.set_xlim(0, 2*np.pi)
ax.set_ylim(-1.5, 1.5)

line, = ax.plot([], [], 'b-')


ax.set_title('Animated Sine Wave')
ax.set_xlabel('x')
ax.set_ylabel('sin(x)')

# Animation function
def animate(frame):
x = np.linspace(0, 2*np.pi, 1000)
y = np.sin(x + frame/10)
line.set_data(x, y)
return line,
# Create animation
anim = FuncAnimation(fig, animate, frames=200, interval=50, blit=True)
plt.show()

# Save animation (requires ffmpeg)


# anim.save('sine_wave.gif', writer='pillow', fps=20)
12. Working with DataFrames
12.1 Pandas Integration

Plotting from DataFrames


# Create sample DataFrame
dates = pd.date_range('2023-01-01', periods=100)
df = pd.DataFrame({
'date': dates,
'value1': np.cumsum(np.random.randn(100)) + 100,
'value2': np.cumsum(np.random.randn(100)) + 50,
'category': np.random.choice(['A', 'B', 'C'], 100)
})

# Direct plotting from DataFrame


fig, axes = plt.subplots(2, 2, figsize=(15, 10))

# Time series plot


axes[0, 0].plot(df['date'], df['value1'], label='Value 1')
axes[0, 0].plot(df['date'], df['value2'], label='Value 2')
axes[0, 0].set_title('Time Series')
axes[0, 0].legend()
axes[0, 0].tick_params(axis='x', rotation=45)

# Histogram
axes[0, 1].hist(df['value1'], bins=20, alpha=0.7, label='Value 1')
axes[0, 1].hist(df['value2'], bins=20, alpha=0.7, label='Value 2')
axes[0, 1].set_title('Histograms')
axes[0, 1].legend()

# Box plot by category


categories = df['category'].unique()
data_by_cat = [df[df['category'] == cat]['value1'].values for cat in categories]
axes[1, 0].boxplot(data_by_cat, labels=categories)
axes[1, 0].set_title('Box Plot by Category')

# Scatter plot
scatter = axes[1, 1].scatter(df['value1'], df['value2'],
c=df.index, cmap='viridis', alpha=0.7)
axes[1, 1].set_xlabel('Value 1')
axes[1, 1].set_ylabel('Value 2')
axes[1, 1].set_title('Scatter Plot')
plt.colorbar(scatter, ax=axes[1, 1])

plt.tight_layout()
plt.show()

12.2 Grouped Data Visualization


# Grouped data analysis
grouped_data = df.groupby('category')['value1'].agg(['mean', 'std'])

# Bar plot with error bars


fig, ax = plt.subplots(figsize=(10, 6))
x_pos = np.arange(len(grouped_data.index))

bars = ax.bar(x_pos, grouped_data['mean'], yerr=grouped_data['std'],


capsize=5, alpha=0.7, color=['red', 'green', 'blue'])

ax.set_xlabel('Category')
ax.set_ylabel('Mean Value')
ax.set_title('Mean Values by Category with Error Bars')
ax.set_xticks(x_pos)
ax.set_xticklabels(grouped_data.index)

# Add value labels on bars


for i, (bar, mean_val) in enumerate(zip(bars, grouped_data['mean'])):
ax.text(bar.get_x() + bar.get_width()/2, bar.get_height() + 1,
f'{mean_val:.1f}', ha='center', va='bottom')

plt.show()
13. Statistical Plots
13.1 Distribution Plots

Q-Q Plots and Probability Plots


from scipy import stats

# Generate sample data


np.random.seed(42)
normal_data = np.random.normal(0, 1, 1000)
uniform_data = np.random.uniform(0, 1, 1000)
exponential_data = np.random.exponential(1, 1000)

fig, axes = plt.subplots(2, 3, figsize=(18, 12))

# Histograms with theoretical distributions


axes[0, 0].hist(normal_data, bins=50, density=True, alpha=0.7, color='blue')
x = np.linspace(-4, 4, 100)
axes[0, 0].plot(x, stats.norm.pdf(x, 0, 1), 'r-', linewidth=2, label='Theoretical')
axes[0, 0].set_title('Normal Distribution')
axes[0, 0].legend()

axes[0, 1].hist(uniform_data, bins=50, density=True, alpha=0.7, color='green')


axes[0, 1].axhline(y=1, color='red', linestyle='-', linewidth=2, label='Theoretical')
axes[0, 1].set_title('Uniform Distribution')
axes[0, 1].legend()

axes[0, 2].hist(exponential_data, bins=50, density=True, alpha=0.7, color='orange')


x = np.linspace(0, 6, 100)
axes[0, 2].plot(x, stats.expon.pdf(x, scale=1), 'r-', linewidth=2, label='Theoretical')
axes[0, 2].set_title('Exponential Distribution')
axes[0, 2].legend()

# Q-Q plots
stats.probplot(normal_data, dist="norm", plot=axes[1, 0])
axes[1, 0].set_title('Q-Q Plot: Normal Data vs Normal Distribution')

stats.probplot(uniform_data, dist="norm", plot=axes[1, 1])


axes[1, 1].set_title('Q-Q Plot: Uniform Data vs Normal Distribution')

stats.probplot(exponential_data, dist="norm", plot=axes[1, 2])


axes[1, 2].set_title('Q-Q Plot: Exponential Data vs Normal Distribution')

plt.tight_layout()
plt.show()

13.2 Correlation and Regression

Correlation Heatmap and Regression Plots


# Create correlated data
np.random.seed(42)
n_vars = 5
n_obs = 200

# Generate correlation matrix


corr_matrix = np.random.rand(n_vars, n_vars)
corr_matrix = (corr_matrix + corr_matrix.T) / 2 # Make symmetric
np.fill_diagonal(corr_matrix, 1) # Diagonal should be 1

# Generate multivariate normal data


data = np.random.multivariate_normal(np.zeros(n_vars), corr_matrix, n_obs)
df_corr = pd.DataFrame(data, columns=[f'Var{i+1}' for i in range(n_vars)])

# Calculate correlation matrix


corr = df_corr.corr()

fig, axes = plt.subplots(1, 2, figsize=(15, 6))

# Correlation heatmap
im = axes[0].imshow(corr, cmap='coolwarm', vmin=-1, vmax=1)
axes[0].set_xticks(range(len(corr.columns)))
axes[0].set_yticks(range(len(corr.columns)))
axes[0].set_xticklabels(corr.columns, rotation=45)
axes[0].set_yticklabels(corr.columns)
axes[0].set_title('Correlation Heatmap')

# Add correlation values to cells


for i in range(len(corr.columns)):
for j in range(len(corr.columns)):
text = axes[0].text(j, i, f'{corr.iloc[i, j]:.2f}',
ha="center", va="center", color="black")

plt.colorbar(im, ax=axes[0])

# Regression plot
x = df_corr['Var1']
y = df_corr['Var2']
axes[1].scatter(x, y, alpha=0.6)

# Add regression line


slope, intercept, r_value, p_value, std_err = stats.linregress(x, y)
line = slope * x + intercept
axes[1].plot(x, line, 'r-', linewidth=2,
label=f'y = {slope:.2f}x + {intercept:.2f}\nR² = {r_value**2:.3f}')

axes[1].set_xlabel('Var1')
axes[1].set_ylabel('Var2')
axes[1].set_title('Regression Plot')
axes[1].legend()

plt.tight_layout()
plt.show()
14. Advanced Customization Techniques
14.1 Custom Color Schemes

Creating Custom Colormaps


from matplotlib.colors import LinearSegmentedColormap, ListedColormap

# Create custom colormap from colors


colors = ['#FF0000', '#FFFF00', '#00FF00', '#00FFFF', '#0000FF']
custom_cmap = LinearSegmentedColormap.from_list('custom', colors, N=256)

# Create discrete colormap


discrete_colors = ['red', 'blue', 'green', 'orange', 'purple']
discrete_cmap = ListedColormap(discrete_colors)

# Test custom colormaps


fig, axes = plt.subplots(1, 3, figsize=(18, 5))

# Continuous custom colormap


x = np.linspace(-3, 3, 100)
y = np.linspace(-3, 3, 100)
X, Y = np.meshgrid(x, y)
Z = np.exp(-(X**2 + Y**2))

im1 = axes[0].contourf(X, Y, Z, levels=20, cmap=custom_cmap)


axes[0].set_title('Custom Continuous Colormap')
plt.colorbar(im1, ax=axes[0])

# Discrete colormap
categories = np.random.randint(0, 5, (20, 20))
im2 = axes[1].imshow(categories, cmap=discrete_cmap)
axes[1].set_title('Custom Discrete Colormap')
plt.colorbar(im2, ax=axes[1], ticks=range(5))

# Color cycle for line plots


plt.rcParams['axes.prop_cycle'] = plt.cycler(color=discrete_colors)
for i in range(5):
axes[2].plot(np.random.randn(50).cumsum(), label=f'Series {i+1}')
axes[2].set_title('Custom Color Cycle')
axes[2].legend()

plt.tight_layout()
plt.show()

14.2 Advanced Text and Annotation Formatting

Rich Text and Mathematical Expressions


fig, ax = plt.subplots(figsize=(12, 8))

# Sample plot
x = np.linspace(0, 10, 100)
y = np.exp(-x/5) * np.cos(2*x)
ax.plot(x, y, 'b-', linewidth=2)

# Various text formatting options


ax.text(2, 0.8, r'$e^{-x/5}\cos(2x), fontsize=20,
bbox=dict(boxstyle='round,pad=0.3', facecolor='yellow', alpha=0.7))

# Multi-line text with different styles


multiline_text = r'''This is a multi-line annotation
$\alpha = \frac{\beta}{\gamma}$
Bold: $\mathbf{vector}$
Italic: $\mathit{variable}''

ax.text(6, 0.5, multiline_text, fontsize=12, verticalalignment='top',


bbox=dict(boxstyle='square,pad=0.5', facecolor='lightblue', alpha=0.8))

# Fancy arrow annotation


ax.annotate('Local Maximum', xy=(1.57, 0.67), xytext=(3, 0.9),
fontsize=12, ha='center',
arrowprops=dict(arrowstyle='->', lw=2, color='red',
connectionstyle='arc3,rad=0.2'))

# Custom annotation box


ax.annotate('Exponential Decay', xy=(8, 0.1), xytext=(5, -0.3),
fontsize=12, ha='center',
bbox=dict(boxstyle='round,pad=0.3', facecolor='lightgreen'),
arrowprops=dict(arrowstyle='-|>', lw=2, color='green'))

ax.set_title(r'Function: $f(x) = e^{-x/5}\cos(2x), fontsize=16)


ax.set_xlabel('x', fontsize=14)
ax.set_ylabel('f(x)', fontsize=14)
ax.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

14.3 Custom Tick Formatters

Advanced Tick Formatting


from matplotlib.ticker import FuncFormatter, MultipleLocator
from matplotlib.dates import DateFormatter, MonthLocator
import datetime as dt

fig, axes = plt.subplots(2, 2, figsize=(16, 12))

# Custom number formatting


def currency_formatter(x, pos):
return f'${x:.0f}K'

def percentage_formatter(x, pos):


return f'{x:.1f}%'

# Example 1: Financial data


x = np.arange(2020, 2025)
revenue = [120, 135, 150, 175, 200]
axes[0, 0].plot(x, revenue, 'o-', linewidth=2, markersize=8)
axes[0, 0].yaxis.set_major_formatter(FuncFormatter(currency_formatter))
axes[0, 0].set_title('Revenue Growth')
axes[0, 0].set_ylabel('Revenue')
axes[0, 0].grid(True, alpha=0.3)

# Example 2: Percentage data


categories = ['Q1', 'Q2', 'Q3', 'Q4']
growth_rates = [5.2, 7.8, 12.1, 15.6]
axes[0, 1].bar(categories, growth_rates, color='lightgreen')
axes[0, 1].yaxis.set_major_formatter(FuncFormatter(percentage_formatter))
axes[0, 1].set_title('Quarterly Growth Rates')
axes[0, 1].set_ylabel('Growth Rate')

# Example 3: Scientific notation


x = np.logspace(1, 6, 50)
y = 1/x
axes[1, 0].loglog(x, y)
axes[1, 0].set_title('Power Law Relationship')
axes[1, 0].set_xlabel('Input (log scale)')
axes[1, 0].set_ylabel('Output (log scale)')
axes[1, 0].grid(True, which="both", ls="-", alpha=0.3)

# Example 4: Date formatting


dates = [dt.datetime(2023, i, 1) for i in range(1, 13)]
values = np.random.randn(12).cumsum() + 100
axes[1, 1].plot(dates, values, 'o-')
axes[1, 1].xaxis.set_major_locator(MonthLocator(interval=2))
axes[1, 1].xaxis.set_major_formatter(DateFormatter('%b\n%Y'))
axes[1, 1].set_title('Time Series with Date Formatting')
axes[1, 1].tick_params(axis='x', rotation=0)

plt.tight_layout()
plt.show()
15. Best Practices and Common Patterns
15.1 Creating Publication-Ready Figures

Professional Figure Setup


def setup_publication_figure():
"""Setup matplotlib for publication-quality figures"""
plt.rcParams.update({
'font.size': 12,
'font.family': 'serif',
'font.serif': ['Times New Roman'],
'axes.linewidth': 1.2,
'axes.spines.top': False,
'axes.spines.right': False,
'xtick.direction': 'out',
'ytick.direction': 'out',
'xtick.major.size': 6,
'xtick.minor.size': 3,
'ytick.major.size': 6,
'ytick.minor.size': 3,
'legend.frameon': False,
'figure.dpi': 300
})

# Apply settings
setup_publication_figure()

# Create publication figure


fig, ax = plt.subplots(figsize=(8, 6))
x = np.linspace(0, 10, 100)
ax.plot(x, np.sin(x), label='sin(x)', linewidth=2)
ax.plot(x, np.cos(x), label='cos(x)', linewidth=2)

ax.set_xlabel('Time (s)')
ax.set_ylabel('Amplitude')
ax.set_title('Trigonometric Functions')
ax.legend(loc='upper right')
ax.grid(True, alpha=0.3)

plt.tight_layout()
plt.savefig('publication_figure.pdf', dpi=300, bbox_inches='tight')
plt.show()

# Reset to default
plt.rcdefaults()

15.2 Error Handling and Debugging

Common Issues and Solutions


# Common matplotlib issues and solutions

# 1. Memory management for multiple figures


def create_and_save_plots():
for i in range(10):
fig, ax = plt.subplots()
ax.plot(np.random.randn(100))
ax.set_title(f'Plot {i}')
plt.savefig(f'plot_{i}.png')
plt.close(fig) # Important: close figure to free memory

# 2. Handling different data types


def robust_plotting(x_data, y_data):
try:
# Convert to numpy arrays
x = np.asarray(x_data)
y = np.asarray(y_data)

# Check for valid data


if len(x) != len(y):
raise ValueError("x and y must have the same length")

# Handle NaN values


mask = ~(np.isnan(x) | np.isnan(y))
x_clean = x[mask]
y_clean = y[mask]

if len(x_clean) == 0:
raise ValueError("No valid data points")

plt.figure()
plt.plot(x_clean, y_clean)
plt.show()

except Exception as e:
print(f"Plotting error: {e}")

# 3. Backend issues
def check_backend():
print(f"Current backend: {plt.get_backend()}")
print(f"Available backends: {plt.backend_bases.Backend}")

# Switch backend if needed


# plt.switch_backend('Agg') # For non-interactive use

15.3 Performance Tips

Optimizing Matplotlib Performance


# Performance optimization techniques

# 1. Use appropriate plot types for data size


def efficient_plotting(data_size):
x = np.random.randn(data_size)
y = np.random.randn(data_size)

if data_size < 10000:


# Use scatter for small datasets
plt.scatter(x, y, alpha=0.6)
else:
# Use hexbin or hist2d for large datasets
plt.hexbin(x, y, gridsize=50)
plt.colorbar()

# 2. Batch operations and avoid loops


def batch_plotting():
# Bad: Multiple plot calls
# for i in range(n):
# plt.plot(x[i], y[i])

# Good: Single call with 2D array


data = np.random.randn(5, 100)
plt.plot(data.T) # Transpose to plot each row

# 3. Use blitting for animations


def fast_animation():
fig, ax = plt.subplots()
line, = ax.plot([], [])

def animate(frame):
# Update data
x = np.linspace(0, 2*np.pi, 100)
y = np.sin(x + frame/10)
line.set_data(x, y)
return line, # Return artists for blitting

# Enable blitting for better performance


from matplotlib.animation import FuncAnimation
anim = FuncAnimation(fig, animate, blit=True, interval=50)
return anim
16. Integration with Other Libraries
16.1 Seaborn Integration

Using Matplotlib with Seaborn


import seaborn as sns

# Set seaborn style but use matplotlib for plotting


sns.set_style("whitegrid")
sns.set_palette("husl")

# Create sample data


tips = sns.load_dataset("tips")

# Use matplotlib with seaborn styling


fig, axes = plt.subplots(2, 2, figsize=(15, 12))

# Matplotlib plots with seaborn styling


axes[0, 0].hist(tips['total_bill'], bins=20, alpha=0.7)
axes[0, 0].set_title('Total Bill Distribution')
axes[0, 0].set_xlabel('Total Bill ($)')

axes[0, 1].scatter(tips['total_bill'], tips['tip'], alpha=0.6)


axes[0, 1].set_title('Tips vs Total Bill')
axes[0, 1].set_xlabel('Total Bill ($)')
axes[0, 1].set_ylabel('Tip ($)')

# Box plot by category


for i, day in enumerate(tips['day'].unique()):
day_data = tips[tips['day'] == day]['total_bill']
axes[1, 0].boxplot(day_data, positions=[i], widths=0.6)
axes[1, 0].set_xticklabels(tips['day'].unique())
axes[1, 0].set_title('Total Bill by Day')

# Combined seaborn and matplotlib


sns.scatterplot(data=tips, x='total_bill', y='tip', hue='time', ax=axes[1, 1])
axes[1, 1].set_title('Tips by Time of Day')

plt.tight_layout()
plt.show()

16.2 Plotly Integration

Converting Between Matplotlib and Plotly


# Create matplotlib figure
fig_mpl, ax = plt.subplots()
x = np.linspace(0, 10, 100)
ax.plot(x, np.sin(x), label='sin(x)')
ax.plot(x, np.cos(x), label='cos(x)')
ax.set_title('Matplotlib Figure')
ax.legend()
# Note: Plotly integration would require additional setup
# This is conceptual code showing the pattern

Summary and Quick Reference


Essential Functions Quick Reference
# Basic plotting
plt.plot(x, y) # Line plot
plt.scatter(x, y) # Scatter plot
plt.bar(x, height) # Bar chart
plt.hist(x) # Histogram
plt.pie(x, labels=labels) # Pie chart

# Customization
plt.title('Title') # Set title
plt.xlabel('X Label') # X-axis label
plt.ylabel('Y Label') # Y-axis label
plt.legend() # Add legend
plt.grid(True) # Add grid
plt.xlim(0, 10) # Set x limits
plt.ylim(0, 10) # Set y limits

# Subplots
fig, axes = plt.subplots(2, 2) # Create subplots
plt.subplot(2, 2, 1) # Select subplot

# Saving
plt.savefig('plot.png') # Save figure
plt.show() # Display figure
plt.close() # Close figure

Common Parameters

●​ figsize: Figure size (width, height) in inches


●​ dpi: Resolution in dots per inch
●​ alpha: Transparency (0-1)
●​ color or c: Color specification
●​ linewidth or lw: Line width
●​ linestyle or ls: Line style ('-', '--', '-.', ':')
●​ marker: Marker style ('o', 's', '^', etc.)
●​ markersize or ms: Marker size
●​ label: Legend label

This comprehensive guide covers all major aspects of Matplotlib pyplot for data visualization. Practice
with these examples and gradually incorporate more advanced techniques as you become comfortable
with the basics.

You might also like