100% found this document useful (1 vote)

64 views8 pages

PCa $ Image Processing

The document outlines a Python code implementation for handwritten digit classification using the MNIST dataset, employing Principal Component Analysis (PCA) for dimensionality reduction and Logistic Regression for classification. It includes steps for importing libraries, loading and visualizing the dataset, reducing dimensions, splitting data, training the model, and evaluating accuracy, achieving around 91-92% accuracy. Additionally, it describes image preprocessing techniques to convert custom images into the MNIST format for potential machine learning applications.

Uploaded by

dagiabi51

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

100% found this document useful (1 vote)

64 views8 pages

PCa $ Image Processing

Uploaded by

dagiabi51

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

PCA

from sklearn.datasets import fetch_openml

from sklearn.model_selection import train_test_split

from sklearn.linear_model import LogisticRegression

from sklearn.metrics import accuracy_score

from sklearn.decomposition import PCA

import matplotlib.pyplot as plt

data = fetch_openml("mnist_784", version=1)

x, y = data.data, data.target

plt.imshow(x.iloc[50].values.reshape(28, 28), cmap="gray")

plt.show()

pca = PCA(n_components=50)

x_reduced = pca.fit_transform(x)

x_train, x_test, y_train, y_test = train_test_split(x_reduced, y, random_state=42,

test_size=0.2)

lg = LogisticRegression(max_iter=1000, solver="lbfgs", multi_class="multinomial")

lg.fit(x_train, y_train)

predicted = lg.predict(x_test)

accuracy = accuracy_score(y_test, predicted)

print(accuracy)

This code demonstrates how to perform handwritten digit classification

using the MNIST dataset with Principal Component Analysis (PCA) for
dimensionality reduction and Logistic Regression for classification. Here's
a detailed breakdown:

1. Importing Libraries
python
Copy
from sklearn.datasets import fetch_openml
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score
from sklearn.decomposition import PCA
import matplotlib.pyplot as plt
 fetch_openml: Used to download datasets from OpenML
 train_test_split: For splitting data into training and test sets
 LogisticRegression: The classification model we'll use
 accuracy_score: To evaluate model performance
 PCA: For dimensionality reduction
 matplotlib.pyplot: For visualizing the digits

2. Loading the MNIST Dataset

python
Copy
data = fetch_openml("mnist_784", version=1)
x, y = data.data, data.target
 MNIST contains 70,000 handwritten digit images (0-9)
 Each image is 28×28 pixels, flattened into a 784-dimensional vector
(features)
 x contains the pixel data, y contains the digit labels (0-9)

3. Visualizing a Sample Digit

python
Copy
plt.imshow(x.iloc[50].values.reshape(28, 28), cmap="gray")
plt.show()
 Selects the 50th image from the dataset
 Reshapes the 784-length vector back to 28×28
 Displays it in grayscale
 This helps verify the data is loaded correctly

4. Dimensionality Reduction with PCA

python
Copy
pca = PCA(n_components=50)
x_reduced = pca.fit_transform(x)
 PCA reduces the 784 dimensions to just 50 principal components
 This speeds up training while preserving most of the variance
 The reduced dataset (x_reduced) now has 50 features per image
instead of 784

5. Splitting Data into Train/Test Sets

python
Copy
x_train, x_test, y_train, y_test = train_test_split(x_reduced, y, random_state=42,
test_size=0.2)
 80% of data (56,000 samples) for training
 20% of data (14,000 samples) for testing
 random_state=42 ensures reproducible splits

6. Training Logistic Regression Model

python
Copy
lg = LogisticRegression(max_iter=1000, solver="lbfgs", multi_class="multinomial")
lg.fit(x_train, y_train)
 Uses multinomial logistic regression (for multi-class classification)
 max_iter=1000: Allows up to 1000 iterations for convergence
 solver="lbfgs": A good choice for medium-sized datasets
 multi_class="multinomial": Properly handles the 10-class problem

7. Making Predictions and Evaluating

Accuracy
python
Copy
predicted = lg.predict(x_test)
accuracy = accuracy_score(y_test, predicted)
print(accuracy)
 Predicts labels for the test set
 Compares predictions to true labels
 Prints the accuracy score (fraction of correct predictions)

Expected Output
The code will:

1. Display a sample digit image

2. Print an accuracy score around 0.91-0.92 (91-92% accuracy) on the
test set

Key Concepts Illustrated

1. Dimensionality Reduction: PCA helps reduce computation time
while maintaining performance
2. Multi-class Classification: Logistic regression can handle multiple
classes
3. Model Evaluation: Using a held-out test set to measure real-world
performance
4. Image Data Handling: Working with flattened image vectors

The accuracy could potentially be improved by:

 Using more PCA components (at the cost of speed)

 Trying more complex models like neural networks
 Performing hyperparameter tuning

Image Processing Code

import pandas as pd

import numpy as np

from PIL import Image

import matplotlib.pyplot as plt

# Specify the image path

image_path = image_path = "C:/Users/GeeKs/Desktop/dagi/pictures/dagi.jpg"

# Open the image using PIL

image = Image.open(image_path)

# Resize the image to 28x28

resized_image = image.resize((28, 28))

# Convert the image to grayscale

grayscale_image = resized_image.convert("L")

# Convert to numpy array

image_array = np.array(grayscale_image)

# Create a pandas DataFrame from the 2D array

image_df = pd.DataFrame(image_array)

# Display the original and resized images

plt.title("Original Image")

plt.imshow(image)

plt.show()

plt.title("Grayscale Resized Image (28x28)")

plt.imshow(grayscale_image, cmap="gray")

plt.show()

This code demonstrates how to load, preprocess, and visualize an image

using Python's PIL (Pillow), NumPy, pandas, and matplotlib libraries. The
code prepares an image for potential machine learning applications (like
digit classification) by converting it to the MNIST dataset format (28×28
grayscale).

Full Code Breakdown

1. Importing Required Libraries
python

Copy

import pandas as pd
import numpy as np
from PIL import Image
import matplotlib.pyplot as plt
 pandas (pd): For creating DataFrames (though not strictly
necessary for this operation)

 numpy (np): For array operations and conversions

 PIL.Image: From Pillow library, for image loading and processing

 matplotlib.pyplot (plt): For image visualization

2. Specifying Image Path

python

Copy

image_path = "C:/Users/GeeKs/Desktop/dagi/pictures/dagi.jpg"
 Defines the path to the image file (note there's a duplicate
assignment)

 Uses Windows path format with forward slashes (also works with
raw strings or double backslashes)

3. Loading the Image

python

Copy

image = Image.open(image_path)
 Opens the image file using PIL's Image.open()

 Creates an Image object that can be manipulated

4. Resizing the Image

python

Copy

resized_image = image.resize((28, 28))

 Resizes the image to 28×28 pixels (standard MNIST dataset size)

 Uses bilinear interpolation by default for resizing

5. Converting to Grayscale
python

Copy

grayscale_image = resized_image.convert("L")
 Converts the color image to 8-bit grayscale (mode "L")

 Each pixel will have values 0-255 (black to white)

6. Converting to NumPy Array

python

Copy

image_array = np.array(grayscale_image)
 Converts the PIL Image object to a NumPy array
 Creates a 28×28 2D array where each element is a pixel intensity
value

7. Creating a pandas DataFrame (Optional)

python

Copy

image_df = pd.DataFrame(image_array)
 Converts the NumPy array to a pandas DataFrame

 This step might be useful if you need tabular manipulation of pixel

data

 Not strictly necessary for most image processing pipelines

8. Visualizing the Images

python

Copy

plt.title("Original Image")
plt.imshow(image)
plt.show()

plt.title("Grayscale Resized Image (28x28)")

plt.imshow(grayscale_image, cmap="gray")
plt.show()
 First block:
o Shows the original color image with a title

o Uses plt.imshow() with default colormap

 Second block:
o Shows the processed grayscale image

o Uses "gray" colormap for proper grayscale display

 Both use plt.show() to render the figures

Expected Output
When you run this code, you'll see two popup windows showing:

1. The original image in its full color and original size

2. The processed version as a 28×28 grayscale image

Key Processing Steps

1. Resizing: Standardizes the image dimensions to match common ML
datasets

2. Grayscale Conversion: Reduces color information to single-

channel intensity

3. Array Conversion: Prepares the image for numerical processing

4. Visualization: Verifies each transformation step

Potential Use Cases

This preprocessing pipeline is particularly useful for:

 Preparing custom images for MNIST-style digit classification

 Creating input for neural networks that expect 28×28 grayscale

images

 Image processing workflows that require standardized input sizes

Possible Improvements
1. Normalization: Add pixel value normalization (divide by 255)

2. Inversion: MNIST expects white digits on black background

3. Error Handling: Add try-catch for file operations

4. Binarization: Optional thresholding for black-and-white conversion

Deep Learning Manual
No ratings yet
Deep Learning Manual
44 pages
Al3502 Deep Learning For Vision Lab Manuval
No ratings yet
Al3502 Deep Learning For Vision Lab Manuval
19 pages
MLP - Week 5 - MNIST - Perceptron - Ipynb - Colaboratory
No ratings yet
MLP - Week 5 - MNIST - Perceptron - Ipynb - Colaboratory
31 pages
Project Guidelines - AIML
No ratings yet
Project Guidelines - AIML
30 pages
Here Are Common Image Preprocessing Techniques Used in Machine Learning and Deep Learning
No ratings yet
Here Are Common Image Preprocessing Techniques Used in Machine Learning and Deep Learning
7 pages
CIS 6213 Applied Machine Learning Coursework
No ratings yet
CIS 6213 Applied Machine Learning Coursework
5 pages
Srafvana
No ratings yet
Srafvana
6 pages
Performance Testing
No ratings yet
Performance Testing
15 pages
D1 - Deep Learning Workshop Session 3
No ratings yet
D1 - Deep Learning Workshop Session 3
5 pages
DLV Lab Manual Print
No ratings yet
DLV Lab Manual Print
29 pages
Image Classification Guide
No ratings yet
Image Classification Guide
2 pages
Code
No ratings yet
Code
11 pages
Lab Assignment 1 216
No ratings yet
Lab Assignment 1 216
2 pages
業務処理定義書セマンティックセグメンテーション En
No ratings yet
業務処理定義書セマンティックセグメンテーション En
9 pages
Deep Learning Project For Computer Vision With Python 2022
No ratings yet
Deep Learning Project For Computer Vision With Python 2022
297 pages
Image Datasets For Practicing Machine Learning in OpenCV
No ratings yet
Image Datasets For Practicing Machine Learning in OpenCV
9 pages
ML Guide: MNIST Digit Classification
No ratings yet
ML Guide: MNIST Digit Classification
98 pages
CV Assignment 2 Group02
No ratings yet
CV Assignment 2 Group02
12 pages
Assignment 02# - Machine Learning 2023
No ratings yet
Assignment 02# - Machine Learning 2023
8 pages
Ilovepdf Merged
No ratings yet
Ilovepdf Merged
62 pages
Project Report - Intro To AI
No ratings yet
Project Report - Intro To AI
40 pages
Lab05 ML
No ratings yet
Lab05 ML
7 pages
Newbie's Deep Learning Project To Recognize Handwritten Digit
No ratings yet
Newbie's Deep Learning Project To Recognize Handwritten Digit
6 pages
DL & AI - Lab Manual
No ratings yet
DL & AI - Lab Manual
33 pages
"I C U N N ": Mage Lassification Sing Eural Etworks
No ratings yet
"I C U N N ": Mage Lassification Sing Eural Etworks
15 pages
Corn Det
No ratings yet
Corn Det
2 pages
Explore The Implementation of CNNs in Python
No ratings yet
Explore The Implementation of CNNs in Python
10 pages
Recognizing Handwritten Digits With Scikit-Learn: Punam Seal
No ratings yet
Recognizing Handwritten Digits With Scikit-Learn: Punam Seal
21 pages
DNN Lab Manual for MCA Semester II
No ratings yet
DNN Lab Manual for MCA Semester II
34 pages
DL Programs
No ratings yet
DL Programs
36 pages
How To Develop A CNN For MNIST Handwritten Digit Classification
No ratings yet
How To Develop A CNN For MNIST Handwritten Digit Classification
43 pages
CNN Hyperparameters and Data Processing Guide
No ratings yet
CNN Hyperparameters and Data Processing Guide
3 pages
A 1
No ratings yet
A 1
9 pages
DL Lab-Final
No ratings yet
DL Lab-Final
22 pages
DL Exp5 22108B0055
No ratings yet
DL Exp5 22108B0055
14 pages
Deep Learning Experiments
No ratings yet
Deep Learning Experiments
42 pages
Case Study - AP23322130042
No ratings yet
Case Study - AP23322130042
7 pages
Pattern Recognition Lab
No ratings yet
Pattern Recognition Lab
24 pages
Deep Learning For Vision Lab Manual 2024
100% (1)
Deep Learning For Vision Lab Manual 2024
25 pages
DL LAB MANUAL Mugesh
No ratings yet
DL LAB MANUAL Mugesh
12 pages
Image Processing with Jupyter Lab
No ratings yet
Image Processing with Jupyter Lab
8 pages
Lab 4-Image Segmentation Using U-Net
No ratings yet
Lab 4-Image Segmentation Using U-Net
9 pages
MNIST Image Classification with CNN
No ratings yet
MNIST Image Classification with CNN
3 pages
MNIST Handwritten Digit Classification Guide
No ratings yet
MNIST Handwritten Digit Classification Guide
54 pages
Point Operations in Image Processing
No ratings yet
Point Operations in Image Processing
7 pages
Digital Image Processing Lab Report
No ratings yet
Digital Image Processing Lab Report
10 pages
Capstone Project-1
No ratings yet
Capstone Project-1
15 pages
Lab05 ML Naqash
No ratings yet
Lab05 ML Naqash
10 pages
Drashti CVML
No ratings yet
Drashti CVML
83 pages
HW46
No ratings yet
HW46
5 pages
DIP Mini Project
100% (1)
DIP Mini Project
12 pages
Image Processing
No ratings yet
Image Processing
5 pages
AI Project-1 - 21L-7744 21L-5433
No ratings yet
AI Project-1 - 21L-7744 21L-5433
5 pages
Lab Sheet Artificial Intelligence: 1. Introduction To Machine Learning: Linear Regression
No ratings yet
Lab Sheet Artificial Intelligence: 1. Introduction To Machine Learning: Linear Regression
8 pages
DL Lab 12212039
No ratings yet
DL Lab 12212039
72 pages
Deep Learning
No ratings yet
Deep Learning
5 pages
FA I - Unit5
No ratings yet
FA I - Unit5
11 pages
Binary Classification With Neural Network
No ratings yet
Binary Classification With Neural Network
19 pages
Reinforcement Learning for Stock Trading Bot
No ratings yet
Reinforcement Learning for Stock Trading Bot
8 pages
4 ArticleText 184 1 10 20190715
No ratings yet
4 ArticleText 184 1 10 20190715
5 pages
Improved Skin Cancer Detection With 3D Total Body
No ratings yet
Improved Skin Cancer Detection With 3D Total Body
21 pages
Peerj Cs 2362
No ratings yet
Peerj Cs 2362
26 pages
Gradient Descent in Linear Regression
No ratings yet
Gradient Descent in Linear Regression
30 pages
AI Book 9 - Answer Key - Part B
0% (1)
AI Book 9 - Answer Key - Part B
24 pages
Chat Match
No ratings yet
Chat Match
17 pages
AI Programming: State Space & PEAS
No ratings yet
AI Programming: State Space & PEAS
5 pages
ML 2
No ratings yet
ML 2
23 pages
Fine Tuned Understanding Enhancing Social Bot Detection With Transformer Based Classification
No ratings yet
Fine Tuned Understanding Enhancing Social Bot Detection With Transformer Based Classification
8 pages
Elderly Fall Detection System
100% (1)
Elderly Fall Detection System
28 pages
Assignment 3 CV
No ratings yet
Assignment 3 CV
7 pages
Model Evaluation and Selection
No ratings yet
Model Evaluation and Selection
8 pages
Indonesian Abstractive Summarization with BERT
No ratings yet
Indonesian Abstractive Summarization with BERT
6 pages
ANN Based Facial Emotion Detection and Music Selection
No ratings yet
ANN Based Facial Emotion Detection and Music Selection
5 pages
EffiReasonTrans: RL-Optimized Reasoning For Code Translation
No ratings yet
EffiReasonTrans: RL-Optimized Reasoning For Code Translation
13 pages
Synthetic Data For Deep Learning: Generate Synthetic Data For Decision Making and Applications With Python and R 1st Edition Necmi Gürsakal Instant Download
100% (1)
Synthetic Data For Deep Learning: Generate Synthetic Data For Decision Making and Applications With Python and R 1st Edition Necmi Gürsakal Instant Download
82 pages
LNCS 2810 Text Categorization Using Hybrid Multiple Model Schemes 1st Edition by in Cheol Kim, Soon Hee Myoung ISBN 3540408134 978-3540408130
100% (28)
LNCS 2810 Text Categorization Using Hybrid Multiple Model Schemes 1st Edition by in Cheol Kim, Soon Hee Myoung ISBN 3540408134 978-3540408130
49 pages
AI Companion: Revolutionizing Sales and Services During Product Advisor and Consumer Interaction
No ratings yet
AI Companion: Revolutionizing Sales and Services During Product Advisor and Consumer Interaction
15 pages
Rohr 2022 Physiol. Meas. 43 074001lstm
No ratings yet
Rohr 2022 Physiol. Meas. 43 074001lstm
13 pages
Data-Driven Detection of Stealth Cyber-Attacks in DC Microgrids
No ratings yet
Data-Driven Detection of Stealth Cyber-Attacks in DC Microgrids
10 pages
Xie Self-Training With Noisy Student Improves ImageNet Classification CVPR 2020 Paper
No ratings yet
Xie Self-Training With Noisy Student Improves ImageNet Classification CVPR 2020 Paper
12 pages
AI Assignment Guide for Students
No ratings yet
AI Assignment Guide for Students
7 pages
IJCV MultiHumanFlow Accepted Preprint
No ratings yet
IJCV MultiHumanFlow Accepted Preprint
17 pages
Classification of Dog Skin Diseases Using Deep Learning With Images
No ratings yet
Classification of Dog Skin Diseases Using Deep Learning With Images
11 pages
AI Driven Framework For Enhanced Cardiovascular Risk Stratification Via APIs Supervised Learning Uns
No ratings yet
AI Driven Framework For Enhanced Cardiovascular Risk Stratification Via APIs Supervised Learning Uns
10 pages
Data Science in Supply Chain Forecasting
No ratings yet
Data Science in Supply Chain Forecasting
84 pages
Robustness and Adversarial Resilience of Actuarial AI/ML Models in The Face of Evolving Threats
No ratings yet
Robustness and Adversarial Resilience of Actuarial AI/ML Models in The Face of Evolving Threats
7 pages
Hybrid Solar Dryer Optimization
No ratings yet
Hybrid Solar Dryer Optimization
8 pages

PCa $ Image Processing

Uploaded by

PCa $ Image Processing

Uploaded by

PCA

from sklearn.datasets import fetch_openml

from sklearn.model_selection import train_test_split

from sklearn.linear_model import LogisticRegression

from sklearn.metrics import accuracy_score

from sklearn.decomposition import PCA

import matplotlib.pyplot as plt

data = fetch_openml("mnist_784", version=1)

plt.imshow(x.iloc[50].values.reshape(28, 28), cmap="gray")

x_train, x_test, y_train, y_test = train_test_split(x_reduced, y, random_state=42,

lg = LogisticRegression(max_iter=1000, solver="lbfgs", multi_class="multinomial")

accuracy = accuracy_score(y_test, predicted)

This code demonstrates how to perform handwritten digit classification

2. Loading the MNIST Dataset

3. Visualizing a Sample Digit

4. Dimensionality Reduction with PCA

5. Splitting Data into Train/Test Sets

6. Training Logistic Regression Model

7. Making Predictions and Evaluating

1. Display a sample digit image

Key Concepts Illustrated

The accuracy could potentially be improved by:

 Using more PCA components (at the cost of speed)

Image Processing Code

from PIL import Image

import matplotlib.pyplot as plt

# Specify the image path

image_path = image_path = "C:/Users/GeeKs/Desktop/dagi/pictures/dagi.jpg"

# Open the image using PIL

# Resize the image to 28x28

resized_image = image.resize((28, 28))

# Convert the image to grayscale

# Convert to numpy array

# Create a pandas DataFrame from the 2D array

# Display the original and resized images

plt.title("Grayscale Resized Image (28x28)")

This code demonstrates how to load, preprocess, and visualize an image

Full Code Breakdown

 numpy (np): For array operations and conversions

 PIL.Image: From Pillow library, for image loading and processing

2. Specifying Image Path

3. Loading the Image

 Creates an Image object that can be manipulated

4. Resizing the Image

resized_image = image.resize((28, 28))

 Uses bilinear interpolation by default for resizing

 Each pixel will have values 0-255 (black to white)

6. Converting to NumPy Array

7. Creating a pandas DataFrame (Optional)

 This step might be useful if you need tabular manipulation of pixel

 Not strictly necessary for most image processing pipelines

8. Visualizing the Images

plt.title("Grayscale Resized Image (28x28)")

o Uses plt.imshow() with default colormap

o Uses "gray" colormap for proper grayscale display

 Both use plt.show() to render the figures

1. The original image in its full color and original size

2. The processed version as a 28×28 grayscale image

Key Processing Steps

2. Grayscale Conversion: Reduces color information to single-

3. Array Conversion: Prepares the image for numerical processing

4. Visualization: Verifies each transformation step

Potential Use Cases

 Preparing custom images for MNIST-style digit classification

 Creating input for neural networks that expect 28×28 grayscale

 Image processing workflows that require standardized input sizes

2. Inversion: MNIST expects white digits on black background

3. Error Handling: Add try-catch for file operations

4. Binarization: Optional thresholding for black-and-white conversion

You might also like