0% found this document useful (0 votes)

20 views51 pages

Module 6

The document discusses the concept of dimensionality reduction, particularly the 'curse of dimensionality' which highlights the challenges of high-dimensional data in machine learning, including increased noise and the need for exponentially more data points. It outlines methods for addressing these issues, such as feature selection and extraction techniques like PCA, which aim to reduce the number of features while preserving essential information. The document emphasizes the importance of careful feature engineering to improve model performance and reduce overfitting.

Uploaded by

sejalshigwan08

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

20 views51 pages

Module 6

Uploaded by

sejalshigwan08

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 51

Module 06

Dimensionality Reduction
Introduction
• Curse of Dimensionality describes the explosive nature of increasing
data dimensions and its resulting exponential increase in
computational efforts required for its processing and/or analysis.
• This term was first introduced by Richard E. Bellman, to explain the
increase in volume of Euclidean space associated with adding extra
dimensions, in area of dynamic programming.
• This phenomena does not occur in low-dimensional settings such as
three dimensional physical space of everyday experience.
• Today, this phenomenon is observed in domains such as numerical
analysis, machine learning, data analysis, data mining
• When we say increase in dimensionality it implies an increase in the
number of features used to describe the data.
• An increase in the dimensions can in theory, add more information to
the data thereby improving the quality of data but practically
increases the noise and redundancy during its analysis.
Problems
• High dimensional data is difficult to work with since
✓Adding more features can increase the noise, and hence human error
✓Good estimates can not be obtained if observations are not enough

• This causes
✓Increase in running time and hence reduce efficiency
✓Overfitting reduces classification performance
✓Large number of Samples to be recorded which causes more storage
cost, more computation cost, and more measurement cost.
✓Complexity in interpretation/modelling
Behavior of a Machine Learning Algorithms — Need for data
points and Accuracy of Model
• Each feature represents a dimension and group of dimensions creates a data
point.
• This represents a feature vector that defines the data point to be used by a
machine learning algorithm(s).
• As the dimensionality increases, the number of data points required for good
performance of any machine learning algorithm increases exponentially.

Let’s say that for a model to perform well, we need at least 10 data points for each
combination of feature values.
✓ If we assume that we have one binary feature, then for its 2¹ unique values (0
and 1) we would need 2¹x 10 = 20 data points.
✓ For 2 binary features, we would have 2² unique values and need 2² x 10 = 40
data points.
✓ Thus, for k-number of binary features we would need 2ᵏ x 10 data points.
In general, if n samples (observations) is dense enough in 1D, then in d
dimensions, we need n^d samples. And this number n^d grows really fast
as a function of d
IN GENERAL ……
• If we cant solve problem with few features, then adding more
features seems like a good idea.
• But as the dimensionality increases, the number of data samples
required for good performance of any machine learning algorithm
increases exponentially.
• However the number of samples available usually stays the same
• The method with more features is likely to perform worse instead of
expected better.

OVERFITTING
The curse of dimensionality is also known as
Hughes phenomenon or Hughes effect.
• Hughes (1968) in his study concluded that with a fixed number of
training samples, the predictive power of any classifier first increases
as the number of dimensions increase, but after a certain value of
number of dimensions, the performance deteriorates.
Classification error
• Thus, dimensionality can be considered
as a “curse” in many algorithms.
• Paradox: If n<d, we are better off
assuming that features are uncorrelated,
Classification accuracy
even if we know this assumption is wrong
and we are likely to avoid overfitting.
Effect of Curse of Dimensionality on Distance
Functions
• For any point A, lets assume distₘᵢₙ(A) is the minimum distance between A
and its nearest neighbor and distₘₐₓ(A) is the maximum distance between A
and the farthest neighbor.

• That is, for a d — dimensional space, given n-random points, the distₘᵢₙ(A)
≈ distₘₐₓ(A) meaning, any given pair of points are equidistant to each other.
Solutions to Curse of Dimensionality
• One of the ways to reduce the impact of high dimensions is to use a different
measure of distance in a space vector. One could explore the use of cosine
similarity to replace Euclidean distance. Cosine similarity can have a lesser
impact on data with higher dimensions. However, use of such method could
also be specific to the required solution of the problem.
• Other methods:
Other methods could involve the use of reduction in dimensions. Some of
the techniques that can be used are:
1) Forward-feature selection: This method involves picking the most useful
subset of features from all given features.
2) PCA/t-SNE: Though these methods help in reduction of number of
features, but it does not necessarily preserve the class labels and thus can
make the interpretation of results a tough task.
Feature Selection and Feature Extraction
• Each machine learning process depends on feature engineering,
which mainly contains two processes; which are Feature Selection
and Feature Extraction.
• Both feature selection and feature extraction are used
for dimensionality reduction which is key to reducing model
complexity and overfitting.
Feature Selection
• "It is a process of automatically or manually identifying and selecting the subset
of most appropriate and relevant features to be used in model building."
• The goal of feature selection is to improve model performance by reducing the
number of irrelevant or redundant features that may introduce noise or bias into
the model.
• This can result in faster training times, improved accuracy, reduced generalization
error introduced due to noise by irrelevant features, and, more robust models
that are less prone to overfitting.

• We need to do feature selection

when the dataset contains a large
number of features, or when the
features are highly correlated,
redundant, or irrelevant.
Feature selection techniques:
1. Regularization techniques such as L1 norm regularization
• L1 norm regularization, also known as Lasso regularization, is a common
regularization technique used in feature selection.
• It works by adding a penalty term that encourages the model to select only
the most important features, while reducing the weights of irrelevant or
redundant features to zero.
• L1 norm regularization introduces sparsity into the feature weights,
meaning that only a subset of the features have non-zero weights.
• The other features are effectively ignored by the model, resulting in a form
of automatic feature selection.
• L1 norm regularization can be particularly useful
(Lasso)
in cases where the dataset
contains many features, and some of them are irrelevant or redundant.
2. Feature importance technique
• Feature importance techniques such as using estimator such
as Random Forest algorithm to fit a model and select features based on
the value of attribute such as feature_importances_ .
• The feature_importances_ attribute of the Random Forest estimator
can be used to obtain the relative importance of each feature in the
dataset.
• The feature_importances_ attribute of the Random Forest estimator
provides a score for each feature in the dataset, indicating how
important that feature is for making predictions.
• These scores are calculated based on the reduction in impurity (e.g.,
Gini impurity or entropy) achieved by splitting the data on that feature.
• The feature with the highest score is considered the most important,
while features with low scores can be considered less important or even
irrelevant.
One numerical and one
Both input output numerical categorical Both input output categorical
Feature Extraction
• This method is used to extract features in a format supported by machine
learning algorithms from the given datasets consisting of formats such as
text and image.
• Feature Extraction transforms an arbitrary data, such as text or images,
into numerical features that is understood by machine learning algorithms.
• It is intended to modify the features and generate new ones (reduce
number of features) by combining the raw/provided features and then
discarding the original features.
• Feature Extraction projects a data set with higher dimensionality onto a
smaller number of dimensions. As such it is useful for data visualization,
since a complex data set can be effectively visualized when it is reduced to
two or three dimensions.
Feature Extraction continued…
• A feature extraction pipeline varies a lot depending on the primary data
and the algorithm to use and it turns into something difficult to consider
abstractly.
• The Principle Component Analysis (PCA), Linear Discriminant Analysis
(LDA), and Multidimensional Scaling are the most often used
methodologies for Feature Extraction.
Conclusion
• Feature selection is a very complicated and vast field of machine learning,
and lots of studies are already made to discover the best methods.
• There is no fixed rule of the best feature selection method. However,
choosing the method depend on a machine learning engineer who can
combine and innovate approaches to find the best method for a specific
problem.
• One should try a variety of model fits on different subsets of features
selected through different statistical Measures. There are no limits to the
ways of creating features.
• Drawing these meaningful features often means a great deal of extensive
exploration that involves expertise, creativity, intuition and time, lots of
time. And this is the reason why, whereas automatic feature selection is
already here, there is no so much development in feature extraction.
What Is Principal Component Analysis?
• Principal component analysis, or PCA, is a dimensionality reduction
method that is often used to reduce the dimensionality of large data
sets, by transforming a large set of variables into a smaller one that
still contains most of the information in the large set.
• Reducing the number of variables of a data set naturally comes at the
expense of accuracy, but the trick in dimensionality reduction is to
trade a little accuracy for simplicity. Because smaller data sets are
easier to explore and visualize and make analyzing data points much
easier and faster for machine learning algorithms without extraneous
variables to process.
• So, to sum up, the idea of PCA is simple — reduce the number of
variables of a data set, while preserving as much information as
possible.
HOW DO YOU DO A
PRINCIPAL COMPONENT ANALYSIS?
1. Standardize the range of continuous initial variables
2. Compute the covariance matrix to identify correlations
3. Compute the eigenvectors and eigenvalues of the covariance
matrix to identify the principal components
4. Create a feature vector to decide which principal components
to keep
5. Recast the data along the principal components axes
STEP 1: STANDARDIZATION
• The aim of this step is to standardize the range of the continuous initial variables so that
each one of them contributes equally to the analysis.
• More specifically, the reason why it is critical to perform standardization prior to PCA, is,
if there are large differences between the ranges of initial variables, those variables with
larger ranges will dominate over those with small ranges (for example, a variable that
ranges between 0 and 100 will dominate over a variable that ranges between 0 and 1), which
will lead to biased results.
• So, transforming the data to comparable scales can prevent this problem.
• Mathematically, this can be done by subtracting the mean and dividing by the standard
deviation for each value of each variable.

• Once the standardization is done, all the variables will be transformed to the same scale.
STEP 2: COVARIANCE MATRIX COMPUTATION

• The aim of this step is to understand how the variables of the

input data set are varying from the mean with respect to each
other, or in other words, to see if there is any relationship
between them.
• Because sometimes, variables are highly correlated in such a
way that they contain redundant information. So, in order to
identify these correlations, we compute the covariance matrix.
• e.g. For a 3-dimensional data set with 3 variables x, y, and z, the
covariance matrix is a 3×3 data matrix of this from
• Since the covariance of a variable with itself is its variance
(Cov(a,a)=Var(a)), in the main diagonal (Top left to bottom
right) we actually have the variances of each initial variable.
• And since the covariance is commutative (Cov(a,b)=Cov(b,a)),
the entries of the covariance matrix are symmetric with respect
to the main diagonal, which means that the upper and the lower
triangular portions are equal.
• It’s actually the sign of the covariance that matters:
✓ If positive then: the two variables increase or decrease together (correlated)
✓ If negative then: one increases when the other decreases (Inversely correlated)
STEP 3: COMPUTE THE EIGENVECTORS AND
EIGENVALUES OF THE COVARIANCE MATRIX TO
IDENTIFY THE PRINCIPAL COMPONENTS
• Eigenvectors and eigenvalues are the linear algebra concepts that we
need to compute from the covariance matrix in order to determine
the principal components of the data.
• Principal components are new variables that are constructed as
linear combinations or mixtures of the initial variables. These
combinations are done in such a way that the new variables (i.e.,
principal components) are uncorrelated and most of the information
within the initial variables is squeezed or compressed into the first
components.
• So, the idea is 10-dimensional data gives, 10 principal components,
but PCA tries to put maximum possible information in the first
component, then maximum remaining information in the second
and so on, until having something like shown in the scree plot on
next slide.
• Organizing information in
principal components allows
to reduce dimensionality
without losing much
information, and this by
discarding the components
with low information and
considering the remaining
components as new variables.

• Principal components are less

interpretable and don’t have
any real meaning since they
are constructed as linear
combinations of the initial
variables.
• Geometrically speaking, principal components represent the
directions of the data that explain a maximal amount of
variance, that is to say, the lines that capture most information
of the data.
• The relationship between variance and information here, is that,
the larger the variance carried by a line, the larger the
dispersion of the data points along it, and the larger the
dispersion along a line, the more information it has.
• To put all this simply, just think of principal components as new
axes that provide the best angle to see and evaluate the data, so
that the differences between the observations are better visible.
How PCA Constructs the Principal Components

…………… it’s approximately the line that matches the purple marks
How PCA Constructs the Principal Components

• The first principal component accounts for the largest

possible variance in the data set. Or mathematically
speaking, it’s the line that maximizes the variance (the average
of the squared distances from the projected points (red dots) to
the origin).
• The second principal component is calculated in the same way,
with the condition that it is uncorrelated with (i.e.,
perpendicular to) the first principal component and that it
accounts for the next highest variance.
• This continues until a total of p principal components have been
calculated, equal to the original number of variables.
• It is eigenvectors and eigenvalues who are behind all the magic
explained above, because the eigenvectors of the Covariance matrix
are actually the directions of the axes where there is the most
variance (most information) and that we call Principal Components.
And eigenvalues are simply the coefficients attached to eigenvectors,
which give the amount of variance carried in each Principal
Component.
• By ranking your eigenvectors in order of their eigenvalues, highest to
lowest, you get the principal components in order of significance.
• After having the principal components, to compute the percentage of
variance (information) accounted for by each component, we divide
the eigenvalue of each component by the sum of eigenvalues.
STEP 4: FEATURE VECTOR
• In this step, what we do is, to choose whether to keep all these
components or discard those of lesser significance (of low
eigenvalues), and form with the remaining ones a matrix of
vectors that we call Feature vector.
• So, the feature vector is simply a matrix that has as columns the
eigenvectors of the components that we decide to keep.
• This makes it the first step towards dimensionality reduction,
because if we choose to keep only p eigenvectors (components)
out of n, the final data set will have only p dimensions.
STEP 5: RECAST THE DATA ALONG THE
PRINCIPAL COMPONENTS AXES
• In the previous steps, apart from standardization, you do not make any
changes on the data, you just select the principal components and form the
feature vector, but the input data set remains always in terms of the
original axes (i.e., in terms of the initial variables).
• In this last step, the aim is to use the feature vector formed using the
eigenvectors of the covariance matrix, to reorient the data from the original
axes to the ones represented by the principal components (hence the name
Principal Components Analysis). This can be done by multiplying the
transpose of the original data set by the transpose of the feature vector
PCA

• Intuition: find the axis that shows the greatest variation, and project all
points into this axis

e1
e2

f1
SVD: The mathematical formulation

• Normalize the dataset by moving

the origin to the center of the f2
dataset
• Find the eigenvectors of the data
(or covariance) matrix e2
e1

• These define the new space

• Sort the eigenvalues in “goodness” f1
order
SVD Cont’d

• Advantages:
• Optimal dimensionality reduction (for linear projections)

• Disadvantages:
• Computationally hard. … but can be improved with random sampling
• Sensitive to outliers and non-linearities
Applications of Principal Component Analysis
• PCA is mainly used as the dimensionality reduction technique in
various AI applications such as computer vision, image compression,
etc.
• It can also be used for finding hidden patterns if data has high
dimensions. Some fields where PCA is used are Finance, data mining,
Psychology, etc.
Example
• Final covariance matrix
• Finding final eigen vectors (Principle components)
(2.5, 2.4)

(0.5, 0.7)

(0.8534) (3.458)

Lecture4-Dimensionality Reduction Methods
No ratings yet
Lecture4-Dimensionality Reduction Methods
40 pages
Unit-I, Part-2 Feature Engineering
No ratings yet
Unit-I, Part-2 Feature Engineering
21 pages
Dimensionality Reduction Techniques Explained
No ratings yet
Dimensionality Reduction Techniques Explained
23 pages
Dimensionality Reduction in Machine Learning-1
No ratings yet
Dimensionality Reduction in Machine Learning-1
16 pages
Feature Engineering and Dimensionality Reduction
No ratings yet
Feature Engineering and Dimensionality Reduction
146 pages
Data Acquisition and Feature Engineering
No ratings yet
Data Acquisition and Feature Engineering
28 pages
Feature Dimensionality Reduction: A Review: Survey and State of The Art
No ratings yet
Feature Dimensionality Reduction: A Review: Survey and State of The Art
31 pages
Chapter6 - Unit IV2024
No ratings yet
Chapter6 - Unit IV2024
84 pages
Feature Selection & Feature Extraction
No ratings yet
Feature Selection & Feature Extraction
19 pages
2 3-FeatureRelatedIssues
No ratings yet
2 3-FeatureRelatedIssues
10 pages
Unit 5 Notes New
No ratings yet
Unit 5 Notes New
6 pages
Unit-13 Feature Selection and Extraction
No ratings yet
Unit-13 Feature Selection and Extraction
24 pages
Dalal 2008
No ratings yet
Dalal 2008
6 pages
Curse of Dimensionality and Its Reduction
No ratings yet
Curse of Dimensionality and Its Reduction
5 pages
UNIT04
No ratings yet
UNIT04
35 pages
What Is A Feature?: 5.5M 732 Oops Concepts in Java
No ratings yet
What Is A Feature?: 5.5M 732 Oops Concepts in Java
20 pages
Feature Pruning and Normalization
No ratings yet
Feature Pruning and Normalization
8 pages
Machine Learning Unit-5
No ratings yet
Machine Learning Unit-5
49 pages
Portinale Saitta 2002a
No ratings yet
Portinale Saitta 2002a
22 pages
Feature Engineering PDF
No ratings yet
Feature Engineering PDF
19 pages
Feature Engineering in Machine Learning
No ratings yet
Feature Engineering in Machine Learning
19 pages
Conference 101719
No ratings yet
Conference 101719
7 pages
Dimensionality Reduction in ML
No ratings yet
Dimensionality Reduction in ML
10 pages
ML Unit2 Classppt
No ratings yet
ML Unit2 Classppt
44 pages
Machine Learning
No ratings yet
Machine Learning
5 pages
Unit 5 - Machine Learning - Www.a2softech - Xyz - A2kash
No ratings yet
Unit 5 - Machine Learning - Www.a2softech - Xyz - A2kash
12 pages
Feature Selection and Extraction
No ratings yet
Feature Selection and Extraction
26 pages
Understanding the Curse of Dimensionality
No ratings yet
Understanding the Curse of Dimensionality
41 pages
L-10 - Presentation1-09052024-072206pm
No ratings yet
L-10 - Presentation1-09052024-072206pm
27 pages
Feature and Feature Extractionlect2
No ratings yet
Feature and Feature Extractionlect2
28 pages
DimensionalityReduction (Filter and Wrapper Methods)
No ratings yet
DimensionalityReduction (Filter and Wrapper Methods)
47 pages
Conference 101719
No ratings yet
Conference 101719
7 pages
Unit 4
No ratings yet
Unit 4
109 pages
Unit-4 Dimensionality Reduction
No ratings yet
Unit-4 Dimensionality Reduction
17 pages
Feature Engineering For Machine Learning
No ratings yet
Feature Engineering For Machine Learning
41 pages
Xplore Feature Engineering
No ratings yet
Xplore Feature Engineering
9 pages
Dimensionality Reduction Guide
No ratings yet
Dimensionality Reduction Guide
47 pages
Dimensionality Reduction Techniques
No ratings yet
Dimensionality Reduction Techniques
7 pages
ICT202B AI ML and Emerging Technologies UNIT 2 (Advanced Phython Packages)
No ratings yet
ICT202B AI ML and Emerging Technologies UNIT 2 (Advanced Phython Packages)
20 pages
DWDM AR16 Unit 1.2
No ratings yet
DWDM AR16 Unit 1.2
14 pages
CHP 4
No ratings yet
CHP 4
72 pages
Summary Chap 1 & 2
No ratings yet
Summary Chap 1 & 2
5 pages
ML - Week 05
No ratings yet
ML - Week 05
25 pages
PCA in Pattern Recognition Overview
No ratings yet
PCA in Pattern Recognition Overview
42 pages
ML Unit Iv Part I
No ratings yet
ML Unit Iv Part I
11 pages
Dimensionality Reduction Techniques
No ratings yet
Dimensionality Reduction Techniques
6 pages
Curse
No ratings yet
Curse
12 pages
Dimensionality Reduction
No ratings yet
Dimensionality Reduction
274 pages
Data Mining: Dimensionality Reduction
No ratings yet
Data Mining: Dimensionality Reduction
135 pages
Feature Engineering and Normalization
No ratings yet
Feature Engineering and Normalization
7 pages
Feature Selection Based On Class-Dependent Densities For High-Dimensional Binary Data
No ratings yet
Feature Selection Based On Class-Dependent Densities For High-Dimensional Binary Data
13 pages
Module5.2 Feature Selection Methods
No ratings yet
Module5.2 Feature Selection Methods
64 pages
Dimensionality Reduction Techniques Overview
No ratings yet
Dimensionality Reduction Techniques Overview
15 pages
Curse of Dimensionality
No ratings yet
Curse of Dimensionality
9 pages
Basics of Feature Engineering Marked
No ratings yet
Basics of Feature Engineering Marked
33 pages
ML Unit 2 Part 2
No ratings yet
ML Unit 2 Part 2
23 pages
Unit IV Dimensionality Reduction
No ratings yet
Unit IV Dimensionality Reduction
34 pages
Data Pre-processing Techniques Explained
No ratings yet
Data Pre-processing Techniques Explained
101 pages
Data
No ratings yet
Data
36 pages
2012 Chrysler 300 Owners Manual
No ratings yet
2012 Chrysler 300 Owners Manual
576 pages
Management Concepts and Practices Exam Guide
No ratings yet
Management Concepts and Practices Exam Guide
4 pages
DLP #7 Eng 7
50% (2)
DLP #7 Eng 7
2 pages
Dark Service 1 - The Dark Lord Awakens - Zara K Lee
No ratings yet
Dark Service 1 - The Dark Lord Awakens - Zara K Lee
359 pages
Topic Two: Knowledge Representation, Reasoning and Inference
No ratings yet
Topic Two: Knowledge Representation, Reasoning and Inference
72 pages
IGCSE English Revision Guide Extended
100% (1)
IGCSE English Revision Guide Extended
30 pages
Special Cargo Publications Erratum I
No ratings yet
Special Cargo Publications Erratum I
4 pages
Impact of Staff Turnover on Customer Satisfaction
No ratings yet
Impact of Staff Turnover on Customer Satisfaction
22 pages
Factors Influencing Slow Learners' Personality
No ratings yet
Factors Influencing Slow Learners' Personality
4 pages
Dga - Ariel-P-Ip67 - Datasheet (Option 2 Spike)
No ratings yet
Dga - Ariel-P-Ip67 - Datasheet (Option 2 Spike)
3 pages
DRV Ind Vol 66 1 Minarik PDF
No ratings yet
DRV Ind Vol 66 1 Minarik PDF
6 pages
Shattered But Unbroken by Amelia Van Der Merwe
No ratings yet
Shattered But Unbroken by Amelia Van Der Merwe
351 pages
Rewards and Challenges of Being A Manager Rewards
No ratings yet
Rewards and Challenges of Being A Manager Rewards
22 pages
SAEJ743 V 001
No ratings yet
SAEJ743 V 001
19 pages
ANSI - IEEE C37.2 - Codificare Prot.
No ratings yet
ANSI - IEEE C37.2 - Codificare Prot.
5 pages
TLE 7-8 Front Office Service Q1 - M2 For Printing
No ratings yet
TLE 7-8 Front Office Service Q1 - M2 For Printing
22 pages
Technostress in University Learning
No ratings yet
Technostress in University Learning
39 pages
Optimum Currency Areas and The European Experience: Slides Prepared by Thomas Bishop
No ratings yet
Optimum Currency Areas and The European Experience: Slides Prepared by Thomas Bishop
51 pages
Effects of Sleep Deprivation On The Academic Performance of Senior High School Students
No ratings yet
Effects of Sleep Deprivation On The Academic Performance of Senior High School Students
61 pages
GS Quantifying The Risks of AI-Related Job Displacement
No ratings yet
GS Quantifying The Risks of AI-Related Job Displacement
20 pages
Diagrama Electrico c15 9nz
100% (5)
Diagrama Electrico c15 9nz
9 pages
OTA Student Evaluation at Taunton Mills
No ratings yet
OTA Student Evaluation at Taunton Mills
10 pages
Exotic Magazine June 2011 PDF
No ratings yet
Exotic Magazine June 2011 PDF
2 pages
Data Classification & Prediction Guide
No ratings yet
Data Classification & Prediction Guide
34 pages
Certificate of Analysis: Pet Preform MW 12.40 GR
No ratings yet
Certificate of Analysis: Pet Preform MW 12.40 GR
1 page
Organizational Chart For IT Department
No ratings yet
Organizational Chart For IT Department
9 pages
Censoring & Truncation
No ratings yet
Censoring & Truncation
14 pages
Bauer Compressor Maintenance
No ratings yet
Bauer Compressor Maintenance
14 pages
Themes in Twelve Angry Men & Mockingbird
No ratings yet
Themes in Twelve Angry Men & Mockingbird
3 pages
CVT PDF
No ratings yet
CVT PDF
194 pages

Module 6

Uploaded by

Module 6

Uploaded by

Module 06

• We need to do feature selection

• The aim of this step is to understand how the variables of the

• Principal components are less

• The first principal component accounts for the largest

• Normalize the dataset by moving

• These define the new space

You might also like