Lecture 6
Feature Selection
and Extraction
Machine Learning
Ivan Smetannikov
15.06.2016
Lecture plan
• Dimensionality Reduction
• Feature Selection
• Feature Extraction
Machine learning. Lecture 6. Feature Selection and Extraction. 15.06.2016. 2
Lecture plan
• Dimensionality Reduction
• Feature Selection
• Feature Extraction
Machine learning. Lecture 6. Feature Selection and Extraction. 15.06.2016. 3
Dimensionality Reduction
Why should we look at dimensionality
reduction?
• Speeds up algorithms
• Reduces space used by data for them
Machine learning. Lecture 6. Feature Selection and Extraction. 15.06.2016. 4
Dimensionality Reduction
What is dimensionality reduction?
• You’ve collected many features – maybe
more than you need. Can you ”simply”
your data set in a rational and useful way?
Machine learning. Lecture 6. Feature Selection and Extraction. 15.06.2016. 5
Dimensionality Reduction
Example:
• Redundant data set –
different units for
same attribute
• Reduce data to 1D
(2D -> 1D)
Machine learning. Lecture 6. Feature Selection and Extraction. 15.06.2016. 6
Dimensionality Reduction
Another Example
• Helicopter flying - do a survey of pilots
(x1 = skill, x2 = pilot enjoyment) These
features may be highly correlated
• This correlation can be combined into a
single attribute called aptitude (for
example)
Machine learning. Lecture 6. Feature Selection and Extraction. 15.06.2016. 7
Dimensionality Reduction
So what does
dimensionality
reduction mean?
• Let plot a line
• Take exact example
and record
position on that
line
• So we can present
x1 as 1D number
Machine learning. Lecture 6. Feature Selection and Extraction. 15.06.2016. 8
Dimensionality Reduction
Another example 3D -> 2D
Machine learning. Lecture 6. Feature Selection and Extraction. 15.06.2016. 9
Dimensionality Reduction
Motivation:
Collect a large data set (50 dimensions)
Machine learning. Lecture 6. Feature Selection and Extraction. 15.06.2016. 10
Dimensionality Reduction
Using dimensionality reduction come up with a
different feature representation
Machine learning. Lecture 6. Feature Selection and Extraction. 15.06.2016. 11
Lecture plan
• Dimensionality Reduction
• Feature Selection
• Feature Extraction
Machine learning. Lecture 6. Feature Selection and Extraction. 15.06.2016. 12
Feature Selection
Goals of feature selection:
• Avoiding retraining and improving the
quality of classification
• Best understanding of models
• Boosting of classifying models
Machine learning. Lecture 6. Feature Selection and Extraction. 15.06.2016. 13
Feature Selection
Type of elected attributes:
• Redundant attributes - do not carry any
additional information
• Irrelevant attributes - are not generally
informative
Machine learning. Lecture 6. Feature Selection and Extraction. 15.06.2016. 14
Feature Selection
Evaluation methods of feature selection:
• At various datasets
• With different classifiers (if possible)
• By adding to datasets noise and target
vectors
Machine learning. Lecture 6. Feature Selection and Extraction. 15.06.2016. 15
Feature Selection
Feature selection types:
• Filter methods
a. Univariate
b. Multivariate
• Wrapper methods
a. Deterministic
b. Randomized
• Embedded methods
Machine learning. Lecture 6. Feature Selection and Extraction. 15.06.2016. 16
Feature Selection
Filter methods:
Evaluate the quality of certain attributes and
remove the worst of them.
+ Simple to compute, easy to scale
- Ignore the relationships between attributes
or features used by classifier
Machine learning. Lecture 6. Feature Selection and Extraction. 15.06.2016. 17
Feature Selection
Examples of filter methods:
• Univariate:
o Euclidian distance
o Information gain
• Spearman corellation coefficient
o Multivariate:
o CFS
o MBF
Machine learning. Lecture 6. Feature Selection and Extraction. 15.06.2016. 18
Feature Selection
Spearman corellation coefficient
Python SciPy: x : (N,) array_like
scipy.stats.pearsonr(x, y) Input
Parameters:
y : (N,) array_like
Input
(Pearson’s correlation coefficient,
Returns:
2-tailed p-value)
Machine learning. Lecture 6. Feature Selection and Extraction. 15.06.2016. 19
Feature Selection
Weka:
ASEvaluation evaluator = new CorrelationAttributeEval();
Ranker ranker = new Ranker();
// ranker.setThreshold(0.05); or ranker.setNumToSelect(10);
AttributeSelection selection = new AttributeSelection();
selection.setInputFormat(heavyInstances);
selection.setEvaluator(evaluator);
selection.setSearch(ranker);
Instances lightInstances = Filter.useFilter(heavyInstances, selection);
Machine learning. Lecture 6. Feature Selection and Extraction. 15.06.2016. 20
Feature Selection
Wrapper methods:
Get a subset of attributes of the source
+ Higher accuracy than Filtering
+ Consider the relationships between
attributes
+ Direct interaction with the classifier
- Long computing time
- The probability of re-education
Machine learning. Lecture 6. Feature Selection and Extraction. 15.06.2016. 21
Feature Selection
Examples of Wrapper methods:
• Deterministic:
o SFS (sequential forward selection?)
o SBE (sequential backward elimination?)
o SVM-RFE
• Randomized:
o Randomized Hill Climbing
o Genetic Algorithms
Machine learning. Lecture 6. Feature Selection and Extraction. 15.06.2016. 22
Feature Selection
SVM-RFE
• Train SVM on training subset
• Rank features by received weights
• Throw out last features
• Repeat until the necessary amount of
features will left
Machine learning. Lecture 6. Feature Selection and Extraction. 15.06.2016. 23
Feature Selection
SVM-RFE (Python example)
x = [1, 5, 1.5, 8, 1, 9]
y = [2, 8, 1.8, 8, 0.6, 11]
Machine learning. Lecture 6. Feature Selection and Extraction. 15.06.2016. 24
Feature Selection
SVM-RFE (Python example)
X = np.array([[1,2], [5,8], [1.5,1.8], [8,8],
[1,0.6], [9,11]])
y = [0,1,0,1,0,1]
Let use SVM:
clf = svm.SVC(kernel='linear', C = 1.0)
Let fit our model:
clf.fit(X,y)
Let predict predict something:
print(clf.predict([0.58,0.76]))
Machine learning. Lecture 6. Feature Selection and Extraction. 15.06.2016. 25
Feature Selection
Embedded
• Take into account the particular classifier
• Use individual method for each classifier
Machine learning. Lecture 6. Feature Selection and Extraction. 15.06.2016. 26
Feature Selection
Random Forest:
Machine learning. Lecture 6. Feature Selection and Extraction. 15.06.2016. 27
Feature Selection
Random Forest:
• Select a subsample of size N for each tree
with replacement
• Build decision trees. To select next feature
to split the considered
• Choose the best for a given criteria
Machine learning. Lecture 6. Feature Selection and Extraction. 15.06.2016. 28
Feature Selection
Random Forest (Python example):
# Import the random forest package
from sklearn.ensemble import RandomForestClassifier
# Create the random forest object which will include all the
parameters for the fit
forest = RandomForestClassifier(n_estimators = 100)
# Fit the training data to the Survived labels and create the
decision trees
forest = forest.fir(train_data[0::, 1::],
train_data[0::, 0])
# Take the same decision trees and run it on the test data
output = forest.predict(test_data)
Machine learning. Lecture 6. Feature Selection and Extraction. 15.06.2016. 29
Feature Selection
Random Forest (Weka):
int numFolds = 10;
br = new BufferedReader(new FileReader(“data.arff"));
Instances trainData = new Instances(br);
trainData.setClassIndex(trainData.numAttributes() - 1);
RandomForest rf = new RandomForest();
rf.setNumTrees(100);
rf.buildClassifier(trainData);
Evaluation evaluation = new Evaluation(trainData);
evaluation.crossValidateModel(rf, trainData, numFolds, new Random(1));
System.out.println("F-measure= " + evaluation.fMeasure(0));
Machine learning. Lecture 6. Feature Selection and Extraction. 15.06.2016. 30
Feature Selection
IG and IG
Machine learning. Lecture 6. Feature Selection and Extraction. 15.06.2016. 31
Feature Selection
Redundancy
Machine learning. Lecture 6. Feature Selection and Extraction. 15.06.2016. 32
Feature Selection
Regularization
Machine learning. Lecture 6. Feature Selection and Extraction. 15.06.2016. 33
Lecture plan
• Dimensionality Reduction
• Feature Selection
• Feature Extraction
Machine learning. Lecture 6. Feature Selection and Extraction. 15.06.2016. 34
Feature Extraction
Machine learning. Lecture 6. Feature Selection and Extraction. 15.06.2016. 35
Feature Extraction
Feature Extraction
• Reducing the amount of resources
required to describe a large set of data
• New features
• Linear and nonlinear
Machine learning. Lecture 6. Feature Selection and Extraction. 15.06.2016. 36
Feature Extraction
Feature Extraction
• Reducing the amount of resources
required to describe a large set of data
• New features
• Linear and nonlinear
Machine learning. Lecture 6. Feature Selection and Extraction. 15.06.2016. 37
Feature Extraction
Linear and nonlinear
Maniford Sculpting PCA
Machine learning. Lecture 6. Feature Selection and Extraction. 15.06.2016. 38
Feature Extraction
PCA
We have 2D dataset which we wish to
reduce to 1D
Machine learning. Lecture 6. Feature Selection and Extraction. 15.06.2016. 39
Feature Extraction
PCA tries to find the surface (a straight line
in this case) which has
the minimum projection error
Machine learning. Lecture 6. Feature Selection and Extraction. 15.06.2016. 40
Feature Extraction
PCA (Python example)
Let use Iris-data and import PCA
from sklearn.decomposition import PCA as sklearnPCA
sklearn_pca = sklearnPCA(n_components=2)
Y_sklearn = sklearn_pca.fit_transform(X_std)
Machine learning. Lecture 6. Feature Selection and Extraction. 15.06.2016. 41
Feature Extraction
PCA (Python example)
Let plot PCA-results
Machine learning. Lecture 6. Feature Selection and Extraction. 15.06.2016. 42
Feature Extraction
PCA (Weka)
PrincipalComponents pca = new PrincipalComponents();
pca.setInputFormat(trainingData);
pca.setMaximumAttributes(100);
newData = Filter.useFilter(newData, pca);
Machine learning. Lecture 6. Feature Selection and Extraction. 15.06.2016. 43