Terminologies in
Machine Learning
By,
Mrs. P. Aileen Chris,
AP(SS)/CSE/HITS
Introduction to ML
Terminology
Machine learning terminology includes concepts used to
describe how machines learn from data to make
predictions or decisions.
By,
Mrs. P. Aileen Chris,
AP(SS)/CSE/HITS
Algorithm
A step-by-step procedure used for data analysis and
prediction.
In machine learning, algorithms are used to create
models that can make predictions or decisions based on
data.
Example: Decision trees, support vector machines,
neural networks.
By,
Mrs. P. Aileen Chris,
AP(SS)/CSE/HITS
Accuracy
A metric used to evaluate classification models,
representing the ratio of correctly predicted instances to
the total instances.
By, Mrs. P. Aileen Chris AP(SS)/CSE/HITS
Model
A trained representation of real-world relationships in
data.
Built using an algorithm and training data to make
predictions.
By,
Mrs. P. Aileen Chris,
AP(SS)/CSE/HITS
Anomaly Detection
The process of identifying rare or unusual data points
that deviate significantly from the majority of the data.
Useful in fraud detection and network security.
By, Mrs. P. Aileen Chris AP(SS)/CSE/HITS
Features & Labels
Feature: Input variable used for making predictions.
Label: Output or target variable in supervised learning.
Example: Email text (feature), spam/not spam (label).
By,
Mrs. P. Aileen Chris,
AP(SS)/CSE/HITS
Bias
In machine learning, bias refers to the error introduced by
approximating a real-world problem, which may be too
complex, with a simplified model.
By, Mrs. P. Aileen Chris AP(SS)/CSE/HITS
Classification
A type of supervised learning where the goal is to predict
discrete labels or categories for given inputs, such as
determining whether an email is spam or not.
By, Mrs. P. Aileen Chris AP(SS)/CSE/HITS
Clustering
An unsupervised learning technique used to group similar
data points together into clusters based on their features,
without predefined labels.
By, Mrs. P. Aileen Chris AP(SS)/CSE/HITS
Confusion Matrix
A table used to evaluate the performance of a classification
model by showing the number of true positives, true
negatives, false positives, and false negatives.
By, Mrs. P. Aileen Chris AP(SS)/CSE/HITS
Cross-Validation
A technique for assessing how well a model generalizes to
new data by splitting the dataset into training and testing
subsets multiple times.
By, Mrs. P. Aileen Chris AP(SS)/CSE/HITS
Data Preprocessing
The steps taken to clean, normalize, and transform raw
data before feeding it into a machine learning model to
improve accuracy and performance.
By, Mrs. P. Aileen Chris AP(SS)/CSE/HITS
Supervised vs Unsupervised
Learning
Supervised: Learns from labeled data (e.g.,
classification, regression).
Unsupervised: Finds patterns in unlabeled data (e.g.,
clustering, dimensionality reduction).
By,
Mrs. P. Aileen Chris,
AP(SS)/CSE/HITS
Reinforcement Learning
An agent learns to make decisions by receiving rewards
or penalties from interacting with an environment.
Example: Game-playing AI, robotics.
By,
Mrs. P. Aileen Chris,
AP(SS)/CSE/HITS
Deep Learning
A subset of machine learning involving neural networks
with many layers (deep neural networks) that can
automatically learn features from data.
Effective for tasks like image recognition and speech
processing.
By, Mrs. P. Aileen Chris AP(SS)/CSE/HITS
Overfitting & Under fitting
Overfitting: Too complex model memorizes training
data.
Underfitting: Too simple model misses important
patterns.
Goal: Find the right balance.
By,
Mrs. P. Aileen Chris,
AP(SS)/CSE/HITS
Hyperparameters
Settings chosen before training that control learning
behavior.
Examples: Learning rate, number of trees in a forest,
number of layers in a neural network.
By,
Mrs. P. Aileen Chris,
AP(SS)/CSE/HITS
Evaluation Metrics
- Used to measure model performance.
Examples:
Accuracy: Correct predictions / total predictions
Precision, Recall, F1-score for classification tasks
By,
Mrs. P. Aileen Chris,
AP(SS)/CSE/HITS
Artificial Neural Networks
(ANNs)
Model inspired by the brain's structure.
Composed of nodes (neurons) in input, hidden, and
output layers.
Used in deep learning models.
By,
Mrs. P. Aileen Chris,
AP(SS)/CSE/HITS
Gradient Descent & NLP
Gradient Descent: Optimization algorithm to minimize
error during training.
NLP: Machines understanding human language (e.g.,
chatbots, translation, sentiment analysis).
By,
Mrs. P. Aileen Chris,
AP(SS)/CSE/HITS
K-Nearest Neighbors (KNN)
A classification algorithm that assigns a label based on the
majority vote of the nearest k data points in the feature
space.
By, Mrs. P. Aileen Chris AP(SS)/CSE/HITS
ROC Curve
A graphical representation of a classification model’s
performance across different thresholds, plotting the true
positive rate against the false positive rate.
By, Mrs. P. Aileen Chris AP(SS)/CSE/HITS
Principal Component Analysis
(PCA)
A dimensionality reduction technique that transforms data
into a lower-dimensional space by finding the directions
(principal components) of maximum variance.
By, Mrs. P. Aileen Chris AP(SS)/CSE/HITS
Decision Tree
A supervised learning algorithm that splits data into
branches based on feature values to make predictions,
visualized as a tree-like structure.
By, Mrs. P. Aileen Chris AP(SS)/CSE/HITS
Random Forest
An ensemble learning method that combines multiple
decision trees to improve performance and reduce
overfitting, by averaging their predictions.
By, Mrs. P. Aileen Chris AP(SS)/CSE/HITS
Generative Model
A model that learns to generate new samples from the
same distribution as the training data, often used for
data synthesis and augmentation.
By, Mrs. P. Aileen Chris AP(SS)/CSE/HITS
??????
By, Mrs. P. Aileen Chris AP(SS)/CSE/HITS
Types – ML
By, Mrs. P. Aileen Chris AP(SS)/CSE/HITS