0% found this document useful (0 votes)
17 views20 pages

Unit 1 - Intro - ML

Uploaded by

skrandom145
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views20 pages

Unit 1 - Intro - ML

Uploaded by

skrandom145
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 20

Unit 1_Intro_ML

Describe Machine Learning and differentiate it from traditional


programming. Compare Machine Learning with Traditional. programming.

1. What is Machine Learning (ML)?


Machine Learning is a branch of Artificial Intelligence (AI) that enables computers to learn
patterns from data and make predictions without being explicitly programmed for each
task.

Key Points

 Learns from Data: ML algorithms use historical data to improve performance over
time.
 No Fixed Rules: Instead of hardcoding instructions, the machine finds rules and
relationships on its own.
 Predictive Ability: Can predict outcomes for new, unseen data.
 Applications:
o Image & speech recognition (e.g., Siri, Google Photos)
o Natural Language Processing (e.g., chatbots, translation)
o Fraud detection (bank transactions)
o Recommendation systems (YouTube, Netflix, Amazon)
o Autonomous systems (self-driving cars, drones)

2. Comparison: Machine Learning vs Traditional Programming


Aspect Machine Learning Traditional Programming
A subset of AI that creates algorithms to Writing rule-based, deterministic
Definition
learn from data and make predictions. code to solve a specific problem.
Rule-based: follows explicit
Data-driven: learns patterns from past
Approach instructions given by the
data.
programmer.
Learning Improves automatically with more data Cannot improve unless a human
Ability and training. modifies the code.
Data Can handle large, complex datasets Limited to processing according to
Handling and find hidden patterns. predefined rules.
Works only for known, predefined
Adaptability Can adapt to new, unseen situations.
scenarios.
Predictive analytics, chatbots,
Building calculators, databases,
Applications autonomous vehicles, medical
websites, fixed-function software.
diagnosis.
Performance depends on quality and Performance depends on quality of
Dependency
quantity of data. programmer’s logic.
💡 Simple Analogy:

 Traditional Programming = A chef following a recipe exactly, no matter what


ingredients change.
 Machine Learning = A chef who learns from experience, adjusts the recipe based on
taste tests, and improves over time.

Explain Principal Component Analysis used in Machine


Learning.
1. What is PCA?

Principal Component Analysis is a dimensionality reduction technique in machine


learning that reduces the number of features (variables) in a dataset while keeping the most
important information.

 It transforms the original features into new, uncorrelated features called principal
components.
 These components capture the maximum variance (differences) in the data.

2. Why Use PCA?

 Too Many Features Problem: Large datasets with many features can cause:
o Slow computation
o Overfitting (model memorizes training data instead of generalizing)
 PCA solves this by removing redundant or less useful features.

3. How PCA Works (Step-by-Step)

1. Standardize the Data


o Scale all features so they have the same importance (mean = 0, variance = 1).
2. Compute the Covariance Matrix
o Shows how features vary together.
3. Find Eigenvectors & Eigenvalues
o Eigenvectors → directions of new features (principal components).
o Eigenvalues → importance (amount of variance) captured by each
component.
4. Sort Components
o Keep the top components with the highest variance (information).
5. Project Data
o Convert original data into the new reduced feature space.

4. Advantages of PCA

 Removes Multicollinearity: Creates uncorrelated new features.


 Noise Reduction: Ignores components with little variance.
 Data Compression: Uses fewer features → faster processing.
 Outlier Detection: Unusual points stand out in reduced space.

5. Disadvantages of PCA

 Hard to Interpret: New components are combinations of old features.


 Scaling Sensitive: Must standardize data before applying PCA.
 Possible Information Loss: If too few components are kept.
 Assumes Linearity: Works best when relationships are linear.

6. Applications of PCA

 Image Compression: Reducing image size while keeping quality.


 Data Visualization: Showing high-dimensional data in 2D or 3D.
 Preprocessing for ML Models: Removing noise & redundancy before training.
 Finance: Reducing economic indicators into fewer key trends.

Explain the relationship between Artificial Intelligence, Machine


Learning and data science.

1. Artificial Intelligence (AI)


 Definition:
AI is the broad field of making machines think and act like humans — solving
problems, making decisions, and understanding language.
 Goal:
Enable machines to perform complex tasks intelligently.
 Examples:
o Chatbots (e.g., Siri, Alexa)
o Self-driving cars
o Game-playing AI (e.g., AlphaGo)

2. Machine Learning (ML)

 Definition:
A subset of AI that focuses on teaching machines to learn from data rather than
being explicitly programmed with rules.
 Goal:
Find patterns in historical data and predict outcomes for new data.
 Examples:
o Email spam filtering
o Movie recommendations (Netflix, YouTube)
o Credit card fraud detection

3. Data Science

 Definition:
A multidisciplinary field focused on extracting insights from raw data using
statistics, mathematics, programming, and domain knowledge.
 Goal:
Understand and interpret data to help make better decisions.
 Examples:
o Market trend forecasting
o Customer segmentation
o Business performance dashboards

4. Relationship

Think of them as nested circles:

Data Science → Uses ML models → Which are part of AI.

 AI: The big picture — making machines act smart.


 ML: A way to achieve AI — by learning from data.
 Data Science: The process of collecting, cleaning, analyzing, and visualizing data,
often using ML techniques to generate insights.

5. Comparison Table

Artificial Intelligence
Aspect Machine Learning (ML) Data Science
(AI)
Simulating human Extracting insights from
Focus Learning from data
intelligence data
Decision-making &
Goal Accurate predictions Data-driven decisions
problem-solving
Techniques ML, NLP, Robotics, Regression, Statistics, ML, Data
Used Computer Vision Classification, Clustering Visualization
Example Voice assistants, robots Predicting house prices Sales trend analysis

💡 Simple Analogy:

 AI = The concept of making a robot act like a human.


 ML = The method where the robot learns from past experiences.
 Data Science = The detective work of finding clues (insights) in data that can help the
robot make better decisions.

Explain types of Machine Learning.


1. Supervised Learning
 Definition:
Works like a teacher guiding the machine — the model is trained using labeled data
(data with correct answers).
 Process:
1. Feed the model a dataset where inputs have known outputs.
2. The model learns the relationship between inputs and outputs.
3. It predicts results for new, unseen data.
 Example:

o Dataset: Images of animals labeled “Elephant,” “Camel,” or “Cow.”


o Task: Predict the correct animal label for a new image.
 Applications:
o Image classification (animals, objects, scenes)
o Medical diagnosis (disease detection from scans)
o Fraud detection (flagging suspicious transactions)
o Natural Language Processing (sentiment analysis, translation)

2. Unsupervised Learning

 Definition:
No teacher — the machine learns from unlabeled data by finding patterns or groups.
 Process:
1. Data has no predefined labels.
2. The model groups data points based on similarities.
3. It discovers hidden structures in the data.
 Example:

o Given many photos of animals with no labels, the model groups similar-
looking animals together without knowing their names.
 Applications:
o Customer segmentation (grouping similar customers)
o Anomaly detection (detecting fraud or system failures)
o Recommendation systems (finding similar items or users)
o Scientific discovery (finding hidden patterns in research data)
 Types:
o Clustering (e.g., K-Means, Hierarchical Clustering)
o Association (finding relationships between variables)

3. Reinforcement Learning

Definition:
The machine learns by trial and error, receiving rewards or penalties based on its actions.

 Process:
1. An agent interacts with an environment.
2. It takes actions and receives feedback (reward or punishment).
3. The agent learns to maximize rewards over time.
 Example:

o Training a robot to walk — it gets points for moving forward, loses points if it
falls.
 Applications:
o Game AI (chess, Go)
o Self-driving cars (navigating roads safely)
o Robotics (optimizing movement and tasks)

4. Summary Table

Type Data Type Goal Examples


Predict outcomes for Spam detection, disease
Supervised Labeled
new data diagnosis
Customer segmentation,
Unsupervised Unlabeled Find patterns or groups
anomaly detection
Environment Learn best strategy
Reinforcement Game AI, self-driving cars
feedback through rewards

💡 Easy Analogy:

 Supervised → Learning with a teacher.


 Unsupervised → Learning without a teacher, just exploring.
 Reinforcement → Learning by trial and error, like training a pet.
Write a note on Reinforcement Learning.
1. What is Reinforcement Learning?

 Definition:
A type of machine learning where an agent learns to make decisions by interacting
with an environment and receiving rewards or penalties based on its actions.
 Learning Style:
Trial and error — the agent tries actions, learns from the outcomes, and improves
over time.

2. Key Components

Term Meaning
Agent The learner/decision-maker (e.g., a robot, software bot).
Everything the agent interacts with (e.g., a game world, real roads for a self-
Environment
driving car).
Action (A) A choice the agent can make at any point.
State (S) The current situation the agent is in.
Feedback from the environment after an action (positive for good actions,
Reward (R)
negative for bad).
Policy (π) The strategy the agent follows to choose actions.
Value
Estimates how good a state or action is in terms of future rewards.
Function

3. How Reinforcement Learning Works (Step-by-Step)

1. Agent observes the current state of the environment.


2. Agent takes an action based on its policy.
3. Environment changes to a new state and gives a reward or penalty.
4. Agent updates its strategy to maximize future rewards.
5. Process repeats until the agent learns the best strategy.

4. Types of Reinforcement Learning

1. Positive Reinforcement:
o Rewards good actions to encourage them.
o Example: Giving points in a game for completing a level.
2. Negative Reinforcement:
o Removes something bad to encourage certain actions.
o Example: Stopping an annoying sound when a correct button is pressed.
5. Applications of RL

 Robotics: Teaching robots to walk, pick objects, or navigate.


 Self-driving Cars: Learning safe driving by trial and error in simulations.
 Game AI: Beating humans in games like Chess, Go, and video games.
 Industrial Automation: Optimizing production processes.

💡 Simple Analogy:
Training a dog: You reward it with treats for good behavior and withhold treats or give
mild penalties for bad behavior. Over time, the dog learns the actions that get rewards.

Differentiate supervised and unsupervised learning techniques.

Supervised vs Unsupervised Learning


Aspect Supervised Learning Unsupervised Learning
Model learns from labeled data Model learns from unlabeled data
Definition
(input + correct output given). (only inputs, no correct outputs).
Predict the output for new data Find patterns, structures, or
Goal
based on learned mapping. relationships in the data.
Data Type Labeled dataset. Unlabeled dataset.
Linear Regression, Logistic
Examples of K-Means Clustering, Hierarchical
Regression, Decision Trees,
Algorithms Clustering, DBSCAN, PCA.
Support Vector Machines.
- Image classification (e.g., cat vs
Examples of - Customer segmentation - Market
dog) - Medical diagnosis - Spam
Applications basket analysis - Anomaly detection
detection
Uses known outputs to compare No known outputs — model organizes
Training Process
predictions and adjust. data on its own.
Can be directly measured using Harder to measure; often evaluated
Accuracy
metrics like accuracy, precision, using clustering quality metrics (e.g.,
Measurement
recall. silhouette score).
Generally simpler if data is Can be more complex because
Complexity
labeled. patterns are unknown.

💡 Easy Analogy:

 Supervised Learning → Like learning with a teacher who gives both questions and
correct answers.
 Unsupervised Learning → Like exploring without a teacher, finding your own
groups and patterns.
Explain Linear Discriminant Analysis (LDA) used in Machine
Learning.
1. What is LDA?

 Definition:
Linear Discriminant Analysis is a supervised dimensionality reduction and
classification technique in machine learning.
 Goal:
To separate two or more classes by projecting data into a lower-dimensional space
where the classes are as distinct as possible.
 Nature:
Works well when classes are linearly separable.

2. Key Idea

 LDA finds a linear combination of features that maximizes the distance between
class means while minimizing the variation within each class.
 This makes the data points of different classes more distinguishable in the reduced
space.

3. Key Assumptions of LDA

1. Gaussian Distribution: Data for each class follows a normal distribution.


2. Equal Covariance Matrices: All classes share the same covariance structure.
3. Linear Separability: Data can be separated by a straight line (or plane in higher
dimensions).

4. How LDA Works (Step-by-Step)

1. Calculate the mean for each class.


2. Compute the within-class scatter (variation inside each class) and between-class
scatter (variation between class means).
3. Find the projection direction that maximizes the ratio of between-class scatter to
within-class scatter.
4. Project the data onto this direction to reduce dimensions.

5. Applications of LDA

 Face Recognition: Reduces high-dimensional pixel data to fewer features for faster
recognition.
 Medical Diagnosis: Classifies diseases (e.g., mild, moderate, severe) based on patient
parameters.
 Customer Segmentation: Identifies target groups most likely to purchase a product.

6. Advantages

 Improves classification accuracy when assumptions hold.


 Reduces computation time by lowering dimensions.
 Works well with small datasets compared to some ML methods.

7. Disadvantages

 Performs poorly if classes are not linearly separable.


 Assumes normal distribution and equal covariance, which may not hold for real-world
data.

💡 Simple Analogy:
Imagine you have a crowd of people wearing red or blue shirts mixed together in a room.
LDA finds the best direction to look from so that red shirts appear on one side and blue
shirts on the other.
Differentiate Grouping and Grading models of Machine
Learning.
Grouping vs Grading in Machine Learning

Aspect Grouping Grading


Assigning labels or scores to data
Clustering similar data points into
Definition points or clusters based on predefined
categories without prior labels.
criteria.
Learning Type Unsupervised Learning Supervised Learning
Discover hidden patterns or natural Classify or rank data into known
Goal
groupings in data. categories.
Input Data Unlabeled data. Labeled data.
Groups (clusters) where items Class labels or grades assigned to each
Output
inside each group are similar. item or group.
Examples of - K-Means Clustering - - Logistic Regression - Decision Trees
Algorithms Hierarchical Clustering - DBSCAN - Support Vector Machines
Aspect Grouping Grading
- Customer segmentation - Market
- Spam detection - Image
Applications basket analysis - Anomaly
classification - Fraud detection
detection
Example Group customers into segments Assign loyalty levels (Gold, Silver,
Scenario based on buying behavior. Bronze) to each customer segment.
Often done first to discover Often follows grouping to label or
Relationship
structure in the data. score the discovered groups.

💡 Easy Analogy:

 Grouping = Sorting fruits into baskets based on similarity (shape, color) without
knowing their names.
 Grading = Labeling each basket as "Apples," "Oranges," or "Bananas" after
grouping.
What is Dimensionality Reduction, Explain any one Dimensionality
Reduction technique.
1. What is Dimensionality Reduction?

 Definition:
The process of reducing the number of input features (variables) in a dataset while
keeping the most important information.
 Goal:
o Remove noise and redundancy.
o Speed up computation.
oAvoid overfitting in machine learning models.
oMake high-dimensional data easier to visualize and analyze.
 When Needed:
o When datasets have too many features (high-dimensional data).
o When features are highly correlated.

2. Benefits of Dimensionality Reduction

 Faster Computation: Fewer features mean quicker training.


 Less Overfitting: Reduces the chance of memorizing noise in the data.
 Better Visualization: Easier to plot in 2D/3D.
 Removes Multicollinearity: Creates uncorrelated new features.

3. Example Technique – Principal Component Analysis (PCA)

Definition:

A linear dimensionality reduction method that transforms the original features into a new set
of uncorrelated variables called principal components, arranged so the first few keep most
of the variance in the data.

How PCA Works (Step-by-Step)

1. Standardize the Data: Make all features comparable in scale.


2. Compute the Covariance Matrix: Shows how features vary together.
3. Find Eigenvalues & Eigenvectors:
o Eigenvectors = directions of maximum variance (principal components).
o Eigenvalues = amount of variance each principal component explains.
4. Sort & Select Components: Keep top components that explain the most variance.
5. Project Data: Transform the dataset into the new reduced feature space.

Advantages of PCA

 Removes redundant information.


 Makes datasets smaller and easier to handle.
 Helps in visualization of large datasets.

Disadvantages of PCA

 Harder to interpret transformed components.


 Assumes linear relationships.
 Can lose some information if too few components are chosen.
💡 Simple Analogy:
Think of PCA as summarizing a 500-page book into 20 key pages that still tell the main
story without unnecessary details.

Explain parametric & nonparametric models in machine learning.


1. Parametric Models

Definition

 Models that assume a specific functional form for the relationship between input
and output.
 Have a fixed number of parameters that don’t change with dataset size.

Key Characteristics

 Assumptions: Strong assumptions about data distribution (e.g., linear, Gaussian).


 Parameters: Fixed in number, determined during training.
 Simplicity: Easier to interpret and train.
 Data Requirement: Works well with smaller datasets.
 Flexibility: Less flexible; may fail if the assumed model form is wrong.

Examples

 Linear Regression
 Logistic Regression
 Naive Bayes
 Support Vector Machine (with linear kernel)

When to Use

 When the data is well-understood and follows a simple pattern.


 When speed and interpretability are important.

2. Non-Parametric Models

Definition
 Models that do not assume a fixed functional form.
 Number of parameters can grow with the dataset size.

Key Characteristics

 Assumptions: Few or no assumptions about data distribution.


 Parameters: Flexible; adapt to data complexity.
 Simplicity: More complex to interpret.
 Data Requirement: Needs more data for good performance.
 Flexibility: Can model complex, non-linear relationships.

Examples

 K-Nearest Neighbors (KNN)


 Decision Trees
 Random Forests
 Support Vector Machines (with non-linear kernels)

When to Use

 When the data pattern is unknown or complex.


 When accuracy is more important than interpretability.

3. Summary Table

Feature Parametric Non-Parametric


Assumptions Strong assumptions about distribution Few or no assumptions
Parameters Fixed number Flexible number
Complexity Simpler More complex
Data Needed Less More
Interpretability Higher Lower
Flexibility Less flexible More flexible
Linear Regression, Logistic KNN, Decision Trees, Random
Examples
Regression Forest

💡 Easy Analogy:

 Parametric Model = A pre-shaped cookie cutter — you assume the cookie will be
round.
 Non-Parametric Model = A freehand dough shaper — you adapt the shape to the
dough each time.

Elaborate models of machine learning


1. Geometric Models

 Definition:
Represent data using geometric concepts like points, lines, surfaces, and volumes.
 Goal:
Use spatial relationships and shapes to separate or classify data.
 Applications:
1. Computer Vision – Object detection, image segmentation, 3D reconstruction.
2. 3D Modeling – CAD design, virtual environments.
3. Molecular Modeling – Analyzing molecule shapes and properties.
4. Graph Analysis – Representing data as networks for relationship analysis.

2. Probabilistic Models

 Definition:
Statistical models that capture uncertainty in data and use probability for predictions.
 Types:
1. Generative Models – Model the joint distribution of inputs & outputs, can
generate new data (e.g., Naive Bayes, GANs).
2. Discriminative Models – Model the boundary between classes for prediction
(e.g., Logistic Regression, SVM).
3. Graphical Models – Use graphs to represent variable dependencies (e.g.,
Bayesian Networks).
 Applications:

o Image and speech recognition


o Natural language processing
o Recommendation systems

3. Logical Models

 Definition:
Use logical expressions (IF–THEN rules) to divide data and make decisions.
 Types:
1. Tree Models – Decision Trees where each node is a rule.
2. Rule Models – Explicit IF–THEN statements for classification.
3. Neural Logic Networks – Combine neural networks with logic reasoning.
 Applications:

o Medical diagnosis
o Fraud detection
o Financial modeling
o Expert systems

4. Grouping (Clustering) Models

 Definition:
Unsupervised methods that group similar data points into clusters.
 Examples:
o K-Means Clustering – Groups based on distance to cluster centers.
o Hierarchical Clustering – Builds a hierarchy of clusters.
o DBSCAN – Groups dense areas, marks outliers separately.
 Applications:
o Customer segmentation
o Anomaly detection
o Document clustering

5. Grading (Classification) Models

 Definition:
Assign labels or scores to data points based on predefined categories (supervised
learning).
 Examples:
o Logistic Regression
o Decision Trees
o Support Vector Machines
 Applications:
o Spam detection
o Image classification
o Fraud detection

💡 Relationship Between Grouping & Grading:

 Often grouping comes first to discover patterns.


 Then grading labels these groups for prediction.

You might also like