0% found this document useful (0 votes)
10 views16 pages

Machine Learning Process

The document outlines the machine learning process, which includes defining the problem, data collection and preparation, model selection and training, model evaluation and tuning, and deployment and monitoring. It emphasizes the iterative nature of machine learning and the importance of avoiding common pitfalls such as biased data and overfitting. A real-world example of spam email detection is provided to illustrate the application of these steps.

Uploaded by

aileencp
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views16 pages

Machine Learning Process

The document outlines the machine learning process, which includes defining the problem, data collection and preparation, model selection and training, model evaluation and tuning, and deployment and monitoring. It emphasizes the iterative nature of machine learning and the importance of avoiding common pitfalls such as biased data and overfitting. A real-world example of spam email detection is provided to illustrate the application of these steps.

Uploaded by

aileencp
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Machine Learning

Process

By,
Mrs. P. Aileen Chris
AP(SS), CSE, HITS
Introduction to Machine
Learning Process
 Machine learning involves building systems that learn
from data.

 This process is iterative and includes several key steps


to ensure models perform well and stay relevant.

By, Mrs. P. Aileen Chris AP(SS), CSE, HITS


How???

By, Mrs. P. Aileen Chris AP(SS), CSE, HITS


Raw Data Golden Model

By, Mrs. P. Aileen Chris AP(SS), CSE, HITS


1. Defining the Problem

 Clearly state the real-world problem.

 Identify what predictions or insights are needed.

 Consider trade-offs: false positives vs. false negatives.

 Set success criteria (accuracy, precision, etc.).

By, Mrs. P. Aileen Chris AP(SS), CSE, HITS


2. Data Collection and
Preparation
 Gather data from relevant sources.

 Clean and preprocess: handle missing values, remove


noise.

 Perform feature engineering to improve model input


quality.

By, Mrs. P. Aileen Chris AP(SS), CSE, HITS


3. Model Selection and Training

 Choose a model based on the problem and data.

 Split data into training, validation, and testing sets.

 Train the model by minimizing error on training data.

 Use validation data for hyperparameter tuning.

By, Mrs. P. Aileen Chris AP(SS), CSE, HITS


4. Model Evaluation and Tuning

 Evaluate using metrics like accuracy, precision, recall,


F1-score.

 Analyze performance and identify improvements.

 Tune hyperparameters or try new models if needed.

By, Mrs. P. Aileen Chris AP(SS), CSE, HITS


5. Deployment and Monitoring

 Deploy the model to a production environment.

 Monitor performance in real time.

 Retrain periodically with new data to maintain


relevance.

By, Mrs. P. Aileen Chris AP(SS), CSE, HITS


ML –The Process

By, Mrs. P. Aileen Chris AP(SS), CSE, HITS


Things to know about ML

 Machine learning isn't linear.

 You may need to revisit previous steps:

 Poor performance? Collect more data.

 New patterns? Retrain the model.

 Model drift? Monitor and update.

By, Mrs. P. Aileen Chris AP(SS), CSE, HITS


Common Pitfalls to Avoid

 Using biased or incomplete data.

 Overfitting the model to training data.

 Ignoring real-world deployment constraints.

 Lack of continuous monitoring.

By, Mrs. P. Aileen Chris AP(SS), CSE, HITS


Real-World Example

Example: Spam Email Detection


Problem: Identify spam emails.
Data: Email text, sender, subject.
Model: Naive Bayes classifier.
Evaluation: Accuracy, false positive rate.
Deployment: Filter in email service.
Monitoring: Update with new spam trends.

By, Mrs. P. Aileen Chris AP(SS), CSE, HITS


Tools and Technologies

 Languages: Python, R

 Libraries: scikit-learn, TensorFlow, PyTorch

 Platforms: AWS, GCP, Azure

 Tools: Jupyter Notebooks, MLflow, Airflow

By, Mrs. P. Aileen Chris AP(SS), CSE, HITS


Group Activity Idea

Task: Design a ML system for predicting student performance.

Groups define problem, data needed, and choose a model.

Present approach to class and get feedback.

By, Mrs. P. Aileen Chris AP(SS), CSE, HITS


Q&A / Wrap-up

Questions to discuss:

What step do you think is most important?

Have you encountered ML in your daily life?

Summary:

Define ➜ Prepare Data ➜ Train ➜ Evaluate ➜ Deploy


➜ Monitor

By, Mrs. P. Aileen Chris AP(SS), CSE, HITS

You might also like