MSBA 315
Machine Learning
&
Predictive Analytics
Wael Khreich
[email protected]
Content
• Machine Learning versus Artificial Intelligence
• Machine Learning versus Statistics
• Machine Learning Applications
• Machine Learning Approaches
• The Machine Learning Pipeline
• First Activities
Artificial Intelligence
• Artificial Intelligence aims to create intelligent machines that are able
to perform intellectual tasks, which normally require human
intelligence:
• Reasoning, Planning and Solving Problems
• Understanding languages/abstract concepts
• Learning from few examples
• Interacting with the world
• Generalizing acquired concepts
• AGI, general AI, strong AI, full AI, etc.
• Today, AI remains in the science fiction novels and movies
• When we should expect (general) AI?
• The answer ranges from (hopefully) never to a couple of years
Beyond AI
Machine Learning vs. AI
• Narrow AI, Applied AI, or Weak AI focuses on performing a single task
extremely well, which may even exceed human performance
• Practical applications are everywhere today, ranging from speech and
image recognition, to the prediction of health outcomes, stock prices,
and consumer behavior
• Machine learning is a central pillar of
narrow AI that focuses on
the learning from data …
• Other fields include Robotics, and
Natural Language Processing,
Computer Vision
What is Machine Learning?
Science of getting computer systems to:
• Learn from data without being explicitly
programmed
• Generalize to unseen examples
• Improve from past experience
Machine Learning Scope
Machine Learning Versus Statistics
Machine Learning Statistics
• Build software systems that • Formal statistical inference (tests,
make predictions based on: confidence intervals, etc.) rely on:
• large amounts of data • moderate amounts of data
• high-dimensional problems • low-dimensional problems
• Often few assumptions • Many restrictive assumptions
• future data will be generated by • linear relationships, normally
the same process distributed errors, independence
• Performance evaluation on data • Mathematically rigorous conclusions
• validation sets, cross validation, • confidence intervals, hypothesis tests,
test sets, etc. etc.
What about Machine Learning Engineer Versus:
• Data Scientist
• Data Analyst
• Data Engineer
• MLOPs
• See videos by codebasics:
• Data Scientist vs Machine Learning Engineer | DS vs ML
• Data Analyst vs Data Engineer vs Data Scientist
Why do we Need Machine Learning?
Machine learning is particularly useful for automating tasks
• Where it is difficult to describe the solution
• Where each example needs a different rule
• given a picture determine whether there is a dog in the image
• Where the desired outputs keep changing over time
Get insights about complex problems and large amounts of data
The Exponential Growth of Data
200
180
Estimation of Volume of data created 175
160 worldwide from 2010 to 2017
Data volume in zettabytes
140 projected to 2025 129.5
120
1 zettabyte = 1 billion Terabytes 101
100
(1,000,000,000,000,000,000,000 bytes) 79.5
80
64.5
60 50.5
41
40 33
26
20 9 12.5 15.5 18
2 5 6.5
0
Source(s): IDC; Seagate; Statista estimates; ID 871513
This is what happened in one internet minute
Handwritten Digit Classification
Handwritten Recognition
Biometric Recognition
Face and Facial Expression Recognition
Automatic Lip Reading
Information Retrieval
Search and ranks
• less about listing
products that match
keywords
• more about
contextual prediction
of what customers
might want to see
now
Search by image
• decreasing the
time/effort to find
specific items
Recommendation Systems
Current recommendation systems can
recommend items based on their
similarity, recency, popularity,
profitability, availability, expiration
date, need-base, etc.
Google DeepMind: AlphaGo (2016)
Approximately:
• 10120 possible Chess games
• 10170 possible Go games
• 1080 atoms in the universe
Sofia by Hanson Robotics (2020)
https://youtu.be/Sq36J9pNaEo
IBM’s Watson: Question Answering 12 Years Ago
• Question: When hit by electrons, a phosphor gives off
electromagnetic energy in this form……?
• Answer: Light (or Photons)http://www.ibm.com/smarterplanet/us/en/ibmwatson/assets/img/tech/img-video-jeopardy.jpg
Watson won
Jeopardy
2011
OpenAI - 2022
MetaAI - 2022 Sora - 2024
Autonomous Navigation Systems
Autonomous Military Systems
Many Other Applications
• Anomaly Detection and Fraud Detection:
• Credit card transactions
• Computer security: System call sequences, network packets, etc.
• Healthcare
• Finance
•…
How Machine Learning Solve Such Problems
Traditional Programming
Data
Output
Program
By:
Machine Learning
▪ Learning the underlying
structures of the training data Data
Program
Output
▪ Generalizing beyond training
examples
ML Approaches
• Supervised learning: X W
(gr)
L
(cm)
Y
• Class label is known (apple or pear)
Length
X1 80 2.5 A
• Learn a function from input X2 70 3 P
features (X) to output labels (y) Weight
…
• Unsupervised learning: X W L
• Class label is unknown (gr) (cm)
Length
X1 80 2.5
• Learn to group data with similar X2 70 3
patterns (discover structures) …
Weight
ML Approaches
Reinforcement Learning
• Learning a policy (mapping of situations to actions)
• Learning through rewards
• Rewards are (usually) delayed
• Can be combined with supervised deep learning techniques
• From predictive to prescriptive (making decisions)
Machine Learning Pipeline
First Activities
• Read and watch additional materials (posted on Moodle)
• Practice lab material with Colab
• Start forming groups and think about project ideas