0% found this document useful (0 votes)

123 views39 pages

Lecture1 - Introduction To Machine Learning

Uploaded by

Packet Mancer

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

123 views39 pages

Lecture1 - Introduction To Machine Learning

Uploaded by

Packet Mancer

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Introduction to Machine Learning

(CSCI-UA.0480-002)

David Sontag
New York University

Slides adapted from Luke Zettlemoyer, Pedro Domingos, and

Carlos Guestrin
Logistics
•  Class webpage:
–  [Link]
–  Sign up for mailing list!

•  Office hours:
–  Tuesday 3:30-4:30pm and by appointment.
–  715 Broadway, 12th floor, Room 1204

•  Grader: Jinglun Dong

–  Email: jinglundong@[Link]
Evaluation
•  About 7 homeworks (50%)
–  Both theory and programming
–  See collaboration policy on class webpage

•  Midterm & final exam (45%)

•  Course participation (5%)

Prerequisites
•  Basic algorithms (CS 310)
–  Dynamic programming, algorithmic analysis
•  Linear algebra (Math 140)
–  Matrices, vectors, systems of linear equations
–  Eigenvectors, matrix rank
–  Singular value decomposition
•  Multivariable calculus (Math 123)
–  Derivatives, integration, tangent planes
–  Lagrange multipliers
•  Probability (Math 233 or 235)
Source Materials
Optional textbooks:

•  C. Bishop, Pattern Recognition and Machine

Learning, Springer, 2007

•  K. Murphy, Machine Learning: a Probabilistic

Perspective, MIT Press, 2012
A Few Quotes
•  A breakthrough in machine learning would be worth ten
Microsofts (Bill Gates, Chairman, Microsoft)

•  Machine learning is the next Internet (Tony Tether, former

director, DARPA)

•  “Machine learning is the hot new thing (John Hennessy, President,

Stanford)

•  Web rankings today are mostly a matter of machine

learning (Prabhakar Raghavan, former Dir. Research, Yahoo)

•  Machine learning is going to result in a real revolution (Greg

Papadopoulos, former CTO, Sun)

•  Machine learning is today’s discontinuity (Jerry Yang, former

CEO, Yahoo)
What is Machine Learning ?
(by examples)
Classification

from data to discrete classes

Spam filtering
data prediction

Spam
vs.
Not Spam
Object detection

Example training images

for each orientation

10 ©2009 Carlos Guestrin

Weather prediction
Regression

predicting a numeric value

Stock market
Weather prediction revisted

Temperature

72° F
Ranking

comparing items
Web search
Given image, find similar images

[Link]
Collaborative Filtering
Recommendation systems
Recommendation systems
Machine learning competition with a $1 million prize
Clustering

discovering structure in data

Clustering Data: Group similar things
Clustering images

Set of Images

[Goldberger et al.]
Clustering web search results
Embedding

visualizing data
Embedding images

•  Images have
thousands or
millions of pixels.

•  Can we give each

image a coordinate,
such that similar
images are near
each other?

[Saul & Roweis ‘03]
Embedding words

[Joseph Turian]
Embedding words (zoom in)

[Joseph Turian]
Structured prediction

from data to discrete classes

Speech recognition
Natural language processing

I need to hide a body

noun, verb, preposition, …
Growth of Machine Learning
•  Machine learning is preferred approach to
–  Speech recognition, Natural language processing
–  Computer vision
–  Medical outcomes analysis
–  Robot control
–  Computational biology
–  Sensor networks
–  …
•  This trend is accelerating
–  Improved machine learning algorithms
–  Improved data capture, networking, faster computers
–  Software too complex to write by hand
–  New sensors / IO devices
–  Demand for self-customization to user, environment
Supervised Learning: find f
•  Given: Training set {(xi, yi) | i = 1 … n}
•  Find: A good approximation to f : X  Y
Examples: what are X and Y ?
•  Spam Detection
–  Map email to {Spam, Not Spam}
•  Digit recognition
–  Map pixels to {0,1,2,3,4,5,6,7,8,9}
•  Stock Prediction
–  Map new, historic prices, etc. to ! (the real numbers)
Example: Spam Filter
Dear Sir.
•  Input: email
•  Output: spam/ham First, I must solicit your confidence in this
transaction, this is by virture of its nature
•  Setup: as being utterly confidencial and top
–  Get a large collection of secret. …
example emails, each
labeled spam or ham
TO BE REMOVED FROM FUTURE
–  Note: someone has to hand MAILINGS, SIMPLY REPLY TO THIS
label all this data! MESSAGE AND PUT "REMOVE" IN THE
–  Want to learn to predict SUBJECT.
labels of new, future emails
99 MILLION EMAIL ADDRESSES
FOR ONLY $99
•  Features: The attributes used to
make the ham / spam decision
Ok, Iknow this is blatantly OT but I'm
–  Words: FREE! beginning to go insane. Had an old Dell
–  Text Patterns: $dd, CAPS Dimension XPS sitting in the corner and
–  Non-text: SenderInContacts decided to put it to use, I know it was
–  … working pre being stuck in the corner, but
when I plugged it in, hit the power nothing
happened.
Example: Digit Recognition
•  Input: images / pixel grids
0
•  Output: a digit 0-9
•  Setup:
–  Get a large collection of example 1
images, each labeled with a digit
–  Note: someone has to hand label all
this data!
–  Want to learn to predict labels of new, 2
future digit images

•  Features: The attributes used to make the

digit decision 1
–  Pixels: (6,8)=ON
–  Shape Patterns: NumComponents,
AspectRatio, NumLoops ??
–  …
Important Concepts
•  Data: labeled instances, e.g. emails marked spam/ham
–  Training set
–  Held out set (sometimes call Validation set)
–  Test set
•  Features: attribute-value pairs which characterize each x
Training
•  Experimentation cycle Data
–  Select a hypothesis f to best match training set
–  (Tune hyperparameters on held-out or validation set)
–  Compute accuracy of test set
–  Very important: never peek at the test set!
•  Evaluation
–  Accuracy: fraction of instances predicted correctly Held-Out
Data
•  Overfitting and generalization
–  Want a classifier which does well on test data
–  Overfitting: fitting the training data very closely, but not
generalizing well Test
–  We’ll investigate overfitting and generalization formally in a Data
few lectures
A Supervised Learning Problem
•  Consider a simple, Dataset:
Boolean dataset:
–  f : X  Y
–  X = {0,1}4
–  Y = {0,1}

•  Question 1: How should

we pick the hypothesis
space, the set of possible
functions f ?
•  Question 2: How do we
find the best f in the
hypothesis space?
Most General Hypothesis Space
Consider all possible boolean functions over four
input features!
Dataset:
• 216 possible
hypotheses

• 29 are consistent

with our dataset

• How do we
choose the best
one?
A Restricted Hypothesis Space
Consider all conjunctive boolean functions.

Dataset:
• 16 possible
hypotheses

• None are
consistent with our
dataset

• How do we
choose the best
one?

Machine Learning Course Overview
No ratings yet
Machine Learning Course Overview
270 pages
Machine Learning Course Overview and Objectives
No ratings yet
Machine Learning Course Overview and Objectives
106 pages
Lecture 1
100% (1)
Lecture 1
43 pages
ML Merged
No ratings yet
ML Merged
433 pages
Unit I MACHINE LEARNING
No ratings yet
Unit I MACHINE LEARNING
87 pages
Machine Learning Overview and Types
No ratings yet
Machine Learning Overview and Types
27 pages
Asset-V1 MITx+6.86x+1T2019+Type@Asset+Block@Slides Lecture1
No ratings yet
Asset-V1 MITx+6.86x+1T2019+Type@Asset+Block@Slides Lecture1
27 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
45 pages
ML - Unit I - Final
No ratings yet
ML - Unit I - Final
132 pages
ML 01
No ratings yet
ML 01
15 pages
01 Ml-Overview Notes
No ratings yet
01 Ml-Overview Notes
19 pages
Unit 1
No ratings yet
Unit 1
93 pages
Matrix Formula in Machine Learning
No ratings yet
Matrix Formula in Machine Learning
423 pages
01-Introduction To Machine Learning
No ratings yet
01-Introduction To Machine Learning
89 pages
Unit 1
No ratings yet
Unit 1
62 pages
Asset-V1 - MITx 6.86x 1T2021 Type@Asset Block@Slides - Lecture1 - Withcredits
No ratings yet
Asset-V1 - MITx 6.86x 1T2021 Type@Asset Block@Slides - Lecture1 - Withcredits
29 pages
Overview of Machine Learning
No ratings yet
Overview of Machine Learning
60 pages
Module 1-Basics of ML
No ratings yet
Module 1-Basics of ML
142 pages
Presentation On ML
No ratings yet
Presentation On ML
469 pages
1 Introduction
No ratings yet
1 Introduction
24 pages
Mlintro 2
No ratings yet
Mlintro 2
28 pages
Module 1 Notes
No ratings yet
Module 1 Notes
38 pages
Introduction To ML
No ratings yet
Introduction To ML
46 pages
Machine Learning: Supervised Learning Basics
No ratings yet
Machine Learning: Supervised Learning Basics
23 pages
Complete Unit-1 Merged
No ratings yet
Complete Unit-1 Merged
74 pages
S1 - 25 (NSP) - ML - CS 1 - 27th July 2025
No ratings yet
S1 - 25 (NSP) - ML - CS 1 - 27th July 2025
59 pages
Unit-1 - Session 1 - Supervised & Unsupervised PDF
No ratings yet
Unit-1 - Session 1 - Supervised & Unsupervised PDF
53 pages
ML 01
No ratings yet
ML 01
23 pages
Unit 3
No ratings yet
Unit 3
62 pages
Introduction To Machine Learning
100% (1)
Introduction To Machine Learning
119 pages
482 LectureNotes Chapter 1
No ratings yet
482 LectureNotes Chapter 1
6 pages
I. The Types of Machine Learning
No ratings yet
I. The Types of Machine Learning
8 pages
Geocluster Mod in Machine Learning
No ratings yet
Geocluster Mod in Machine Learning
124 pages
1 - ML - Introduction
No ratings yet
1 - ML - Introduction
47 pages
Machine Learning Syllabus Overview
No ratings yet
Machine Learning Syllabus Overview
70 pages
ELE-COI-521 Machine Learning Topics
No ratings yet
ELE-COI-521 Machine Learning Topics
40 pages
Unit 3
No ratings yet
Unit 3
80 pages
ML Lec 1
No ratings yet
ML Lec 1
47 pages
Intro To ML - 1
No ratings yet
Intro To ML - 1
29 pages
Intro to Machine Learning Course
No ratings yet
Intro to Machine Learning Course
83 pages
MLUnit 1
No ratings yet
MLUnit 1
131 pages
Intro To Machine Learning
No ratings yet
Intro To Machine Learning
32 pages
01 - ML - Introduction
No ratings yet
01 - ML - Introduction
65 pages
UBLearn Course Overview for ML
No ratings yet
UBLearn Course Overview for ML
26 pages
20ECE633T Machine Learning in VLSI
No ratings yet
20ECE633T Machine Learning in VLSI
81 pages
Mlall
No ratings yet
Mlall
186 pages
Introduction to Machine Learning Basics
No ratings yet
Introduction to Machine Learning Basics
38 pages
03 Introtoml Ueh
No ratings yet
03 Introtoml Ueh
43 pages
Lec-1 Introduction
No ratings yet
Lec-1 Introduction
65 pages
Ch7 Introduction To Machine Learning
No ratings yet
Ch7 Introduction To Machine Learning
29 pages
Unit 1 Machine Learning
No ratings yet
Unit 1 Machine Learning
68 pages
Unit 5
No ratings yet
Unit 5
30 pages
Chapter 5 Machine Learning
No ratings yet
Chapter 5 Machine Learning
96 pages
ML Lecture#1
No ratings yet
ML Lecture#1
52 pages
1 - Introduction
No ratings yet
1 - Introduction
82 pages
Introduction to Machine Learning Basics
No ratings yet
Introduction to Machine Learning Basics
606 pages
01 Introduction
No ratings yet
01 Introduction
43 pages
ML L1 PDF
No ratings yet
ML L1 PDF
43 pages
Synergy Between 6G and AI: Open Future Horizons and Impending Security Risks
No ratings yet
Synergy Between 6G and AI: Open Future Horizons and Impending Security Risks
27 pages
Introduction To AI in Manufacturing
No ratings yet
Introduction To AI in Manufacturing
5 pages
Exploring ChatGPT: AI's Conversational Future
No ratings yet
Exploring ChatGPT: AI's Conversational Future
4 pages
Practices, Education and Technology in Audiovisual Translation - 24!10!25!16!30 - 51
No ratings yet
Practices, Education and Technology in Audiovisual Translation - 24!10!25!16!30 - 51
261 pages
Aerospace & Robotics CV
No ratings yet
Aerospace & Robotics CV
3 pages
Understanding Bidirectional Associative Memories
No ratings yet
Understanding Bidirectional Associative Memories
2 pages
Exploring Genetic Algorithms Components and Cycle
No ratings yet
Exploring Genetic Algorithms Components and Cycle
8 pages
Deep Learning in Cybersecurity Survey
No ratings yet
Deep Learning in Cybersecurity Survey
33 pages
Grade 8 Holiday Homework
No ratings yet
Grade 8 Holiday Homework
6 pages
The CIO Playbook
No ratings yet
The CIO Playbook
11 pages
ISO/IEC 42001 Lead Implementer Quiz
No ratings yet
ISO/IEC 42001 Lead Implementer Quiz
65 pages
Accuris - EWB Brochure 2023 - 93bc53
No ratings yet
Accuris - EWB Brochure 2023 - 93bc53
15 pages
Srikanth - Bellary Architect Resume
No ratings yet
Srikanth - Bellary Architect Resume
6 pages
Minimizing Gradient Problems in DNNs
No ratings yet
Minimizing Gradient Problems in DNNs
51 pages
IDC - China's Public Cloud Service Market Tops US$5.4 Billion in 1H 2019 As Competition Intensifies
No ratings yet
IDC - China's Public Cloud Service Market Tops US$5.4 Billion in 1H 2019 As Competition Intensifies
7 pages
An Efficient Piecewise Linear Approximation of Non-Linear Operations For Transformer Inference
No ratings yet
An Efficient Piecewise Linear Approximation of Non-Linear Operations For Transformer Inference
1 page
Soft Computing Techniques Overview
No ratings yet
Soft Computing Techniques Overview
2 pages
Machine Learning Algorithms - A Review: January 2019
No ratings yet
Machine Learning Algorithms - A Review: January 2019
7 pages
Project and Weekly Report For Cancer Detection Model
No ratings yet
Project and Weekly Report For Cancer Detection Model
16 pages
Recommender Systems Notes
No ratings yet
Recommender Systems Notes
21 pages
Chapter 1 Introduction To Datascience
No ratings yet
Chapter 1 Introduction To Datascience
13 pages
DaFang AI Brochure
No ratings yet
DaFang AI Brochure
12 pages
ML Ch-2 Supervised Learning
No ratings yet
ML Ch-2 Supervised Learning
23 pages
33 Crim LF121
No ratings yet
33 Crim LF121
29 pages
Advancements in DRL for Robotics
No ratings yet
Advancements in DRL for Robotics
4 pages
AI - Unit - 04
No ratings yet
AI - Unit - 04
28 pages
MBA Internship
No ratings yet
MBA Internship
45 pages
GenAI CheatSheet
No ratings yet
GenAI CheatSheet
2 pages
Car Classification Model Analysis
No ratings yet
Car Classification Model Analysis
3 pages
Robot - Wikipedia
No ratings yet
Robot - Wikipedia
40 pages