0% found this document useful (0 votes)

23 views57 pages

Meta Learning

The document discusses meta learning, defined as learning to learn, and its application in machine learning. It outlines the process of meta learning, including defining learnable components, loss functions, and optimization techniques. The document also compares traditional machine learning with meta learning, highlighting the importance of learning algorithms that can adapt quickly to new tasks.

Uploaded by

pq4szdp6p8

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

23 views57 pages

Meta Learning

Uploaded by

pq4szdp6p8

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 57

Meta Learning:

Learn to learn

Hung-yi Lee

What does “meta” mean? meta-X = X about X

Source of image: https://medium.com/intuitionmachine/the-brute-force-method-of-
deep-learning-innovation-58b497323ae5 (Denny Britz’s graphic)

這門課的作業在做甚麼？

感謝沈昇勳同學提供圖檔
Industry Academia

Using 1000 GPUs to try “Telepathize” (通靈) a set of

1000 sets of hyperparameters good hyperparameters

Can machine automatically determine the hyperparameters?

Machine
Learning 101
Dog-Cat Classification
Machine Learning 𝑓 = “cat”
= Looking for a function

Step 1: Function cat or dog?

with unknown
𝑓𝜽
Step 2: Define
loss function

Weights and biases of neurons are

Step 3: unknown parameters (learnable).
Optimization
Using 𝜽 to represent the
unknown parameters.
Training Examples
Machine Learning

Step 1: Function 𝑓𝜽 𝑓𝜽 𝑓𝜽
with unknown

Step 2: Define
𝐿 𝜽
loss function cat dog cat dog
Cross-entropy 𝑒1 𝑒2

Step 3:
Optimization 𝐾
cat dog cat dog
𝐿 𝜽 = ෍ 𝑒𝑘
Ground Truth
𝑘=1
Machine Learning 101
𝐾
Step 1: Function sum over
with unknown loss: 𝐿 𝜽 = ෍ 𝑒𝑘 training
𝑘=1 examples

Step 2: Define 𝜽∗ = 𝑎𝑟𝑔 min 𝐿 𝜽

𝜽
loss function
done by gradient descent

Step 3: 𝑓𝜽∗ is the function learned by

Optimization learning algorithm from data
Introduction of
Meta Learning
What is Meta Learning?
Can we learn this function?
Following the same
Training Examples three steps in ML!
function
Learning
𝐹 algorithm Hand-crafted
cat dog

input

𝑓 ∗ classifier Learned from data

Testing
output
cat
Meta Learning – Step 1
• What is learnable in a learning algorithm?
Component
Training Examples
Net Architecture,
Deep Initial Parameters,
𝐹 Learning Learning Rate,
cat dog
……

In meta, we will try to

learn some of them.
𝑓 ∗ classifier
Testing

cat
Meta Learning – Step 1
• What is learnable in a learning algorithm?
Component
Training Examples
𝐹𝜙 Net Architecture,
Deep Initial Parameters,
𝐹 Learning Learning Rate,
cat dog
……
𝜙: learnable components

𝑓 ∗ classifier
Testing
Categorize meta learning based
cat
on what is learnable
Meta Learning – Step 2
• Define loss function for learning algorithm 𝐹𝜙
𝐿 𝜙
𝐿 𝜙 𝐿 𝜙

Task 1
Apple & Train Test
Orange apple orange apple orange
Training
Tasks
Task 2
Train Test
Car & Bike
bike car bike car
Meta Learning – Step 2

Training
Task 1 How to define 𝐿 𝜙
Examples
apple orange
𝐿 𝜙
𝐹𝜙

classifier 𝑓𝜽𝟏∗

𝜽𝟏∗ : parameters of the classifier learned by 𝐹𝜙

using the training examples of task 1
Meta Learning – Step 2

Training
Task 1 How to define 𝐿 𝜙
Examples
apple orange
𝐿 𝜙
𝐹𝜙

classifier 𝑓𝜽𝟏∗

How can we know a classifier is good or bad?

Evaluate the classifier on testing set
Meta Learning – Step 2 Testing Examples

Training
Task 1
Examples
apple orange
𝑓𝜽𝟏∗ 𝑓𝜽𝟏∗
Testing 𝐹𝜙
Examples

𝑓𝜽𝟏∗
apple orange apple orange
apple orange prediction Cross-entropy Cross-entropy

1 Compute
𝑙 difference apple orange apple orange

Ground Truth
Meta Learning – Step 2 Testing Examples

Training
Task 1
Examples
apple orange
𝑓𝜽𝟏∗ 𝑓𝜽𝟏∗
Testing 𝐹𝜙
Examples

𝑓𝜽𝟏∗
apple orange apple orange
apple orange prediction Cross-entropy Cross-entropy

1 Compute
𝑙 difference apple orange apple orange

Ground Truth
Meta Learning – Step 2 Testing Examples

Training
Task 1
Examples
apple orange
𝑓𝜽𝟏∗ 𝑓𝜽𝟏∗
Testing 𝐹𝜙
Examples

𝑓𝜽𝟏∗
apple orange apple orange
apple orange prediction Cross-entropy Cross-entropy

1 Compute
𝑙 difference apple orange apple orange

Ground Truth
Meta Learning – Step 2

Training Task 2
Task 1
Examples
apple orange bike car

Testing 𝐹𝜙 Testing 𝐹𝜙
Examples Examples

𝑓𝜽𝟏∗ 𝑓𝜽𝟐∗
apple orange prediction bike car prediction

𝑙1 𝑙2
Total loss: 𝐿 𝜙 = 𝑙1 + 𝑙 2 (sum over all the
training tasks)
Meta Learning – Step 2

Training Task 2
Task 1
Examples
apple orange bike car

Testing 𝐹𝜙 Testing 𝐹𝜙
Examples Examples

𝑓𝜽𝟏∗ 𝑓𝜽𝟐∗
apple orange prediction bike car prediction

𝑙1 𝑁 𝑙2
Total loss: 𝐿 𝜙 = ෍ 𝑙 𝑛 (𝑁 is the number of the
𝑛=1 training tasks)
Meta Learning – Step 2 Testing Examples
In typical ML, you compute the
Task 1 loss based on training examples
In meta, you compute the loss
based on testing examples 𝑓𝜽𝟏∗ 𝑓𝜽𝟏∗

Hold on! You use testing

examples during training???

apple orange apple orange

apple orange prediction Cross-entropy Cross-entropy

1 Compute
𝑙 difference apple orange apple orange

Ground Truth
Meta Learning – Step 2 Testing Examples
In typical ML, you compute the
Task 1 loss based on training examples
In meta, you compute the loss
based on testing examples 𝑓𝜽𝟏∗ 𝑓𝜽𝟏∗
of training tasks.

apple orange apple orange

apple orange prediction Cross-entropy Cross-entropy

1 Compute
𝑙 difference apple orange apple orange

Ground Truth
Meta Learning – Step 3
𝑁

• Loss function for learning algorithm 𝐿 𝜙 = ෍ 𝑙𝑛

𝑛=1

• Find 𝜙 that can minimize 𝐿 𝜙 𝜙 ∗ = 𝑎𝑟𝑔 𝑚𝑖𝑛 𝐿 𝜙

𝜙

• Using the optimization approach you know

If you know how to compute 𝜕𝐿 𝜙 Τ𝜕𝜙
Gradient descent is your friend.
What if 𝐿 𝜙 is not differentiable?
Reinforcement Learning / Evolutionary Algorithm
Now we have a learned “learning algorithm” F𝜙∗
Framework Training Tasks
Not related to Task 1 Task 2
the testing task

apple orange bike car

only need little labeled training data

Learned
F𝜙∗ “Learning
Testing cat dog Algorithm”
Task Train
What we really Test 𝑓𝜽∗
care about
cat
ML v.s. Meta
Goal
Machine Learning ≈ find a function f

Dog-Cat 𝑓 = “cat”
Classification

Meta Learning
≈ find a function F that finds a function f

Learning
Algorithm
𝐹 =𝑓
cat dog cat dog
Training Examples
Machine Learning
Training Data One task

Meta Learning cat dog

Training tasks Train

Task 1
Train Test
Apple &
Orange apple orange apple orange

Task 2 Train Test

Car & Bike bike car bike car

Support set Query set

(in the literature of “learning to compare”)
Machine Learning Within-task Training

Train 𝐹 𝑓𝜽∗
cat dog
Hand-crafted
Meta Learning

Task 1 Train Test

orange
Training apple orange apple

Tasks

Task 2 Train Test

bike car bike car

Learning Across-task Training

F𝜙∗ Algorithm
Training Examples
Machine Learning

𝑓𝜽∗
Within-task
Test Testing
cat
Meta Learning
Training Tasks

Learned
F𝜙∗ “Learning
Within-task
Testing cat dog Algorithm”
Training
Task Train

Test 𝑓𝜽∗
Within-task
Across-task
Testing
Testing Episode cat
Loss
Machine Learning
𝐾
Sum over training
𝐿 𝜽 = ෍ 𝑒𝑘
examples in one task
𝑘=1
Meta Learning
𝑁
Sum over testing
𝐿 𝜙 = ෍ 𝑙𝑛 examples in one task
𝑛=1

Sum over training tasks

𝑁 If your optimization method needs to
𝐿 𝜙 = ෍ 𝑙𝑛 compute 𝐿 𝜙
Outer Loop in
𝑛=1
“Learning to initialize”
Across-task training
Training
includes within-task
Examples
apple orange training and testing
Inner Loop in
Testing 𝐹𝜙 “Learning to initialize”
Examples
Within-task Training
𝑓𝜽∗
apple orange prediction Within-task Testing

𝑙1 To compute the loss

Meta Learning v.s ML
• What you know about ML can usually apply to
meta learning
• Overfitting on training tasks
• Get more training tasks to improve performance
• Task augmentation
• There are also hyperparameters when learning a
learning algorithm ……
• Development task ☺
What is learnable in a
learning algorithm?
Review: Gradient Descent 𝜽∗

Network 𝜙
𝜽𝟎 Update 𝜽′ Update 𝜽′′
Structure Init

gradient gradient
Gradient
Descent Compute Compute
(Function 𝐹) Gradient Gradient

Training Training
Data Data
Learning to initialize
• Model-Agnostic Meta-Learning (MAML)

Mammals

Chelsea Finn, Pieter Abbeel, and Sergey Levine, “Model-Agnostic Meta-

Learning for Fast Adaptation of Deep Networks”, ICML, 2017

• Reptile

https://arxiv.org/abs/1803.02999
How to train your Dragon MAML

Antreas Antoniou, Harrison Edwards, Amos Storkey, How to train your MAML, ICLR, 2019
MAML Testing
Task 1 Task 2 Task

find good init

cat dog cat dog cat dog

Pre-training (Self-supervised Learning)

find good init

Trained by proxy tasks cat dog
(fill-in the blanks, etc.)
MAML Isn’t it domain adaptation / transfer learning?

Task 1 Task 2

find good init

cat dog cat dog cat dog

Pre-training (more typical ways)

find good init

cat dog cat dog cat dog
Use data from different tasks Also known as multi-task
to train a model learning (baseline of meta)
MAML v.s. Pre-training
• https://youtu.be/vUwOA3SNb_E

影片中有防不勝防
的業配

這就是 “meta 業配”

MAML is good because ……
• ANIL (Almost No Inner Loop)

Aniruddh Raghu, Maithra Raghu, Samy Bengio, Oriol Vinyals, Rapid Learning or
Feature Reuse? Towards Understanding the Effectiveness of MAML, ICLR, 2020
More about MAML
• More mathematical details behind MAML
• https://youtu.be/mxqzGwP_Qys
• First order MAML (FOMAML)
• https://youtu.be/3z997JhL9Oo
• Reptile
• https://youtu.be/9jJe2AD35P8
𝜙
Basis form: 𝜽𝒕+𝟏 ← 𝜽𝒕 − 𝜆𝒈𝒕
Optimizer Adagrad, RMSprop, NAG, Adam ……
Is the optimizer learnable?
𝜽∗
Can be learned by MAML

Network
𝜽𝟎 Update 𝜽′ Update 𝜽′′
Structure Init

gradient gradient
Gradient
Descent Compute Compute
(Function 𝐹) Gradient Gradient

Training Training
Data Data
Marcin Andrychowicz, et al., Learning to learn by
Optimizer gradient descent by gradient descent, NIPS, 2016
Network Architecture Search (NAS)
𝜽∗
𝜙
Network
𝜽𝟎 Update 𝜽′ Update 𝜽′′
Structure Init

gradient gradient
Gradient
Descent Compute Compute
(Function 𝐹) Gradient Gradient

Training Training
Data Data
Network Architecture Search (NAS)

An agent uses a set of actions to −𝐿 𝜙

determine the network architecture. Reward to be
𝜙: the agent’s parameters maximized
Network Architecture Search (NAS)
Across-task Update 𝜙 to maximize reward −𝐿 𝜙
Training

agent 𝜙 (RNN) form a

−𝐿 𝜙
network
Accuracy
of the
network

Train the network

Within-task Training
Network Architecture Search (NAS)

𝜙෠ = 𝑎𝑟𝑔 𝑚𝑖𝑛 𝐿 𝜙 ∇𝜙 𝐿 𝜙 =?
𝜙
Network
Architecture
• Reinforcement Learning
• Barret Zoph, et al., Neural Architecture Search with Reinforcement
Learning, ICLR 2017
• Barret Zoph, et al., Learning Transferable Architectures for Scalable Image
Recognition, CVPR, 2018
• Hieu Pham, et al., Efficient Neural Architecture Search via Parameter
Sharing, ICML, 2018
• Evolution Algorithm
• Esteban Real, et al., Large-Scale Evolution of Image Classifiers, ICML 2017
• Esteban Real, et al., Regularized Evolution for Image Classifier Architecture
Search, AAAI, 2019
• Hanxiao Liu, et al., Hierarchical Representations for Efficient Architecture
Search, ICLR, 2018
Network Architecture Search (NAS)

𝜙෠ = 𝑎𝑟𝑔 𝑚𝑖𝑛 𝐿 𝜙 ∇𝜙 𝐿 𝜙 =?
𝜙
Network
Architecture
• DARTS Hanxiao Liu, et al., DARTS: Differentiable Architecture Search,
ICLR, 2019
Data Processing? 𝜽∗

Network
𝜽𝟎 Update 𝜽′ Update 𝜽′′
Structure Init

gradient gradient
Gradient
Descent Compute Compute
(Function 𝐹) Gradient Gradient

Training Training
Data Data
Data Augmentation

Yonggang Li, Guosheng Hu, Yongtao Wang, Timothy Hospedales, Neil M.

Robertson, Yongxin Yang, DADA: Differentiable Automatic Data Augmentation,
ECCV, 2020
Daniel Ho, Eric Liang, Ion Stoica, Pieter Abbeel, Xi Chen, Population Based
Augmentation: Efficient Learning of Augmentation Policy Schedules, ICML, 2019
Ekin D. Cubuk, Barret Zoph, Dandelion Mane, Vijay Vasudevan, Quoc V. Le,
AutoAugment: Learning Augmentation Policies from Data, CVPR, 2019
Sample Reweighting
• Give different samples different weights

Larger weights (focus on

tough examples)?
Smaller weights (the labels
are noisy)?

Sample Weighting Strategies Learnable 𝜙

Jun Shu, Qi Xie, Lixuan Yi, Qian Zhao, Sanping Zhou, Zongben Xu, Deyu Meng,
Meta-Weight-Net: Learning an Explicit Mapping For Sample Weighting, NeurIPS, 2019
Mengye Ren, Wenyuan Zeng, Bin Yang, Raquel Urtasun, Learning to Reweight Examples
for Robust Deep Learning, ICML, 2018
Beyond Gradient Descent
Andrei A. Rusu, Dushyant Rao, Jakub Sygnowski, Oriol Vinyals, Razvan
Pascanu, Simon Osindero, Raia Hadsell, 𝜽∗
Meta-Learning with Latent Embedding Optimization, ICLR, 2019

Network
𝜽𝟎 Update 𝜽′ Update 𝜽′′
Structure Init This is a Network.
gradient is 𝜙
Its parameter gradient
Gradient
(Invent
Descent Compute
new learning algorithm! Compute
Not gradient descent anymore)
(Function 𝐹) Gradient Gradient

Training Training
Data Data
Until now …… How about?
cat cat

Learning
Algorithm 𝜽∗ Learning + Classification
(Function 𝐹) (Function 𝐹)

cat dog
cat dog
Training Data Testing Data Training Data Testing Data

https://youtu.be/yyKaACh_j3M
Learning to compare https://youtu.be/scK2EIT7klw
https://youtu.be/semSxPP2Yzg
(metric-based approach)
https://youtu.be/ePimv_k-H24
Applications
Few-shot Image Classification
• Each class only has a few images.

Class 1 Class 1 Class 2 Class 2 Class 3 Class 3 Which

3-ways 2-shot class?

• N-ways K-shot classification: In each task, there are

N classes, each has K examples.
• In meta learning, you need to prepare many N-ways
K-shot tasks as training and testing tasks.
Omniglot
https://github.com/brendenlake/omniglot

• 1623 characters
• Each has 20 examples
Demo:
Omniglot https://openai.com/blog/reptile/

20 ways Testing set

1 shot (Query set)

Each character Training set

represents a class (Support set)

• Split your characters into training and testing characters

• Sample N training characters, sample K examples from
each sampled characters → one training task
• Sample N testing characters, sample K examples from
each sampled characters → one testing task
http://speech.ee.
ntu.edu.tw/~tlkag
k/meta_learning_
table.pdf

Meta Learning Algorithms in Deep Learning
No ratings yet
Meta Learning Algorithms in Deep Learning
40 pages
2024 MTH058 Lecture09 Meta Learning
No ratings yet
2024 MTH058 Lecture09 Meta Learning
25 pages
Meta-Learning For Few-Shot Natural Language Processing - A Survey
No ratings yet
Meta-Learning For Few-Shot Natural Language Processing - A Survey
7 pages
A Perspective View and Survey of Meta-Learning-2002
No ratings yet
A Perspective View and Survey of Meta-Learning-2002
23 pages
ML Lecture#1
No ratings yet
ML Lecture#1
52 pages
01 Introduction
No ratings yet
01 Introduction
43 pages
1.0 Introduction
No ratings yet
1.0 Introduction
50 pages
Introduction To Machine Learning: WWW - Seas.upenn - Edu/ Cis519
100% (1)
Introduction To Machine Learning: WWW - Seas.upenn - Edu/ Cis519
51 pages
01 Introduction ML
No ratings yet
01 Introduction ML
48 pages
01 Introduction
No ratings yet
01 Introduction
51 pages
Machine Learning Week2
No ratings yet
Machine Learning Week2
51 pages
Machine Learning INTRO
No ratings yet
Machine Learning INTRO
12 pages
Online Meta-Learning: y 0. An Algorithm That Understands The Underlying Struc
No ratings yet
Online Meta-Learning: y 0. An Algorithm That Understands The Underlying Struc
19 pages
Unit 1
No ratings yet
Unit 1
93 pages
Machine Learning Course Guide
No ratings yet
Machine Learning Course Guide
151 pages
Unit1 ML NGP
No ratings yet
Unit1 ML NGP
106 pages
Introduction to Machine Learning
No ratings yet
Introduction to Machine Learning
79 pages
Data Science Guide
100% (1)
Data Science Guide
275 pages
01 Introduction ML
No ratings yet
01 Introduction ML
60 pages
Machine Learning
No ratings yet
Machine Learning
42 pages
01 Introduction
No ratings yet
01 Introduction
50 pages
Unit 1
No ratings yet
Unit 1
92 pages
Intro to Machine Learning Basics
100% (1)
Intro to Machine Learning Basics
11 pages
AI Neural Networks Guide
No ratings yet
AI Neural Networks Guide
25 pages
ML Module 1
No ratings yet
ML Module 1
79 pages
Introduction To ML
No ratings yet
Introduction To ML
4 pages
Metalearning - A Tutorial: Christophe Giraud-Carrier December 2008
No ratings yet
Metalearning - A Tutorial: Christophe Giraud-Carrier December 2008
45 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
24 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
16 pages
Machine Learning Course Overview
No ratings yet
Machine Learning Course Overview
53 pages
Firoz Topic 0
No ratings yet
Firoz Topic 0
24 pages
1 - Introduction
No ratings yet
1 - Introduction
82 pages
Introduction to Machine Learning Basics
No ratings yet
Introduction to Machine Learning Basics
606 pages
Daily Dose of Data Science Full Archive
No ratings yet
Daily Dose of Data Science Full Archive
53 pages
Ai and Machine Learning
No ratings yet
Ai and Machine Learning
8 pages
ML Lec-1
No ratings yet
ML Lec-1
59 pages
Intro to Machine Learning Basics
No ratings yet
Intro to Machine Learning Basics
71 pages
Understanding Machine Learning Basics
No ratings yet
Understanding Machine Learning Basics
17 pages
Lecture#12 DM MS (DEIM) Spring 2025
No ratings yet
Lecture#12 DM MS (DEIM) Spring 2025
21 pages
Unit 1
No ratings yet
Unit 1
62 pages
AI Research Insights by Niloy Ganguly
No ratings yet
AI Research Insights by Niloy Ganguly
49 pages
Introduction To Deep Learning AI 2025
No ratings yet
Introduction To Deep Learning AI 2025
78 pages
Last Time: - Web As A Graph - What Is Link Analysis
No ratings yet
Last Time: - Web As A Graph - What Is Link Analysis
78 pages
Overview of Machine Learning
No ratings yet
Overview of Machine Learning
60 pages
Unit 1 ML
No ratings yet
Unit 1 ML
93 pages
Aiml Co - 3,4 Notes
No ratings yet
Aiml Co - 3,4 Notes
98 pages
ML Unit 1
No ratings yet
ML Unit 1
29 pages
Lec1 - Introduction
No ratings yet
Lec1 - Introduction
55 pages
Ch7 Introduction To Machine Learning
No ratings yet
Ch7 Introduction To Machine Learning
29 pages
Unit 4
No ratings yet
Unit 4
34 pages
Key Ideas in Machine Learning
No ratings yet
Key Ideas in Machine Learning
11 pages
Week 1 - Artificial Neural Networks - Part I - Justin
No ratings yet
Week 1 - Artificial Neural Networks - Part I - Justin
56 pages
Introduction To Machine Learning: Agenda
No ratings yet
Introduction To Machine Learning: Agenda
13 pages
UNIT I 1 ML Introduction To ML Well Posed Learning Problem
No ratings yet
UNIT I 1 ML Introduction To ML Well Posed Learning Problem
48 pages
Lec 1,2
No ratings yet
Lec 1,2
69 pages
Introduction To ML
No ratings yet
Introduction To ML
46 pages
MLintroduction
No ratings yet
MLintroduction
75 pages
Machine Learning Model Overview
No ratings yet
Machine Learning Model Overview
6 pages
Concept - : Program
No ratings yet
Concept - : Program
7 pages
Ijsrp p8252
No ratings yet
Ijsrp p8252
6 pages
Master of Computer Science Courses
No ratings yet
Master of Computer Science Courses
4 pages
Lesson 07 7.02 Knowledge Check
No ratings yet
Lesson 07 7.02 Knowledge Check
7 pages
CSE4037 Reinforcement Learning
No ratings yet
CSE4037 Reinforcement Learning
19 pages
Final-Report22 3 PDF
No ratings yet
Final-Report22 3 PDF
124 pages
Wars of None Artificial Intelligence
No ratings yet
Wars of None Artificial Intelligence
27 pages
The KDD Process For From Volumes Of: Extracting Useful Knowledge Data
No ratings yet
The KDD Process For From Volumes Of: Extracting Useful Knowledge Data
12 pages
Lightweight Image Segmentation Smart Agriculture
No ratings yet
Lightweight Image Segmentation Smart Agriculture
10 pages
Fundamentals of Deep Learning
No ratings yet
Fundamentals of Deep Learning
26 pages
Skin Disease Detection App Using CNN
No ratings yet
Skin Disease Detection App Using CNN
18 pages
Development of Heart Disesase Prediction System Using Firefly Feature Selection and Logistic Regression Algorithm (Tobless)
No ratings yet
Development of Heart Disesase Prediction System Using Firefly Feature Selection and Logistic Regression Algorithm (Tobless)
42 pages
Big Data Analytics With Java 1st Edition Rajat Mehta
No ratings yet
Big Data Analytics With Java 1st Edition Rajat Mehta
65 pages
Python Interview Prep for Devs
No ratings yet
Python Interview Prep for Devs
9 pages
PHD Admission at NITD
No ratings yet
PHD Admission at NITD
17 pages
Understanding Cluster Analysis Techniques
No ratings yet
Understanding Cluster Analysis Techniques
68 pages
Svms Homework Now
100% (1)
Svms Homework Now
8 pages
Data Science Roadmap
No ratings yet
Data Science Roadmap
2 pages
Sachin Biradar
No ratings yet
Sachin Biradar
2 pages
AI3202-Handout (Computer Vision & Pattern Recognition) AI3202-Handout (Computer Vision & Pattern Recognition)
No ratings yet
AI3202-Handout (Computer Vision & Pattern Recognition) AI3202-Handout (Computer Vision & Pattern Recognition)
7 pages
AP19110010110 Project Report
No ratings yet
AP19110010110 Project Report
9 pages
Comprehensive Guide to Multilayer Perceptrons
No ratings yet
Comprehensive Guide to Multilayer Perceptrons
2 pages
Causal Machine Learning For Supply Chain Risk Prediction and Intervention Planning
No ratings yet
Causal Machine Learning For Supply Chain Risk Prediction and Intervention Planning
22 pages
Advancing Personalized Learningthrough Educational Artificial Intelligence Challenges Opportunitiesand Future Directions
No ratings yet
Advancing Personalized Learningthrough Educational Artificial Intelligence Challenges Opportunitiesand Future Directions
14 pages
Tsa Unit 5 QB
No ratings yet
Tsa Unit 5 QB
30 pages
Radiology
No ratings yet
Radiology
2 pages
AI's Impact on Privacy Law
No ratings yet
AI's Impact on Privacy Law
60 pages
Chapter 4
No ratings yet
Chapter 4
34 pages
MCQ DM
No ratings yet
MCQ DM
2 pages
Materi 2 - PerMen PUPR NO. 10-2023
No ratings yet
Materi 2 - PerMen PUPR NO. 10-2023
50 pages

Meta Learning

Uploaded by

Meta Learning

Uploaded by

Meta Learning:

What does “meta” mean? meta-X = X about X

Using 1000 GPUs to try “Telepathize” (通靈) a set of

Can machine automatically determine the hyperparameters?

Step 1: Function cat or dog?

Weights and biases of neurons are

Step 2: Define 𝜽∗ = 𝑎𝑟𝑔 min 𝐿 𝜽

Step 3: 𝑓𝜽∗ is the function learned by

𝑓 ∗ classifier Learned from data

In meta, we will try to

𝜽𝟏∗ : parameters of the classifier learned by 𝐹𝜙

How can we know a classifier is good or bad?

Hold on! You use testing

apple orange apple orange

apple orange apple orange

• Loss function for learning algorithm 𝐿 𝜙 = ෍ 𝑙𝑛

• Find 𝜙 that can minimize 𝐿 𝜙 𝜙 ∗ = 𝑎𝑟𝑔 𝑚𝑖𝑛 𝐿 𝜙

• Using the optimization approach you know

apple orange bike car

only need little labeled training data

Meta Learning cat dog

Task 2 Train Test

Support set Query set

Task 1 Train Test

Task 2 Train Test

Learning Across-task Training

Sum over training tasks

𝑙1 To compute the loss

Chelsea Finn, Pieter Abbeel, and Sergey Levine, “Model-Agnostic Meta-

find good init

Pre-training (Self-supervised Learning)

find good init

find good init

Pre-training (more typical ways)

find good init

這就是 “meta 業配”

An agent uses a set of actions to −𝐿 𝜙

agent 𝜙 (RNN) form a

Train the network

Yonggang Li, Guosheng Hu, Yongtao Wang, Timothy Hospedales, Neil M.

Larger weights (focus on

Sample Weighting Strategies Learnable 𝜙

Class 1 Class 1 Class 2 Class 2 Class 3 Class 3 Which

• N-ways K-shot classification: In each task, there are

20 ways Testing set

Each character Training set

• Split your characters into training and testing characters

You might also like