AI-832 Reinforcement Learning
Instructor: Dr. Zuhair Zafar
Lecture # 34: Model Based Reinforcement Learning
Recap
• Actor Critic Methods
• What is Actor?
• What is Critic?
• Can we reduce variance in Actor Critic Methods? How?
Today’s Agenda
• Model Based Reinforcement Learning
Model-Based Reinforcement Learning
Model-Based and Model-Free RL
Model-Based and Model-Free RL
Model-Based RL
Advantages of Model-Based RL
What is a Model?
Model Learning
Table Lookup Model
AB Example
Planning with a Model
Sample-Based Planning
Back to the AB Example
Planning with an Inaccurate Model
Model Based Monte Carlo
• In model-based Monte Carlo, the idea is to estimate transition
probabilities and rewards from the data.
• By looking thousand of samples, the model-based Monte Carlo can fairly
accurately estimate the average transition probabilities and rewards.
• The estimated values might not be 100% accurate as they are estimated
from the data.
• By using policy evaluation and value iteration, the optimal utility can be
calculated.
Problem: Model Based Monte Carlo