0% found this document useful (0 votes)
54 views26 pages

Lecture 34 - Model Based Reinforcement Learning

The document covers Lecture #34 of AI-832 on Model Based Reinforcement Learning, led by Dr. Zuhair Zafar. It discusses the differences between model-based and model-free reinforcement learning, the advantages of model-based approaches, and the concept of model learning, including model-based Monte Carlo methods for estimating transition probabilities and rewards. The lecture also addresses the challenges of planning with inaccurate models.

Uploaded by

Hadia Ramzan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
54 views26 pages

Lecture 34 - Model Based Reinforcement Learning

The document covers Lecture #34 of AI-832 on Model Based Reinforcement Learning, led by Dr. Zuhair Zafar. It discusses the differences between model-based and model-free reinforcement learning, the advantages of model-based approaches, and the concept of model learning, including model-based Monte Carlo methods for estimating transition probabilities and rewards. The lecture also addresses the challenges of planning with inaccurate models.

Uploaded by

Hadia Ramzan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

AI-832 Reinforcement Learning

Instructor: Dr. Zuhair Zafar

Lecture # 34: Model Based Reinforcement Learning


Recap

• Actor Critic Methods

• What is Actor?

• What is Critic?

• Can we reduce variance in Actor Critic Methods? How?


Today’s Agenda

• Model Based Reinforcement Learning


Model-Based Reinforcement Learning
Model-Based and Model-Free RL
Model-Based and Model-Free RL
Model-Based RL
Advantages of Model-Based RL
What is a Model?
Model Learning
Table Lookup Model
AB Example
Planning with a Model
Sample-Based Planning
Back to the AB Example
Planning with an Inaccurate Model
Model Based Monte Carlo

• In model-based Monte Carlo, the idea is to estimate transition


probabilities and rewards from the data.

• By looking thousand of samples, the model-based Monte Carlo can fairly


accurately estimate the average transition probabilities and rewards.

• The estimated values might not be 100% accurate as they are estimated
from the data.

• By using policy evaluation and value iteration, the optimal utility can be
calculated.
Problem: Model Based Monte Carlo

You might also like