0% found this document useful (0 votes)
11 views28 pages

Intro

Uploaded by

Mohammad Zohaib
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views28 pages

Intro

Uploaded by

Mohammad Zohaib
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

CS-866 Deep Reinforcement Learning

Introduction

Nazar Khan
Department of Computer Science

University of the Punjab


Introduction Supervised ML Unsupervised ML Reinforcement Learning

What is Deep Reinforcement Learning?

Deep RL studies how to solve complex problems that require making a


sequence of good decisions.
I

I These problems often live in high-dimensional state spaces:


I Many variables must be considered simultaneously.
I Example: In chess, the position of each piece denes the state; there are
more possible states than atoms in the universe.
I Example: In robotics, sensors may produce hundreds or thousands of
readings per time step.
Introduction Supervised ML Unsupervised ML Reinforcement Learning

Examples of Sequential Decision-Making

Making Tea: wait until water is boiling, add tea leaves, adjust milk,
control sweetness, simmer for avor, strain before serving.
I

Tic-Tac-Toe: sequences of moves, opponent's responses, and planning


ahead.
I

Chess: much more complex version of tic-tac-toe with astronomical state


space.
I

Having a Conversation: listen to the other person, interpret context,


choose a relevant response, maintain ow, achieve an agenda.
I

Success comes from a sequence of decisions, not a single one. Each


decision has an immediate consequence and a long-term consequence.
An RL agent learns through trial-and-error.
Introduction Supervised ML Unsupervised ML Reinforcement Learning

State Spaces

X
O
O X
Tic-Tac-Toe Chess
3 = 19,683 possible boards.
9 ≈ 10 possible states.
47

Go Conversation
≈ 10 possible states.
170 Innite possible states.
Introduction Supervised ML Unsupervised ML Reinforcement Learning

What is Deep Reinforcement Learning?


I Combination of deep learning + reinforcement learning
I Goal: learn optimal actions that maximize reward across all states
I Works in high-dimensional, interactive environments
Introduction Supervised ML Unsupervised ML Reinforcement Learning

Deep Learning
I Function approximation in high dimensions
I Uses deep neural networks
I Examples: speech recognition, image classication
Introduction Supervised ML Unsupervised ML Reinforcement Learning

Reinforcement Learning
I Learns from trial and error, not from xed datasets
I Feedback comes from the environment (reward / punishment)
I Builds a policy: which action to take in each state
Introduction Supervised ML Unsupervised ML Reinforcement Learning

Where DRL Fits

Low-Dimensional High-Dimensional
Static Dataset Supervised Learning Deep Supervised Learning
Interaction Tabular RL Deep RL
Introduction Supervised ML Unsupervised ML Reinforcement Learning

Applications of DRL
I Robotics: locomotion, manipulation, pancake ipping, helicopters
I Games: Chess, Go, Pac-Man, StarCraft
Real-world: healthcare, nance, recommender systems, energy grids,
ChatGPT
I
Introduction Supervised ML Unsupervised ML Reinforcement Learning

Four Related Fields


1. Psychology

I Conditioning: Pavlov's dog


I Operant conditioning (Skinner)
I Learning from reinforcement is a core AI idea
Introduction Supervised ML Unsupervised ML Reinforcement Learning

Four Related Fields


1. Psychology

: A natural reaction to food is that a dog salivates. By ringing a


bell whenever the dog is given food, the dog learns to associate the sound with
Pavlov's dog

food, and after enough trials, the dog starts salivating as soon as it hears the
bell, presumably in anticipation of the food, whether it is there or not.
Introduction Supervised ML Unsupervised ML Reinforcement Learning

Four Related Fields


2. Mathematics

I Markov Decision Processes (MDPs)


I Optimization, planning, graph theory
I Symbolic AI: search, reasoning, theorem proving

Andrei Markov (1856-1922)


Introduction Supervised ML Unsupervised ML Reinforcement Learning

Four Related Fields


3. Engineering

I Known as optimal control in engineering.


I Focus on dynamical systems.
Bellman and Pontryagin's work in optimal control laid the foundation of
RL.
I

Two space vehicles docking Richard Bellman (1920-1984) Lev Pontryagin (1908-1988)
Introduction Supervised ML Unsupervised ML Reinforcement Learning

Four Related Fields


4. Biology

I Connectionism: swarm intelligence, neural networks


I Nature-inspired algorithms: ant colony, evolutionary algorithms

Biological Neuron Articial Neural Network Hinton, LeCun, Bengio


Introduction Supervised ML Unsupervised ML Reinforcement Learning

Three Paradigms of Machine Learning

Machine Learning studies how to approximate functions f : X → Y from


data.
I

I Often, functions are not known analytically.


I Instead, we them from observations.
learn

I Three main paradigms:


1. Supervised Learning
2. Unsupervised Learning
3. Reinforcement Learning
Introduction Supervised ML Unsupervised ML Reinforcement Learning

Functions in AI
I A function transforms input x to output y : f (x) → y .
I More generally: f : X → Y , where X , Y can be discrete or continuous.
I Real-world functions may be stochastic: f : X → p(Y ).
Introduction Supervised ML Unsupervised ML Reinforcement Learning

Given vs. Learned Functions


I Sometimes f is given exactly (laws of physics, explicit algorithms).
Example: Newton's 2nd Law F = m · a.
I Often, f is unknown and must be approximated from data.
I This is the domain of machine learning.
Introduction Supervised ML Unsupervised ML Reinforcement Learning

Supervised Learning
I Data: example pairs (x, y ).
I Goal: learn a function fˆ that predicts y from x .
I Common tasks:
I Regression: predict a continuous value.
I Classication: predict a discrete category.
I Loss function measures prediction error, e.g. MSE or cross-entropy.
Introduction Supervised ML Unsupervised ML Reinforcement Learning

Example: Regression

Figure: Blue: data points. Red: learned linear function ŷ = ax + b.


Introduction Supervised ML Unsupervised ML Reinforcement Learning

Example: Classication

Cat Cat Dog Cat Dog Dog


Introduction Supervised ML Unsupervised ML Reinforcement Learning

Unsupervised Learning
I No labels: only input data x .
I Goal: nd structure in data (clusters, latent variables).
I Examples:
I k -means clustering
I Principal Component Analysis (PCA)
I Autoencoders
I Learns p(x) instead of p(y |x).
Introduction Supervised ML Unsupervised ML Reinforcement Learning

Reinforcement Learning
I Third paradigm of machine learning
I Learns by interaction with the environment
I Data comes sequentially (one state at a time)
I Objective: learn a policy  a function mapping states to the best actions
Introduction Supervised ML Unsupervised ML Reinforcement Learning

Agent and Environment

Figure: Agent interacts with Environment to maximize reward.

I Agent: Learner/decision-maker
I Environment: Provides feedback and state transitions
I Goal: maximize long-term accumulated reward
Introduction Supervised ML Unsupervised ML Reinforcement Learning

Key Dierences from Supervised/Unsupervised Learning


1. Interaction-based: No pre-collected dataset; data generated dynamically
via interaction between agent and environment
2. Reward signal: Partial numeric feedback, not full labels; UL has no
labels, RL has reward, SL has complete labels
3. Sequential decision-making: Learns policies across multiple steps
In RL there is no teacher or supervisor, and there is no static
dataset.
I

RL learns a policy for the environment by interacting with it and


receiving rewards and punishments.
I

SL can classify a set of images for you; UL can tell you which items
belong together; RL can tell you the winning of moves in
I

a game of chess, or the action- that robot-legs need to


sequence

take in order to walk.


sequence
Introduction Supervised ML Unsupervised ML Reinforcement Learning

Supervised vs Reinforcement Learning

Concept Supervised Learning Reinforcement Learning


Inputs x Full dataset One state at a time
Labels y Full (correct action) Partial (numeric reward)
Table: Comparison of paradigms
Introduction Supervised ML Unsupervised ML Reinforcement Learning

Implications of RL Paradigm
I Data is generated step-by-step ⇒ suited for sequential problems
I Risk of circular feedback (policy both selects and learns from actions)
I RL can continue to learn indenitely if environment is challenging
I Example: Chess, Go, robotics, conversational agents
Introduction Supervised ML Unsupervised ML Reinforcement Learning

Deep Reinforcement Learning


I Traditional RL: works on small, low-dimensional state spaces
I Many real-world problems: large, high-dimensional state spaces
I Deep RL = RL + Deep Learning
I Handles large state spaces
I Scales to complex tasks
I Key driver of recent breakthroughs in AI
Introduction Supervised ML Unsupervised ML Reinforcement Learning

Summary
I Deep RL = deep learning + reinforcement learning
I Solves sequential decision problems in high dimensions
I Rooted in psychology, math, engineering, biology
I Applications: robotics, games, healthcare, nance, any interactive setting

You might also like