0% found this document useful (0 votes)
75 views6 pages

Reinforcement Learning Syllabus

Uploaded by

Husein Yusuf
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
75 views6 pages

Reinforcement Learning Syllabus

Uploaded by

Husein Yusuf
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Course Number

Course Title Reinforcement Learning

ECTS Credits 5 ects (3 Cr.)

Contact Hours (per week) Lectures Tutorial Practice or Laboratory

2 0 3

Course Objectives &


Competencies to be The objectives of this course typically include:
Acquired
●​ Understanding Reinforcement Learning Concepts: The
course aims to provide a solid understanding of the
fundamental concepts and principles of reinforcement
learning, including Markov Decision Processes, value
functions, policies, exploration, exploitation, and the
trade-off between exploration and exploitation.
●​ Mastering Reinforcement Learning Algorithms: The
course focuses on teaching various reinforcement learning
algorithms and techniques, such as Q-learning, policy
gradients, Monte Carlo methods, temporal difference
learning, and model-based approaches. Students learn how
these algorithms work, their strengths, limitations, and
how to apply them to solve different types of problems.
●​ Solving Real-World Problems: The course aims to equip
students with the skills to apply reinforcement learning to
real-world problems. It covers topics such as function
approximation, handling continuous state and action
spaces, dealing with high-dimensional inputs, and
incorporating deep neural networks into reinforcement
learning algorithms.
●​ Understanding Exploration and Exploitation: The course
explores the challenges of balancing exploration (gaining
new knowledge) and exploitation (using existing
knowledge) in reinforcement learning. Students learn
about exploration strategies, such as epsilon-greedy,
softmax, and UCB (Upper Confidence Bound), and how
to design effective exploration policies.
●​ Deep Reinforcement Learning: The course covers
advanced topics in deep reinforcement learning, which
involve combining deep neural networks with
reinforcement learning algorithms. Students learn about
Deep Q-Networks (DQN), actor-critic methods, policy
gradients with neural networks, and other state-of-the-art
techniques used in deep reinforcement learning.
●​ Evaluating and Analyzing Reinforcement Learning
Agents: The course teaches students how to evaluate and
analyze the performance of reinforcement learning agents.
This includes understanding performance metrics,
conducting experiments, analyzing learning curves, and
assessing the robustness and generalization capabilities of
learned policies.
●​ Applications and Case Studies: The course explores
various applications of reinforcement learning across
different domains, such as robotics, game playing,
recommendation systems, autonomous systems, and
resource management. Students learn about successful
case studies and gain insights into how reinforcement
learning can be used to solve complex problems.
●​ Ethical Considerations: The course addresses ethical
considerations and challenges in reinforcement learning,
such as fairness, bias, safety, and interpretability. Students
learn about the societal impact of reinforcement learning
algorithms and discuss ethical guidelines and responsible
use of these techniques.

Course Outcomes The course outcomes for computer vision graduate courses
include:

●​ Knowledge Acquisition: Students will acquire a


comprehensive understanding of the theoretical
foundations, concepts, and principles of reinforcement
learning. They will develop a solid knowledge base of the
key components, such as Markov Decision Processes,
value functions, policies, exploration-exploitation
trade-offs, and various reinforcement learning algorithms.
●​ Algorithm Implementation: Students will gain practical
experience in implementing and applying reinforcement
learning algorithms. They will be able to write code and
develop software systems to simulate environments, train
agents, and evaluate their performance. Students will
become proficient in implementing algorithms like
Q-learning, policy gradients, and value iteration.
●​ Problem Solving: Students will develop problem-solving
skills specific to reinforcement learning. They will learn
how to analyze real-world problems and formulate them
as reinforcement learning tasks. They will be able to apply
appropriate algorithms, tune hyperparameters, and
evaluate the effectiveness of different approaches in
solving the given problems.
●​ Experimental Design and Evaluation: Students will learn
how to design experiments to evaluate the performance of
reinforcement learning agents. They will acquire skills in
collecting and analyzing data, interpreting experimental
results, and drawing meaningful conclusions. Students
will be able to assess the strengths and weaknesses of
different algorithms based on empirical evaluation.
●​ Application to Real-World Scenarios: Students will gain
the ability to apply reinforcement learning techniques to
real-world domains and scenarios. They will understand
how to adapt and extend reinforcement learning
algorithms to handle complex and practical challenges.
Students will be able to identify appropriate applications
for reinforcement learning and propose effective solutions.
●​ Critical Thinking and Analysis: Students will develop
critical thinking skills by analyzing and evaluating the
theoretical and practical aspects of reinforcement learning.
They will be able to assess the advantages, limitations,
and trade-offs associated with different algorithms and
approaches. Students will be encouraged to think
creatively and propose innovative solutions to
reinforcement learning problems.
●​ Communication and Collaboration: Students will enhance
their communication and collaboration skills through
group projects, presentations, and discussions. They will
be able to effectively convey their ideas, present their
findings, and engage in constructive discussions related to
reinforcement learning concepts, algorithms, and
applications.
●​ Ethical Considerations: Students will develop an
understanding of the ethical implications and societal
impact of reinforcement learning. They will be aware of
issues such as fairness, transparency, and accountability in
the deployment of reinforcement learning systems.
Students will be encouraged to think critically about the
ethical use of reinforcement learning techniques and
consider the broader implications.
Course Contents Lecture 1: Introduction to Reinforcement Learning
Lecture 2: Exploration & Control
Exploration
Epsilon-Greedy
Upper Confidence Bound (UCB)
Thompson Sampling
Optimistic Initialization
Boltzmann Exploration
Upper Confidence Trees (UCT)
Exploitation
Greedy Algorithm
Q-Learning
Policy Gradient Methods
Actor-Critic Methods
Deterministic Policy Optimization (DPO)
Lecture 3: MDPs & Dynamic Programming
Policy Evaluation
Policy Improvement
Value Iteration
Policy Iteration
Lecture 4: Theoretical Fundamentals of Dynamic Programming
Algorithms [reading]
Principle of Optimality
Bellman Equations
Value Function Iteration
Policy Iteration
Optimal Bellman Operator
Lecture 5: Model-free Prediction
Monte Carlo Methods
Temporal Difference (TD) Learning
TD(0) or One-Step TD
TD($\lambda$) or Multi-Step TD
Lecture 6: Model-free Control
Q-Learning
SARSA (State-Action-Reward-State-Action)
Deep Q-Networks (DQN)
Policy Gradient Methods
Actor-Critic Methods
Lecture 7: Function Approximation
Parametric Function
Training Data
Loss Function and Optimization
Generalization
Lecture 8: Planning & models
Model-Based RL
Model Based Planning
Model Free RL
Exploration-Exploitation Tradeoff

Lecture 9: Policy-Gradient & Actor-Critic methods


Policy
Policy + Value
Advantage Actor-Critic (A2C), Asynchronous Advantage
Actor-Critic (A3C), and Proximal Policy Optimization
(PPO)
Lecture 10: Approximate Dynamic Programming
Complex Sequential decision Problems
Value Iteration, Policy Iteration, Q-Learning, SARSA,
Approximate Policy Iteration (API), and Dual Heuristic
Programming (DHP)
Lecture 11: Multi-step & Off Policy
n-step SARSA
n-step Q-learning​
Expected SARSA
Q-learning with Experience Replay
TD($\lambda$):
Lecture 12: Deep Reinforcement Learning #1
Lecture 13: Deep Reinforcement Learning #2
Deep Q-Network (DQN)
Proximal Policy Optimization (PPO)
Deep Deterministic Policy Gradient (DDPG)
Trust Region Policy Optimization (TRPO)
Twin Delayed Deep Deterministic Policy Gradient (TD3)
Soft Actor-Critic (SAC)

Pre-requisites Linear Algebra, Probability and Statistics, Fundamentals of


Machine Learning

Teaching & Learning Lecture, assignments, projects and exercises


Methods

Assessment/Evaluation & ●​ Mid Exam - 15


Grading System ●​ Seminar - 15
○​ Three Seminars
●​ Lab Work and Quizzes - 15
●​ Project - 30
●​ Final Exam -25

Attendance Requirements 85% attendance is required.


Refernces Textbook:
Richard S. Sutton and Andrew G. Barto, "Reinforcement
Learning: An Introduction"

Target Institute: Deep Mind

You might also like