Course Lecturers: Wei Pan and Mingfei Sun
This is a MSc/CDT module offered by the Department of Computer Science at the University of Manchester.
This course explores how agents learn to make decisions through interaction with an environment, guided by rewards. It covers fundamental tabular methods (MDPs, value functions, model-free RL), function approximation & deep RL (linear function approximation, LSTD, deep RL), policy gradients & model-based RL (actor-critic, natural gradients, model-based exploration), and POMDPs & multi-agent RL (belief MDPs, deep RL for POMDPs, cooperative multi-agent RL). The course combines theoretical insights with practical implementations using modern RL frameworks.
| COMP64202 | Lab Session | Lab Link | Location |
|---|---|---|---|
| Feb 12, 2025 | Markov Decision Processes | Lab1 | Kilburn_2.25 (A+B) |
| Feb 26, 2025 | Function Approximation | Lab2 | Kilburn_2.25 (A+B) |
| Mar 19, 2025 | Policy Gradients | Lab3 | Kilburn_2.25 (A+B) |
| Apr 2, 2025 | POMDP and Multi-Agent RL | Lab4 | Kilburn_2.25 (A+B) |