Skip to content

mathias-madsen/reinforce_tutorial

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

40 Commits
 
 
 
 
 
 

Repository files navigation

REINFORCE tutorial

This repository contains a collection of scripts and notes that explain the basics of the so-called REINFORCE algorithm, a method for estimating the derivative of an expected value with respect to the parameters of a distribution.

The method was introduced into the reinforcement learning literature by Ronald J. Williams in "Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning" (Machine Learning, 1992) but has earlier precedents.

This repository was created to provide some background material for a talk I gave 6 March 2017 at the Berlin machine learning meet-up. The slides from the talk are also available here, although they are not completely self-explanatory.

I have also included a few theoretical notes which explain various aspects of REINFORCE, Trust Region Policy Optimization, and other policy gradients methods:

These papers were originally written for internal use in my company, the robot software company micropsi industries, but are now freely available.

About

A small collection of code snippets and notes explaining the foundations of the REINFORCE algorithm.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages