Skip to content

Conversation

@tareknaser
Copy link
Member

Description

This pull request implements the DDPG (Deep Deterministic Policy Gradient) algorithm, along with 2 test cases.

Implementation details

DDPG is an actor-critic algorithm designed for continuous action spaces. It combines deep neural networks with deterministic policy gradients to learn optimal policies in a continuous control setting.

Implemented four networks:

  • policyNetwork (actor network)
  • targetPNetwork (target actor network)
  • learningQNetwork (critic network)
  • targetQNetwork (target critic network)

How Has This Been Tested?

  • Included a Pendulum test that successfully passes on my machine. The average reward achieved in the Pendulum environment is approximately -500, indicating successful learning.
  • Additionally, added a test for continuous action spaces, which also passes.

The test configurations for DDPG are adapted from the SAC (Soft Actor-Critic) implementation since both DDPG and SAC are policy gradient off-policy algorithms. This ensures consistent evaluation and comparison of the algorithms.

Copy link
Member

@zoq zoq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No more comments from my side, awesome work.

Copy link

@mlpack-bot mlpack-bot bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Second approval provided automatically after 24 hours. 👍

@shubham1206agra shubham1206agra merged commit 590c1b1 into mlpack:master Jun 20, 2023
@tareknaser tareknaser deleted the ddpg branch July 8, 2023 12:41
@rcurtin rcurtin mentioned this pull request Sep 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants