Skip to content

Noise in the environment 🐛  #385

@riccardopoiani

Description

@riccardopoiani

Description

Currently, the data generation process works in the following way.
Whenever there is noise in the environment configurations, fixed a seed, the data generation process will be fixed, thus generating always the same "environment trajectory" (i.e., same order distribution, same vessel speeds, same vessel parking noise).

Expected Behavior

I would expect each "environment trajectory" to be different from the previous one (i.e., after a reset), in the sense that a different noise should be applied each time the environment is reset.

This is crucial also for the different reasons that are mentioned in both of your papers: if not done, the environment is fully deterministic (and one of the main reason to apply methods based on RL is the way in which they can handle uncertainty, as it happens in truly real scenarios indeed).
If this is not done, the performances that any RL-based method is able to achieve are flawed. In this case, indeed, it is obvious that the method is overfitting the "noise" in that specific configuration (at this point, it is even missleading to call it noise, since each trajectory generates the same exact data) .

Environment

  • MARO version (e.g., v0.1.1a1): master
  • MARO scenario (CIM, Citi Bike): CIM
  • MARO component (Simulation, RL, Distributed Training): Simulation
  • Orchestration platform (GraSS on Azure, AKS on Azure):
  • How you installed MARO (pip, source): source
  • OS (Linux, Windows, macOS): Linux
  • Python version (3.6, 3.7): 3.7
  • Docker image (e.g., maro2020/maro:latest):
  • CPU/GPU:
  • Any other relevant information:

Metadata

Metadata

Assignees

No one assigned

    Labels

    🐛 bugSomething isn't working.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions