Skip to content

TrainsSaver doesn't respect Checkpoint's n_saved #1056

@achigeor

Description

@achigeor

🐛 Bug description

As the title says, it seems that TrainsSaver bypasses the Checkpoint n_saved parameter. That means that all models are saved and never updated / deleted.

Consider this simple example:

        task.phases['train'].add_event_handler(
            Events.EPOCH_COMPLETED(every=1),
            Checkpoint(to_save, TrainsSaver(output_uri=output_uri), 'epoch', n_saved=1,
                       global_step_transform=global_step_from_engine(task.phases['train'])))

The above saves every checkpoint. You end-up with

epoch_checkpoint_1.pt
epoch_checkpoint_2.pt
epoch_checkpoint_3.pt
...

Now if we do, the same with DiskSaver:

        task.phases['train'].add_event_handler(
            Events.EPOCH_COMPLETED(every=1),
            Checkpoint(to_save, DiskSaver(dirname=dirname), 'epoch', n_saved=1,
                       global_step_transform=global_step_from_engine(task.phases['train'])))

We get only:

epoch_checkpoint_3.pt

as expected.

Same behaviour if we save only best models using score_function, i.e. TrainsSaver saves every best model.

Environment

  • PyTorch Version: 1.3.1
  • Ignite Version: 0.4.0.dev20200519 (EDIT: update to latest nightly, issue still exists)
  • OS: Linux
  • How you installed Ignite: pip nightly
  • Python version: 3.6
  • Any other relevant information: trains version: 0.14.3

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions