buffers such as running_mean and running_std are essential for restoring the model function, but are not saved via parameter_dict.
I think there's real value in returning a dict that saves these, and restores them back.
For example, in the dcgan example, I switched to checkpointing the model's parameter dict, instead of the model itself. This is much cleaner than saving the model, because i have certain variables like nGPU that change from run to run that I dont want to save as part of the model.
How should we solve this?