-
Notifications
You must be signed in to change notification settings - Fork 26.3k
[jit] Remove torch.save-related logic from pickler
#25502
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
torch.save stuff from picklertorch.save-related logic from pickler
zdevito
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes that move torch::save out are good. I do not think we should be removing tensor references at the same time. They are the most appropriate way to for the RPC world to be dealing with tensor data. In particular if the tensors are on GPUs there may be a more efficient pathway to copy them, but the logic inside the pickler is going force a copy to the cpu and a copy back. It would be best to split the changes for torch::save from changes to tensor references and revisit how pickler RPC should work.
| archive.save_to(std::forward<SaveToArgs>(args)...); | ||
| } | ||
|
|
||
| TORCH_API std::vector<char> save(const torch::IValue& ivalue); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the C++ Module API. I am not sure this is the right place to put the save function. It might be more appropriately defined where torch::load for modules is defined.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, this API doesn't seem to belong here - it returns std::vector<char>, while all other torch::save APIs don't return anything and expect to be used by torch::save(value, filepath). I'd suggest naming this function pickle_save or similar, so that C++ API end users won't be confused.
btw, do we have this function in Python API? Or is it an implementation detail that would only be used by JIT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function (renamed to pickle_save) should produce the same output as torch.save() in Python, so there should be a similar API here too. A rename is fine given the clash with the existing torch::save/torch::load. It's intended to be used as an API function e.g. #25591
|
Thanks for flagging, @zdevito. I agree it would be great if the copy to CPU is done as part of What I don't yet fully get here is the tradeoff between storing a tensor instead of a storage object in Also cc @lerks since we had a discussion about this. |
This reverts commit 7a0dade.
|
@driazati Looking at the recent changes, do you now plan to indeed keep the tensor table around, and always serialize their metadata (since If so, the comment in the summary should be updated. |
The Pickler previously had a distinction between tensors that would be inlined in 1 pickle binary (matching the format of
torch.save()) and tensors that are saved elsewhere with only a reference stored in the binary. This PR moves that distinction out totorch::pickle_saveto match the eager Python interface.The change can be seen in
register_prim_ops.cppwhere the call tojit::pickleis nowtorch::pickle_saveDifferential Revision: D17175215