Skip to content

Conversation

@zdevito
Copy link
Contributor

@zdevito zdevito commented Jul 24, 2019

Stack from ghstack:

This allows tensors to be loaded directly from the pickle archive,
using the same encoding as the eager serializer. The one difference
is that the tensor data is still stored in a zip file rather
than appended to the archive, which still allows for aligning the tensor data.

With this change, we no longer rely on the json file for any information, so this PR also stops writing the json file and loads modules directly from the pickle.

New layout is:

  • code/ - all code goes in these folders with python names
  • constants{.pkl,/} - pickled code constants, these appear in the code so they cannot go in data, otherwise it would create a circular dependency.
  • data{.pkl,/} - pickled data, with tensors in the data/ folder.

Differential Revision: D16452815

This allows tensors to be loaded directly from the pickle archive,
using the same encoding as the eager serializer. The one difference
is that the tensor data is still stored in a zip file rather
than appended to the archive.
[WIP] Load tensors directly from pickle archive

This allows tensors to be loaded directly from the pickle archive,
using the same encoding as the eager serializer. The one difference
is that the tensor data is still stored in a zip file rather
than appended to the archive.

gh-metadata: pytorch pytorch 23281 gh/zdevito/78/head
result = autograd::make_variable(result, requires_grad);
stack_.push_back(std::move(result));
});
} else if (module_name == "collections" && class_name == "OrderedDict") {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a bit scary. Is there are stricter condition we can enforce here (e.g. that it's indeed in Tensor deserialization only)

}
});
} else if (
module_name == "torch._utils" && class_name == "_rebuild_tensor_v2") {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would be nice to extract to a subfunction and explicitly document in both python and C++ that those functions should match semantically

Load tensors directly from pickle archive

This allows tensors to be loaded directly from the pickle archive,
using the same encoding as the eager serializer. The one difference
is that the tensor data is still stored in a zip file rather
than appended to the archive.

gh-metadata: pytorch pytorch 23281 gh/zdevito/78/head
@pytorchbot pytorchbot added caffe2 oncall: distributed Add this issue/PR to distributed oncall triage queue labels Aug 20, 2019
@zdevito zdevito requested a review from suo August 20, 2019 03:59
@zdevito zdevito changed the title [WIP] Load tensors directly from pickle archive Load tensors directly from pickle archive Aug 20, 2019
Load tensors directly from pickle archive

This allows tensors to be loaded directly from the pickle archive,
using the same encoding as the eager serializer. The one difference
is that the tensor data is still stored in a zip file rather
than appended to the archive.

gh-metadata: pytorch pytorch 23281 gh/zdevito/78/head

ScriptModuleSerializer2(std::ostream* ofs) : ofs_(), writer_(ofs) {}

void writeArchive(const std::string& archive_name, const IValue& value) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't look like this needs to be public?

/// `bounds_checker` is a function that returns `true` if the reader can read
/// more data, and `false` if it cannot (i.e. if a stream has hit its end of
/// file)
///
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is now out of date, references the bounds_checker arg


public:
// tensors inside the pickle are references to the tensor_table
Unpickler(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this version legacy-only? If so, we should note it in the comments

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is possible that RPC will be using this version, and completely handling tensor serialization on its own. Not sure though.

module_name == "torch._utils" &&
(class_name == "_rebuild_tensor_v2" ||
class_name == "_rebuild_qtensor")) {
bool quantized = class_name == "_rebuild_qtensor";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ugh, I don't like that quantized tensors are somehow special case for the serialization format

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, but this is how it works in eager serialization, so we are stuck with it if we want torch.save from torchscript to work with torch.load

/// file)
///
/// See `torch::pickle` for details.
TORCH_API IValue unpickle(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general we should think a bit about how we want the Pickle C++ API to look at the end of this. @driazati, any thoughts? At least I think we should get rid of tensor_table if possible

Copy link
Contributor

@driazati driazati Aug 20, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any use of the tensor_table would be covered by pickle's persistent_id/load functions (which we could implement if we need to in the future). This is already how eager's serialization works anyways, so it's fine to get rid of the tensor_table


void writeArchive(const std::string& archive_name, const IValue& value) {
std::vector<char> data;
Pickler data_pickle(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like jit::pickle(...) would be fine here instead of the manual use of the Pickler

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right after this it uses data_pickle.tensorData which is not exposed by those functions.

ss << archive_name << "/" << name;
return std::get<0>(reader_->getRecord(ss.str()));
};
Unpickler unpickler(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, this can be jit::unpickle

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

user-exposed pickle functions do not take the read_record or device_ arguments needed here to load the tensors.

Load tensors directly from pickle archive

This allows tensors to be loaded directly from the pickle archive,
using the same encoding as the eager serializer. The one difference
is that the tensor data is still stored in a zip file rather
than appended to the archive.

gh-metadata: pytorch pytorch 23281 gh/zdevito/78/head
@facebook-github-bot
Copy link
Contributor

@zdevito merged this pull request in e2cccce.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

caffe2 Merged oncall: distributed Add this issue/PR to distributed oncall triage queue oncall: jit Add this issue/PR to JIT oncall triage queue

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants