[pytorch] In torch::save() avoid zip compressing small header records. #28180

jjlilley · 2019-10-16T23:52:33Z

Stack from ghstack:

[pytorch] String optimizations related to serialization. #28230 [pytorch] String optimizations related to serialization.
[pytorch] In torch::save() avoid zip compressing small header records. #28180 [pytorch] In torch::save() avoid zip compressing small header records.

ScriptModuleSerializer::writeCode() is the only place during torch::save()
serialization where we attempt to zip compress records.

This change avoids compressing these string records if they are
sufficiently small - e.g. in the example I looked at:

the strings were 123 and 28 bytes, respectively.
the cost in the compression routines was 16.5% of the torch::save() cost.
(we're building a huffman table for a 28 byte string).

We'd save time and not significantly affect the space if we add these
1-line conditional compressions, rather than making it unconditional.

Differential Revision: D17967995

ScriptModuleSerializer::writeCode() is the only place during torch::save() serialization where we attempt to zip compress records. This change avoids compressing these string records if they are sufficiently small - e.g. in the example I looked at: - the strings were 123 and 28 bytes, respectively. - the cost in the compression routines was 16.5% of the torch::save() cost. (we're building a huffman table for a 28 byte string). We'd save time and not significantly affect the space if we add these 1-line conditional compressions, rather than making it unconditional. Differential Revision: [D17967995](https://our.internmc.facebook.com/intern/diff/D17967995/) [ghstack-poisoned]

ScriptModuleSerializer::writeCode() is the only place during torch::save() serialization where we attempt to zip compress records. This change avoids compressing these string records if they are sufficiently small - e.g. in the example I looked at: - the strings were 123 and 28 bytes, respectively. - the cost in the compression routines was 16.5% of the torch::save() cost. (we're building a huffman table for a 28 byte string). We'd save time and not significantly affect the space if we add these 1-line conditional compressions, rather than making it unconditional. Differential Revision: [D17967995](https://our.internmc.facebook.com/intern/diff/D17967995/) ghstack-source-id: 92065904 Pull Request resolved: #28180

jjlilley · 2019-10-16T23:55:16Z

fwiw, there's a pdf of the profile under D17967995...

…der records." ScriptModuleSerializer::writeCode() is the only place during torch::save() serialization where we attempt to zip compress records. This change avoids compressing these string records if they are sufficiently small - e.g. in the example I looked at: - the strings were 123 and 28 bytes, respectively. - the cost in the compression routines was 16.5% of the torch::save() cost. (we're building a huffman table for a 28 byte string). We'd save time and not significantly affect the space if we add these 1-line conditional compressions, rather than making it unconditional. Differential Revision: [D17967995](https://our.internmc.facebook.com/intern/diff/D17967995/) [ghstack-poisoned]

Pull Request resolved: #28180 ScriptModuleSerializer::writeCode() is the only place during torch::save() serialization where we attempt to zip compress records. This change avoids compressing these string records if they are sufficiently small - e.g. in the example I looked at: - the strings were 123 and 28 bytes, respectively. - the cost in the compression routines was 16.5% of the torch::save() cost. (we're building a huffman table for a 28 byte string). We'd save time and not significantly affect the space if we add these 1-line conditional compressions, rather than making it unconditional. ghstack-source-id: 92104517 Differential Revision: [D17967995](https://our.internmc.facebook.com/intern/diff/D17967995/)

jjlilley · 2019-10-17T16:15:41Z

needed to rebase, to resolve merge conflicts.

facebook-github-bot · 2019-10-18T04:10:41Z

This pull request has been merged in d7ff34c.

…#28180) Summary: Pull Request resolved: pytorch#28180 ScriptModuleSerializer::writeCode() is the only place during torch::save() serialization where we attempt to zip compress records. This change avoids compressing these string records if they are sufficiently small - e.g. in the example I looked at: - the strings were 123 and 28 bytes, respectively. - the cost in the compression routines was 16.5% of the torch::save() cost. (we're building a huffman table for a 28 byte string). We'd save time and not significantly affect the space if we add these 1-line conditional compressions, rather than making it unconditional. ghstack-source-id: 92104517 Test Plan: Benchmark: experimental/jeremyl/c2:SerializationBench Correctness: normal buck mode/dev-nosan caffe2/test/... Differential Revision: D17967995 fbshipit-source-id: 7ff934388533645dc987e105c814ffe6324f4596

jjlilley requested a review from apaszke as a code owner October 16, 2019 23:52

facebook-github-bot added the oncall: jit Add this issue/PR to JIT oncall triage queue label Oct 16, 2019

jjlilley requested review from suo and zdevito October 17, 2019 00:39

jjlilley mentioned this pull request Oct 17, 2019

[pytorch] String optimizations related to serialization. #28230

Closed

jjlilley requested a review from driazati October 17, 2019 20:56

zdevito approved these changes Oct 18, 2019

View reviewed changes

facebook-github-bot closed this in d7ff34c Oct 18, 2019

facebook-github-bot added the merged label Oct 18, 2019

facebook-github-bot deleted the gh/jjlilley/6/head branch October 28, 2019 22:16

mruberry added the Merged label Oct 28, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[pytorch] In torch::save() avoid zip compressing small header records. #28180

[pytorch] In torch::save() avoid zip compressing small header records. #28180

Uh oh!

jjlilley commented Oct 16, 2019 •

edited

Loading

Uh oh!

jjlilley commented Oct 16, 2019

Uh oh!

jjlilley commented Oct 17, 2019

Uh oh!

facebook-github-bot commented Oct 18, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

[pytorch] In torch::save() avoid zip compressing small header records. #28180

[pytorch] In torch::save() avoid zip compressing small header records. #28180

Uh oh!

Conversation

jjlilley commented Oct 16, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jjlilley commented Oct 16, 2019

Uh oh!

jjlilley commented Oct 17, 2019

Uh oh!

facebook-github-bot commented Oct 18, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

jjlilley commented Oct 16, 2019 •

edited

Loading