Skip to content

Conversation

@jjlilley
Copy link

@jjlilley jjlilley commented Oct 10, 2019

Summary:
This change adds a small fixed-size buffer to Pickler to
avoid calling writer_() and the associated downstream checks
on a per-opcode/per-byte basis.

We end up still doing a bounds check in the common case,
but the memcpy() is a fixed size. And we reduce the number
of backend calls.

In practice, this change speeds up the Pickle1MInts benchmark
for me locally from roughly 56msec to 22msec.

Additionally, in this change we convert a few pushIValue() on
typed lists, where we know the type to be double/int/boot to be
pushInt() to bypass a bit of logic.

We should additionally change the Unpickler, though keeping
this separate, since the std::function<> prototype needs to be
changed for this to work (i.e. return size_t rather than bool).

@jjlilley jjlilley requested a review from apaszke as a code owner October 10, 2019 22:43
@pytorchbot pytorchbot added the oncall: jit Add this issue/PR to JIT oncall triage queue label Oct 10, 2019
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D17847174

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D17847174

Summary:
Pull Request resolved: pytorch#27720

This change adds a small fixed-size buffer to Pickler to
avoid calling writer_() and the associated downstream checks
on a per-opcode/per-byte basis.

We end up still doing a bounds check in the common case,
but the memcpy() is a fixed size. And we reduce the number
of backend calls.

In practice, this change speeds up the Pickle1MInts benchmark
for me locally from roughly 56msec to 22msec.

Additionally, in this change we convert a few pushIValue() on
typed lists, where we know the type to be double/int/boot to be
pushInt() to bypass a bit of logic.

We should additionally change the Unpickler, though keeping
this separate, since the std::function<> prototype needs to be
changed for this to work (i.e. return size_t rather than bool).

Test Plan:
buck test mode/dev-nosan caffe2/test:...
  Benchmark in experimental/jeremyl/c2/SerializationBench.cpp (run in mode/opt)

Differential Revision: D17847174

fbshipit-source-id: 30eace0187f71d9fff13a53013227b780f93c6c0
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D17847174

@jjlilley jjlilley requested review from resistor, suo and zdevito October 11, 2019 01:54
Copy link
Contributor

@zdevito zdevito left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks!

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jjlilley is landing this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@jjlilley
Copy link
Author

Would you mind stamping #28043 ?

I used the web-based export-from-phabricator originally with this diff, which alas is suffering from an Infra failure, and apparently using the web tool makes ghexport not work.

The change 28043 is identical except that I addressed the two pending review comments:

  • I added the static assert.
  • I removed the TODO comment re: LONG4 so I could fix that in a followup change

Thanks!

@facebook-github-bot
Copy link
Contributor

@jjlilley merged this pull request in 3ed9a6e.

thiagocrepaldi pushed a commit to thiagocrepaldi/pytorch that referenced this pull request Feb 4, 2020
Summary:
This change adds a small fixed-size buffer to Pickler to
avoid calling writer_() and the associated downstream checks
on a per-opcode/per-byte basis.

We end up still doing a bounds check in the common case,
but the memcpy() is a fixed size. And we reduce the number
of backend calls.

In practice, this change speeds up the Pickle1MInts benchmark
for me locally from roughly 56msec to 22msec.

Additionally, in this change we convert a few pushIValue() on
typed lists, where we know the type to be double/int/boot to be
pushInt() to bypass a bit of logic.

We should additionally change the Unpickler, though keeping
this separate, since the std::function<> prototype needs to be
changed for this to work (i.e. return size_t rather than bool).
Pull Request resolved: pytorch#27720

Test Plan:
buck test mode/dev-nosan caffe2/test:...
  Benchmark in experimental/jeremyl/c2/SerializationBench.cpp (run in mode/opt)

Differential Revision: D17847174

Pulled By: jjlilley

fbshipit-source-id: 22e5e5fd33f1a369c124ea5aac7880538e2bf6a0
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Merged oncall: jit Add this issue/PR to JIT oncall triage queue

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants