faster batch sampler #76951

didicout · 2022-05-06T12:05:29Z

Improve the performance of iteration on BatchSampler , especially when batch_size is big.

Python 3.6.8:

  batch_size  drop_last      origin(avg)    origin(std)   new(avg)   new(std)  speedup
------------  -----------  -------------  -------------  ---------  ---------  -------
           4  True                 0.026      0.0001183      0.032  0.0004839  -18.07%
           4  False                0.026      2.326e-05      0.022  3.317e-05  15.92% 
           8  True                 0.025      0.0001441      0.023  0.0001204  9.43%
           8  False                0.025      4.457e-05      0.019  3.951e-05  30.90% 
          64  True                 0.024      0.0005505      0.015  0.0002987  54.99%
          64  False                0.023      8.014e-05      0.016  1.999e-05  49.64% 
         640  True                 0.024      0.0005598      0.015  2.327e-05  66.26%
         640  False                0.024      2.229e-05      0.016  2.358e-05  48.32% 
        6400  True                 0.025      0.0005789      0.015  2.935e-05  69.06%
        6400  False                0.025      2.529e-05      0.017  6.737e-05  45.17%

Python 3.8.12:

  batch_size  drop_last      origin(avg)    origin(std)   new(avg)   new(std)  speedup
------------  -----------  -------------  -------------  ---------  ---------  -------
           4  True                 0.017      0.0001389      0.019  3.026e-05  -10.50%
           4  False                0.017      9.909e-06      0.017  2.37e-05   -0.78% 
           8  True                 0.016      1.251e-05      0.013  2.748e-05  24.40%
           8  False                0.016      5.523e-05      0.015  1.156e-05  10.20% 
          64  True                 0.015      2.483e-05      0.008  1.859e-05  90.96%
          64  False                0.015      2.622e-05      0.012  3.227e-05  26.09% 
         640  True                 0.016      6.405e-05      0.007  1.836e-05  112.88%
         640  False                0.016      1.284e-05      0.013  0.0004079  20.09% 
        6400  True                 0.016      8.298e-05      0.008  1.507e-05  111.80%
        6400  False                0.016      1.716e-05      0.014  0.0004602  18.37%

Check the issue page for more details of the tests.

facebook-github-bot · 2022-05-06T12:05:33Z

Hi @didicout!

Thank you for your pull request and welcome to our community.

Action Required

In order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you.

Process

In order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with CLA signed. The tagging process may take up to 1 hour after signing. Please give it that time before contacting us about it.

If you have received this in error or have any questions, please contact us at [email protected]. Thanks!

facebook-github-bot · 2022-05-06T12:05:35Z

🔗 Helpful links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/76951
📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓Need help or want to give feedback on the CI? Visit our office hours

❌ 1 New Failures

As of commit 4844bb5 (more details on the Dr. CI page):

Expand to see more

1/1 failures introduced in this PR

🕵️ 1 new failure recognized by patterns

The following CI failures do not appear to be due to upstream breakages

pull / linux-xenial-py3.7-gcc5.4 / test (backwards_compat, 1, 1, linux.2xlarge) (1/1)

Step: "Test" (full log | diagnosis details | 🔁 rerun)

2022-05-10T14:50:31.5079066Z The PR is introduc...m to confirm whether this change is wanted or not.

2022-05-10T14:50:31.5063431Z processing existing schema:  text(__torch__.torch.classes.profiling.SourceRef _0) -> (str _0)
2022-05-10T14:50:31.5065070Z processing existing schema:  count(__torch__.torch.classes.profiling.InstructionStats _0) -> (int _0)
2022-05-10T14:50:31.5066320Z processing existing schema:  duration_ns(__torch__.torch.classes.profiling.InstructionStats _0) -> (int _0)
2022-05-10T14:50:31.5067965Z processing existing schema:  source(__torch__.torch.classes.profiling.SourceStats _0) -> (__torch__.torch.classes.profiling.SourceRef _0)
2022-05-10T14:50:31.5069911Z processing existing schema:  line_map(__torch__.torch.classes.profiling.SourceStats _0) -> (Dict(int, __torch__.torch.classes.profiling.InstructionStats) _0)
2022-05-10T14:50:31.5071101Z processing existing schema:  __init__(__torch__.torch.classes.profiling._ScriptProfile _0) -> (NoneType _0)
2022-05-10T14:50:31.5072661Z processing existing schema:  enable(__torch__.torch.classes.profiling._ScriptProfile _0) -> (NoneType _0)
2022-05-10T14:50:31.5073876Z processing existing schema:  disable(__torch__.torch.classes.profiling._ScriptProfile _0) -> (NoneType _0)
2022-05-10T14:50:31.5075883Z processing existing schema:  _dump_stats(__torch__.torch.classes.profiling._ScriptProfile _0) -> (__torch__.torch.classes.profiling.SourceStats[] _0)
2022-05-10T14:50:31.5078593Z processing existing schema:  __init__(__torch__.torch.classes.dist_rpc.WorkerInfo _0, str _1, int _2) -> (NoneType _0)
2022-05-10T14:50:31.5079066Z The PR is introducing backward incompatible changes to the operator library. Please contact PyTorch team to confirm whether this change is wanted or not. 
2022-05-10T14:50:31.5079351Z 
2022-05-10T14:50:31.5079428Z Broken ops: [
2022-05-10T14:50:31.5079873Z 	aten::set_.source_Tensor_storage_offset(Tensor(a!) self, Tensor source, int storage_offset, int[] size, int[] stride=[]) -> (Tensor(a!))
2022-05-10T14:50:31.5080485Z 	aten::linalg_vander(Tensor x, *, int? N=None) -> (Tensor)
2022-05-10T14:50:31.5080709Z ]
2022-05-10T14:50:31.6152653Z + cleanup
2022-05-10T14:50:31.6153010Z + retcode=1
2022-05-10T14:50:31.6153321Z + set +x
2022-05-10T14:50:31.6185610Z ##[error]Process completed with exit code 1.
2022-05-10T14:50:31.6227907Z ##[group]Run pytorch/pytorch/.github/actions/get-workflow-job-id@master

This comment was automatically generated by Dr. CI (expand for details).

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

facebook-github-bot · 2022-05-06T14:07:47Z

Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Meta Open Source project. Thanks!

ejguan

Thank you for proactively trying to make batchsampler better. Do you mind listing the performance gain in the PR summary with your testing script? Besides, I also commented about adding a few tests with batchsize 4 or 8 in the issue.

didicout · 2022-05-09T15:25:37Z

Thank you for your reply. I have posted some new test results in the issue. I will list the performance in the PR summary if there's no problem.

ejguan

Please ping me when you have PR summary ready! Really appreciate your work.

torch/utils/data/sampler.py

didicout · 2022-05-10T14:39:56Z

@ejguan Done. :)

facebook-github-bot · 2022-05-10T15:00:25Z

@ejguan has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

ejguan

LGTM

ejguan · 2022-05-10T18:18:27Z

@pytorchbot merge this please

ejguan · 2022-05-10T18:23:50Z

Do you mind also adding std to the table in summary? Thank you

didicout · 2022-05-11T02:37:19Z

Do you mind also adding std to the table in summary? Thank you

Done. :)

Summary: Fixes #76950 Improve the performance of iteration on `BatchSampler` , especially when `batch_size` is big. Python 3.6.8: ``` batch_size drop_last speedup ------------ ----------- ------- 4 True -18.07% 4 False 15.92% 8 True 9.43% 8 False 30.90% 64 True 54.99% 64 False 49.64% 640 True 66.26% 640 False 48.32% 6400 True 69.06% 6400 False 45.17% ``` Python 3.8.12: ``` batch_size drop_last speedup ------------ ----------- -------- 4 True -10.50% 4 False -0.78% 8 True 24.40% 8 False 10.20% 64 True 90.96% 64 False 26.09% 640 True 112.88% 640 False 20.09% 6400 True 111.80% 6400 False 18.37% ``` Check the issue page for more details of the tests. Pull Request resolved: #76951 Approved by: https://github.com/ejguan Test Plan: contbuild & OSS CI, see https://hud.pytorch.org/commit/pytorch/pytorch/d22d749a0e8436829e564796bf6cd67847199453 Reviewed By: samdow Differential Revision: D36282565 Pulled By: samdow fbshipit-source-id: cbdf554f5a9941a892e10b651c0dc709e25c7a87

Builds upon #76951. Benchmarking code is the same as in #76950. AMD Ryzen Threadripper PRO 3995WX: ``` batch_size drop_last origin new speedup ------------ ----------- -------- ------ --------- 4 True 0.94 0.5706 64.74% 4 False 0.9745 0.9468 2.93% 8 True 0.7423 0.3715 99.82% 8 False 0.7974 0.5666 40.73% 64 True 0.5394 0.2085 158.76% 64 False 0.6083 0.2697 125.51% 640 True 0.5448 0.1985 174.41% 640 False 0.7085 0.2308 206.91% 6400 True 0.5554 0.2028 173.88% 6400 False 0.7711 0.2109 265.60% 64000 True 0.556 0.2091 165.82% 64000 False 0.7803 0.2078 275.58% ``` When `drop_last == True`, it uses `zip` to speed things up. When `drop_last == False`, it uses `itertools` to speed things up. `itertools` was the fastest way I could find that deals with the last batch if it is smaller than `batch_size`. I have a pure python method too, but it is slower when `batch_size` is 4 or 8, so I have committed the `itertools` version for now. Happy to chat further about this change :-) I understand you may not want to introduce the `itertools` package into [sampler.py](https://github.com/pytorch/pytorch/blob/main/torch/utils/data/sampler.py). Pull Request resolved: #137423 Approved by: https://github.com/Skylion007

faster batch sampler

5abe94a

pytorchbot added the open source label May 6, 2022

facebook-github-bot added the cla signed label May 6, 2022

soulitzer requested review from VitalyFedyunin and ejguan May 9, 2022 04:40

soulitzer added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label May 9, 2022

ejguan self-assigned this May 9, 2022

ejguan reviewed May 9, 2022

View reviewed changes

ejguan reviewed May 10, 2022

View reviewed changes

torch/utils/data/sampler.py Show resolved Hide resolved

add comment

4844bb5

ejguan added module: dataloader Related to torch.utils.data.DataLoader and Sampler release notes: dataloader release notes category topic: improvements topic category labels May 10, 2022

ejguan approved these changes May 10, 2022

View reviewed changes

pytorchmergebot added the Merged label May 10, 2022

pytorchmergebot closed this in d22d749 May 10, 2022

alstonlo mentioned this pull request Jun 29, 2022

DataLoader runs out of memory when batch_size >> len(dataset) #80493

Closed

xangma mentioned this pull request Oct 7, 2024

Faster Faster BatchSampler #137423

Closed

faster batch sampler #76951

faster batch sampler #76951

Uh oh!

Conversation

didicout commented May 6, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

facebook-github-bot commented May 6, 2022

Action Required

Process

Uh oh!

facebook-github-bot commented May 6, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful links

❌ 1 New Failures

🕵️ 1 new failure recognized by patterns

pull / linux-xenial-py3.7-gcc5.4 / test (backwards_compat, 1, 1, linux.2xlarge) (1/1)

Uh oh!

facebook-github-bot commented May 6, 2022

Uh oh!

ejguan left a comment

Choose a reason for hiding this comment

Uh oh!

didicout commented May 9, 2022

Uh oh!

ejguan left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

didicout commented May 10, 2022

Uh oh!

facebook-github-bot commented May 10, 2022

Uh oh!

ejguan left a comment

Choose a reason for hiding this comment

Uh oh!

ejguan commented May 10, 2022

Uh oh!

ejguan commented May 10, 2022

Uh oh!

didicout commented May 11, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

didicout commented May 6, 2022 •

edited

Loading

facebook-github-bot commented May 6, 2022 •

edited

Loading