[BE] Remove dependency on `six` and `future` by XuehaiPan · Pull Request #94709 · pytorch/pytorch

XuehaiPan · 2023-02-12T18:42:27Z

Remove the Python 2 and 3 compatibility library six and future and torch._six. We only support Python 3.8+ now. It's time to retire them.

cc @mlazos @soumith @voznesenskym @yanboliang @penguinwu @anijain2305 @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @Xia-Weiwen @wenzhe-nrv @jiayisunx @desertfire

pytorch-bot · 2023-02-12T18:42:30Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/94709

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 44ea65d:
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

.gitmodules

albanD · 2023-03-28T16:55:37Z

We should revisit all changes from string_classes to str to make sure to properly include bytes when relevant.
There are already 2 forward fixes for it: #97737 and #97789 but we should look through this diff and fix all the other ones pre-emptively!

@XuehaiPan would you have some time to look into that by any chance?

…es (#97737) Slack thread: https://pytorch.slack.com/archives/GEEQ2K4MD/p1679962409906099 I was seeing some massive (~2x) slowdowns on a job after running it on PyTorch 2.0. From some profiling in `py-spy` it looked like the pin_memory thread was doing a lot more work than before. Looking at a trace in `nsys` I saw the thread doing the forward pass having a bunch of `pthread_cond_timedwait` with GIL reacquire calls in it’s call stack, and it seemed like the thread doing the forward pass was getting blocked (waiting for the GIL) by the pin memory thread (which was holding the GIL). After some debugging I found out the issue. If a `bytes` was passed into `pin_memory`, previously in 1.13 (before #94709) it would short-circuit and return here https://github.com/pytorch/pytorch/blob/d922c29a22e4bf0fba49526f7536395eb8cd66f4/torch/utils/data/_utils/pin_memory.py#L54-L55 since `bytes` was in `torch._six.string_classes`: ``` >>> from torch._six import string_classes >>> string_classes (<class 'str'>, <class 'bytes'>) >>> ``` However after #94709, if a `bytes` was passed into `pin_memory` it would fall into here instead https://github.com/pytorch/pytorch/blob/c263bd43e8e8502d4726643bc6fd046f0130ac0e/torch/utils/data/_utils/pin_memory.py#L68-L73 because the previous check is now doing `isinstance(data, str)` instead of `isinstance(data, (str, bytes))`! https://github.com/pytorch/pytorch/blob/c263bd43e8e8502d4726643bc6fd046f0130ac0e/torch/utils/data/_utils/pin_memory.py#L56-L57 As a result, `pin_memory` gets called recursively for each element in the `bytes` leading to a ton of wasted recursion. This also explains the slowdown / GIL contention I was seeing. This PR simply changes `isinstance(data, str)` to `isinstance(data, (str, bytes))` to match the behavior before #94709 Pull Request resolved: #97737 Approved by: https://github.com/albanD, https://github.com/NivekT

XuehaiPan · 2023-03-29T06:28:25Z

@albanD I opened a PR to re-add relevant isinstance checks for bytes.

Revisit torch._six.string_classes removal (#94709) #97863

Revisit `torch._six.string_classes` (which is `(str, bytes)`) removal: `isinstance(obj, string_classes) -> isinstance(obj, str)`. Both `str` and `bytes` are `Sequence` classes. ```python In [1]: from typing import Sequence In [2]: issubclass(bytes, Sequence) Out[2]: True In [3]: issubclass(str, Sequence) Out[3]: True ``` Re-add `bytes` to type guards like: ```python def is_seq(obj): return isinstance(obj, Sequence) and not isinstance(obj, (str, bytes)) ``` Ref: - #94709 (comment) - #97737 - #97789 Pull Request resolved: #97863 Approved by: https://github.com/Skylion007, https://github.com/albanD

…es (pytorch#97737) Slack thread: https://pytorch.slack.com/archives/GEEQ2K4MD/p1679962409906099 I was seeing some massive (~2x) slowdowns on a job after running it on PyTorch 2.0. From some profiling in `py-spy` it looked like the pin_memory thread was doing a lot more work than before. Looking at a trace in `nsys` I saw the thread doing the forward pass having a bunch of `pthread_cond_timedwait` with GIL reacquire calls in it’s call stack, and it seemed like the thread doing the forward pass was getting blocked (waiting for the GIL) by the pin memory thread (which was holding the GIL). After some debugging I found out the issue. If a `bytes` was passed into `pin_memory`, previously in 1.13 (before pytorch#94709) it would short-circuit and return here https://github.com/pytorch/pytorch/blob/d922c29a22e4bf0fba49526f7536395eb8cd66f4/torch/utils/data/_utils/pin_memory.py#L54-L55 since `bytes` was in `torch._six.string_classes`: ``` >>> from torch._six import string_classes >>> string_classes (<class 'str'>, <class 'bytes'>) >>> ``` However after pytorch#94709, if a `bytes` was passed into `pin_memory` it would fall into here instead https://github.com/pytorch/pytorch/blob/c263bd43e8e8502d4726643bc6fd046f0130ac0e/torch/utils/data/_utils/pin_memory.py#L68-L73 because the previous check is now doing `isinstance(data, str)` instead of `isinstance(data, (str, bytes))`! https://github.com/pytorch/pytorch/blob/c263bd43e8e8502d4726643bc6fd046f0130ac0e/torch/utils/data/_utils/pin_memory.py#L56-L57 As a result, `pin_memory` gets called recursively for each element in the `bytes` leading to a ton of wasted recursion. This also explains the slowdown / GIL contention I was seeing. This PR simply changes `isinstance(data, str)` to `isinstance(data, (str, bytes))` to match the behavior before pytorch#94709 Pull Request resolved: pytorch#97737 Approved by: https://github.com/albanD, https://github.com/NivekT

…97863) Revisit `torch._six.string_classes` (which is `(str, bytes)`) removal: `isinstance(obj, string_classes) -> isinstance(obj, str)`. Both `str` and `bytes` are `Sequence` classes. ```python In [1]: from typing import Sequence In [2]: issubclass(bytes, Sequence) Out[2]: True In [3]: issubclass(str, Sequence) Out[3]: True ``` Re-add `bytes` to type guards like: ```python def is_seq(obj): return isinstance(obj, Sequence) and not isinstance(obj, (str, bytes)) ``` Ref: - pytorch#94709 (comment) - pytorch#97737 - pytorch#97789 Pull Request resolved: pytorch#97863 Approved by: https://github.com/Skylion007, https://github.com/albanD

Torch does not support torch._six anymore, so removed from torch._six import string_classes Replaced string_classes with str. Reference: pytorch/pytorch#94709

…es (#97737) Slack thread: https://pytorch.slack.com/archives/GEEQ2K4MD/p1679962409906099 I was seeing some massive (~2x) slowdowns on a job after running it on PyTorch 2.0. From some profiling in `py-spy` it looked like the pin_memory thread was doing a lot more work than before. Looking at a trace in `nsys` I saw the thread doing the forward pass having a bunch of `pthread_cond_timedwait` with GIL reacquire calls in it’s call stack, and it seemed like the thread doing the forward pass was getting blocked (waiting for the GIL) by the pin memory thread (which was holding the GIL). After some debugging I found out the issue. If a `bytes` was passed into `pin_memory`, previously in 1.13 (before #94709) it would short-circuit and return here https://github.com/pytorch/pytorch/blob/d922c29a22e4bf0fba49526f7536395eb8cd66f4/torch/utils/data/_utils/pin_memory.py#L54-L55 since `bytes` was in `torch._six.string_classes`: ``` >>> from torch._six import string_classes >>> string_classes (<class 'str'>, <class 'bytes'>) >>> ``` However after #94709, if a `bytes` was passed into `pin_memory` it would fall into here instead https://github.com/pytorch/pytorch/blob/c263bd43e8e8502d4726643bc6fd046f0130ac0e/torch/utils/data/_utils/pin_memory.py#L68-L73 because the previous check is now doing `isinstance(data, str)` instead of `isinstance(data, (str, bytes))`! https://github.com/pytorch/pytorch/blob/c263bd43e8e8502d4726643bc6fd046f0130ac0e/torch/utils/data/_utils/pin_memory.py#L56-L57 As a result, `pin_memory` gets called recursively for each element in the `bytes` leading to a ton of wasted recursion. This also explains the slowdown / GIL contention I was seeing. This PR simply changes `isinstance(data, str)` to `isinstance(data, (str, bytes))` to match the behavior before #94709 Pull Request resolved: #97737 Approved by: https://github.com/albanD, https://github.com/NivekT

…97789, #97863) (#98055) * [DataLoader] Short circuit pin_memory recursion when operating on bytes (#97737) Slack thread: https://pytorch.slack.com/archives/GEEQ2K4MD/p1679962409906099 I was seeing some massive (~2x) slowdowns on a job after running it on PyTorch 2.0. From some profiling in `py-spy` it looked like the pin_memory thread was doing a lot more work than before. Looking at a trace in `nsys` I saw the thread doing the forward pass having a bunch of `pthread_cond_timedwait` with GIL reacquire calls in it’s call stack, and it seemed like the thread doing the forward pass was getting blocked (waiting for the GIL) by the pin memory thread (which was holding the GIL). After some debugging I found out the issue. If a `bytes` was passed into `pin_memory`, previously in 1.13 (before #94709) it would short-circuit and return here https://github.com/pytorch/pytorch/blob/d922c29a22e4bf0fba49526f7536395eb8cd66f4/torch/utils/data/_utils/pin_memory.py#L54-L55 since `bytes` was in `torch._six.string_classes`: ``` >>> from torch._six import string_classes >>> string_classes (<class 'str'>, <class 'bytes'>) >>> ``` However after #94709, if a `bytes` was passed into `pin_memory` it would fall into here instead https://github.com/pytorch/pytorch/blob/c263bd43e8e8502d4726643bc6fd046f0130ac0e/torch/utils/data/_utils/pin_memory.py#L68-L73 because the previous check is now doing `isinstance(data, str)` instead of `isinstance(data, (str, bytes))`! https://github.com/pytorch/pytorch/blob/c263bd43e8e8502d4726643bc6fd046f0130ac0e/torch/utils/data/_utils/pin_memory.py#L56-L57 As a result, `pin_memory` gets called recursively for each element in the `bytes` leading to a ton of wasted recursion. This also explains the slowdown / GIL contention I was seeing. This PR simply changes `isinstance(data, str)` to `isinstance(data, (str, bytes))` to match the behavior before #94709 Pull Request resolved: #97737 Approved by: https://github.com/albanD, https://github.com/NivekT * [DataLoader] Fix collation logic (#97789) Similar to #97737, a previous auto-refactor changed how `bytes` are handled during collation, which can potentially lead to performance regression. This PR undoes that. Pull Request resolved: #97789 Approved by: https://github.com/albanD * Revisit `torch._six.string_classes` removal (#94709) (#97863) Revisit `torch._six.string_classes` (which is `(str, bytes)`) removal: `isinstance(obj, string_classes) -> isinstance(obj, str)`. Both `str` and `bytes` are `Sequence` classes. ```python In [1]: from typing import Sequence In [2]: issubclass(bytes, Sequence) Out[2]: True In [3]: issubclass(str, Sequence) Out[3]: True ``` Re-add `bytes` to type guards like: ```python def is_seq(obj): return isinstance(obj, Sequence) and not isinstance(obj, (str, bytes)) ``` Ref: - #94709 (comment) - #97737 - #97789 Pull Request resolved: #97863 Approved by: https://github.com/Skylion007, https://github.com/albanD --------- Co-authored-by: Eric Zhang <[email protected]> Co-authored-by: Kevin Tse <[email protected]>

This commit fixes a deprecated import statement in the codebase that uses the string_classes module from PyTorch. The deprecated import statement from torch import string_classes has been replaced with a simpler and more Pythonic alternative string_classes = str. The string_classes module was previously used to define the type of a string, but it has been removed in more recent versions of PyTorch. This commit provides a cleaner and more compatible solution that works with newer versions of PyTorch. See pytorch/pytorch#94709

See pytorch/pytorch#94709 (comment)

XuehaiPan requested review from a team, BowenBao, H-Huang, abock, albanD, awgu, fegin, jeffdaily, kulinseth, kwen2501, mrshenli, mruberry, ngimel, rohan-varma, wanchaol and zhaojuanmao as code owners February 12, 2023 18:42

XuehaiPan requested review from janeyx99, jbschlosser and soulitzer as code owners February 12, 2023 18:42

pytorch-bot bot added ciflow/mps Run MPS tests (subset of trunk) release notes: releng release notes category labels Feb 12, 2023

github-actions bot added ciflow/inductor module: dynamo labels Feb 12, 2023

XuehaiPan mentioned this pull request Feb 12, 2023

Consider upgrade to Python 3.8+ syntax with pyupgrade #94040

Closed

Skylion007 added better-engineering Relatively self-contained tasks for better engineering contributors ciflow/trunk Trigger trunk jobs on your pull request labels Feb 12, 2023

Skylion007 reviewed Feb 12, 2023

View reviewed changes

.gitmodules Show resolved Hide resolved

pytorchbot added the open source label Feb 12, 2023

XuehaiPan force-pushed the remove-six-future branch from 1e60afb to f3347dc Compare February 12, 2023 19:06

ezhang887 mentioned this pull request Mar 28, 2023

[DataLoader] Short circuit pin_memory recursion when operating on bytes #97737

Closed

XuehaiPan added a commit to XuehaiPan/pytorch that referenced this pull request Mar 29, 2023

Revisit torch._six.string_classes removal (pytorch#94709)

65ad82a

XuehaiPan mentioned this pull request Mar 29, 2023

Revisit torch._six.string_classes removal (#94709) #97863

Closed

koliaok mentioned this pull request Apr 1, 2023

Remove torch version 2.0.0 dependency in EleutherAI/oslo#174

Closed

miamannionx mentioned this pull request Apr 4, 2023

torch._six is depreciated, remove import CompVis/taming-transformers#202

Open

zahrakhanjani128 mentioned this pull request Apr 8, 2023

torch._six has been removed asvspoof-challenge/2021#21

Closed

ChiehYunChen mentioned this pull request Apr 10, 2023

packaging.version.InvalidVersion: Invalid version: '0.10.1,<0.11' CompVis/latent-diffusion#207

Open

This was referenced Apr 29, 2023

Fix for deprecated torch string_classes import salesforce/ULIP#25

Closed

Fix for deprecated torch string_classes import salesforce/ULIP#26

Open

nitinranjansharma mentioned this pull request Jun 14, 2023

Error in chapter 4 Apress/computer-vision-projects-with-pytorch#1

Open

davinnovation mentioned this pull request Jul 5, 2023

remove deprecated torch._six string_classes cvg/DeepLSD#21

Merged

fenglui mentioned this pull request Sep 17, 2023

fix Update grads.py Torch does not support torch._six anymore FlagAI-Open/FlagAI#535

Closed

4 tasks

mohanrajroboticist mentioned this pull request Nov 7, 2023

No module named 'torch._six' VincentSch4rf/torchtime#1

Closed

MlLearnerAkash mentioned this pull request Jan 16, 2024

torch2.0.1 No module named 'torch._six NVIDIA/apex#1724

Open

alex391 added a commit to alex391/bindsnet that referenced this pull request Feb 29, 2024

Fix import for depreciated torch._six

3e8d2bb

See pytorch/pytorch#94709 (comment)

alex391 mentioned this pull request Feb 29, 2024

Fix import for depreciated torch._six BINDS-LAB-UMASS/bindsnet#14

Open

alex391 added a commit to alex391/bindsnet that referenced this pull request Feb 29, 2024

Fix import for depreciated torch._six

f38bb1d

See pytorch/pytorch#94709 (comment)

alex391 added a commit to alex391/bindsnet that referenced this pull request Feb 29, 2024

Fix import for depreciated torch._six

63fe258

See pytorch/pytorch#94709 (comment)

nirgoren mentioned this pull request Mar 24, 2024

[BUG] ModuleNotFoundError: No module named 'torch._six' yoshitomo-matsubara/torchdistill#448

Closed

ps4vs mentioned this pull request Apr 5, 2024

[BUG] No module named 'torch._six' facebookresearch/AudioMAE#27

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BE] Remove dependency on `six` and `future`#94709

[BE] Remove dependency on `six` and `future`#94709
XuehaiPan wants to merge 8 commits intopytorch:masterfrom
XuehaiPan:remove-six-future

XuehaiPan commented Feb 12, 2023 •

edited by pytorch-bot bot

Loading

Uh oh!

pytorch-bot bot commented Feb 12, 2023 •

edited

Loading

Uh oh!

Uh oh!

albanD commented Mar 28, 2023

Uh oh!

XuehaiPan commented Mar 29, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

Conversation

XuehaiPan commented Feb 12, 2023 • edited by pytorch-bot bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Feb 12, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/94709

✅ No Failures

Uh oh!

Uh oh!

albanD commented Mar 28, 2023

Uh oh!

XuehaiPan commented Mar 29, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

XuehaiPan commented Feb 12, 2023 •

edited by pytorch-bot bot

Loading

pytorch-bot bot commented Feb 12, 2023 •

edited

Loading