-
Notifications
You must be signed in to change notification settings - Fork 26.3k
[WIP] Revert "Revert "Redefine scheduler to set learning rate using recursive formula" #14010 (#21463)" and enable closed form with non-sequential epoch parameter #21800
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…ing recursive formula" #14010 (#21463)" and enable closed form with non-sequential epoch parameter" Revert "Revert "Redefine scheduler to set learning rate using recursive formula" #14010 (#21463)" and enable closed form with non-sequential epoch parameter This reverts commit 3889855. gh-metadata: pytorch pytorch 21800 gh/vincentqb/6/head
…ing recursive formula" #14010 (#21463)" and enable closed form with non-sequential epoch parameter WIP" Revert "Revert "Redefine scheduler to set learning rate using recursive formula" #14010 (#21463)" and enable closed form with non-sequential epoch parameter This reverts commit 3889855. gh-metadata: pytorch pytorch 21800 gh/vincentqb/6/head
…ing recursive formula" #14010 (#21463)" and enable closed form with non-sequential epoch parameter WIP" Revert "Revert "Redefine scheduler to set learning rate using recursive formula" #14010 (#21463)" and enable closed form with non-sequential epoch parameter This reverts commit 3889855. gh-metadata: pytorch pytorch 21800 gh/vincentqb/6/head
…ing recursive formula" #14010 (#21463)" and enable closed form with non-sequential epoch parameter" Revert "Revert "Redefine scheduler to set learning rate using recursive formula" #14010 (#21463)" and enable closed form with non-sequential epoch parameter This reverts commit 3889855. gh-metadata: pytorch pytorch 21800 gh/vincentqb/6/head
…ing recursive formula" #14010 (#21463)" and enable closed form with non-sequential epoch parameter" Revert "Revert "Redefine scheduler to set learning rate using recursive formula" #14010 (#21463)" and enable closed form with non-sequential epoch parameter This reverts commit 3889855. gh-metadata: pytorch pytorch 21800 gh/vincentqb/6/head
…ing recursive formula" #14010 (#21463)" and enable closed form with non-sequential epoch parameter WIP" Revert "Revert "Redefine scheduler to set learning rate using recursive formula" #14010 (#21463)" and enable closed form with non-sequential epoch parameter This reverts commit 3889855. gh-metadata: pytorch pytorch 21800 gh/vincentqb/6/head
…ing recursive formula" #14010 (#21463)" and enable closed form with non-sequential epoch parameter" Revert "Revert "Redefine scheduler to set learning rate using recursive formula" #14010 (#21463)" and enable closed form with non-sequential epoch parameter This reverts commit 3889855. gh-metadata: pytorch pytorch 21800 gh/vincentqb/6/head
|
Since this isn't a "straight" revert, we should have a more detailed description in the PR body explaining what the approach we took in this version of the patch is. We should also mention which bug we fix. |
torch/optim/lr_scheduler.py
Outdated
| return [group['lr'] * self.gamma | ||
| for group in self.optimizer.param_groups] | ||
|
|
||
| def get_lr_closed_form(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose get_lr isn't a publicly documented function, but it would be nice to explain in English, somewhere, what exactly get_lr_closed_form does. You could do a Note: somewhere in this file, write your comments below a header Note [LR closed form] and then reference this note elsewhere.
ezyang
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good! Just a little more documentation (esp. in the PR body) needed, and I'll stamp it.
…ing recursive formula" #14010 (#21463)" and enable closed form with non-sequential epoch parameter" Revert "Revert "Redefine scheduler to set learning rate using recursive formula" #14010 (#21463)" and enable closed form with non-sequential epoch parameter This reverts commit 3889855. gh-metadata: pytorch pytorch 21800 gh/vincentqb/6/head
…ing recursive formula" #14010 (#21463)" and enable closed form with non-sequential epoch parameter" Revert "Revert "Redefine scheduler to set learning rate using recursive formula" #14010 (#21463)" and enable closed form with non-sequential epoch parameter This reverts commit 3889855. gh-metadata: pytorch pytorch 21800 gh/vincentqb/6/head
|
@fmassa could you take a look too? |
…e changes mentioned below from [comment](#21800 (comment)). * Changing the behavior of schedulers to the chainable formula when available * Using the closed form whenever epoch is different from None until the next release with a deprecation warning * Making get_computed_values the supported way of obtaining the last computed learning rate by the scheduler (see [comment](#21800 (comment)) for new syntax) * Returning a deprecation warning when invoking the undocumented get_lr function (see [comment](#21800 (comment))) referring to get_computed_values, and deprecating it in the next release. * `CosineAnnealingWarmRestart` still takes an epoch parameter as it is the only one with a mechanic relying on fractional epoch ghstack-source-id: 561b0af Pull Request resolved: #24352
Enable chainable schedulers as requested in #13022 by implementing the changes mentioned below from [comment](#21800 (comment)). Stack from [ghstack](https://github.com/ezyang/ghstack): * Changing the behavior of schedulers to the chainable formula when available * Using the closed form whenever epoch is different from None until the next release with a deprecation warning * Making get_computed_values the supported way of obtaining the last computed learning rate by the scheduler (see [comment](#21800 (comment)) for new syntax) * Returning a deprecation warning when invoking the undocumented get_lr function (see [comment](#21800 (comment))) referring to get_computed_values, and deprecating it in the next release. * `CosineAnnealingWarmRestart` still takes an epoch parameter as it is the only one with a mechanic relying on fractional epoch
Enable chainable schedulers as requested in #13022 by implementing the changes mentioned below from [comment](#21800 (comment)). * Changing the behavior of schedulers to the chainable formula when available * Using the closed form whenever epoch is different from None until the next release with a deprecation warning * Making get_computed_values the supported way of obtaining the last computed learning rate by the scheduler (see [comment](#21800 (comment)) for new syntax) * Returning a deprecation warning when invoking the undocumented get_lr function (see [comment](#21800 (comment))) referring to get_computed_values, and deprecating it in the next release. * `CosineAnnealingWarmRestart` still takes an epoch parameter as it is the only one with a mechanic relying on fractional epoch ghstack-source-id: a9dacbf Pull Request resolved: #24352
Enable chainable schedulers as requested in #13022 by implementing the changes mentioned below from [comment](#21800 (comment)). Stack from [ghstack](https://github.com/ezyang/ghstack): * Changing the behavior of schedulers to the chainable formula when available * Using the closed form whenever epoch is different from None until the next release with a deprecation warning * Making get_computed_values the supported way of obtaining the last computed learning rate by the scheduler (see [comment](#21800 (comment)) for new syntax) * Returning a deprecation warning when invoking the undocumented get_lr function (see [comment](#21800 (comment))) referring to get_computed_values, and deprecating it in the next release. * `CosineAnnealingWarmRestart` still takes an epoch parameter as it is the only one with a mechanic relying on fractional epoch * `LambdaLR` maintains its non-chainable form in this PR
Enable chainable schedulers as requested in #13022 by implementing the changes mentioned below from [comment](#21800 (comment)). * Changing the behavior of schedulers to the chainable formula when available * Using the closed form whenever epoch is different from None until the next release with a deprecation warning * Making get_computed_values the supported way of obtaining the last computed learning rate by the scheduler (see [comment](#21800 (comment)) for new syntax) * Returning a deprecation warning when invoking the undocumented get_lr function (see [comment](#21800 (comment))) referring to get_computed_values, and deprecating it in the next release. * `CosineAnnealingWarmRestart` still takes an epoch parameter as it is the only one with a mechanic relying on fractional epoch ghstack-source-id: 417663b Pull Request resolved: #24352
Enable chainable schedulers as requested in #13022 by implementing the changes mentioned below from [comment](#21800 (comment)). * Changing the behavior of schedulers to the chainable formula when available * Using the closed form whenever epoch is different from None until the next release with a deprecation warning * Making get_computed_values the supported way of obtaining the last computed learning rate by the scheduler (see [comment](#21800 (comment)) for new syntax) * Returning a deprecation warning when invoking the undocumented get_lr function (see [comment](#21800 (comment))) referring to get_computed_values, and deprecating it in the next release. * `CosineAnnealingWarmRestart` still takes an epoch parameter as it is the only one with a mechanic relying on fractional epoch * `LambdaLR` maintains its non-chainable form in this PR # #20527 ### Before ``` import torch.optim as optim from torch import nn conv = nn.Conv2d(3,3,3) optimizer = optim.Adam(conv.parameters()) lr_scheduler = optim.lr_scheduler.StepLR(optimizer, 2) for _ in range(10): # Scheduler computes rate for epoch 2 using closed form formula lr_scheduler.step(2) print(optimizer.param_groups[0]['lr']) ``` ### After If the user wants to step ``` import torch.optim as optim from torch import nn conv = nn.Conv2d(3,3,3) optimizer = optim.Adam(conv.parameters()) lr_scheduler = optim.lr_scheduler.StepLR(optimizer, 2) for _ in range(10): # Scheduler computes new iteration of learning rate, without epoch parameter lr_scheduler.step() print(optimizer.param_groups[0]['lr']) ``` or if the user wants an explicit constant function and use the scheduler framework ``` lr_scheduler = LambdaLR(optimizer, lr_lambda=lambda epoch: 0.5) for _ in range(10): # The new constant scheduler simply returns constant learning rate lr_scheduler.step() print(lr_scheduler.get_computed_values()) ``` # #22107 ### Before ``` import torch from torchvision.models import resnet18 net = resnet18() optimizer = torch.optim.SGD(net.parameters(), 0.1) scheduler = torch.optim.lr_scheduler.MultiStepLR(optimizer, milestones=[3, 6, 9], gamma=0.1) scheduler = torch.optim.lr_scheduler.StepLR(optimizer, 3, gamma=0.1) for i in range(10): # Scheduler computes and returns new learning rate, leading to unexpected behavior print(i, scheduler.get_lr()) scheduler.step() ``` ### After ``` import torch from torchvision.models import resnet18 net = resnet18() optimizer = torch.optim.SGD(net.parameters(), 0.1) lr_scheduler = torch.optim.lr_scheduler.MultiStepLR(optimizer, milestones=[3, 6, 9], gamma=0.1) lr_scheduler = torch.optim.lr_scheduler.StepLR(optimizer, 3, gamma=0.1) for i in range(10): # Returns last computed learning rate by scheduler print(i, lr_scheduler.get_computed_values()) lr_scheduler.step() ```
Enable chainable schedulers as requested in #13022 by implementing the changes mentioned below from [comment](#21800 (comment)). * Changing the behavior of schedulers to the chainable formula when available * Using the closed form whenever epoch is different from None until the next release with a deprecation warning * Making get_computed_values the supported way of obtaining the last computed learning rate by the scheduler (see [comment](#21800 (comment)) for new syntax) * Returning a deprecation warning when invoking the undocumented get_lr function (see [comment](#21800 (comment))) referring to get_computed_values, and deprecating it in the next release. * `CosineAnnealingWarmRestart` still takes an epoch parameter as it is the only one with a mechanic relying on fractional epoch * `LambdaLR` maintains its non-chainable form in this PR # #20527 ### Before ``` import torch.optim as optim from torch import nn conv = nn.Conv2d(3,3,3) optimizer = optim.Adam(conv.parameters()) lr_scheduler = optim.lr_scheduler.StepLR(optimizer, 2) for _ in range(10): # Scheduler computes rate for epoch 2 using closed form formula lr_scheduler.step(2) print(optimizer.param_groups[0]['lr']) ``` ### After If the user wants to step ``` import torch.optim as optim from torch import nn conv = nn.Conv2d(3,3,3) optimizer = optim.Adam(conv.parameters()) lr_scheduler = optim.lr_scheduler.StepLR(optimizer, 2) for _ in range(10): # Scheduler computes new iteration of learning rate, without epoch parameter lr_scheduler.step() print(optimizer.param_groups[0]['lr']) ``` or if the user wants an explicit constant function and use the scheduler framework ``` lr_scheduler = LambdaLR(optimizer, lr_lambda=lambda epoch: 0.5) for _ in range(10): # The new constant scheduler simply returns constant learning rate lr_scheduler.step() print(lr_scheduler.get_computed_values()) ``` # #22107 ### Before ``` import torch from torchvision.models import resnet18 net = resnet18() optimizer = torch.optim.SGD(net.parameters(), 0.1) scheduler = torch.optim.lr_scheduler.MultiStepLR(optimizer, milestones=[3, 6, 9], gamma=0.1) scheduler = torch.optim.lr_scheduler.StepLR(optimizer, 3, gamma=0.1) for i in range(10): # Scheduler computes and returns new learning rate, leading to unexpected behavior print(i, scheduler.get_lr()) scheduler.step() ``` ### After ``` import torch from torchvision.models import resnet18 net = resnet18() optimizer = torch.optim.SGD(net.parameters(), 0.1) lr_scheduler = torch.optim.lr_scheduler.MultiStepLR(optimizer, milestones=[3, 6, 9], gamma=0.1) lr_scheduler = torch.optim.lr_scheduler.StepLR(optimizer, 3, gamma=0.1) for i in range(10): # Returns last computed learning rate by scheduler print(i, lr_scheduler.get_computed_values()) lr_scheduler.step() ```
Enable chainable schedulers as requested in #13022 by implementing the changes mentioned below from [comment](#21800 (comment)). * Changing the behavior of schedulers to the chainable formula when available * Using the closed form whenever epoch is different from None until the next release with a deprecation warning * Making get_computed_values the supported way of obtaining the last computed learning rate by the scheduler (see [comment](#21800 (comment)) for new syntax) * Returning a deprecation warning when invoking the undocumented get_lr function (see [comment](#21800 (comment))) referring to get_computed_values, and deprecating it in the next release. * `CosineAnnealingWarmRestart` still takes an epoch parameter as it is the only one with a mechanic relying on fractional epoch ghstack-source-id: f738dc3 Pull Request resolved: #24352
Enable chainable schedulers as requested in #13022 by implementing the changes mentioned below from [comment](#21800 (comment)). * Changing the behavior of schedulers to the chainable formula when available * Using the closed form whenever epoch is different from None until the next release with a deprecation warning * Making get_computed_values the supported way of obtaining the last computed learning rate by the scheduler (see [comment](#21800 (comment)) for new syntax) * Returning a deprecation warning when invoking the undocumented get_lr function (see [comment](#21800 (comment))) referring to get_computed_values, and deprecating it in the next release. * `CosineAnnealingWarmRestart` still takes an epoch parameter as it is the only one with a mechanic relying on fractional epoch ghstack-source-id: bf62d37 Pull Request resolved: #24352
Enable chainable schedulers as requested in #13022 by implementing the changes mentioned below from [comment](#21800 (comment)). * Changing the behavior of schedulers to the chainable formula when available * Using the closed form whenever epoch is different from None until the next release with a deprecation warning * Making get_computed_values the supported way of obtaining the last computed learning rate by the scheduler (see [comment](#21800 (comment)) for new syntax) * Returning a deprecation warning when invoking the undocumented get_lr function (see [comment](#21800 (comment))) referring to get_computed_values, and deprecating it in the next release. * `CosineAnnealingWarmRestart` still takes an epoch parameter as it is the only one with a mechanic relying on fractional epoch * `LambdaLR` maintains its non-chainable form in this PR # #20527 ### Before The user calls scheduler with a constant epoch either across loops or in the same loop. ``` import torch.optim as optim from torch import nn conv = nn.Conv2d(3,3,3) optimizer = optim.Adam(conv.parameters()) lr_scheduler = optim.lr_scheduler.StepLR(optimizer, 2) for _ in range(10): # Scheduler computes rate for epoch 2 using closed form formula lr_scheduler.step(2) lr_scheduler.step(2) print(optimizer.param_groups[0]['lr']) ``` ### After If the user wants to step ``` import torch.optim as optim from torch import nn conv = nn.Conv2d(3,3,3) optimizer = optim.Adam(conv.parameters()) lr_scheduler = optim.lr_scheduler.StepLR(optimizer, 2) for _ in range(10): # Scheduler computes new iteration of learning rate, without epoch parameter lr_scheduler.step() print(optimizer.param_groups[0]['lr']) ``` or if the user wants an explicit constant function and use the scheduler framework ``` lr_scheduler = LambdaLR(optimizer, lr_lambda=lambda epoch: 0.5) for _ in range(10): # The new constant scheduler simply returns constant learning rate lr_scheduler.step() print(lr_scheduler.get_computed_values()) ``` # #22107 ### Before ``` import torch from torchvision.models import resnet18 net = resnet18() optimizer = torch.optim.SGD(net.parameters(), 0.1) scheduler = torch.optim.lr_scheduler.MultiStepLR(optimizer, milestones=[3, 6, 9], gamma=0.1) scheduler = torch.optim.lr_scheduler.StepLR(optimizer, 3, gamma=0.1) for i in range(10): # Scheduler computes and returns new learning rate, leading to unexpected behavior print(i, scheduler.get_lr()) scheduler.step() ``` ### After ``` import torch from torchvision.models import resnet18 net = resnet18() optimizer = torch.optim.SGD(net.parameters(), 0.1) lr_scheduler = torch.optim.lr_scheduler.MultiStepLR(optimizer, milestones=[3, 6, 9], gamma=0.1) lr_scheduler = torch.optim.lr_scheduler.StepLR(optimizer, 3, gamma=0.1) for i in range(10): # Returns last computed learning rate by scheduler print(i, lr_scheduler.get_computed_values()) lr_scheduler.step() ```
Enable chainable schedulers as requested in #13022 by implementing the changes mentioned below from [comment](#21800 (comment)). * Changing the behavior of schedulers to the chainable formula when available * Using the closed form whenever epoch is different from None until the next release with a deprecation warning * Making `get_computed_values` the supported way of obtaining the last computed learning rate by the scheduler (see [comment](#21800 (comment)) for new syntax) * Returning a deprecation warning when invoking the undocumented get_lr function (see [comment](#21800 (comment))) referring to `get_computed_values`, and deprecating it in the next release. * `CosineAnnealingWarmRestart` still takes an epoch parameter as it is the only one with a mechanic relying on fractional epoch * `LambdaLR` now has a toggle to switch between chainable and non-chainable form. It currently defaults to non-chainable, but also returns a FutureWarning to default to chainable form later like most other schedulers. # #20527 ### Before The user calls scheduler with a constant epoch either across loops or in the same loop. ``` import torch.optim as optim from torch import nn conv = nn.Conv2d(3,3,3) optimizer = optim.Adam(conv.parameters()) lr_scheduler = optim.lr_scheduler.StepLR(optimizer, 2) for _ in range(10): # Scheduler computes rate for epoch 2 using closed form formula lr_scheduler.step(2) lr_scheduler.step(2) print(optimizer.param_groups[0]['lr']) ``` ### After If the user wants to step ``` import torch.optim as optim from torch import nn conv = nn.Conv2d(3,3,3) optimizer = optim.Adam(conv.parameters()) lr_scheduler = optim.lr_scheduler.StepLR(optimizer, 2) for _ in range(10): # Scheduler computes new iteration of learning rate, without epoch parameter lr_scheduler.step() print(optimizer.param_groups[0]['lr']) ``` or if the user wants an explicit constant function and use the scheduler framework ``` lr_scheduler = LambdaLR(optimizer, lr_lambda=lambda epoch: 0.5) for _ in range(10): # The new constant scheduler simply returns constant learning rate lr_scheduler.step() print(lr_scheduler.get_computed_values()) ``` # #22107 ### Before ``` import torch from torchvision.models import resnet18 net = resnet18() optimizer = torch.optim.SGD(net.parameters(), 0.1) scheduler = torch.optim.lr_scheduler.MultiStepLR(optimizer, milestones=[3, 6, 9], gamma=0.1) scheduler = torch.optim.lr_scheduler.StepLR(optimizer, 3, gamma=0.1) for i in range(10): # Scheduler computes and returns new learning rate, leading to unexpected behavior print(i, scheduler.get_lr()) scheduler.step() ``` ### After ``` import torch from torchvision.models import resnet18 net = resnet18() optimizer = torch.optim.SGD(net.parameters(), 0.1) lr_scheduler = torch.optim.lr_scheduler.MultiStepLR(optimizer, milestones=[3, 6, 9], gamma=0.1) lr_scheduler = torch.optim.lr_scheduler.StepLR(optimizer, 3, gamma=0.1) for i in range(10): # Returns last computed learning rate by scheduler print(i, lr_scheduler.get_computed_values()) lr_scheduler.step() ```
Enable chainable schedulers as requested in #13022 by implementing the changes mentioned below from [comment](#21800 (comment)). * Changing the behavior of schedulers to the chainable formula when available * Using the closed form whenever epoch is different from None until the next release with a deprecation warning * Making `get_computed_values` the supported way of obtaining the last computed learning rate by the scheduler (see [comment](#21800 (comment)) for new syntax) * Returning a deprecation warning when invoking the undocumented get_lr function (see [comment](#21800 (comment))) referring to `get_computed_values`, and deprecating it in the next release. * `CosineAnnealingWarmRestart` still takes an epoch parameter as it is the only one with a mechanic relying on fractional epoch * `LambdaLR` now has a toggle to switch between chainable and non-chainable form. It currently defaults to non-chainable, but also returns a FutureWarning to default to chainable form later like most other schedulers. # #20527 ### Before The user calls scheduler with a constant epoch either across loops or in the same loop. ``` import torch.optim as optim from torch import nn conv = nn.Conv2d(3,3,3) optimizer = optim.Adam(conv.parameters()) lr_scheduler = optim.lr_scheduler.StepLR(optimizer, 2) # Scheduler with sometimes-constant epoch number for epoch in [0, 0, 1, 1, 2, 2, 3, 3]: lr_scheduler.step(epoch) print(optimizer.param_groups[0]['lr']) ``` ### After If the user wants to step ``` import torch.optim as optim from torch import nn conv = nn.Conv2d(3,3,3) optimizer = optim.Adam(conv.parameters()) lr_scheduler = optim.lr_scheduler.StepLR(optimizer, 2) last_epoch = -1 for epoch in [0, 0, 1, 1, 2, 2, 3, 3]: if epoch-last_epoch > 0: lr_scheduler.step() last_epoch = epoch print(epoch, scheduler.get_computed_values()) ``` # #22107 ### Before ``` import torch from torchvision.models import resnet18 net = resnet18() optimizer = torch.optim.SGD(net.parameters(), 0.1) scheduler = torch.optim.lr_scheduler.MultiStepLR(optimizer, milestones=[3, 6, 9], gamma=0.1) scheduler = torch.optim.lr_scheduler.StepLR(optimizer, 3, gamma=0.1) for i in range(10): # Scheduler computes and returns new learning rate, leading to unexpected behavior print(i, scheduler.get_lr()) scheduler.step() ``` ### After ``` import torch from torchvision.models import resnet18 net = resnet18() optimizer = torch.optim.SGD(net.parameters(), 0.1) lr_scheduler = torch.optim.lr_scheduler.MultiStepLR(optimizer, milestones=[3, 6, 9], gamma=0.1) lr_scheduler = torch.optim.lr_scheduler.StepLR(optimizer, 3, gamma=0.1) for i in range(10): # Returns last computed learning rate by scheduler print(i, lr_scheduler.get_computed_values()) lr_scheduler.step() ```
Enable chainable schedulers as requested in #13022 by implementing the changes mentioned below from [comment](#21800 (comment)). * Changing the behavior of schedulers to the chainable formula when available * Using the closed form whenever epoch is different from None until the next release with a deprecation warning * Making get_computed_values the supported way of obtaining the last computed learning rate by the scheduler (see [comment](#21800 (comment)) for new syntax) * Returning a deprecation warning when invoking the undocumented get_lr function (see [comment](#21800 (comment))) referring to get_computed_values, and deprecating it in the next release. * `CosineAnnealingWarmRestart` still takes an epoch parameter as it is the only one with a mechanic relying on fractional epoch ghstack-source-id: 26fbf39 Pull Request resolved: #24352
Enable chainable schedulers as requested in #13022 by implementing the changes mentioned below from [comment](#21800 (comment)). * Changing the behavior of schedulers to the chainable formula when available * Using the closed form whenever epoch is different from None until the next release with a deprecation warning * Making get_computed_values the supported way of obtaining the last computed learning rate by the scheduler (see [comment](#21800 (comment)) for new syntax) * Returning a deprecation warning when invoking the undocumented get_lr function (see [comment](#21800 (comment))) referring to get_computed_values, and deprecating it in the next release. * `CosineAnnealingWarmRestart` still takes an epoch parameter as it is the only one with a mechanic relying on fractional epoch ghstack-source-id: 686ba80 Pull Request resolved: #24352
Enable chainable schedulers as requested in #13022 by implementing the changes mentioned below from [comment](#21800 (comment)). * Changing the behavior of schedulers to the chainable formula when available * Using the closed form whenever epoch is different from None until the next release with a deprecation warning * Making `get_computed_values` the supported way of obtaining the last computed learning rate by the scheduler (see [comment](#21800 (comment)) for new syntax) * Returning a deprecation warning when invoking the undocumented get_lr function (see [comment](#21800 (comment))) referring to `get_computed_values`, and deprecating it in the next release. * `CosineAnnealingWarmRestart` still takes an epoch parameter as it is the only one with a mechanic relying on fractional epoch * `MultiplicativeLR` is consumes a function providing the multiplicative factor at each epoch. It mimics `LambdaLR` in its syntax. # #20527 ### Before The user calls scheduler with a constant epoch either across loops or in the same loop. ``` import torch.optim as optim from torch import nn conv = nn.Conv2d(3,3,3) optimizer = optim.Adam(conv.parameters()) lr_scheduler = optim.lr_scheduler.StepLR(optimizer, 2) # Scheduler with sometimes-constant epoch number for epoch in [0, 0, 1, 1, 2, 2, 3, 3]: lr_scheduler.step(epoch) print(optimizer.param_groups[0]['lr']) ``` ### After If the user wants to step ``` import torch.optim as optim from torch import nn conv = nn.Conv2d(3,3,3) optimizer = optim.Adam(conv.parameters()) lr_scheduler = optim.lr_scheduler.StepLR(optimizer, 2) last_epoch = -1 for epoch in [0, 0, 1, 1, 2, 2, 3, 3]: # Check if epoch number has changed manually if epoch-last_epoch > 0: lr_scheduler.step() last_epoch = epoch print(epoch, scheduler.get_computed_values()) ``` # #22107 ### Before ``` import torch from torchvision.models import resnet18 net = resnet18() optimizer = torch.optim.SGD(net.parameters(), 0.1) scheduler = torch.optim.lr_scheduler.MultiStepLR(optimizer, milestones=[3, 6, 9], gamma=0.1) scheduler = torch.optim.lr_scheduler.StepLR(optimizer, 3, gamma=0.1) for i in range(10): # Scheduler computes and returns new learning rate, leading to unexpected behavior print(i, scheduler.get_lr()) scheduler.step() ``` ### After ``` import torch from torchvision.models import resnet18 net = resnet18() optimizer = torch.optim.SGD(net.parameters(), 0.1) lr_scheduler = torch.optim.lr_scheduler.MultiStepLR(optimizer, milestones=[3, 6, 9], gamma=0.1) lr_scheduler = torch.optim.lr_scheduler.StepLR(optimizer, 3, gamma=0.1) for i in range(10): # Returns last computed learning rate by scheduler print(i, lr_scheduler.get_computed_values()) lr_scheduler.step() ``` Differential Revision: [D17349760](https://our.internmc.facebook.com/intern/diff/D17349760)
Summary: Pull Request resolved: #24352 Enable chainable schedulers as requested in #13022 by implementing the changes mentioned below from [comment](#21800 (comment)). * Changing the behavior of schedulers to the chainable formula when available * Using the closed form whenever epoch is different from None until the next release with a deprecation warning * Making `get_computed_values` the supported way of obtaining the last computed learning rate by the scheduler (see [comment](#21800 (comment)) for new syntax) * Returning a deprecation warning when invoking the undocumented get_lr function (see [comment](#21800 (comment))) referring to `get_computed_values`, and deprecating it in the next release. * `CosineAnnealingWarmRestart` still takes an epoch parameter as it is the only one with a mechanic relying on fractional epoch * `MultiplicativeLR` is consumes a function providing the multiplicative factor at each epoch. It mimics `LambdaLR` in its syntax. # #20527 ### Before The user calls scheduler with a constant epoch either across loops or in the same loop. ``` import torch.optim as optim from torch import nn conv = nn.Conv2d(3,3,3) optimizer = optim.Adam(conv.parameters()) lr_scheduler = optim.lr_scheduler.StepLR(optimizer, 2) # Scheduler with sometimes-constant epoch number for epoch in [0, 0, 1, 1, 2, 2, 3, 3]: lr_scheduler.step(epoch) print(optimizer.param_groups[0]['lr']) ``` ### After If the user wants to step ``` import torch.optim as optim from torch import nn conv = nn.Conv2d(3,3,3) optimizer = optim.Adam(conv.parameters()) lr_scheduler = optim.lr_scheduler.StepLR(optimizer, 2) last_epoch = -1 for epoch in [0, 0, 1, 1, 2, 2, 3, 3]: # Check if epoch number has changed manually if epoch-last_epoch > 0: lr_scheduler.step() last_epoch = epoch print(epoch, scheduler.get_computed_values()) ``` # #22107 ### Before ``` import torch from torchvision.models import resnet18 net = resnet18() optimizer = torch.optim.SGD(net.parameters(), 0.1) scheduler = torch.optim.lr_scheduler.MultiStepLR(optimizer, milestones=[3, 6, 9], gamma=0.1) scheduler = torch.optim.lr_scheduler.StepLR(optimizer, 3, gamma=0.1) for i in range(10): # Scheduler computes and returns new learning rate, leading to unexpected behavior print(i, scheduler.get_lr()) scheduler.step() ``` ### After ``` import torch from torchvision.models import resnet18 net = resnet18() optimizer = torch.optim.SGD(net.parameters(), 0.1) lr_scheduler = torch.optim.lr_scheduler.MultiStepLR(optimizer, milestones=[3, 6, 9], gamma=0.1) lr_scheduler = torch.optim.lr_scheduler.StepLR(optimizer, 3, gamma=0.1) for i in range(10): # Returns last computed learning rate by scheduler print(i, lr_scheduler.get_computed_values()) lr_scheduler.step() ``` Test Plan: Imported from OSS Differential Revision: D17349760 Pulled By: vincentqb fbshipit-source-id: 0a6ac01e2a6b45000bc6f9df732033dd81f0d89f
Enable chainable schedulers as requested in #13022 by implementing the changes mentioned below from [comment](#21800 (comment)). * Changing the behavior of schedulers to the chainable formula when available * Using the closed form whenever epoch is different from None until the next release with a deprecation warning * Making `get_computed_values` the supported way of obtaining the last computed learning rate by the scheduler (see [comment](#21800 (comment)) for new syntax) * Returning a deprecation warning when invoking the undocumented get_lr function (see [comment](#21800 (comment))) referring to `get_computed_values`, and deprecating it in the next release. * `CosineAnnealingWarmRestart` still takes an epoch parameter as it is the only one with a mechanic relying on fractional epoch * `MultiplicativeLR` is consumes a function providing the multiplicative factor at each epoch. It mimics `LambdaLR` in its syntax. # #20527 ### Before The user calls scheduler with a constant epoch either across loops or in the same loop. ``` import torch.optim as optim from torch import nn conv = nn.Conv2d(3,3,3) optimizer = optim.Adam(conv.parameters()) lr_scheduler = optim.lr_scheduler.StepLR(optimizer, 2) # Scheduler with sometimes-constant epoch number for epoch in [0, 0, 1, 1, 2, 2, 3, 3]: lr_scheduler.step(epoch) print(optimizer.param_groups[0]['lr']) ``` ### After If the user wants to step ``` import torch.optim as optim from torch import nn conv = nn.Conv2d(3,3,3) optimizer = optim.Adam(conv.parameters()) lr_scheduler = optim.lr_scheduler.StepLR(optimizer, 2) last_epoch = -1 for epoch in [0, 0, 1, 1, 2, 2, 3, 3]: # Check if epoch number has changed manually if epoch-last_epoch > 0: lr_scheduler.step() last_epoch = epoch print(epoch, scheduler.get_computed_values()) ``` # #22107 ### Before ``` import torch from torchvision.models import resnet18 net = resnet18() optimizer = torch.optim.SGD(net.parameters(), 0.1) scheduler = torch.optim.lr_scheduler.MultiStepLR(optimizer, milestones=[3, 6, 9], gamma=0.1) scheduler = torch.optim.lr_scheduler.StepLR(optimizer, 3, gamma=0.1) for i in range(10): # Scheduler computes and returns new learning rate, leading to unexpected behavior print(i, scheduler.get_lr()) scheduler.step() ``` ### After ``` import torch from torchvision.models import resnet18 net = resnet18() optimizer = torch.optim.SGD(net.parameters(), 0.1) lr_scheduler = torch.optim.lr_scheduler.MultiStepLR(optimizer, milestones=[3, 6, 9], gamma=0.1) lr_scheduler = torch.optim.lr_scheduler.StepLR(optimizer, 3, gamma=0.1) for i in range(10): # Returns last computed learning rate by scheduler print(i, lr_scheduler.get_computed_values()) lr_scheduler.step() ``` # ghstack This contains the changes from #24352. Opening again since they were reverted. This reverts commit 1c477b7. Differential Revision: [D17460427](https://our.internmc.facebook.com/intern/diff/D17460427)
* C++ Average Pool Module (#25800)
Summary:
This PR adds Average Pool module to C++ front-end.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25800
Differential Revision: D17318094
Pulled By: yf225
fbshipit-source-id: c914c0e802bbe5f1d1f0a21a669c28bc956899db
* Better error messages in C2 ONNX backend (#25809)
Summary:
Just a tiny fix to make debugging easier (output errors to stderr and include in the exception message)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25809
Reviewed By: zrphercule
Differential Revision: D17329957
Pulled By: houseroad
fbshipit-source-id: 0d73dd9f62c735fbc5096e6a7c0e5f58e4cd90ae
* Add new API for Fully Connected and Convolution Operators in QNNPACK (#25862)
Summary:
This change adds a new prepack and run function for FC and Convolution operators in QNNPACK.
The new functions added are `PackBMatrix`, `qnnpackLinear`, `PrePackConvWeights` and `qnnpackConv`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25862
Test Plan:
QNNPACK unit tests
fully-connected-test
convolution-test
Differential Revision: D17299260
Pulled By: supriyar
fbshipit-source-id: fdc4e2d5f1232675acd153f3efb9d17ed8628a54
* Enable more mGPU tests (#26055)
Summary:
Enable mGPU tests that pass on ROCm as of 2.7.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26055
Differential Revision: D17331484
Pulled By: bddppq
fbshipit-source-id: 51f956a84a6c14a1a41473d322950994fa29c25c
* remove verbose in pytorch_ci hypothesis profile (#26075)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26075
att, remove verbose argument to reduce noice in the logs
Test Plan:
ci
Imported from OSS
Differential Revision: D17335935
fbshipit-source-id: 2e4289e838bf4489dcad8d5533353eebcff0d481
* TorchScript Serialization for dynamic LSTM module
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/25877
Test Plan: Imported from OSS
Reviewed By: jianyuh
Differential Revision: D17275746
Pulled By: jamesr66a
fbshipit-source-id: db2f38ddd99f02ccb4fb754fa1c1e6cad4425fa8
* Upgrade the naming for fbgemm quantized op (#26064)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26064
Just changing the names after https://github.com/pytorch/pytorch/pull/25678.
ghstack-source-id: 89944542
Test Plan: CI
Differential Revision: D17332068
fbshipit-source-id: 5e9febed7a2fcd10d44273e55643b277d33a3ad7
* Use BytesIO instead of tempfile (#25976)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25976
As recommended in https://github.com/pytorch/pytorch/pull/25877/files#r322956051:
> We should move more of these toward using BytesIO. Using files in tests is generally considered bad practice because it introduces syscalls and dependencies on the execution environment, and thus can cause test flakiness/instability.
ghstack-source-id: 89929947
Test Plan: CI
Differential Revision: D17310441
fbshipit-source-id: ba97cce4224225df45ff44062f1bc8ebefb25922
* Revert "TorchScript Serialization for dynamic LSTM module" (#26079)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26079
This reverts commit e3039612d851d0fbd337546c8debc27ec7cfc4e4.
Test Plan: Imported from OSS
Differential Revision: D17337585
Pulled By: jamesr66a
fbshipit-source-id: 4b93a4c5ca2fe491d609da889a42d22be8e52889
* Add Runtime flag for quantized backend. (#25680)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25680
Add a runtime flag to choose between FBGEMM and QNNPACK when compiled with both.
The flag can be set by using torch.backends.quantized.engine = torch.fbgemm/torch.qnnpack or ctx::setPreferredQuantizedEngine(at::QEngine)
ghstack-source-id: 89935643
Test Plan: Verified torch.backends.quantized.engine works
Differential Revision: D17198233
fbshipit-source-id: e5449d06f4136385e0e6d18bd4237f8654a61672
* Dynamic registration of RPC backends (#25734)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25734
[pytorch] Dynamic registration of RPC backends
Allow non-pg rpc backends to be plugged in as a backend.
ghstack-source-id: 89938296
Differential Revision: D17183789
fbshipit-source-id: 885fed12d80b82b60f9a125f78302a161e708089
* Make regular softmax warp size aware (#25956)
Summary:
Enable one unit test that passes now.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25956
Differential Revision: D17298150
Pulled By: bddppq
fbshipit-source-id: 8763e71ad7ef80be915fe93a3471b29f27f3f0a4
* Move NamedTensorMetaInterface definitions to TensorImpl.h (#26030)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/26030
Test Plan:
- [namedtensor ci]
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26030
Differential Revision: D17322383
Pulled By: zou3519
fbshipit-source-id: d5b914d646b48a6f4e0104aceb435e694b72bd96
* Experimental warning for named tensors (#26050)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26050
Throws a warning once when someone attempts to attach names to a tensor.
This is guaranteed to happen at the callsite `set_named_tensor_meta`.
Test Plan: - run tests [namedtensor ci]
Differential Revision: D17331634
Pulled By: zou3519
fbshipit-source-id: 44f5e5c95acd9c7ba543c1210a3b1314aab348f0
* print source code when a function is executed (#25868)
Summary:
While this isn't ideal as it might print out the same source every time a function is run; it's still easier to go and tweak python code to reduce loop counts, than to insert `std::cout` and recompile cpp code.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25868
Differential Revision: D17318386
Pulled By: Krovatkin
fbshipit-source-id: 928ba6543204042924ab41a724635594709630de
* Disable test_cuda.test_stream_event_nogil on ROCm (#26087)
Summary:
Was recently enabled in https://github.com/pytorch/pytorch/pull/26055, it's flaky on master:
https://ci.pytorch.org/jenkins/job/pytorch-builds/job/py2-clang7-rocmdeb-ubuntu16.04-test/37575
https://ci.pytorch.org/jenkins/job/pytorch-builds/job/py2-clang7-rocmdeb-ubuntu16.04-test/37577
```
05:39:35 test_stream_event_nogil (__main__.TestCuda) ... Exception in thread Thread-3:
05:39:40 Traceback (most recent call last):
05:39:40 File "/usr/lib/python2.7/threading.py", line 801, in __bootstrap_inner
05:39:40 self.run()
05:39:40 File "/usr/lib/python2.7/threading.py", line 754, in run
05:39:40 self.__target(*self.__args, **self.__kwargs)
05:39:40 File "test_cuda.py", line 1894, in _test_stream_event_nogil
05:39:40 c2p.put(sync_func(self, TestCuda.FIFTY_MIL_CYCLES))
05:39:40 File "test_cuda.py", line 1882, in _event_wait
05:39:40 self.assertTrue(s1.query())
05:39:40 File "/usr/lib/python2.7/unittest/case.py", line 422, in assertTrue
05:39:40 raise self.failureException(msg)
05:39:40 AssertionError: False is not true
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26087
Differential Revision: D17340891
Pulled By: bddppq
fbshipit-source-id: b2b70beb1b068db53197a5f9f6a80cb046e66ebd
* TorchScript Serialization for dynamic LSTM
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/26084
Test Plan: Imported from OSS
Differential Revision: D17339315
Pulled By: jamesr66a
fbshipit-source-id: 03a2674edcf779becfe3b8ec96f1bae23c74b11c
* Automatic update of fbcode/onnx to 7988d8360b11e6003560076e9b1d4aa426db3244 (#25959)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25959
Previous import was 28ca699b69b5a31892619defca2391044a9a6052
Included changes:
- **[7988d836](https://github.com/onnx/onnx/commit/7988d836)**: Supporting negative axes for all existing onnx ops (#2281) <Negin Raoof>
- **[5ca0a09e](https://github.com/onnx/onnx/commit/5ca0a09e)**: Update managingexperimentalops.md (#1981) <Joseph Spisak>
- **[bc0495c1](https://github.com/onnx/onnx/commit/bc0495c1)**: Fix link to community docs in readme (#2261) <Prasanth Pulavarthi>
- **[2fdb3ef6](https://github.com/onnx/onnx/commit/2fdb3ef6)**: move map and sequence types to onnx domain, (#2244) <Ke Zhang>
- **[568b65aa](https://github.com/onnx/onnx/commit/568b65aa)**: Improve compatiblity with proto3 and enable reading attributes (#2288) <Dmitri Smirnov>
- **[1f350f2c](https://github.com/onnx/onnx/commit/1f350f2c)**: Remove type info for loop variadic input in Loop op used to compose the Range op (#2287) <Hariharan Seshadri>
- **[eb139446](https://github.com/onnx/onnx/commit/eb139446)**: Add Foundation WG to working-groups.md (#2276) <Ryan Loney>
- **[4eabc4b3](https://github.com/onnx/onnx/commit/4eabc4b3)**: Fix testdata model for CumSum. Add exclusive attribute. (#2271) <jignparm>
- **[1a62afdb](https://github.com/onnx/onnx/commit/1a62afdb)**: Support GatherND operator in ONNX (#2106) <Hariharan Seshadri>
- **[0e330e9d](https://github.com/onnx/onnx/commit/0e330e9d)**: Support ScatterND operator in ONNX (#2220) <Bowen Bao>
- **[733f7a6a](https://github.com/onnx/onnx/commit/733f7a6a)**: Add Det to ONNX (#2233) <Bowen Bao>
- **[52187738](https://github.com/onnx/onnx/commit/52187738)**: Update the description of nearest_mode of resize op (#2257) <daquexian>
- **[64b4b686](https://github.com/onnx/onnx/commit/64b4b686)**: Adding sparse tensor to ONNX (#2019) <G. Ramalingam>
- **[c8a8b7cc](https://github.com/onnx/onnx/commit/c8a8b7cc)**: Support Range operator in ONNX (#2242) <Hariharan Seshadri>
- **[44b0d6d5](https://github.com/onnx/onnx/commit/44b0d6d5)**: Update resize op (#2057) <daquexian>
- **[7d907964](https://github.com/onnx/onnx/commit/7d907964)**: Add function to fuse dynamic quantization graph into 1 node (#2187) <Ashwini Khade>
- **[36f8e6d9](https://github.com/onnx/onnx/commit/36f8e6d9)**: Update logo_request.md (#2231) <Prasanth Pulavarthi>
- **[4eb737c8](https://github.com/onnx/onnx/commit/4eb737c8)**: Update Clip in opset 11 to support min/max as inputs instead of attributes (#2096) <Bowen Bao>
- **[a25e1388](https://github.com/onnx/onnx/commit/a25e1388)**: Fix segfault in tile shape inference (#2221) <daquexian>
- **[2dc273c7](https://github.com/onnx/onnx/commit/2dc273c7)**: update onehot shape inference to reflect the spec for depth input (#2224) <Ashwini Khade>
- **[665211c1](https://github.com/onnx/onnx/commit/665211c1)**: Add GatherElements Op and Rename ScatterElements (#2143) <Lara Haidar>
- **[3ba2e31a](https://github.com/onnx/onnx/commit/3ba2e31a)**: Unique (#2141) <liqunfu>
- **[5a5588ad](https://github.com/onnx/onnx/commit/5a5588ad)**: Clarify dimension variable scoping (#2211) <G. Ramalingam>
- **[fabe39d5](https://github.com/onnx/onnx/commit/fabe39d5)**: Liqun/topk sort (#2126) <liqunfu>
- **[453aa644](https://github.com/onnx/onnx/commit/453aa644)**: Update document for NMS (#2193) <Hector Li>
- **[34e28ec2](https://github.com/onnx/onnx/commit/34e28ec2)**: Handle negative 'axis' value in Split type and shape inferencing (#2177) <Scott McKay>
- **[28ec4583](https://github.com/onnx/onnx/commit/28ec4583)**: depth to space shuffle order (#2163) <Negin Raoof>
- **[98f72629](https://github.com/onnx/onnx/commit/98f72629)**: minor updates to fix links in readme (#2189) <Prasanth Pulavarthi>
- **[321d1467](https://github.com/onnx/onnx/commit/321d1467)**: Add check to disallow squeezing input axes which are not 1 (#2204) <Ashwini Khade>
- **[573f0dc9](https://github.com/onnx/onnx/commit/573f0dc9)**: fix a bug in fun shape inference (#2188) <Tang, Cheng>
- **[36dc7110](https://github.com/onnx/onnx/commit/36dc7110)**: Clarify ambiguity in gather spec regarding indices expectation (#2202) <Ashwini Khade>
- **[a2449673](https://github.com/onnx/onnx/commit/a2449673)**: Fix some minor issues in IR.md and Versioning.md (#2108) <edgchen1>
- **[349aff69](https://github.com/onnx/onnx/commit/349aff69)**: Skip install typing package for python >=3.5 (#2199) <bddppq>
Test Plan: ci
Reviewed By: bddppq, benoitsteiner
Differential Revision: D17296390
fbshipit-source-id: 9f9f5ce85d9694128008d756c2ea393bd4e0cb71
* Skip test_triangular_solve_batched (#26108)
Summary:
cc: gchanan zou3519
I will look into why this is failing spuriously.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26108
Differential Revision: D17348399
Pulled By: zou3519
fbshipit-source-id: aed4ccfc3f106692d4e32acc029740309570b0c3
* Exposing Fused8BitRowwiseQuantizedToFloat in PyTorch (#26080)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26080
Will be used in c2 ctr_mbl_feed model to PyTorch conversion
Test Plan: Unit test
Reviewed By: yinghai
Differential Revision: D17337604
fbshipit-source-id: a90d9f5dc38301608d1562c6f2418e7f4616e753
* make sure all out stringstreams start out empty in jit_log.hpp
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/25863
Differential Revision: D17347386
Pulled By: Krovatkin
fbshipit-source-id: a42cf56680a27bc3e50fd945ab372a409225b875
* tracing with an opt-in by file name (#25895)
Summary:
This basically works a simple filter as you suggested ZolotukhinM
`export PYTORCH_JIT_LOG_LEVEL=guard_elimination` will print all `GRAPH_DUMP` and `GRAPH_UPDATE` statements.
`export PYTORCH_JIT_LOG_LEVEL=>guard_elimination:>alias_analysis` will print all `GRAPH_DUMP`, `GRAPH_UPDATE` **and** `GRAPH_DEBUG` statements in `guard_elimination.cpp` **and** in `alias_analysis.cpp`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25895
Differential Revision: D17309090
Pulled By: Krovatkin
fbshipit-source-id: 8fa9e67cc9af566b084d66cc15223633fda08444
* Stop re-ordering TH(C)Blas arguments. (#25606)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25606
This just complicates the codegen for no benefit.
Test Plan: Imported from OSS
Differential Revision: D17172498
Pulled By: gchanan
fbshipit-source-id: d2f50e45400ac0336792422518e03dbae3a1bedc
* Kill TH(C)Blas kwarg_only declarations. (#25607)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25607
Since we don't generate these as end-user bindings, and we no longer reorder based on this property, we can just get rid of the property.
Test Plan: Imported from OSS
Differential Revision: D17172500
Pulled By: gchanan
fbshipit-source-id: f84fd8bb2b13598501897f56871b21339585d844
* simplify build_android_gradle.sh (#25897)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25897
It doesn't hurt to set all variables unconditionally.
And we can create link to lib directory instead of specific files - this
way it's easier to switch between dynamic/static library names.
Test Plan:
- check android gradle CI;
- use stack diff to check all 4 architectures on PR;
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25897
Differential Revision: D17307240
Pulled By: ljk53
fbshipit-source-id: c975085ddda852ef7da1c29935c2f6a28d797e5a
* change gradle build to use static libtorch + gc-sections (#25984)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25984
Link static libtorch libraries into pytorch.so (API library for android)
with "-Wl,--gc-sections" flag to remove unused symbols in libtorch.
Test Plan:
- full gradle CI with stacked PR;
- will check final artifacts.tgz size change;
Differential Revision: D17312859
Pulled By: ljk53
fbshipit-source-id: 99584d15922867a7b3c3d661ba238a6f99f43db5
* remove "build_deps" arg from setup.py command in (#26113)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26113
After https://github.com/pytorch/pytorch/pull/16914, passing in an
argument such as "build_deps" (i.e. python setup.py build_deps develop) is
invalid since it gets picked up as an invalid argument.
ghstack-source-id: 90003508
Test Plan:
Before, this script would execute "python setup.py build_deps
develop", which errored. Now it executes "python setup.py develop" without an
error. Verified by successfully running the script on devgpu. In setup.py,
there is already a `RUN_BUILD_DEPS = True` flag.
Differential Revision: D17350359
fbshipit-source-id: 91278c3e9d9f7c7ed8dea62380f18ba5887ab081
* Stop reordering TH random function arguments.
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/25608
Test Plan: Imported from OSS
Differential Revision: D17172494
Pulled By: gchanan
fbshipit-source-id: 5a46889cc040297231e2473ae5b2879b39f8d60a
* fix base_lr overridden in cyclic lr (#26105)
Summary:
base_lr parameter was being overridden by super `__init__`, see https://github.com/pytorch/pytorch/issues/21965.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26105
Reviewed By: yf225
Differential Revision: D17346724
Pulled By: vincentqb
fbshipit-source-id: 4b146bd64f4f385c0a9c4f4df8eb8991312fb15c
* Skip inserting duplicate observers (#25504)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25504
Skip inserting duplicate observers for values observed
in forward method of a child module or other methods in
the current module.
Test Plan:
python test/test_jit.py -- 'TestJit.insert_observers'
python test/test_jit.py -- 'TestJit.insert_observers_child_qconfig'
python test/test_jit.py -- 'TestJit.insert_observers_skip_values'
Imported from OSS
Differential Revision: D17208888
fbshipit-source-id: e04f1c22ab1c4f410933a17a3ef31acf5f217323
* Implementation of ConstantThenLinearWarmupLRPolicy and CompositeCyclicalLRPolicy (#25970)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25970
ConstantThenLinearWarmupLRPolicy:
* first use a constant warm up
* then ramp up to the fixed learning rate linearly
CompositeCyclicalLRPolicy:
* first use a constant warm up
* then ramp up to the fixed learning rate linearly
* then use cyclical learning rates for the rest of time
Pull Request resolved: https://our.intern.facebook.com/intern/opensource/shipit/preview/D17302632/
Test Plan:
* buck test
* https://our.intern.facebook.com/intern/testinfra/testconsole/testrun/5910974518377039/
* https://our.intern.facebook.com/intern/testinfra/testrun/1407375027118303
* checked the consistency of learning rates w.r.t. iterations with offline simulations n143987
Reviewed By: swatirallapalli
Differential Revision: D17302632
fbshipit-source-id: 1098d4dd9109a48932b76e36d78239e49f8077a1
* Fix build warning in vec256_qint.h
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/26121
Test Plan: Imported from OSS
Differential Revision: D17351960
Pulled By: jamesr66a
fbshipit-source-id: 12389729fe5fb8d863cf47288920ea375a3e74ab
* Kill kwarg_only declarations in Declarations.cwrap. (#25609)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25609
They don't do anything anymore.
Test Plan: Imported from OSS
Differential Revision: D17172497
Pulled By: gchanan
fbshipit-source-id: 5cf7fdcf7d2da0054ac1bd7d8d2b70a2264b8c93
* Support quantizing any methods called (#25505)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25505
Support for quantizing all the methods called by forward method, including
child module methods and other methods in the current module
It relies on module level constant prop, we need to figure out a way to do constant prop
for these methods as well. We can either do constant prop in the module level or do constant
prop in the quantization function, but this will need some discussion.
Test Plan:
python test/test_jit.py 'TestJit.insert_quant_dequant'
python test/test_quantizer.py
Imported from OSS
Differential Revision: D17208887
fbshipit-source-id: 21749457b21b00a6edada290c26324e2fb210b10
* C++ unregister_module function for Module (#26088)
Summary:
This PR adds ```unregister_module``` to ```nn::Module``` and ```erase``` function to ```OrderedDict```.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26088
Differential Revision: D17360058
Pulled By: yf225
fbshipit-source-id: f1f375b4751317da85b8da1458e092fe2405ceec
* Port fuse_linear from pytorch/tvm (#25623)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25623
Port over fuse_linear pass from pytorch/tvm project, we'll need this
in backend specific quantization pass to match aten::linear and swap
it with quantized linear
Test Plan:
python test/test_jit.py 'TestJit.test_fuse_linear'
Imported from OSS
Differential Revision: D17208890
fbshipit-source-id: f4ff3889ae4525797d3b986f46ae37e50ea49116
* Add device check before accessing data_ptr in PackLayer (#26056)
Summary:
fixes https://github.com/pytorch/xla/issues/927
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26056
Differential Revision: D17331859
Pulled By: ailzhang
fbshipit-source-id: bdc334f03c8dcbb4ef4f5e059a63ef188a0b8b61
* Create TensorBoard test classes in all cases (#26005)
Summary:
To give better signal to the user, we will now always create the TensorBoard tests classes and just disable tests if TensorBoard is not installed.
cc lanpa sanekmelnikov natalialunova pietern
[test macos]
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26005
Reviewed By: sanekmelnikov
Differential Revision: D17352430
Pulled By: orionr
fbshipit-source-id: 87a592064f4768ffded76a3d666a8e508a1ef164
* Automatic update of fbcode/onnx to 95252c2adec185e305e34486c6756ece9aa8f57f (#26137)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26137
Previous import was 7988d8360b11e6003560076e9b1d4aa426db3244
Included changes:
- **[95252c2a](https://github.com/onnx/onnx/commit/95252c2a)**: Fix shapeinference function (#2296) <jignparm>
- **[414285bb](https://github.com/onnx/onnx/commit/414285bb)**: fix the buffer overflow problem in shape inference logic of Squeeze op <Lu Fang>
- **[797cdd0f](https://github.com/onnx/onnx/commit/797cdd0f)**: Support for negative indices in 'Gather', 'GatherElements', 'ScatterElements', 'OneHot' (#2260) <Negin Raoof>
- **[7636978d](https://github.com/onnx/onnx/commit/7636978d)**: Fix collect_snippets warnings (#2277) <Lutz Roeder>
- **[fa70c33b](https://github.com/onnx/onnx/commit/fa70c33b)**: Update printable_graph in helper.py to output details of initializers that do not have matching graph inputs. (#2135) <Scott McKay>
- **[428d09b0](https://github.com/onnx/onnx/commit/428d09b0)**: test int64 input type for 'where' op (#2253) <Negin Raoof>
Test Plan: ci
Reviewed By: bddppq
Differential Revision: D17353795
fbshipit-source-id: 6d4f39754863a30f427f4512c7b228e45d3ce84f
* Add fusion for quantized linear (#25624)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25624
First fuse the splitted op into aten::linear and then fuse
`dequant - aten::linear - quant` into quantized linear op
Test Plan:
python test/test_jit.py 'TestJit.quant_fusion'
Imported from OSS
Differential Revision: D17208891
fbshipit-source-id: 864b19fabab2e8e6f8f8ad35eb3dbbf2d5fdb8c4
* Implement tensor.refine_names (#25842)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25842
`tensor.refine_names(*names)` takes `tensor` and attempts to name its
dimensions `names` out-of-place. If a dimension `i` already had a name,
then it cannot be changed (so tensor.names[i] must equal names[i]);
if the original dimension did not have a name, then the new name
(names[i]) can be anything.
`tensor.refine_names(*names)` also accepts a glob '*' that greedily selects
names from `tensor`. Here are some examples:
- `Tensor[None].refine_names('N') -> Tensor[N]`
- `Tensor[N].refine_names('N') -> Tensor[N]`
- `Tensor[N].refine_names('D') -> Error!`
- `Tensor[N].refine_names(None) -> Error!`
- `Tensor[None, None].refine_names('*', D) -> Tensor[None, D]`
Test Plan: - new tests [namedtensor ci]
Differential Revision: D17255548
Pulled By: zou3519
fbshipit-source-id: fdbdb3a12f24fbe37ce1e53ed09dc8a42589d928
* Implement tensor.align_as(other), change tensor.align_to(names) (#25843)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25843
`tensor.align_to(*names)` permutes the dimensions of `tensor` and adds
additional 1-sized dimensions such that the output tensor has dimensions
in the same order as `names`. All dimensions of `tensor` must be
present in `names`, in addition, this function requires that all dims of
`tensor` be named.
`tensor.align_as(other)` is equivalent to
`tensor.align_to(*other.names)`.
I'm planning on changing `torch.align_tensors(*tensors)` to align closer
to these semantics because there didn't seem to be a clear use case for the old
semantics that preserve unnamed dimensions. That will come in a future
change.
Test Plan: - new tests [namedtensor ci]
Differential Revision: D17255549
Pulled By: zou3519
fbshipit-source-id: 1e437ad81e9359b4d5bd0e7e64c3a1be441fc3e3
* C++ API parity: at::Tensor::data
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/26008
Test Plan: Imported from OSS
Differential Revision: D17343488
Pulled By: pbelevich
fbshipit-source-id: b9ba5e26cad621a428a14292446d7fb5a6e5535d
* Fix bug with named tensors and (no) tracer support (#26106)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26106
Previously, in the named tensors build, an operator is marked as
non-traceable if ANY of its overloads are named tensor overloads. This
breaks the tracer for things like torch.full (has a names= overload for
named tensor) and tensor.sum (has a Dimname overload for named tensor).
This PR fixes the problem by putting the "no tracer support" logic into
the location where the tracer attempts to construct a graph by adding a
Dimname/DimnameList argument to a node.
Test Plan:
- new test in test_jit.py to check if torch.full is traceable
- new test in test_namedtensor.py to check what happens when someone
tries to trace a function that uses named tensor APIs.
- [namedtensor ci]
Differential Revision: D17353452
Pulled By: zou3519
fbshipit-source-id: b0b843c8357ffe54baee6e8df86db914f0b1ece4
* Add data field to Tensor pyi. (#26093)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26093
Signed-off-by: Edward Z. Yang <[email protected]>
Test Plan: Imported from OSS
Reviewed By: vsiles
Differential Revision: D17366320
Pulled By: ezyang
fbshipit-source-id: 025f1c3d75d294fc1b51ddc540e542a05dc72b6a
* Change schedulers to chainable form (#24352)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/24352
Enable chainable schedulers as requested in #13022 by implementing the changes mentioned below from [comment](https://github.com/pytorch/pytorch/pull/21800#issuecomment-513370208).
* Changing the behavior of schedulers to the chainable formula when available
* Using the closed form whenever epoch is different from None until the next release with a deprecation warning
* Making `get_computed_values` the supported way of obtaining the last computed learning rate by the scheduler (see [comment](https://github.com/pytorch/pytorch/pull/21800#issuecomment-513940729) for new syntax)
* Returning a deprecation warning when invoking the undocumented get_lr function (see [comment](https://github.com/pytorch/pytorch/pull/21800#discussion_r294305485)) referring to `get_computed_values`, and deprecating it in the next release.
* `CosineAnnealingWarmRestart` still takes an epoch parameter as it is the only one with a mechanic relying on fractional epoch
* `MultiplicativeLR` is consumes a function providing the multiplicative factor at each epoch. It mimics `LambdaLR` in its syntax.
# #20527
### Before
The user calls scheduler with a constant epoch either across loops or in the same loop.
```
import torch.optim as optim
from torch import nn
conv = nn.Conv2d(3,3,3)
optimizer = optim.Adam(conv.parameters())
lr_scheduler = optim.lr_scheduler.StepLR(optimizer, 2)
# Scheduler with sometimes-constant epoch number
for epoch in [0, 0, 1, 1, 2, 2, 3, 3]:
lr_scheduler.step(epoch)
print(optimizer.param_groups[0]['lr'])
```
### After
If the user wants to step
```
import torch.optim as optim
from torch import nn
conv = nn.Conv2d(3,3,3)
optimizer = optim.Adam(conv.parameters())
lr_scheduler = optim.lr_scheduler.StepLR(optimizer, 2)
last_epoch = -1
for epoch in [0, 0, 1, 1, 2, 2, 3, 3]:
# Check if epoch number has changed manually
if epoch-last_epoch > 0:
lr_scheduler.step()
last_epoch = epoch
print(epoch, scheduler.get_computed_values())
```
# #22107
### Before
```
import torch
from torchvision.models import resnet18
net = resnet18()
optimizer = torch.optim.SGD(net.parameters(), 0.1)
scheduler = torch.optim.lr_scheduler.MultiStepLR(optimizer, milestones=[3, 6, 9], gamma=0.1)
scheduler = torch.optim.lr_scheduler.StepLR(optimizer, 3, gamma=0.1)
for i in range(10):
# Scheduler computes and returns new learning rate, leading to unexpected behavior
print(i, scheduler.get_lr())
scheduler.step()
```
### After
```
import torch
from torchvision.models import resnet18
net = resnet18()
optimizer = torch.optim.SGD(net.parameters(), 0.1)
lr_scheduler = torch.optim.lr_scheduler.MultiStepLR(optimizer, milestones=[3, 6, 9], gamma=0.1)
lr_scheduler = torch.optim.lr_scheduler.StepLR(optimizer, 3, gamma=0.1)
for i in range(10):
# Returns last computed learning rate by scheduler
print(i, lr_scheduler.get_computed_values())
lr_scheduler.step()
```
Test Plan: Imported from OSS
Differential Revision: D17349760
Pulled By: vincentqb
fbshipit-source-id: 0a6ac01e2a6b45000bc6f9df732033dd81f0d89f
* Run PyTorch macOS CPU-only build/test on all PRs
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/26096
Test Plan: Imported from OSS
Differential Revision: D17366419
Pulled By: pietern
fbshipit-source-id: 138659dae346aad3cde52d488cd1780614e7692f
* Use CircleCI commands for brew update/install (#26159)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26159
The snippets for working with Homebrew were duplicated across binary
builds, macOS builds, and iOS builds. In #25336, the CircleCI
configuration version was updated to version 2.1, which supports
parameterized commands. This means we no longer have to use YAML
tricks to duplicate stanzas and instead can natively define a series
of reusable steps.
Motivation for doing this is that the macOS binary builds were still
using the slow `brew update` instead of `git fetch` (see #25988).
[test macos]
[test wheel]
Test Plan: Imported from OSS
Differential Revision: D17366538
Pulled By: pietern
fbshipit-source-id: 194c0f37c1dc999705f3ba97fdabf4ff18728d93
* Turn should_run_job into command
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/26160
Test Plan: Imported from OSS
Differential Revision: D17366539
Pulled By: pietern
fbshipit-source-id: a870d6da21925764986c6c748ad291440b78e6fd
* Turn setup_linux_system_environment into command
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/26162
Test Plan: Imported from OSS
Differential Revision: D17366537
Pulled By: pietern
fbshipit-source-id: 98413daa344812f06578c3373d8516292d2f21f5
* Turn setup_ci_environment into command
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/26163
Test Plan: Imported from OSS
Differential Revision: D17366536
Pulled By: pietern
fbshipit-source-id: 07181a77aaeba5457aa716ceac9cc404aacefe5f
* Kill most defaults in Declarations.cwrap. (#25610)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25610
They don't do anything anymore, since this isn't the end-user interface.
Test Plan: Imported from OSS
Differential Revision: D17172495
Pulled By: gchanan
fbshipit-source-id: a380d970f0836ed85eb9ac2aa42eb73655d775aa
* Get rid of more defaults in Declarations.cwrap.
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/25611
Test Plan: Imported from OSS
Differential Revision: D17172493
Pulled By: gchanan
fbshipit-source-id: 0f4319f8024ac4eca62576231214227b341f56c4
* Kill remaining defaults in Declarations.cwrap.
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/25612
Test Plan: Imported from OSS
Differential Revision: D17172499
Pulled By: gchanan
fbshipit-source-id: f99e813a4a90e8576541da317027e6f8ae76079b
* Remove requests as dependency (#26083)
Summary:
local build is slow... test in CI...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26083
Differential Revision: D17346949
Pulled By: ailzhang
fbshipit-source-id: f552d1a4be55ad4e2bd915af7c5a2c1b6667c446
* Fix 'in' return true incorrectly (#24156)
Summary:
Because of 'return NotImplemented', __contains__ return True when the element is not a number.
bool(NotImplemented) == True
Pull Request resolved: https://github.com/pytorch/pytorch/pull/24156
Differential Revision: D16829895
Pulled By: zou3519
fbshipit-source-id: 9d3d58025b2b78b33a26fdfcfa6029d0d049f11f
* guard dyndep with a lock (#26153)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26153
I am suspecting that our multithreaded test-system causes issue with dyndep, if two places try to concurrently InitOpsLibrary. So perhaps we just guard this by a lock. This is just a guess-fix, as it is impossible to repro.
Test Plan: sandcastle
Reviewed By: bddppq
Differential Revision: D17361310
fbshipit-source-id: 596634a2098b18881abbd26a5a727a5ba0d03b6e
* Add documentation to logging
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/26175
Differential Revision: D17371085
Pulled By: Krovatkin
fbshipit-source-id: ea06f4e16fc320940a299e8e1d4f4d7c76f5950a
* Fold quantize op into module (#25625)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25625
We want to fold the quantize op for weights/bias into module to avoid quantizing weights on the fly.
Test Plan:
python test/test_jit.py
Imported from OSS
Differential Revision: D17208889
fbshipit-source-id: 1854b8953b065855d210bc1166533c08ca264354
* Revert D17349760: Change schedulers to chainable form
Test Plan: revert-hammer
Differential Revision:
D17349760
Original commit changeset: 0a6ac01e2a6b
fbshipit-source-id: 41c2c136215dabc26cad5098a08eff2a2a29b715
* Use torch::from_blob instead of shareExternalPointer, nits (#25973)
Summary:
The main part is to switch at::Tensor creation from usage of `torch::empty(torch::IntArrayRef(...))->ShareExternalPointer(...) to torch::from_blob(...)`
Removed explicit set of `device CPU` as `at::TensorOptions` by default `device CPU`
And renaming of local variables removing `input` prefix to make them shorter
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25973
Differential Revision: D17356837
Pulled By: IvanKobzarev
fbshipit-source-id: 679e099b8aebd787dbf8ed422dae07a81243e18f
* Make schema part of RegisterOperators::Options (#26114)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26114
With this diff, the operator schema or name can be specified as part of the options objects:
```
static auto registry = torch::RegisterOperators()
.op(torch::RegisterOperators::options().schema("my_op").kernel(&kernel))
.op(...);
```
This does not break backwards compatibility, all old APIs are kept as shorthands.
This (a) makes the API more consistent, accumulating all options into the options objects and not treating schema special anymore, and (b) this is required for allowing the c10 dispatcher to forward registration calls to ATenDispatch for ops that are still on that dispatcher, see plan in https://github.com/pytorch/pytorch/issues/24132
ghstack-source-id: 90049402
Test Plan: unit tests
Differential Revision: D17350383
fbshipit-source-id: cbb8f33a52dccb2a4522753e7b5ac8ba35b908fd
* Allow overwriting catch-all kernels (#25947)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25947
Previously, the c10 dispatcher didn't allow having a catch-all kernel and backend specific kernels at the same time.
This is also the long term goal. But to make the current XLA implementation work, we need to allow them to overwrite these ops with XLA variants.
This diff changes that so that ops can have both, catchall and backend specific kernels, and will call into the catchall kernel if there is no more specific kernel registered.
This is also the current behavior of globalATenDispatch.
ghstack-source-id: 90049398
Test Plan: unit tests
Differential Revision: D17293036
fbshipit-source-id: f2d5928e904c1dc9b6b89e9bb468debe48a4056c
* Register ATen ops with c10 (#26131)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26131
Changes in this PR:
- For each operator with use_c10_dispatcher: True, additionally generate a c10 registration line in TypeDefault.cpp, CPUType.cpp, and other backend files.
- This doesn't change globalATenDispatch yet, the c10 registration is purely additional and the operator calling path doesn't change. A diff further up the stack will change these things.
- Enable the use_c10_dispatcher: True flag for about ~70% of operators
- This also changes the c10->jit operator export because ATen ops are already exported to JIT directly and we don't want to export the registered c10 ops because they would clash
- For this, we need a way to recognize if a certain operator is already moved from ATen to c10, this is done by generating a OpsAlreadyMovedToC10.cpp file with the list. A diff further up in the stack will also need this file to make sure we don't break the backend extension API for these ops.
Reasons for some ops to be excluded (i.e. not have the `use_c10_dispatcher` flag set to true):
- `Tensor?(a!)` (i.e. optional tensor with annotations) not supported in c++ function schema parser yet
- `-> void` in native_functions.yaml vs `-> ()` expected by function schema parser
- out functions have different argument order in C++ as in the jit schema
- `Tensor?` (i.e. optional tensor) doesn't work nicely with undefined tensor sometimes being undefined tensor and sometimes being None.
- fixed-size arrays like `int[3]` not supported in c10 yet
These will be fixed in separate diffs and then the exclusion tag will be removed.
ghstack-source-id: 90060748
Test Plan: a diff stacked on top uses these registrations to call these ops from ATen
Differential Revision: D16603131
fbshipit-source-id: 315eb83d0b567eb0cd49973060b44ee1d6d64bfb
* Updating submodules
Summary:
GitHub commits:
https://github.com/facebook/rocksdb/commit/83a6a614e9bf5f3f06abc265b736e868acee498b
https://github.com/pytorch/fbgemm/commit/c8cac64995d8d8af871e461affbf505ac7fce4d8
Test Plan: n/a
Reviewed By: 2d2d2d2d2d
fbshipit-source-id: 1f5bc1e065fe13d89eeb42539f21a8ab0ab8b8a1
* Nightly build for for iOS (#26074)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26074
### Summary
This PR creates a nightly job for iOS builds. The job will generate a couple of static libraries that contains three architectures(x86, arm64, armv7s) and upload them to AWS s3.
### Note
The test phase in this job is missing right now, meaning if there is a linking error, we won't be able to know it. To add the test jobs, we have to put a dummy test App in the repo and manually link the libraries to the app after the build finishes. This will be done in the next following PRs
Test Plan: Imported from OSS
Differential Revision: D17363066
Pulled By: xta0
fbshipit-source-id: 5beeb4263af5722f0a852297023f37aaea9ba4b1
* Change the source link in podspec (#26089)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26089
### Summary
A couple of changes
1. Replace the source link with the newly nightly build address
2. Remove module support for Swift and Objective-C
3. Expose all static libraries instead of archiving them into one single library. This is because those static libraries might contain object files that have the same name, e.g. `init.c.o` in both `libcupinfo.a` and `libqnnpack.a`. If we archive them into one using this `libtool -static` command, by default, it only picks one object file and discards the others, which could result in undefined symbols when linking the executable. The change here is to expose all the static libraries and let the linker decide which one to use.
### Test Plan
- pod spec lint succeed
- `pod spec lint --verbose --allow-warnings --no-clean --use-libraries --skip-import-validation`
Test Plan: Imported from OSS
Differential Revision: D17363037
Pulled By: xta0
fbshipit-source-id: ba77b0001b58e6e2353d8379d932db598166d37d
* Updating submodules
Summary:
GitHub commits:
https://github.com/facebook/rocksdb/commit/97631357aa274d06a7ab09b3cde7b909262cc4dd
https://github.com/pytorch/fbgemm/commit/2f1477dfee9465c1e2dbdf21722970b3fa1baf86
Test Plan: n/a
Reviewed By: 2d2d2d2d2d
fbshipit-source-id: 33029d2e8c6a3664a35823829670f6ed9dfc3b44
* Tensor renaming to dtype, shape; support long, double (#26183)
Summary:
Applying dzhulgakov review comments
org.pytorch.Tensor:
- dims renamed to shape
- typeCode to dtype
- numElements to numel
newFloatTensor, newIntTensor... to newTensor(...)
Add support of dtype=long, double
Resorted in code byte,int,float,long,double
For if conditions order float,int,byte,long,double as I expect that float and int branches will be used more often
Tensor.toString() does not have data, only numel (data buffer capacity)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26183
Differential Revision: D17374332
Pulled By: IvanKobzarev
fbshipit-source-id: ee93977d9c43c400b6c054b6286080321ccb81bc
* use whitelist for selecting observed values (#25974)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25974
Previously we observe all the Tensor values, but what we want is actually
observing only the ones that can be quantized.
Test Plan:
python test/test_jit.py
python test/test_quantizer.py
Imported from OSS
Differential Revision: D17348986
fbshipit-source-id: 55be0d73862a0e7eb1e7fd882d16e0d830618b63
* fix circle CI
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/26225
Test Plan: Imported from OSS
Differential Revision: D17379899
Pulled By: xta0
fbshipit-source-id: 4077aa0149b23560f3a9e29531ca9bc612a2c09c
* Add histogram observer (#23959)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/23959
Add histogram observer that records the running histogram of tensor values along with min/max values.
ghstack-source-id: 90076996
Test Plan:
Added a test test_histogram_observer
buck test mode/dev caffe2/test:quantization -- 'test_histogram_observer'
buck test mode/dev caffe2/test:quantization -- 'test_observer_scriptable'
Differential Revision: D16692835
fbshipit-source-id: 0f047d3349cb9770fad4a2b6cb346c51d9e99cd4
* Add isBackwardCompatibleWith for Argument and FunctionSchema (#23409)
Summary:
we intend to be conservative, and will relax the checks in future if necessary.
So far, we consider the following three conditions as backward compatible:
1) two schemas are equal
2) two schemas have same number of arguments, and this schema's
arguments are backward compatible with the corresponding ones in
argument list of old_schema.
3) this schema has m argument, old_argument has n argument, m > n.
the first n arguments of this schema are backward compatible with
the corresponding arguments of old_schema. the remaning arguments
must be either OptionalType or provide default values.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/23409
ghstack-source-id: 90111021
Test Plan: buck test //caffe2/test:function_schema
Reviewed By: hl475
Differential Revision: D16505203
fbshipit-source-id: e4099537776a60e8945e5c3cd57fa861f3598a9b
* Creates generic device type testing framework (#25967)
Summary:
This PR addresses https://github.com/pytorch/pytorch/issues/24851 by...
1. lets device types easily register themselves for testing
2. lets tests be written to run on multiple devices and with multiple dtypes
3. provides a mechanism to instantiate those tests so they are discoverable and filterable by unittest and pytest
It refactors three tests from test_torch.py to demonstrate how to use it.
`test_diagonal` is the simplest example. Most tests just need to be modified to accept 'device' as an argument. The framework will then instantiate `test_diagonal_cpu` and `test_diagonal_cuda` (when CUDA is available) which call `test_diagonal` with the appropriate 'device' argument.
`test_neg` also has dtype variants. It accepts both 'device' and 'dtype' as arguments, and the dtypes it runs with are specified with the 'dtypes' decorator. Dtypes can be specified for all device types and particular device types. The framework instantiates tests like `test_neg_cpu_torch.float`.
`test_inverse` has device-specific dependencies. These dependencies are expressed with the sugary 'skipCUDAIfNoMagma' and 'skipCPUIfNoLapack' decorators. These decorators are device-specific so CPU testing is not skipped if Magma is not installed, and there conditions may be checked after or before the test case has been initialized. This means that skipCUDAIfNoMagma does not initialize CUDA. In fact, CUDA is only initialized if a CUDA test is run.
These instantiated tests may be run as usual and with pytest filtering it's easy to run one test on all device types, run all the tests for a particular device type, or run a device type and dtype combination.
See the note "Generic Device-Type Testing" for more detail.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25967
Differential Revision: D17381987
Pulled By: mruberry
fbshipit-source-id: 4a639641130f0a59d22da0efe0951b24b5bc4bfb
* adds sync to flaky test_events_multi_gpu_query (#26231)
Summary:
This test can sometimes fail in CI.
I suspect this flakiness is because the test asks a CUDA stream to record an event, fails to synchronize the CPU with that stream, then checks if the event is recorded on the CPU. There is no guarantee this will have happened.
This one-line change preserves the intent of the test while ensuring the GPU has recorded the event before the CPU queries it.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26231
Differential Revision: D17382110
Pulled By: mruberry
fbshipit-source-id: 35b701f87f41c24b208aafde48bf10e1a54de059
* Added possible out of shared memory error message (#25730)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/5040
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25730
Differential Revision: D17226214
Pulled By: pbelevich
fbshipit-source-id: 92278272aab74e6690f14fc9597acfd1a98854b7
* Remove armv7s build from iOS (#26222)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26222
### Summary
The last generation of armv7s device is Phone 5C. As discussed with David offline, we decided not to support iOS armv7s devices.
### Test plan
- CI finishes successfully
- Builds can be run only on X86_64 and arm64 devices
Test Plan: Imported from OSS
Differential Revision: D17385308
Pulled By: xta0
fbshipit-source-id: f883999aed18224ea3386b1f016964a33270fa34
* Back out "[quant][observer] Add histogram observer" (#26236)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26236
Original diff broke oss CI. Reverting.
Original commit changeset: 0f047d3349cb
ghstack-source-id: 90125990
Test Plan: testinprod
Reviewed By: hx89
Differential Revision: D17385490
fbshipit-source-id: 4258502bbc0e3a6dd6852c8ce01ed05eee618b1a
* Ports most of test_torch.py to generic device type framework (#26232)
Summary:
This PR moves many tests in test_torch.py to the generic device type framework. This means that many CUDA tests now run in test_torch.py and there is greater consistency in how tests for many device types are written.
One change is that all MAGMA tests are run on the default stream due to intermittent instability running MAGMA on the non-default stream. This is a known issue.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26232
Test Plan:
While this PR edits the tests itself, it was validated using two independent methods:
(1) The code was reviewed and it was verified that all deleted functions were actually moved.
(2) The output of the TestTorch CI was reviewed and test outputs were matched before and after this PR.
Differential Revision: D17386370
Pulled By: mruberry
fbshipit-source-id: 843d14911bbd52e8aac6861c0d9bc3d0d9418219
* Add type hint for cuda.set_rng_state (#26200)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/26199
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26200
Differential Revision: D17386885
Pulled By: soumith
fbshipit-source-id: 9da03aae29281b2ed691cbfdd7b85fde55e5b7ef
* Add a wrapper for inspect in JIT to produce better error message (#25415)
Summary:
If source code is not available due to packaging (e.g. sources are compiled to .pyc), TorchScript produces very obscure error message. This tries to make it nicer and allow to customize message by overriding _utils_internal.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25415
Test Plan: Really hard to unittest properly. Did one off testing by compiling to .pyc and checking the message.
Differential Revision: D17118238
Pulled By: dzhulgakov
fbshipit-source-id: 3cbfee0abddc8613000680548bfe0b8ed52a36b0
* Use MIOpen for transpose convolutions (#26172)
Summary:
Provides significant performance uplift where used.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26172
Differential Revision: D17374862
Pulled By: bddppq
fbshipit-source-id: 85d2df3c67b8935bc54f3a81a912a25c0102743a
* Call aten ops through c10 dispatcher (#23668)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/23668
- The eager mode frontend now calls operators who are defined in native_functions.yaml with `use_c10_dispatcher: True` through the c10 dispatcher and not anymore through globalATenDispatch().
- These operators aren't registered with globalAtenDispatch anymore, only on c10 now.
- Backend extensions calling globalATenDispatch().registerOp() to add their own kernels still work, this function will forward the registration to the c10 dispatcher for them.
ghstack-source-id: 90130455
Test Plan: benchmarks at https://docs.google.com/document/d/1gpzKZcFf1JJameY1vKxF7Cloul9s6D8HKIK2_Pp1hFo/edit#
Differential Revision: D16603133
fbshipit-source-id: 991f17b355e9c78c5e86fee4fa381df7ab98ac82
* Remove unboxedAutogradKernel from c10 (#26130)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26130
Since we now just use TensorTypeId::VariableTensorId, there's no need to treat autograd kernels any differently.
ghstack-source-id: 90130457
Test Plan: unit tests
Differential Revision: D17353873
fbshipit-source-id: d4468506a5366bc5e7429144b090b3e78af9de62
* Refines test_torch.py generic device testing (#26244)
Summary:
- Adds SkipCUDAIfRocm and skipCPUIfNoMkl decorators, ports corresponding tests
- Changes "SkipIf" input semantics for consistency
- Removes torchtest, which has been replaced with this new generic framework
- Refactors some common parts out of CUDA tests to TestTorchDeviceType
- Ensures all MAGMA tests run on default stream by putting the skipCUDANonDefaultStreamIf in the skipCUDAIfNoMagma decorator.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26244
Differential Revision: D17389060
Pulled By: mruberry
fbshipit-source-id: 1375774f24c2266049e6d4b899e7300ddf32eac8
* Fix Windows build (#26246)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26246
Broken due to https://github.com/pytorch/pytorch/issues/12117. Try fixing it.
ghstack-source-id: 90137033
Test Plan: waitforsandcastle
Reviewed By: zou3519
Differential Revision: D17387317
fbshipit-source-id: 705998c0b1608668d510b47f4fe20cecf5057c5f
* Fix CI (#26250)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26250
Exclude some ops from the c10 dispatcher that don't work with it yet.
ghstack-source-id: 90138046
Test Plan: waitforsandcastle
Reviewed By: zou3519
Differential Revision: D17390117
fbshipit-source-id: a87fb3048aeba2c3293b95d610ddb8e94369f8fe
* Back out "[pytorch][PR] Refines test_torch.py generic device testing" (#26252)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26252
Original commit changeset: 1375774f24c2
Testing to see if this is somehow the source of hangs on ROCm builds.
Test Plan: Change is to tests themselves. This diff is for testing the ROCm hang, however.
Differential Revision: D17390575
fbshipit-source-id: a6ffd5eb1df3971b99b6d42271a8d3d501ac79c6
* Fix namedtensor ci (#26257)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26257
In native_functions.yaml, all overloads must have unique overload names.
This PR fixes `flatten` to have unique names for the overloads.
Test Plan: - tested locally, but also [namedtensor ci]
Differential Revision: D17391243
Pulled By: zou3519
fbshipit-source-id: aaef654953b4275c43b9d7bd949c46bd011f6c73
* Switch to the new profiler infrastructure (#26174)
Summary:
The ones supported going forward are rocprofiler and roctracer.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26174
Differential Revision: D17387538
Pulled By: bddppq
fbshipit-source-id: 19d9828d9d07b5073ab5fa288e24fd65a8b18b52
* Fix binary size of OpsAlreadyMovedToC10.cpp (#26237)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26237
Calling a lot of `std::string` constructors is horrible for binary size, see t53997334.
Using `const char*` instead should make the binary size much smaller.
ghstack-source-id: 90145501
Test Plan: size checks on the diff
Differential Revision: D17386002
fbshipit-source-id: c5420adf225e535396e806a0df92419a7e2ad3e8
* Fix no auto batching bugs: cannot bulk load; not work with namedtuple (#26065)
Summary:
see title
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26065
Differential Revision: D17392851
Pulled By: soumith
fbshipit-source-id: 468cd41c8e03d689ff2e0261d948e28daad6bfaf
* Upgrade MKLDNN to v0.20.5 (#25757)
Summary:
1. Fix issues exposed by below posts.
https://github.com/pytorch/pytorch/issues/25242
https://github.com/pytorch/pytorch/issues/25101
https://github.com/pytorch/pytorch/issues/23825
2. Fix RNN support issue in mkldnn-bridge
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25757
Differential Revision: D17367948
Pulled By: VitalyFedyunin
fbshipit-source-id: d8430d3909ecbf853afa0ce3d968735f86f1da31
* fix hypothesis timeout (#26280)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26280
ghstack-source-id: 90160270
Test Plan: testinprod
Differential Revision: D17396861
fbshipit-source-id: ee2348ffa7f6092e2c5647a42d0e17879dcfacd0
* Migrate away from using Variable( in test_nn.py (#26077)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26077
As per #26071, we would like to get rid of the calls to Variable(
where possible. This diff removes the calls in the test file test_nn.py. The
unit tests should all still pass as expected.
ghstack-source-id: 90086624
Test Plan: tests in `test_nn.py` should all pass.
Differential Revision: D17336484
fbshipit-source-id: 43fc7bd0b0be835ae89d06162ce1cbe4e0056d91
* Enabled conv methods for the bfloat16
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/26167
Differential Revision: D17367728
Pulled By: izdeby
fbshipit-source-id: 0a7bd9a6dbc15815af195d644c9372af2135e93a
* Move the CUDA implementation of round to ATen. (#25041)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25041
Fix #24617
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25041
Test Plan: Imported from OSS
Differential Revision: D17114368
Pulled By: VitalyFedyunin
fbshipit-source-id: 6ec6ef99b4451acd7e93491fd4b44fca9ce1809d
* Whiltelist and fusion support for quantized::linear - addmm (#26208)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26208
Supporing `addmm` -> `quantized::linear` quant fusion
Test Plan:
python test/test_jit.py 'TestJit.test_quant_fusion'
Imported from OSS
Differential Revision: D17380074
fbshipit-source-id: fae88f118f85663d777648695768b0504ed7ccf9
* Whiltelist and fusion support for quantized::linear - matmul(without bias) (#26209)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26209
Support quant fusion for `matmul`(without bias) -> `quantized::linear`
Test Plan:
python test/test_jit.py 'TestJit.test_quant_fusion'
Imported from OSS
Differential Revision: D17380075
fbshipit-source-id: 290caee7f7bcf94d2731c0ee9bd40054f0fb9b07
* Updating submodules
Summary:
GitHub commits:
https://github.com/facebook/mcrouter/commit/653434b898ea35810d7369d0911e3bdab9a1c3ac
https://github.com/facebook/proxygen/commit/b74fbefc1a69de78989f540d9d0d312945aeadeb
https://github.com/facebook/rocksdb/commit/9bd5fce6e89fcb294a1d193f32f3e4bb2e41d994
https://github.com/facebookincubator/mvfst/commit/6efcef720fac04011708840b89d1f174d3f290d0
https://github.com/facebookresearch/pytorch-biggraph/commit/cb7830b6b30d2d24b591178705eaf9e8209ecd09
https://github.com/pytorch/fbgemm/commit/53f0c0d175ae4283609a5b251052f9c6598b8aee
Test Plan: n/a
Reviewed By: yns88
fbshipit-source-id: 78d0e24f5601aa990391a2404ae9d23b325de93f
* Add ProcessGroupGloo::createDefaultDevice (#26166)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26166
There were 2 variants to create a new device. One to do so based the
name of a network interface, and one to do so based on a hostname or
address. In the latter, if the address was not specified, it would
lookup the local hostname and try to resolve that. If that failed, the
process would crash.
In this default path, we now try to lookup and use the local hostname,
and if that fails we fallback to using the loopback address.
If the local hostname doesn't resolve to an address that we can bind
to, it is very likely that this process won't join other processes
over the network, and that the user is trying to run a local test.
If this assumption is wrong, the user can override the default
interface selection by setting the environment variable
`GLOO_SOCKET_IFNAME` to the name of the external network interface.
I tested this by changing the local hostname to a bogus name and
confirmed that default initialization works as expected.
Closes #26049.
Test Plan: Imported from OSS
Differential Revision: D17397898
Pulled By: pietern
fbshipit-source-id: 95a2467761d89df87b520d6e5837b92184b0dc12
* Disable broken unit tests (#26301)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26301
-
ghstack-source-id: 90176419
Test Plan: waitforsandcastle
Differential Revision: D17400971
fbshipit-source-id: b6f9cb27fe955b0200d62591300c70ba79a90e5f
* Kill defaults in nn.yaml. (#26282)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26282
Since this isn't the end-user API anymore, we shouldn't have defaults.
Test Plan: Imported from OSS
Differential Revision: D17397153
Pulled By: gchanan
fbshipit-source-id: d44040bec0ee9c70734a53ebcc10a96f12226a29
* Upgrade Caffe2 docker images to 306 to include roctracer and rocprofiler
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/26260
Differential Revision: D17391902
Pulled By: bddppq
fbshipit-source-id: 89ab3dedf05ba398acb7300fac95f03cfb31f0ba
* Whiltelist and fusion support for quantized::linear - matmul(with bias) (#26204)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26204
Support quant fusion for `matmul` with bias to `quantized::linear`.
Test Plan:
python test/test_jit.py 'TestJit.test_quant_fusion'
Imported from OSS
Differential Revision: D17380073
fbshipit-source-id: 00014469a852cc5d5b66469fc4b8d05eafba1e3e
* Add __s390x__ compiler define for s390 builds. (#26233)
Summary:
pytorch builds fail on 390 architecture because
in simd.h the ifdef macros default to an x86 asm instruction.
This patchs adds an ifdef __s390x__ to be able to build on s390.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26233
Differential Revision: D17392714
Pulled By: soumith
fbshipit-source-id: 037672bfea64fc5e52da2390d93b973534137c12
* Clarified ambiguous docstring in NegativeBinomial
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/25923
Differential Revision: D17392848
Pulled By: soumith
fbshipit-source-id: 2833e72fe449c74dfd8273a7b1eb46c05c63d999
* Dynamic quantization for bias. (#26057)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26057
bias is now unquantized (i.e. floating type) for qconv and qlinear. It is dynamically quantized by fbgemm.
TODO: Add some performance numbers.
Tests:
test:quantization
```
Summary (total time 8.41s):
PASS: 24
FAIL: 0
SKIP: 0
FATAL: 0
TIMEOUT: 0More details at https://our.intern.facebook.com/intern/buck/build/74d5f6f7-55c9-4350-a618-2013042fffd8
OMIT: 0
```
test:quantized
```
Summary (total time 13.21s):
PASS: 43
FAIL: 0
SKIP: 5
caffe2/test:quantized - test_qnnpack_maxpool2d (test_quantized.TestQNNPackOps)
caffe2/test:quantized - test_compare_tensor_scalar (test_quantized.TestComparatorOps)
caffe2/test:quantized - test_qnnpack_linear (test_quantized.TestQNNPackOps)
caffe2/test:quantized - test_qnnpack_relu (test_quantized.TestQNNPackOps)
caffe2/test:quantized - test_qnnpack_add (test_quantized.TestQNNPackOps)
FATAL: 0
TIMEOUT: 0
OMIT: 0
```
ghstack-source-id: 90166254
Test Plan:
buck test mode/dev caffe2/test:quantization
buck test mode/dev caffe2/test:quantized
Differential Revision: D17328028
fbshipit-source-id: d4a163d730d0f4a03e8e0faf7420710cf36eec09
* Use expected_wrapper only if CMAKE_{C,CXX}_COMPILER and/or is not set by user (#26306)
Summary:
This will honor user's preference.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26306
Differential Revision: D17408030
Pulled By: soumith
fbshipit-source-id: 6841b805603d40cd7caf78dbb42405a0c931f052
* Add derivative of cholesky_solve (#26185)
Summary:
Changelog:
- Add derivative of cholesky_solve. The equations are derived akin to the derivative of solve methods using the technique detailed [here](https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&uact=8&ved=2ahUKEwiXrOjIyM7kAhWstlkKHRxqCDgQFjAAegQIAhAC&url=https%3A%2F%2Fpeople.maths.ox.ac.uk%2Fgilesm%2Ffiles%2FNA-08-01.pdf&usg=AOvVaw0BNISOvM_I9KjPrl0xv1R_)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26185
Test Plan:
- Added tests for cholesky_solve in test_autograd.py
Closes half of https://github.com/pytorch/pytorch/issues/4669.
Differential Revision: D17408123
Pulled By: soumith
fbshipit-source-id: f9668c8d4d758c0dc658941a8b730a17683091aa
* Kill 'default_init', which isn't needed anymore.
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/26281
Test Plan: Imported from OSS
Differential Revision: D17397097
Pulled By: gchanan
fbshipit-source-id: fb53e90637a3dfb2300fca78f414abe2d82832f3
* Export round (#26126)
Summary:
Added round export in opset 11
Pull Request resolved: https:…
Enable chainable schedulers as requested in #13022 by implementing the changes mentioned below from [comment](#21800 (comment)). * Changing the behavior of schedulers to the chainable formula when available * Using the closed form whenever epoch is different from None until the next release with a deprecation warning * Making `get_computed_values` the supported way of obtaining the last computed learning rate by the scheduler (see [comment](#21800 (comment)) for new syntax) * Returning a deprecation warning when invoking the undocumented get_lr function (see [comment](#21800 (comment))) referring to `get_computed_values`, and deprecating it in the next release. * `CosineAnnealingWarmRestart` still takes an epoch parameter as it is the only one with a mechanic relying on fractional epoch * `MultiplicativeLR` is consumes a function providing the multiplicative factor at each epoch. It mimics `LambdaLR` in its syntax. # #20527 ### Before The user calls scheduler with a constant epoch either across loops or in the same loop. ``` import torch.optim as optim from torch import nn conv = nn.Conv2d(3,3,3) optimizer = optim.Adam(conv.parameters()) lr_scheduler = optim.lr_scheduler.StepLR(optimizer, 2) # Scheduler with sometimes-constant epoch number for epoch in [0, 0, 1, 1, 2, 2, 3, 3]: lr_scheduler.step(epoch) print(optimizer.param_groups[0]['lr']) ``` ### After If the user wants to step ``` import torch.optim as optim from torch import nn conv = nn.Conv2d(3,3,3) optimizer = optim.Adam(conv.parameters()) lr_scheduler = optim.lr_scheduler.StepLR(optimizer, 2) last_epoch = -1 for epoch in [0, 0, 1, 1, 2, 2, 3, 3]: # Check if epoch number has changed manually if epoch-last_epoch > 0: lr_scheduler.step() last_epoch = epoch print(epoch, scheduler.get_computed_values()) ``` # #22107 ### Before ``` import torch from torchvision.models import resnet18 net = resnet18() optimizer = torch.optim.SGD(net.parameters(), 0.1) scheduler = torch.optim.lr_scheduler.MultiStepLR(optimizer, milestones=[3, 6, 9], gamma=0.1) scheduler = torch.optim.lr_scheduler.StepLR(optimizer, 3, gamma=0.1) for i in range(10): # Scheduler computes and returns new learning rate, leading to unexpected behavior print(i, scheduler.get_lr()) scheduler.step() ``` ### After ``` import torch from torchvision.models import resnet18 net = resnet18() optimizer = torch.optim.SGD(net.parameters(), 0.1) lr_scheduler = torch.optim.lr_scheduler.MultiStepLR(optimizer, milestones=[3, 6, 9], gamma=0.1) lr_scheduler = torch.optim.lr_scheduler.StepLR(optimizer, 3, gamma=0.1) for i in range(10): # Returns last computed learning rate by scheduler print(i, lr_scheduler.get_computed_values()) lr_scheduler.step() ``` # ghstack This contains the changes from #24352. Opening again since they were reverted. This reverts commit 1c477b7. Differential Revision: [D17460427](https://our.internmc.facebook.com/intern/diff/D17460427)
Summary: Pull Request resolved: #26423 Enable chainable schedulers as requested in #13022 by implementing the changes mentioned below from [comment](#21800 (comment)). * Changing the behavior of schedulers to the chainable formula when available * Using the closed form whenever epoch is different from None until the next release with a deprecation warning * Making `get_computed_values` the supported way of obtaining the last computed learning rate by the scheduler (see [comment](#21800 (comment)) for new syntax) * Returning a deprecation warning when invoking the undocumented get_lr function (see [comment](#21800 (comment))) referring to `get_computed_values`, and deprecating it in the next release. * `CosineAnnealingWarmRestart` still takes an epoch parameter as it is the only one with a mechanic relying on fractional epoch * `MultiplicativeLR` is consumes a function providing the multiplicative factor at each epoch. It mimics `LambdaLR` in its syntax. # #20527 ### Before The user calls scheduler with a constant epoch either across loops or in the same loop. ``` import torch.optim as optim from torch import nn conv = nn.Conv2d(3,3,3) optimizer = optim.Adam(conv.parameters()) lr_scheduler = optim.lr_scheduler.StepLR(optimizer, 2) # Scheduler with sometimes-constant epoch number for epoch in [0, 0, 1, 1, 2, 2, 3, 3]: lr_scheduler.step(epoch) print(optimizer.param_groups[0]['lr']) ``` ### After If the user wants to step ``` import torch.optim as optim from torch import nn conv = nn.Conv2d(3,3,3) optimizer = optim.Adam(conv.parameters()) lr_scheduler = optim.lr_scheduler.StepLR(optimizer, 2) last_epoch = -1 for epoch in [0, 0, 1, 1, 2, 2, 3, 3]: # Check if epoch number has changed manually if epoch-last_epoch > 0: lr_scheduler.step() last_epoch = epoch print(epoch, scheduler.get_computed_values()) ``` # #22107 ### Before ``` import torch from torchvision.models import resnet18 net = resnet18() optimizer = torch.optim.SGD(net.parameters(), 0.1) scheduler = torch.optim.lr_scheduler.MultiStepLR(optimizer, milestones=[3, 6, 9], gamma=0.1) scheduler = torch.optim.lr_scheduler.StepLR(optimizer, 3, gamma=0.1) for i in range(10): # Scheduler computes and returns new learning rate, leading to unexpected behavior print(i, scheduler.get_lr()) scheduler.step() ``` ### After ``` import torch from torchvision.models import resnet18 net = resnet18() optimizer = torch.optim.SGD(net.parameters(), 0.1) lr_scheduler = torch.optim.lr_scheduler.MultiStepLR(optimizer, milestones=[3, 6, 9], gamma=0.1) lr_scheduler = torch.optim.lr_scheduler.StepLR(optimizer, 3, gamma=0.1) for i in range(10): # Returns last computed learning rate by scheduler print(i, lr_scheduler.get_computed_values()) lr_scheduler.step() ``` # ghstack This contains the changes from #24352. Opening again since they were reverted. This reverts commit 1c477b7. Test Plan: Imported from OSS Differential Revision: D17460427 Pulled By: vincentqb fbshipit-source-id: 8c10f4e7246d6756ac91df734e8bed65bdef63c9
* Implement C++ API version of torch.nn.functional.one_hot (#27081) (#27177) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/27177 Add support for F::one_hot C++ function. Test Plan: Added 3 new tests to verify API is working Imported from OSS Differential Revision: D17697934 fbshipit-source-id: a8127fb87c00daa119bb92a5702bc4bbba48290d * Refactor torch::jit::script::Module::register_* API. (#27189) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/27189 Conceptually, Module is just a view over ClassType and ivalue::object. register_ methods are the only methods that are exception from this: they provide an API not available on ClassType or object directly. This PR ports this API to ClassType and makes Module truly just a view over those two. Test Plan: Imported from OSS Differential Revision: D17703533 Pulled By: ZolotukhinM fbshipit-source-id: 2cdb9fb486b3fb8527986483c7f34be7bd59fabf * Add c10_experimental ops to BC check white list (#27235) Summary: experimental ops doesn't provide bc guarantee. Pull Request resolved: https://github.com/pytorch/pytorch/pull/27235 Reviewed By: hl475 Differential Revision: D17723292 Pulled By: houseroad fbshipit-source-id: 644ae34d130418a810e0f9d802fa25f6e34c5ccf * Rename _intrinsic to intrinsic Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/27194 Test Plan: Imported from OSS Differential Revision: D17704957 Pulled By: zafartahirov fbshipit-source-id: 46f02d129aa77c3047b2a6c606bfadd831a6b0fc * Allow set for qconfig for dynamic_quantize Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/27181 Test Plan: Imported from OSS Differential Revision: D17717482 Pulled By: jamesr66a fbshipit-source-id: f3930fc87831cbdcf4390cd769c594bb13f5cd81 * Fix reprs for _intrinsic modules Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/27184 Test Plan: Imported from OSS Differential Revision: D17717481 Pulled By: jamesr66a fbshipit-source-id: 4bd72bcd42191d9b21d03f5bb6698198dbffffda * skip all rpc and dist autograd spawn tests for <PY36 (#27191) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/27191 skip rpc and distautograd spawns tests for <python 3.6 ghstack-source-id: 91231565 close #27157 Test Plan: unit tests Differential Revision: D17697368 fbshipit-source-id: bb8cf1f47de41f9d350fd60afe37fece293d8680 * Add send and recv backward functions for builtin operators RPC. (#25527) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/25527 Master GH issue: https://github.com/pytorch/pytorch/issues/23110. This change builds upon https://github.com/pytorch/pytorch/pull/24876 and provides all the autograd hooks needed for a forward pass with distributed rpc for builtin operators. This change does not address distributed rpc for python UDFs and that will be addressed in follow up PRs. Summary of changes: 1. Attach send autograd functions when a request is sent from the client and response is sent from the server. 2. Attach receive autograd functions when a request is received on the server and a response is received on the client. 3. Generate a globally unique autograd_message_id for each send/recv autograd function pair to uniquely identify them. ghstack-source-id: 91240466 Test Plan: unit tests. Differential Revision: D17148077 fbshipit-source-id: 192d8a3f552ed7cc939f55dcca332965c9bd3233 * Rename jit Function to ScriptFunction Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/27219 Test Plan: Imported from OSS Differential Revision: D17715306 Pulled By: albanD fbshipit-source-id: d11a7634dbee6a885c7177b240958e5aed2544f3 * Make cpp-backed jit classes appear as being in torch.jit Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/27220 Test Plan: Imported from OSS Differential Revision: D17715305 Pulled By: albanD fbshipit-source-id: 574704ad23ece6da7aa2780b78867307bef523cc * Avoid configuring ROCm if USE_CUDA is on. (#26910) Summary: Move the resolution of conflict between `USE_CUDA` and `USE_ROCM` to CMake as to effectuate: - `USE_CUDA=ON` and CUDA is found, `USE_ROCM=ON` and ROCM is found --> fatal error - Either `USE_CUDA=ON` and CUDA is found or `USE_ROCM=ON` and ROCM is found --> The respective GPU feature is ON - Otherwise no GPU support Pull Request resolved: https://github.com/pytorch/pytorch/pull/26910 Differential Revision: D17738652 Pulled By: ezyang fbshipit-source-id: 8e07cc7e922e0abda24a6518119c28952276064e * Revert "Add std::variant backport as c10::variant (#26836)" (#27277) Summary: This reverts commit 0cd188035a27fc38ce1e8eee205f6d47cd7650e6. As reported by jerryzh168 and pritamdamania87, mpark::variant doesn’t compile with gcc 7.3.1 on fb devserver and throws error similar to https://github.com/mpark/variant/issues/43. (However, it doesn’t fail with gcc 7.3.1 in OSS CI, based on https://circleci.com/api/v1.1/project/github/pytorch/pytorch/2995606/output/107/0?file=true) A plausible workaround is to upgrade devserver to devtoolset-8, but that would in turn causes CUDA build to complain: ``` /usr/local/cuda/bin/../targets/x86_64-linux/include/crt/host_config.h:119:2: error: #error -- unsupported GNU version! gcc versions later than 7 are not supported! #error -- unsupported GNU version! gcc versions later than 7 are not supported! ``` (Thanks pritamdamania87 for the report!) The solution for now is to revert the mpark::variant addition, and I will find alternatives that will work with gcc 7.3.1 on fb devserver. Pull Request resolved: https://github.com/pytorch/pytorch/pull/27277 Differential Revision: D17739804 fbshipit-source-id: ad945b3d86ab7ddbff58f4ecab95e0e1ac725ae9 * Implement LpNorm regularizer to be used on the inputs for feature importance (#26376) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/26376 * Create the new dense_feature_reg (FCInputLpNorm) for feature importance to be applied to the fully-connected layer for feature-importance. Test Plan: * Unit test located in: `caffe2/caffe2/fb/dper/layer_models/tests/split_1/sparse_nn_test.py` Reviewed By: un-disclosed Differential Revision: D17360361 fbshipit-source-id: 1a0e119eeb17199a13dfffe58b3036ea4255e301 * Provide (but skip) 3.5 job by default on all PRs. (#27293) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/27293 This doesn't turn on 3.5 signal, but it makes it so that [test all] will include it if you do request it. Signed-off-by: Edward Z. Yang <[email protected]> Test Plan: Imported from OSS Differential Revision: D17738741 Pulled By: ezyang fbshipit-source-id: 2b1af4d7bf26fd84a593fde292d6bfa2aabc1148 * more profiler changes in C++ before enabling checkScript changes Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/26909 Differential Revision: D17683632 Pulled By: Krovatkin fbshipit-source-id: 5d36c3c4cf7411c56485ef19fe59262b9f8b45b2 * Fix segfault while printing value type for an error msg in emitListComprehension Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/27261 Differential Revision: D17740159 Pulled By: Krovatkin fbshipit-source-id: 90439282aea14d8634eb41ffece5b6320d615fa7 * Factored out the default mappings Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/27164 Test Plan: Imported from OSS Differential Revision: D17694475 Pulled By: zafartahirov fbshipit-source-id: df8df5f7d66062ed35da957064a31344e1d3c961 * Add memory format argument to the `clone` operator (#27106) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/27106 Adds memory_format option to the `clone` operator. Introduce new `clone` behavior if used with `input_t.clone(memory_format=torch.preserve_format)`: 1) If tensor is non-overlapping and dense - output tensor will have the same strides as input tensor. 2) If not (1) and tensor is stored in the channels last format, output tensor going to have channels last format. 3) Output tensor is going to be contiguous in all other cases. --- Dense tensor is the tensor that store values in a contiguous block of memory. Non-overlapping tensor is the tensor in which elements occupy individual non-repetitive memory. Test Plan: Imported from OSS Differential Revision: D17699357 Pulled By: VitalyFedyunin fbshipit-source-id: 5ae1537c2aca1abf0bf1eec4416846129c156f66 * Extract version to version.txt (#27149) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/27149 Extract version to version.txt and add reading version logic to setup.py and fb/torch_version.py ghstack-source-id: 91271883 Test Plan: N/A Reviewed By: gchanan, ezyang Differential Revision: D17689307 fbshipit-source-id: 21899502027cec71b63d9dc151e09ff5ff3f279d * add AutoNonVariableTypeMode for USE_STATIC_DISPATCH on JIT->ATen path (#27274) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/27274 This is yet another fix to address #26764. PR #26908 toggles NonVariableTypeMode in ATen dispatcher, which is where USE_STATIC_DISPATCH takes place thus it's most logically sound place to do such tweaks. However, we observed nontrivial perf regression due to this fix. Turns out the numel() tensor method gets called in several for-loops thus incurs ~7M thread_local updates in a single forward call: ``` 7173330 numel 558 size 416 q_scale 302 _empty_affine_quantized 288 contiguous 257 q_zero_point 216 qscheme 173 empty 110 set_ 105 as_strided 104 permute ... ``` As numel() is not called from a single place so a natural workaround is to update function_wrapper.py so that it only adds the guard on gen_namespace_function() case and ignore the gen_tensor_method() case. But some tensor methods are actually being called from JIT side directly (e.g. "aten::eq_" -> "(self).eq_") so the only "band aid" left on the table is to insert guard on JIT->aten path as originally did on #26868 - this is a simplified version of it as it doesn't hurt to extend the NonVariableMode scope a little bit to also cover stack drop/pack calls. On Android we only expose JIT API so we don't need worry about TensorMethods being called directly. On iOS we don't provide a wrapper yet but we can mention this caveat in the doc. Hopefully by the time it's widely used we can finish Variable/Tensor unification and remove all these hacks. Test Plan: - Verified it runs quantized/fp32 MobileNetV2 models; - Verified it fixes the perf regression (revert #26908 separately); Differential Revision: D17732489 Pulled By: ljk53 fbshipit-source-id: c14ca66aebc6b6f17ad6efac7ca47f9487c98de5 * Updating submodules Summary: GitHub commits: https://github.com/pytorch/fbgemm/commit/8786c0819029c076b0e28320e880ba3ac192ea8b Test Plan: n/a Reviewed By: zpao fbshipit-source-id: 9c04a2ba7cc2166db0203f186ece261ca8b186dd * Avoid calling tensor.numel() in for loops (#27298) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/27298 PR #26908 toggles NonVariableTypeMode in ATen dispatcher, which is where USE_STATIC_DISPATCH takes place. This causes an issue with numel() as it gets called through the dispatch mode and probably not getting inlined. Also the thread local state is expensive to read/write so many times and this kills perf. PR #27274 is another approach to fix this and has more details. Test Plan: Quantized mobilenetV2 perf before this change Main run finished. Milliseconds per iter: 28.6782. Iters per second: 34.8696 Perf after this change Main run finished. Milliseconds per iter: 22.2585. Iters per second: 44.9267 Imported from OSS Differential Revision: D17742565 fbshipit-source-id: 43c6045cc001c46916ba339555c9d809a2537eff * Fix circle CI Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/27307 Test Plan: Imported from OSS Differential Revision: D17746444 Pulled By: xta0 fbshipit-source-id: ed37f91921f1ea7db6c63ba69f04883856341c39 * Update the link for iOS demo app in README.md (#27145) Summary: Update the link for iOS demo app in README.md Pull Request resolved: https://github.com/pytorch/pytorch/pull/27145 Differential Revision: D17746591 Pulled By: xta0 fbshipit-source-id: 6f49a0daddc8b79804e1b8487ba1db3807a3f481 * Allow use cpu_serial_kernel with void-lambda (#27271) Summary: Currently we use CPU_tensor_apply1 to loop through the tensor in single thread and aggregate data: ``` // compute variance per input accscalar_t var_sum = 0; CPU_tensor_apply1<scalar_t>(in, [&] (const scalar_t& i) { var_sum += (i - mean) * (i - mean); }); ``` and we don't have the ability to use TensorIterator for this. ``` accscalar_t var_sum = 0; auto iter = TensorIterator::unary_op(self, self); cpu_serial_kernel(iter, [&](scalar_t i) -> scalar_t { var_sum += (i - mean) * (i - mean); return a; //Unable to set value back, because self should be const }); ``` This PR should resolve this problem and allow to use void-lambda: ``` auto iter = at::TensorIterator(); iter.add_input(in); iter.build(); accscalar_t var_sum = 0; \ at::native::cpu_serial_kernel(iter, [&](scalar_t i) -> void { var_sum += (i - mean) * (i - mean); }); ``` In the future it make sense to change Reduction part and allow to reduce to a scalar, not just to a tensor Pull Request resolved: https://github.com/pytorch/pytorch/pull/27271 Differential Revision: D17743310 Pulled By: ifedan fbshipit-source-id: a149751f2d671aefd3ed84bd50b2c0543a63b701 * Move the CUDA implementation of log10 to ATen. (#26733) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/26733 Close #24587 Test Plan: Imported from OSS Differential Revision: D17606981 Pulled By: VitalyFedyunin fbshipit-source-id: 732f07b981287da3ca235b272b7b6f78144f8ebe * Mention magma-cuda101 package in install instructions (#27325) Summary: There is a magma package for the newest CUDA verson (10.1), mention it here lest someone try to mistakenly use the version for CUDA 10.0. Pull Request resolved: https://github.com/pytorch/pytorch/pull/27325 Differential Revision: D17749535 Pulled By: soumith fbshipit-source-id: 2d34a7af1218e6157935bfd5e03f4d2c0f00f200 * C++ API parity: TensorTest.BackwardNonScalarOutputs Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/27314 Test Plan: Imported from OSS Differential Revision: D17746371 Pulled By: pbelevich fbshipit-source-id: 246fae22a60ed9a6d7b9843239b4b3391cc9dc3e * Fix build (#27318) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/27318 Fix TBB build USE_TBB=1 ATEN_THREADING=TBB python setup.py develop install --cmake Test Plan: Imported from OSS Differential Revision: D17747449 Pulled By: ilia-cher fbshipit-source-id: 421f362bd10f3be34bffe86ae4f26e8f1c15f1a4 * Relax restrictions on set_num_threads (#27190) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/27190 Allow set_num_threads to be called multiple times in case of TBB parallel backend Test Plan: BUILD_BINARY=1 USE_TBB=1 ATEN_THREADING=TBB python setup.py develop install --cmake ./build/bin/test_parallel ./build/bin/thread_init_test Reviewed By: kostmo Differential Revision: D17704236 Pulled By: ilia-cher fbshipit-source-id: 274380795e78ba417301c5faa18c9e9d3198bd5e * Migrate the cpu and gpu implementations of resize nearest 3D from vision to caffe2 Summary: As title. Fix the build failures in unicorn-build-restrictions as discussed in D17330625 Test Plan: buck test mode/opt caffe2/caffe2/quantization/server:resize_nearest_3d_dnnlowp_op_test In vision libs, no need to explicitly add dep to resize 3d op as the caffe2_cpu dep is added by default. Reviewed By: stephenyan1231 Differential Revision: D17676082 fbshipit-source-id: c034ab67a9078f72077b396991ffb9e54e6ab40b * Add method add_hparams to API doc (#27344) Summary: Adds the method `add_hparams` to `torch.utils.tensorboard` API docs. Will want to have this in PyTorch 1.3 release. cc sanekmelnikov lanpa natalialunova Pull Request resolved: https://github.com/pytorch/pytorch/pull/27344 Differential Revision: D17753689 Pulled By: orionr fbshipit-source-id: cc8636e0bdcf3f434444cd29471c62105491039d * Support interface python assignment as an attribute (#26734) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/26734 This PR added the python assignment for interface as an attribute in the module, it enables any object that implicitly inheriting the specific interface to be able to be assigned to the interface type in python. Serialization support for interface/class assignment will be done in the follow up PR Test Plan: Imported from OSS Differential Revision: D17742708 Pulled By: wanchaol fbshipit-source-id: a0a2d8c74b60ed3fa6c05e1b0d49b7ad1abc670b * Skip tests that use numpy if it's not present Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/27165 Pulled By: driazati Differential Revision: D17695078 fbshipit-source-id: d25c920f4c43285028537f88761d47a2c9db7b8f * Add Python RRef as args and return value (#25499) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/25499 See #23110 for model parallel design details, and #26759 for the RRef protocol. This commit add support for using RRef as Python UDF arguments and return value. RRefs can now be shared from owner to user, from user to owner, or from user to user. Limitations: 1. No implicit type conversion yet. (#27099) 2. No failure handling and retry. (#26116) 3. UDF is not yet blocked until all RRefs are confirmed. (#27098) 4. Internal RRef control messages are not idempotent yet. (#26116) 5. Cannot delete RRefs correctly when there are circular dependencies. (#27096) Main changes: 1. Added `SCRIPT_REMOTE_CALL` and `PYTHON_REMOTE_CALL` to `Message.h` to represent `dist.remote` invocations. 2. Added `SCRIPT_RREF_FETCH_CALL`, `PYTHON_RREF_FETCH_CALL`, `RREF_USER_ACCEPT`, `RREF_USER_DELETE`, `RREF_CHILD_ACCEPT`, and `RREF_FORK_REQUEST` to `Message.h` as internal RRef control messages. 3. New message request handling code is added to `functions.cpp`, and message format is added in `script_remote_call.h`, `python_remote_call.h`, and `rref_proto.h`. 4. Added a `PyRRef` type in `py_rref.h` and `py_rref.cpp` which holds a shared pointer to C++ `RRef` type. `PyRRef` wraps the C++ API and also implements RRef pickling and unpickling. RRef fork related control messages will be sent during RRef pickling/unpickling procedure. 5. Update `RRef.h` and `RRef.cpp` accordingly to support `py::object` RRefs. 6. RRef context (reference count, etc.) are tracked in `rref_context.h` and `rref_context.cpp`. Test Plan: Imported from OSS buck test mode/dev-nosan //caffe2/test:rpc_fork Differential Revision: D17184146 Pulled By: mrshenli fbshipit-source-id: a3a268efc087ac1ef489136ab957080382629265 * Set MINIZ_NO_TIME to avoid computing localtime on each pickle/unpickle (#27268) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/27268 For small pickle/unpickle, we spend a disproportionate amount of time in time functions - roughly 23% in __tzset() for unpickle case. We're currently not using the .m_time currently, though we can add this feature back if it's ever needed. An alternative would be to -DMINIZ_NO_TIME in compiler_flags, but we would need to also consistently # define MINIZ_NO_TIME in any .cpp including this .h, since this # define modifies the struct length in an unfortunate manner. Test Plan: buck test mode/dev-nosan caffe2/test/... Run benchmark: buck-out/opt/gen/caffe2/torch/fb/distributed/thriftRpcBackend/test/ThriftRpcAgentBench Differential Revision: D17724198 fbshipit-source-id: b44a0217b1d9f8ce6c0f24297f59045c7cadf4b1 * Add a test case to RpcTest, check src/dst (#27322) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/27322 # Problem Existing test cases are too symmetric, so that didn't detect this error, request sent to the wrong worker. Because of wrong `worker_names` setup, worker0 sends request to itself, while it should had sent to worker1. # Solution Add a test case, letting the dst side to check if it's an request from the expected src. ghstack-source-id: 91299312 Reviewed By: satgera Differential Revision: D17069062 fbshipit-source-id: ef7a532dd497bfc0f0ee8446fcd5d29656aaf175 * Update to ROCm 2.8 (#27337) Summary: New docker images built with tag 324. Related jenkins changes: https://github.com/pytorch/ossci-job-dsl/commit/83ec81335742e66b02af90b7c74021b8792fc63f https://github.com/pytorch/ossci-job-dsl/commit/aa235a14c82db69d0544cd8fc1da03ef9a50096e Triggered CI runs: https://ci.pytorch.org/jenkins/job/caffe2-builds/job/py2-devtoolset7-rocmrpm-centos7.5-trigger-test/48682/ https://ci.pytorch.org/jenkins/job/pytorch-builds/job/py2-clang7-rocmdeb-ubuntu16.04-trigger/55638/ Pull Request resolved: https://github.com/pytorch/pytorch/pull/27337 Differential Revision: D17753827 Pulled By: bddppq fbshipit-source-id: 2c3f77b0b7c680013c7cc6d7953fe0da4922fe48 * add sdk support for xcodebuild script Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/27358 Test Plan: Imported from OSS Differential Revision: D17757389 Pulled By: xta0 fbshipit-source-id: ed8e470b9c6329b96297ee7c65ba08759251baad * export remainder (#24410) Summary: Added ONNX export support for torch.remainder and torch.fmod Pull Request resolved: https://github.com/pytorch/pytorch/pull/24410 Reviewed By: hl475 Differential Revision: D17466791 Pulled By: houseroad fbshipit-source-id: afe6519e5f370824e3b4a45b69036a7260fb72cf * Replacing the skip_list with white_list in the qconfig propagation Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/27183 Test Plan: Imported from OSS Differential Revision: D17700548 Pulled By: zafartahirov fbshipit-source-id: 18e6ffbda496b14ac1da1783f928ad539cdb1d16 * Show a warning that not all dir members of quantized work. (#27339) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/27339 This PR just shows a warning message. Eventually we will show a correct __dir__ Test Plan: Imported from OSS Differential Revision: D17751333 Pulled By: zafartahirov fbshipit-source-id: e9bc62fd8dd0147979291d0aac3f1afe5b8c7a9f * improve error messages when a method or attribute is missing (#27110) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/27110 Previously missing methods on some types like tensors would talk about 'builtins' which are only a thing inside of the compiler. Furthermore, the error would only occur when the builtin was applied and it was discovered that no builtin existed. This changes the error message so that it discovers that method on our builtin types does not exist on attribute lookup. Test Plan: Imported from OSS Differential Revision: D17677616 Pulled By: zdevito fbshipit-source-id: 2f7cf6c6093a9c832569c44f4b1044a2e56fe205 * refactor extra sugared values (#26270) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/26270 We've accumulated a lot of sugared values whose only purpose is to be instanced-checked against in emitApplyExpr. I need to add another one to insert an unchecked_cast, and do not want to continue the pattern. This creates an abstraction for this concept (SpecialFormValue), and removes all the unneeded sugared values. There is no functionality change here just a bunch of code movement in compiler.cpp Test Plan: Imported from OSS Differential Revision: D17412854 Pulled By: zdevito fbshipit-source-id: 15877c91decaea5a00d1fe737ed2d0f0f8a79a28 * Minor readability fixes to C++ documentation (#27338) Summary: Changed `yieldings` to `yielding`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/27338 Differential Revision: D17758406 Pulled By: yf225 fbshipit-source-id: 1633834a6ad80449c061ebc330ac24f3e42f5506 * Choose num_threads in parallel_for based on GRAIN_SIZE (#26963) Summary: Fixes https://github.com/pytorch/pytorch/issues/24080, Continuation of https://github.com/pytorch/pytorch/issues/26886 What soumith said in https://github.com/pytorch/pytorch/pull/26886#issuecomment-535760635 seems plausible > I wonder if it has to do with `#pragma omp parallel num_threads(num_threads)` which has unintended consequences, where even if `num_threads=1`, entering an omp block inside an omp block results in bad behavior. I know for a fact that gcc's openmp doesn't start the thread pool when given `num_threads(1)` but it seems clang behaves differently. Pull Request resolved: https://github.com/pytorch/pytorch/pull/26963 Differential Revision: D17626981 Pulled By: soumith fbshipit-source-id: 484ffe6cc172382bb5ff49ce1fceda7eba20a512 * Enable Python3.6 PyTorch ROCm CI Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/27353 Differential Revision: D17758495 Pulled By: bddppq fbshipit-source-id: 95e329bc30f092e4093a33c408f1647b803d9983 * Fixes PackedSequence.to (and unifies PackedSequence conversions) (#27245) Summary: PackedSequence.to(device) incorrectly places one of three tensors on the device and leaves the other two tensors where they are. If these devices are distinct then further operations on PackedSequence will fail. This behavior is inconsistent with Tensor.to and PackedSequence's behavior when .cuda() is called. Additionally, PackedSequence defines multiple other conversion functions that were independently and inconsistently implemented. This PR unifies all implementations and makes the PackedSequence.to behavior more consistent with Tensor.to. It is not completely consistent per comments. test_device_mask in test_nn.py is updated to validate the new functionality. Pull Request resolved: https://github.com/pytorch/pytorch/pull/27245 Differential Revision: D17757850 Pulled By: mruberry fbshipit-source-id: 58f0bd40f1aa300fb0a91ee743483d645f977dc5 * Makes test_cuda.py's generated tensor op tests generic (#27210) Summary: - The tensor op tests generated in test_cuda.py are now generic and appear in test_torch,py - Data previously held in auxiliary data structures and files, like test_cuda_ignores.txt, is inlined Previously the tensor op tests used several auxiliary data structures, a file, and exception handling to filter the test suite. If a function wasn't implemented, for example, that exception would be caught. This let functions like trigamma, which isn't callable, appear to be tested. See https://github.com/pytorch/pytorch/issues/27230. Filtering from additional data stores is error prone, too. It requires developers understand what data stores are used and how they're used. The existing sources are also sometimes incorrect. The txt file claims that dist_ doesn't work on half tensors, for example, but the updated tests verify it does. In addition to making these tests generic, this PR removes those auxiliary data structures and does not catch any exceptions. Exceptions are errors. (This also means that if something implemented breaks it will now report as an error. Previously the test suite would have reported a pass.) The test infrastructure was also simplified to not perform computations with CPU half tensors since they do not support many operations. This introduces a float<->half conversion quirk but eliminates awkward functions that would first convert cpu tensors to float, perform an operation, and convert them back. With this change test_cuda.py is almost entirely CUDA-specific. Pull Request resolved: https://github.com/pytorch/pytorch/pull/27210 Differential Revision: D17757907 Pulled By: mruberry fbshipit-source-id: b3c191c379667b1a7d5361087bdf82f397f77f65 * Remove six dependency (#27282) Summary: https://github.com/pytorch/pytorch/pull/27136 added a dependency on `six`, which is not available by default and is not marked as a dependency on PyTorch binaries, causing torchvision CI to break, see https://circleci.com/gh/pytorch/vision/20778?utm_campaign=vcs-integration-link&utm_medium=referral&utm_source=github-build-link for example. This PR use `torch._six` instead of `six` as a replacement. Pull Request resolved: https://github.com/pytorch/pytorch/pull/27282 Reviewed By: lerks Differential Revision: D17737561 Pulled By: fmassa fbshipit-source-id: 7dcd0cc2c8bab27b8f4535f664f60388818d3497 * Make `align_to` method-only. (#27304) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/27304 The ellipsis version of `align_to` only works if it is called as a method. To prevent any confusion, this PR disables `torch.align_to` (but keeps `Tensor.align_to`. Test Plan: - [namedtensor ci] Differential Revision: D17743809 Pulled By: zou3519 fbshipit-source-id: cf5c53dcf45ba244f61bb1e00e4853de5db6c241 * Remove CUDA_VERSION from Python script (which has already been detected in CMake) (#27316) Summary: (Intentionally left blank) Pull Request resolved: https://github.com/pytorch/pytorch/pull/27316 Differential Revision: D17762715 Pulled By: ezyang fbshipit-source-id: 044c0ea6e8c2d12912c946a9a50b934b5253d8c8 * Revert D17743310: [pytorch][PR] Allow use cpu_serial_kernel with void-lambda Test Plan: revert-hammer Differential Revision: D17743310 Original commit changeset: a149751f2d67 fbshipit-source-id: 043240201d67966dd08b7b1bc2f9bf4897923e00 * Implement pickle support for sparse tensors and torch.layout instances (#27062) Summary: Resolves issue https://github.com/pytorch/pytorch/issues/16667 and https://github.com/OpenMined/PySyft/issues/2326 Pull Request resolved: https://github.com/pytorch/pytorch/pull/27062 Differential Revision: D17762932 Pulled By: ezyang fbshipit-source-id: dd99c1f4ac8eb2286eb55aa20ce973f60ce7b7e1 * move new_zeros to core from THP (#26511) Summary: Fix for issue https://github.com/pytorch/pytorch/issues/25831 ezyang can you please have a look? Pull Request resolved: https://github.com/pytorch/pytorch/pull/26511 Differential Revision: D17763037 Pulled By: ezyang fbshipit-source-id: 3596c01c4ab421e7785d6055cc813806f840a5c7 * autograd: double backwards function for binary_cross_entropy loss Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/26983 Reviewed By: albanD Differential Revision: D17714357 Pulled By: anjali411 fbshipit-source-id: cebfe09a9048c4be457b7f2718bc396c06ecabee * Change schedulers to chainable form (#26423) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/26423 Enable chainable schedulers as requested in #13022 by implementing the changes mentioned below from [comment](https://github.com/pytorch/pytorch/pull/21800#issuecomment-513370208). * Changing the behavior of schedulers to the chainable formula when available * Using the closed form whenever epoch is different from None until the next release with a deprecation warning * Making `get_computed_values` the supported way of obtaining the last computed learning rate by the scheduler (see [comment](https://github.com/pytorch/pytorch/pull/21800#issuecomment-513940729) for new syntax) * Returning a deprecation warning when invoking the undocumented get_lr function (see [comment](https://github.com/pytorch/pytorch/pull/21800#discussion_r294305485)) referring to `get_computed_values`, and deprecating it in the next release. * `CosineAnnealingWarmRestart` still takes an epoch parameter as it is the only one with a mechanic relying on fractional epoch * `MultiplicativeLR` is consumes a function providing the multiplicative factor at each epoch. It mimics `LambdaLR` in its syntax. # #20527 ### Before The user calls scheduler with a constant epoch either across loops or in the same loop. ``` import torch.optim as optim from torch import nn conv = nn.Conv2d(3,3,3) optimizer = optim.Adam(conv.parameters()) lr_scheduler = optim.lr_scheduler.StepLR(optimizer, 2) # Scheduler with sometimes-constant epoch number for epoch in [0, 0, 1, 1, 2, 2, 3, 3]: lr_scheduler.step(epoch) print(optimizer.param_groups[0]['lr']) ``` ### After If the user wants to step ``` import torch.optim as optim from torch import nn conv = nn.Conv2d(3,3,3) optimizer = optim.Adam(conv.parameters()) lr_scheduler = optim.lr_scheduler.StepLR(optimizer, 2) last_epoch = -1 for epoch in [0, 0, 1, 1, 2, 2, 3, 3]: # Check if epoch number has changed manually if epoch-last_epoch > 0: lr_scheduler.step() last_epoch = epoch print(epoch, scheduler.get_computed_values()) ``` # #22107 ### Before ``` import torch from torchvision.models import resnet18 net = resnet18() optimizer = torch.optim.SGD(net.parameters(), 0.1) scheduler = torch.optim.lr_scheduler.MultiStepLR(optimizer, milestones=[3, 6, 9], gamma=0.1) scheduler = torch.optim.lr_scheduler.StepLR(optimizer, 3, gamma=0.1) for i in range(10): # Scheduler computes and returns new learning rate, leading to unexpected behavior print(i, scheduler.get_lr()) scheduler.step() ``` ### After ``` import torch from torchvision.models import resnet18 net = resnet18() optimizer = torch.optim.SGD(net.parameters(), 0.1) lr_scheduler = torch.optim.lr_scheduler.MultiStepLR(optimizer, milestones=[3, 6, 9], gamma=0.1) lr_scheduler = torch.optim.lr_scheduler.StepLR(optimizer, 3, gamma=0.1) for i in range(10): # Returns last computed learning rate by scheduler print(i, lr_scheduler.get_computed_values()) lr_scheduler.step() ``` # ghstack This contains the changes from #24352. Opening again since they were reverted. This reverts commit 1c477b7e1f378e9c1f8efed296241f68a8a4372b. Test Plan: Imported from OSS Differential Revision: D17460427 Pulled By: vincentqb fbshipit-source-id: 8c10f4e7246d6756ac91df734e8bed65bdef63c9 * Make RpcTest re-usable by other RPC backends by using init_method to initialize a RPC backend (#27320) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/27320 https://github.com/pytorch/pytorch/pull/27208/ # Problem Other RPC backends take init_method. # Solution Set up init_method in rpc tests. ghstack-source-id: 91335127 Differential Revision: D17709219 fbshipit-source-id: 3184c6e9b922a6ff9f4d1cb9abfa118b23f43eeb * Add OPN instruction and vararg operator table (#27104) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/27104 * The use case here is to replace prim::ListConstruct, which requires Node, but Node is not available in mobile lite interpreter. * (OPN, X, N), X is the index to the vararg operator-name and operator tables. N is number of inputs. For ListConstruct example, operator name can be "aten::listconstruct" and the overloaded name is the output type ("int", "float", "bool", "tensor" and "generic"). * A vararg operator table is built with void(int input_size, Stack& stack) functions. ## Unit test LiteInterpreterConv covers OPN instruction and conv operator. Test Plan: Imported from OSS Differential Revision: D17762853 fbshipit-source-id: 475aa0c6678e3760cec805862a78510913a89c83 * Allow use cpu_serial_kernel with void-lambda (#27370) Summary: https://github.com/pytorch/pytorch/pull/27271 Pull Request resolved: https://github.com/pytorch/pytorch/pull/27370 Differential Revision: D17763265 Pulled By: ifedan fbshipit-source-id: d670560dfc555db529b18c01aa42f0ccb2127889 * From docs of scatter_add_() removed erroneous comment on uniqueness of indices. (#27132) Summary: Fixes https://github.com/pytorch/pytorch/issues/27080 Pull Request resolved: https://github.com/pytorch/pytorch/pull/27132 Differential Revision: D17765307 Pulled By: soumith fbshipit-source-id: b0892ff442f3b49f8e3cdf029e2a08b51fa88f28 * Reduce error context from 10 -> 3 (#26765) Summary: 10 lines of error context (on both sides) is overkill, especially now that we have line numbers. With a compilation stack of a couple functions, it becomes a pain to scroll to the top of the stack to see the real error every time. This also fixes class names in the compilation stack to a format of `ClassName.method_name` instead of the the full qualified name Old output ``` clip_boxes_to_image(Tensor boxes, (int, int) size) -> (Tensor): Expected a value of type 'Tuple[int, int]' for argument 'size' but instead found type 'Tuple[int, int, int]'. : at /home/davidriazati/dev/vision/torchvision/models/detection/rpn.py:365:20 top_n_idx = self._get_top_n_idx(objectness, num_anchors_per_level) batch_idx = torch.arange(num_images, device=device)[:, None] objectness = objectness[batch_idx, top_n_idx] levels = levels[batch_idx, top_n_idx] proposals = proposals[batch_idx, top_n_idx] final_boxes = [] final_scores = [] for boxes, scores, lvl, img_shape in zip(proposals, objectness, levels, image_shapes): boxes = box_ops.clip_boxes_to_image(boxes, img_shape) ~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE keep = box_ops.remove_small_boxes(boxes, self.min_size) boxes, scores, lvl = boxes[keep], scores[keep], lvl[keep] # non-maximum suppression, independently done per level keep = box_ops.batched_nms(boxes, scores, lvl, self.nms_thresh) # keep only topk scoring predictions keep = keep[:self.post_nms_top_n] boxes, scores = boxes[keep], scores[keep] final_boxes.append(boxes) final_scores.append(scores) 'RegionProposalNetwork.filter_proposals' is being compiled since it was called from 'RegionProposalNetwork.forward' at /home/davidriazati/dev/vision/torchvision/models/detection/rpn.py:446:8 num_images = len(anchors) num_anchors_per_level = [o[0].numel() for o in objectness] objectness, pred_bbox_deltas = \ concat_box_prediction_layers(objectness, pred_bbox_deltas) # apply pred_bbox_deltas to anchors to obtain the decoded proposals # note that we detach the deltas because Faster R-CNN do not backprop through # the proposals proposals = self.box_coder.decode(pred_bbox_deltas.detach(), anchors) proposals = proposals.view(num_images, -1, 4) boxes, scores = self.filter_proposals(proposals, objectness, images.image_sizes, num_anchors_per_level) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE losses = {} if self.training: assert targets is not None labels, matched_gt_boxes = self.assign_targets_to_anchors(anchors, targets) regression_targets = self.box_coder.encode(matched_gt_boxes, anchors) loss_objectness, loss_rpn_box_reg = self.compute_loss( objectness, pred_bbox_deltas, labels, regression_targets) losses = { 'RegionProposalNetwork.forward' is being compiled since it was called from 'MaskRCNN.forward' at /home/davidriazati/dev/vision/torchvision/models/detection/generalized_rcnn.py:53:8 """ if self.training and targets is None: raise ValueError("In training mode, targets should be passed") original_image_sizes = [(img.shape[-2], img.shape[-3]) for img in images] images, targets = self.transform(images, targets) features = self.backbone(images.tensors) if isinstance(features, torch.Tensor): features = OrderedDict([(0, features)]) proposals, proposal_losses = self.rpn(images, features, targets) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE detections, detector_losses = self.roi_heads(features, proposals, images.image_sizes, targets) detections = self.transform.postprocess(detections, images.image_sizes, original_image_sizes) losses = {} losses.update(detector_losses) losses.update(proposal_losses) # TODO: multiple return types?? # if self.training: ``` New output ``` RuntimeError: clip_boxes_to_image(Tensor boxes, (int, int) size) -> (Tensor): Expected a value of type 'Tuple[int, int]' for argument 'size' but instead found type 'Tuple[int, int, int]'. : at /home/davidriazati/dev/vision/torchvision/models/detection/rpn.py:365:20 final_scores = [] for boxes, scores, lvl, img_shape in zip(proposals, objectness, levels, image_shapes): boxes = box_ops.clip_boxes_to_image(boxes, img_shape) ~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE keep = box_ops.remove_small_boxes(boxes, self.min_size) boxes, scores, lvl = boxes[keep], scores[keep], lvl[keep] 'RegionProposalNetwork.filter_proposals' is being compiled since it was called from 'RegionProposalNetwork.forward' at /home/davidriazati/dev/vision/torchvision/models/detection/rpn.py:446:8 proposals = self.box_coder.decode(pred_bbox_deltas.detach(), anchors) proposals = proposals.view(num_images, -1, 4) boxes, scores = self.filter_proposals(proposals, objectness, images.image_sizes, num_anchors_per_level) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE losses = {} 'RegionProposalNetwork.forward' is being compiled since it was called from 'MaskRCNN.forward' at /home/davidriazati/dev/vision/torchvision/models/detection/generalized_rcnn.py:53:8 if isinstance(features, torch.Tensor): features = OrderedDict([(0, features)]) proposals, proposal_losses = self.rpn(images, features, targets) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE detections, detector_losses = self.roi_heads(features, proposals, images.image_sizes, targets) detections = self.transform.postprocess ``` ](https://our.intern.facebook.com/intern/diff/17560963/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/26765 Pulled By: driazati Differential Revision: D17560963 fbshipit-source-id: e463548744b505ca17f0158079b80e08fda47d49 * Fix some return std::move warnings (#27384) Summary: clang-tidy was complaining about these Pull Request resolved: https://github.com/pytorch/pytorch/pull/27384 Pulled By: driazati Differential Revision: D17767412 fbshipit-source-id: 03e2630790edf3f6bbf9064e754156613032b464 * add function to get nccl version for error messages (#27068) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/27068 Adds a function that uses ncclGetVersion from the NCCL API to retrieve the NCCL version. Converts it into a readable string, and is called in NCCL-related error messages to log the NCCL version. Hopefully this will help with debugging NCCL errors. Test Plan: Modify C10D_NCCL_CHECK in NCCLUtils.hpp to always error by setting ncclResult_t error = ncclSystemError force an NCCL error with script test/simulate_nccl_errors.py: Start master node: python test/simulate_nccl_errors.py localhost 9124 0 2 Start other node: python test/simulate_nccl_errors.py localhost 9124 1 2 On the master node, should see the following error message w/NCCL version: ``` Traceback (most recent call last): File "simulate_nccl_errors.py", line 29, in <module> process_group.allreduce(torch.rand(10).cuda(rank)).wait() RuntimeError: NCCL error in: ../torch/lib/c10d/ProcessGroupNCCL.cpp:375, unhandled system error, NCCL version 2.4.8 ``` Differential Revision: D17639476 fbshipit-source-id: a2f558ad9e883b6be173cfe758ec56cf140bc1ee * C++ API parity: Hardtanh Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/27038 Test Plan: Imported from OSS Differential Revision: D17682405 Pulled By: pbelevich fbshipit-source-id: f65e76696e0041c3518f56da94f2e3b800305234 * fix OSX CI build (#27373) Summary: fix OSX caffe2 CI build, attempt 1 Pull Request resolved: https://github.com/pytorch/pytorch/pull/27373 Differential Revision: D17768461 Pulled By: soumith fbshipit-source-id: b0a076c07382327730b5d86b8a00f5388c368b5e * ProcessGroupNCCL should respect timeout passed in to init_process_group. (#27224) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/27224 As part of adding error handling to NCCL, we are now able to specify a timeout for operations using ProcessGroupNCCL. Although, this timeout had a default of 10 seconds and didn't respect the timeout specified in init_process_group. In this change, I've ensured we pass the appropriate timeout to ProcessGroupNCCL. ghstack-source-id: 91283548 Test Plan: Added unit test to verify timeout passed in to init_process_group is respected. Differential Revision: D17717992 fbshipit-source-id: c73320187f1f3b2693ba1e177d80646e282d01a2 * Add clip_grad_norm_ to c++ api (#26140) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/26140 Per https://github.com/pytorch/pytorch/issues/25883, we want to work towards C++/Python API parity. This diff adds clip_grad_norm_ to the c++ API to improve parity. ghstack-source-id: 91334333 ghstack-source-id: 91334333 Test Plan: Added a unit test Differential Revision: D17312367 fbshipit-source-id: 753ba3a4d084d01f3cc8919da3108e67c809ad65 * C++ API parity: LeakyReLU Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/27059 Test Plan: Imported from OSS Differential Revision: D17682407 Pulled By: pbelevich fbshipit-source-id: 2a4f42e9438799ba8de7282ac7a6fd3ff97ee048 * Some hipify script cleanups (#27375) Summary: continue https://github.com/pytorch/pytorch/issues/26363 Pull Request resolved: https://github.com/pytorch/pytorch/pull/27375 Differential Revision: D17764992 Pulled By: bddppq fbshipit-source-id: ecc06521179677efcedb1d58ceda63df7d63627e * add some support for the occupancy API on ROCm (#27390) Summary: Unfortunately, the HIP function takes uint32_t* instead of int*, so we still need to ifdef for the time being. Pull Request resolved: https://github.com/pytorch/pytorch/pull/27390 Differential Revision: D17768832 Pulled By: bddppq fbshipit-source-id: c65176660cb0783a04f0a4a064f686818d759589 * Add gfx908 to the list of per-default compiled architectures. (#27388) Summary: ROCm 2.8 added preliminary support for gfx908. Pull Request resolved: https://github.com/pytorch/pytorch/pull/27388 Differential Revision: D17767772 Pulled By: bddppq fbshipit-source-id: 172daf5bb66d3db86a13e287059af4b9b90a7f57 * Change nightly builds version to 1.4.0-SNAPSHOT (#27381) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/27381 Changing android nightly builds from master to version 1.4.0-SNAPSHOT, as we also have 1.3.0-SNAPSHOT from the branch v1.3.0 Test Plan: Imported from OSS Differential Revision: D17773620 Pulled By: IvanKobzarev fbshipit-source-id: c39a1dbf5e06f79c25367c3bc602cc8ce42cd939 * Pickup proxy parameters for publishing (#27389) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/27389 Pickup gradle proxy parameters (handy for publishing from devserver) in maven publishing gradle plugin Test Plan: Imported from OSS Differential Revision: D17773548 Pulled By: IvanKobzarev fbshipit-source-id: 662c0b2835e6cf1e4009da79e27268d4a19c2ceb * MovingAverage Observer (#27396) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/27396 Observer that estimates moving averages of min and max values per batch, more suited for quantization aware training instead of minmax observers that track extremal values across batches ghstack-source-id: 91369018 Test Plan: buck test caffe2/test:quantization -- 'test_per_tensor_observers \(test_quantization\.ObserverTest\)' --print-passing-details buck test caffe2/test:quantization -- 'test_per_channel_observers \(test_quantization\.ObserverTest\)' --print-passing-details Differential Revision: D17727213 fbshipit-source-id: 024a890bf3dd0bf269d8bfe61f19871d027326f0 * Add methods to write image tensor content to buffer (#27359) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/27359 Adding methods to TensorImageUtils: ``` bitmapToFloatBuffer(..., FloatBuffer outBuffer, int outBufferOffset) imageYUV420CenterCropToFloat32Tensor(..., FloatBuffer outBuffer, int outBufferOffset) ``` To be able to - reuse FloatBuffer for inference - to create batch-Tensor (contains several images/bitmaps) As we reuse FloatBuffer for example demo app - image classification, profiler shows less memory allocations (before that for every run we created new input tensor with newly allocated FloatBuffer) and ~-20ms on my PixelXL Known open question: At the moment every tensor element is written separatly calling `outBuffer.put()`, which is native call crossing lang boundaries As an alternative - to allocation `float[]` on java side and fill it and put it in `outBuffer` with one call, reducing native calls, but increasing memory allocation on java side. Tested locally just eyeballing durations - have not noticed big difference - decided to go with less memory allocations. Will be good to merge into 1.3.0, but if not - demo app can use snapshot dependencies with this change. PR with integration to demo app: https://github.com/pytorch/android-demo-app/pull/6 Test Plan: Imported from OSS Differential Revision: D17758621 Pulled By: IvanKobzarev fbshipit-source-id: b4f1a068789279002d7ecc0bc680111f781bf980 * add warning to dnnlowp fc if quantization kind is not min_max Summary: Print warning when using DNNLOWP dynamic int8 quant for FC and activation_quantization_kind != min_max. Warning will display in console but not in Bento. Would have to use CAFFE_ENFORCE to alert in Bento. Test Plan: buck run unit test forcing DNNLOWP FC with activation_quantization_kind = "l2" and saw warning printed in console. Reviewed By: csummersea Differential Revision: D17770921 fbshipit-source-id: b6532e4c9a86d74e3db4cb432735505d378a366e * Add interface/object serialization as module attribute (#26770) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/26770 This PR added the interface/object serialization as module attribute, to allow initializing object as a interface type during python initialization. Because interface type can be backed by any class object that implements that interface, if we declare it in python/module.__init__, we will need to collect the run time types of the value and serialize them to ensure complete code information Test Plan: Imported from OSS Differential Revision: D17742707 fbshipit-source-id: 7f614ad4f982996d320a0e2dd3515bf47370e730 * Adding docstrings for nnq.functional Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/27363 Test Plan: Imported from OSS Differential Revision: D17758907 Pulled By: zafartahirov fbshipit-source-id: f560f2726cf51ceebdbf22ebef2d067422340cf2 * Enable RCCL in ROCm build (#27383) Summary: continues https://github.com/pytorch/pytorch/pull/23884 Pull Request resolved: https://github.com/pytorch/pytorch/pull/27383 Differential Revision: D17767248 Pulled By: bddppq fbshipit-source-id: 3a506844ca6f01d7bbe8be5bde0976999e3a2b90 * Add randomFill to test_utils.h Summary: Add helper function randomFill to test_utils.h so we can use it in benchmark scrips as well tests. Test Plan: ``` buck run mode/opt //tvm/sparse:cblas_bench ``` Reviewed By: yinghai Differential Revision: D17759193 fbshipit-source-id: e4909b04e83ca9382ab4718855fb63743d028de1 * Use deepcopy inputs for ONNX ort test cases (#27186) Summary: Running models with inplace operators will change values of input tensors. Deepcopy input tensors each time to keep the original input tensors intact. Pull Request resolved: https://github.com/pytorch/pytorch/pull/27186 Differential Revision: D17776598 Pulled By: jerryzh168 fbshipit-source-id: d4808a11185a9ab0d782a62d7d708dfe7e94559c * Remove dependency on six from dist_autograd_test.py Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/27369 Test Plan: Imported from OSS Differential Revision: D17763104 Pulled By: mrshenli fbshipit-source-id: dd146809686e7720f2b77012eebb6aed72851556 * Docstring fix (#27225) Summary: Correcting docstring for `add_image_with_boxes` method. Fixed spelling mistake. Pull Request resolved: https://github.com/pytorch/pytorch/pull/27225 Differential Revision: D17776604 Pulled By: jerryzh168 fbshipit-source-id: 45f69643ec3b58c46b9fb67411c42a6d09b7290e * Tweak docs on building docs Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/27364 Differential Revision: D17777402 Pulled By: dzhulgakov fbshipit-source-id: 304c678e5c80d7f8c779d65c11f9bf1b0facdb52 * Upgrade to ROCm 2.9 (#27417) Summary: New docker images built with tag 325: https://ci.pytorch.org/jenkins/job/caffe2-docker-trigger/325 Related ossci-job-dsl commits: https://github.com/pytorch/ossci-job-dsl/commit/a00a76f927944aed961a3bbbc4f17aff0fc30d71 Pull Request resolved: https://github.com/pytorch/pytorch/pull/27417 Differential Revision: D17777517 Pulled By: bddppq fbshipit-source-id: a6b8cb86b37f537d402f6d2c7d28ad28a6a5a317 * enable rocTX API (#27416) Summary: ROCm 2.9 brings support for the rocTX API through rocTracer. Pull Request resolved: https://github.com/pytorch/pytorch/pull/27416 Differential Revision: D17777480 Pulled By: bddppq fbshipit-source-id: 6bce9b54c94e5b4c5787570d2b85736882bd23a7 * C++ API parity: LogSigmoid Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/27060 Test Plan: Imported from OSS Differential Revision: D17682404 Pulled By: pbelevich fbshipit-source-id: d60d64cd4caf1f56a2e05c516f91321d46ec9624 * Remove Tensor.h, TensorMethods.h from src/core. (#27086) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/27086 This is a major source of merge conflicts, and AFAICT isn't necessary anymore (it may have been necessary for some mobile build stuff in the past). This is a commandeer of #25031 Test Plan: Imported from OSS Reviewed By: ljk53 Differential Revision: D17687345 Pulled By: ezyang fbshipit-source-id: bf6131af835ed1f9e3c10699c81d4454a240445f * Remove outdated note in cholesky_solve and triangular_solve doc strings (#26989) Summary: We do support inputs with dim > 2 in _out variants Pull Request resolved: https://github.com/pytorch/pytorch/pull/26989 Differential Revision: D17785632 Pulled By: soumith fbshipit-source-id: d42ba7ca9c225ad1a26ff3b410d0c5c08eaed001 * Disable tsan for test_multiprocessing. (#27410) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/27410 Similar to https://github.com/pytorch/pytorch/pull/25005, TSAN is not safe to use in a multi-threaded program with fork and can cause deadlocks. As a result, disabling this test for TSAN. ghstack-source-id: 91393545 Test Plan: buildbot Differential Revision: D17775141 fbshipit-source-id: 109b8095240ad43ee4a6380f70b9efca863c0a4a * Unfold export (#24970) Summary: ONNX export for Unfold in symbolic opset9 + op and ORT tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/24970 Reviewed By: hl475 Differential Revision: D17495106 Pulled By: houseroad fbshipit-source-id: fcd179a1213c0f219628f25c09e66fcfe4c5df50 * Reduce special casing around 'training' (#27109) Summary: Most of this was old cruft left over from special handling of `training` before we had a `bool` type. This makes all modules have a `training` attribute that is true by default and removes all other special handling. Fixes #26884 ](https://our.intern.facebook.com/intern/diff/17728129/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/27109 Pulled By: driazati Differential Revision: D17728129 fbshipit-source-id: 8ddc9fbb07a953dd05529538bfdd01ed88b5cb57 * Put metrics back to torch.utils.tensorboard similar we have in TensorboardX Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/27252 Test Plan: Check metrics in the Scuba table: https://fburl.com/scuba/k5x8yosj Reviewed By: sanekmelnikov Differential Revision: D17723414 fbshipit-source-id: 64d42e0b4582f635d38f38feb2b2a6c4826f2065 * Automatic update of fbcode/onnx to 2891e1459745933f4bba9a8cb3371cf3c9eb1d16 (#27474) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/27474 Previous import was 034921bd574cc84906b7996c07873454b7dd4135 Included changes: - **[2891e145](https://github.com/onnx/onnx/commit/2891e145)**: Fix Unique unit test (#2381) <Scott McKay> - **[25cf73e5](https://github.com/onnx/onnx/commit/25cf73e5)**: update shapeInference h file link (#2369) <prcvih> - **[e3074bc0](https://github.com/onnx/onnx/commit/e3074bc0)**: modify file path (#2378) <prcvih> - **[9058d3a4](https://github.com/onnx/onnx/commit/9058d3a4)**: Incrementing version number to 1.6.0 (#2353) (#2385) <Kevin Chen> - **[c963586d](https://github.com/onnx/onnx/commit/c963586d)**: Remove typing packages from test requirements (#2375) <Aiken Cairncross> Test Plan: ci Reviewed By: bddppq Differential Revision: D17791527 fbshipit-source-id: 23ad5abe313cd4e4eedcbe7794b98450b3b7d3bc * Fixed Select symbolic to export slice when index = negative one (#25273) Summary: Exporting torch.select when index = negative one (x[:,-1]) was broken. This PR has the fix in symbolic function for select. Pull Request resolved: https://github.com/pytorch/pytorch/pull/25273 Reviewed By: hl475 Differential Revision: D17159707 Pulled By: houseroad fbshipit-source-id: 2c3b275421082758f1b63c1c9b6e578f03ca9f76 * Avoid variable shadowing in ``::at::philox_engine::single_round()`` (#27486) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/27486 Rename `key` argument of `single_round` method to `in_key` Test Plan: CI Reviewed By: stepancheg, soumith Differential Revision: D17782904 fbshipit-source-id: 6feae55c407f39d41db099b013dcbd3990768603 * Refactor python_android test to separate Android-specific components (#27453) Summary: All of the test cases move into a base class that is extended by the intrumentation test and a new "HostTests" class that can be run in normal Java. (Some changes to the build script and dependencies are required before the host test can actually run.) ghstack-source-id: fe1165b513241b92c5f4a81447f5e184b3bfc75e Pull Request resolved: https://github.com/pytorch/pytorch/pull/27453 Test Plan: Imported from OSS Reviewed By: IvanKobzarev Differential Revision: D17800410 fbshipit-source-id: 1184f0caebdfa219f4ccd1464c67826ac0220181 * Various cleanups to pytorch_android API (#27454) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/27454 See detailed discussion at https://github.com/pytorch/pytorch/issues/27350 Test Plan: Imported from OSS Reviewed By: IvanKobzarev Differential Revision: D17800480 Pulled By: dreiss fbshipit-source-id: bf174e8b16231b89be771de0fa54c41e864a3eb0 * Clean up JavaDoc comments in pytorch_android Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/27455 Test Plan: Imported from OSS Differential Revision: D17800658 Pulled By: dreiss fbshipit-source-id: dbd01d9fa5ac82c50daf54c2869dc18be233d8dd * FunctionEventAvg implements __iadd__ interface (#27498) Summary: Resolving issue https://github.com/pytorch/pytorch/issues/26433 by making FunctionEventAvg implement the `__iadd__` interface again, like it used to. Pull Request resolved: https://github.com/pytorch/pytorch/pull/27498 Differential Revision: D17801918 Pulled By: ezyang fbshipit-source-id: 0597059c903ac168ed64a05ac1decff3ffd14f06 * Move hipify to torch/utils to bundle them into torch package (#27425) Summary: Similar to https://github.com/pytorch/pytorch/pull/27418 but try to put it under "torch" namespace Pull Request resolved: https://github.com/pytorch/pytorch/pull/27425 Differential Revision: D17779490 Pulled By: bddppq fbshipit-source-id: 688338d143509b37dfc110df17af3331db48a42b * Ensure NCCL error handling code is disabled for NCCL versions < 2.4 (#27124) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/27124 ncclCommAbort() and ncclGetAsyncError() were two APIs added in NCCL 2.4 to detect errors in NCCL communicators. These were used as part of ProcesGroupNCCL and we also enforced that only NCCL versions 2.4+ were supported. Although, there is still legitimate use for older NCCL versions and hence we should still support those. For that purpose, in this change I've ensured we disable NCCL error checking for versions < 2.4. ghstack-source-id: 91452959 Test Plan: 1) Test with 2.4.8 2) Test with 2.2.13 3) unit tests. Differential Revision: D17178988 fbshipit-source-id: 5dc44b5f7b4b00466c67fd452315f1d4f5c47698 * #include <stdexcept> into flat_hash_map.h (#27478) Summary: Fixing https://github.com/pytorch/pytorch/issues/27266 In general we should not rely on transitively included headers, we should implicitly include all headers if their members are used in the source file. Pull Request resolved: https://github.com/pytorch/pytorch/pull/27478 Differential Revision: D17799522 Pulled By: pbelevich fbshipit-source-id: 5818394a212c947cfac3a6cf042af9ebb8b9d9a0 * Fix broken name mangling Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/27511 Test Plan: Imported from OSS Differential Revision: D17801185 Pulled By: jamesr66a fbshipit-source-id: 3eaa9542a445c9401f3f96e11138ec09b0d8350a * Updating submodules Summary: GitHub commits: https://github.com/facebook/fbthrift/commit/e80ecd1d63c956ed34b257fbd1aaef73ef8eb781 https://github.com/facebook/proxygen/commit/6c7a36b1b3f2825fd30ba00c708ec5ceaa5db760 https://github.com/facebookincubator/mvfst/commit/875046204325f9bd8cc5343b98a8fa4b99187a3c https://github.com/facebook/proxygen/commit/442d7def679c297427f5d0b679685db92fe3d28c https://github.com/facebook/wangle/commit/c138dc3d2c0c4f4f68ab4931e44b87a6becb194c https://github.com/facebookincubator/fizz/commit/3833f10989711256704260a01e0c9f7d1c33e468 https://github.com/facebookincubator/katran/commit/6fc473d5304985aa31d351c6305904e80af4b614 https://github.com/pytorch/fbgemm/commit/82d259dade58e53775a534f88b7b48e760f09a64 Test Plan: n/a Reviewed By: 2d2d2d2d2d fbshipit-source-id: 7834a4a8620d0ab9b60060e0abadfba457fb2890 * Revert D17159707: [pytorch][PR] [ONNX] Fixed Select symbolic to export slice when index = negative one Test Plan: revert-hammer Differential Revision: D17159707 Original commit changeset: 2c3b27542108 fbshipit-source-id: accce910abdbe13270d0f592810a48b1dabe4b01 * Roll master to 1.4.0 (#27374) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/27374 Signed-off-by: Edward Z. Yang <[email protected]> Test Plan: Imported from OSS Differential Revision: D17809770 Pulled By: ezyang fbshipit-source-id: 75bd97426494a7bbbf08f9bce7563d35871443d8 * Exponential decay of the weight of task loss (#27508) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/27508 Implemented a simple exponential decay of the weight of lr loss function, with a lower bound. Test Plan: buck test //caffe2/caffe2/fb/dper/layer_models/tests:mtml_test -- test_task_weight_decay https://our.intern.facebook.com/intern/testinfra/testrun/3377699729136308 canary: f140103452 Reviewed By: chenshouyuan Differential Revision: D17524101 fbshipit-source-id: 9a653e21a4ecb74dfc4ac949c9e3388f36ef3a20 * docstring only formatting changes: quantize.py, fake_quantize.py, observer.…
Summary: Pull Request resolved: pytorch#26423 Enable chainable schedulers as requested in pytorch#13022 by implementing the changes mentioned below from [comment](pytorch#21800 (comment)). * Changing the behavior of schedulers to the chainable formula when available * Using the closed form whenever epoch is different from None until the next release with a deprecation warning * Making `get_computed_values` the supported way of obtaining the last computed learning rate by the scheduler (see [comment](pytorch#21800 (comment)) for new syntax) * Returning a deprecation warning when invoking the undocumented get_lr function (see [comment](pytorch#21800 (comment))) referring to `get_computed_values`, and deprecating it in the next release. * `CosineAnnealingWarmRestart` still takes an epoch parameter as it is the only one with a mechanic relying on fractional epoch * `MultiplicativeLR` is consumes a function providing the multiplicative factor at each epoch. It mimics `LambdaLR` in its syntax. # pytorch#20527 ### Before The user calls scheduler with a constant epoch either across loops or in the same loop. ``` import torch.optim as optim from torch import nn conv = nn.Conv2d(3,3,3) optimizer = optim.Adam(conv.parameters()) lr_scheduler = optim.lr_scheduler.StepLR(optimizer, 2) # Scheduler with sometimes-constant epoch number for epoch in [0, 0, 1, 1, 2, 2, 3, 3]: lr_scheduler.step(epoch) print(optimizer.param_groups[0]['lr']) ``` ### After If the user wants to step ``` import torch.optim as optim from torch import nn conv = nn.Conv2d(3,3,3) optimizer = optim.Adam(conv.parameters()) lr_scheduler = optim.lr_scheduler.StepLR(optimizer, 2) last_epoch = -1 for epoch in [0, 0, 1, 1, 2, 2, 3, 3]: # Check if epoch number has changed manually if epoch-last_epoch > 0: lr_scheduler.step() last_epoch = epoch print(epoch, scheduler.get_computed_values()) ``` # pytorch#22107 ### Before ``` import torch from torchvision.models import resnet18 net = resnet18() optimizer = torch.optim.SGD(net.parameters(), 0.1) scheduler = torch.optim.lr_scheduler.MultiStepLR(optimizer, milestones=[3, 6, 9], gamma=0.1) scheduler = torch.optim.lr_scheduler.StepLR(optimizer, 3, gamma=0.1) for i in range(10): # Scheduler computes and returns new learning rate, leading to unexpected behavior print(i, scheduler.get_lr()) scheduler.step() ``` ### After ``` import torch from torchvision.models import resnet18 net = resnet18() optimizer = torch.optim.SGD(net.parameters(), 0.1) lr_scheduler = torch.optim.lr_scheduler.MultiStepLR(optimizer, milestones=[3, 6, 9], gamma=0.1) lr_scheduler = torch.optim.lr_scheduler.StepLR(optimizer, 3, gamma=0.1) for i in range(10): # Returns last computed learning rate by scheduler print(i, lr_scheduler.get_computed_values()) lr_scheduler.step() ``` # ghstack This contains the changes from pytorch#24352. Opening again since they were reverted. This reverts commit 1c477b7. Test Plan: Imported from OSS Differential Revision: D17460427 Pulled By: vincentqb fbshipit-source-id: 8c10f4e7246d6756ac91df734e8bed65bdef63c9
Summary: Pull Request resolved: pytorch#26423 Enable chainable schedulers as requested in pytorch#13022 by implementing the changes mentioned below from [comment](pytorch#21800 (comment)). * Changing the behavior of schedulers to the chainable formula when available * Using the closed form whenever epoch is different from None until the next release with a deprecation warning * Making `get_computed_values` the supported way of obtaining the last computed learning rate by the scheduler (see [comment](pytorch#21800 (comment)) for new syntax) * Returning a deprecation warning when invoking the undocumented get_lr function (see [comment](pytorch#21800 (comment))) referring to `get_computed_values`, and deprecating it in the next release. * `CosineAnnealingWarmRestart` still takes an epoch parameter as it is the only one with a mechanic relying on fractional epoch * `MultiplicativeLR` is consumes a function providing the multiplicative factor at each epoch. It mimics `LambdaLR` in its syntax. # pytorch#20527 ### Before The user calls scheduler with a constant epoch either across loops or in the same loop. ``` import torch.optim as optim from torch import nn conv = nn.Conv2d(3,3,3) optimizer = optim.Adam(conv.parameters()) lr_scheduler = optim.lr_scheduler.StepLR(optimizer, 2) # Scheduler with sometimes-constant epoch number for epoch in [0, 0, 1, 1, 2, 2, 3, 3]: lr_scheduler.step(epoch) print(optimizer.param_groups[0]['lr']) ``` ### After If the user wants to step ``` import torch.optim as optim from torch import nn conv = nn.Conv2d(3,3,3) optimizer = optim.Adam(conv.parameters()) lr_scheduler = optim.lr_scheduler.StepLR(optimizer, 2) last_epoch = -1 for epoch in [0, 0, 1, 1, 2, 2, 3, 3]: # Check if epoch number has changed manually if epoch-last_epoch > 0: lr_scheduler.step() last_epoch = epoch print(epoch, scheduler.get_computed_values()) ``` # pytorch#22107 ### Before ``` import torch from torchvision.models import resnet18 net = resnet18() optimizer = torch.optim.SGD(net.parameters(), 0.1) scheduler = torch.optim.lr_scheduler.MultiStepLR(optimizer, milestones=[3, 6, 9], gamma=0.1) scheduler = torch.optim.lr_scheduler.StepLR(optimizer, 3, gamma=0.1) for i in range(10): # Scheduler computes and returns new learning rate, leading to unexpected behavior print(i, scheduler.get_lr()) scheduler.step() ``` ### After ``` import torch from torchvision.models import resnet18 net = resnet18() optimizer = torch.optim.SGD(net.parameters(), 0.1) lr_scheduler = torch.optim.lr_scheduler.MultiStepLR(optimizer, milestones=[3, 6, 9], gamma=0.1) lr_scheduler = torch.optim.lr_scheduler.StepLR(optimizer, 3, gamma=0.1) for i in range(10): # Returns last computed learning rate by scheduler print(i, lr_scheduler.get_computed_values()) lr_scheduler.step() ``` # ghstack This contains the changes from pytorch#24352. Opening again since they were reverted. This reverts commit 1c477b7. Test Plan: Imported from OSS Differential Revision: D17460427 Pulled By: vincentqb fbshipit-source-id: 8c10f4e7246d6756ac91df734e8bed65bdef63c9
We want to let the user combine schedulers, as described in #13022 and here. The mechanic of combining schedulers together implemented in #14010 results in unexpected behavior for the user, see #20527. We are planning on reinstating #14010 with the following modifications: when the user inputs an epoch value different from the upcoming one, a warning is shown, and the closed form of the scheduler is used when available.
This PR also defines the behavior of
get_lrto return the last computed value of a scheduler to address #22107.Stack from ghstack:
This reverts commit 3889855.
Differential Revision: D15833938