-
Notifications
You must be signed in to change notification settings - Fork 26.3k
Closed
Description
Gradient shape is not adjusted to follow broadcasting and autograd fails. Backwards of these two fail, for example:
torch.randn(1) * torch.randn(5,4)torch.randn(4,1) * torch.randn(1,4)
grad/variable shape mismatch:
import torch
from torch.autograd import Variable
# no broadcasting
a = Variable(torch.randn(1), requires_grad=True)
b = Variable(torch.randn(5,4), requires_grad=True)
(a.expand_as(b) * b).sum().backward()
print 'a:', a.size(), a.grad.size()
print 'b:', b.size(), b.grad.size()
# broadcasting failure case
a = Variable(torch.randn(1), requires_grad=True)
b = Variable(torch.randn(5,4), requires_grad=True)
(a * b).sum().backward()
print 'a:', a.size(), a.grad.size() # <-- grad size is different with tensor here
print 'b:', b.size(), b.grad.size()Metadata
Metadata
Assignees
Labels
No labels