Skip to content

matmul uses too much memory in some batched cases #18862

@colesbury

Description

@colesbury

🐛 Bug

PyTorch 1.1.0a0+c3e3c5c. The GPU times reported are on a P100.

In this case matmul uses about 12 GB of memory when it shouldn't use more than ~3 MB. (i.e. it's using 4096x more memory than necessary)

A

x = torch.randn(4096, 4096)
y = torch.randn(192, 4096, 1)
z = torch.matmul(x, y)

Note that this is equivalent to the following memory efficient operation:

B

x = torch.randn(4096, 4096)
y = torch.randn(192, 4096, 1)
z = torch.bmm(x.unsqueeze(0).expand(192, *x.shape), y)

It's also equivalent to the following which is memory efficient and faster, but may require a copy of y and the output may be batched-column-major without some extra work:

C

x = torch.randn(4096, 4096)
y = torch.randn(192, 4096, 1)
z = torch.matmul(y.permute(0, 2, 1), x.t()).permute(0, 2, 1)

On GPU, A takes ~125 ms and uses 12 GB of memory, B takes ~22 ms, and C takes ~1 ms.

Originally https://discuss.pytorch.org/t/unexpected-huge-memory-cost-of-matmul/41642/4

See also #13222 which may be related


I believe the problem is the unnecessary contiguous call here:

Tensor tensor1_expanded = tensor1.expand(tensor1_expand_size).contiguous().view(tensor1_bmm_view);
Tensor tensor2_expanded = tensor2.expand(tensor2_expand_size).contiguous().view(tensor2_bmm_view);

Instead of using contiguous() and view() it may be possible to use reshape(). That might achieve performance of B.

Metadata

Metadata

Assignees

No one assigned

    Labels

    high prioritymodule: memory usagePyTorch is using more memory than it should, or it is leaking memorymodule: performanceIssues related to performance, either of kernel code or framework gluesmallWe think this is a small issue to fix. Consider knocking off high priority small issuestriagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate module

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions