Skip to content

LinearOperator Abstraction / Structure-Exploiting LazyTensors for Linear Algebra #28341

@Balandat

Description

@Balandat

Feature

Abstraction for lazy evaluation of tensors that can make use of special matrix structure for linear algebra operations, similar to Tensorflow's / scipy's LinearOperator.

Motivation

In many situations, sequences of operations involving tensors can often be expressed and computed without ever instantiating the involved tensors. This can be achieved using lazy evaluation, so that a compiler can optimize the operations (“operator fusion”). In the context of linear algebra, the involved (input or intermediate) tensors often have special structure (e.g. diagonal, block diagonal, triangular, Toeplitz, …). Exploiting this structure using (semi-)symbolic evaluation and reductions can significantly decrease time and memory complexity, often by many orders of magnitude (see example below).

While TensorFlow’s LinearOperator (and to a lesser degree, scipy’s) provide a basic abstraction for implementing such operators, PyTorch does not have this concept. This is a significant shortcoming for researchers whose work involves a lot of linear algebra.

General Wishlist

In order for this to be general useful, we should have the following:

  • PyTorch autograd functions can accept the operator abstraction
  • The operator abstraction accept the same methods as Tensors, with the default implementations being to evaluate the operator and perform standard operations on the materialized tensors.

Alternatives / 3rd party solutions

Third-party implementations include KeOps’s LazyTensor, which compiles the symbolic operations (basically a JIT) and GPyTorch’s LazyTensor, which performs reductions in a pair-wise fashion using the PyTorch’s python frontend.

Illustration: Simple example using GPyTorch LazyTensor

import torch
from gpytorch.lazy import DiagLazyTensor

d = torch.rand(10000)
D = d.diag()  # 10000 x 10000 tensor
D_lazy = DiagLazyTensor(d)  # same object, but represented only by d

(D @ D).diag()  # 1.6 seconds 
(D_lazy @ D_lazy).diag()  # 0.000027 seconds (D_lazy @ D_lazy is a DiagLazyTensor)

Many other operations can similarly be accelerated when exploiting known tensor structure, including matrix decompositions (Cholesky) and spectral operations.

Limitations

Both of these solutions have significant shortcomings. KeOps requires setting up an external compiler toolchain (which can be painful), and does not straightforwardly support sparse / structured representations. GPyTorch, being implemented in pure python, incurs overhead that for smaller tensors / simpler operations often outweighs the benefits.

Pyro’s suggested Funsor abstraction also addresses the issue, but is much more general in scope (it generalizes the tensor interface to also cover arbitrary functions of multiple variables, where variables may be integers, real numbers or themselves tensors).

Existing plans for Lazy Tensors

The proposed implementation for lazy tensors in #25753 lays the groundwork for lazy evaluation. However, so far it only deals with the basic dispatch mechanism, not with any actual code optimizations. It should be possible to implement optimization of the involved tensors by manipulating PyTorch Internal Representation generated during the evaluation of the lazy operations.

In order to be able to exploit special structure, it seems that one would need to have certain structured tensor primitives (similar to GPyTorch’s BlockDiagLazyTensor or TensorFlow’s LinearOperatorKronecker).

@gpleiss, @jacobrgardner, @vishwakftw, @bwasti

cc @ezyang @gchanan @zou3519 @bdhirsh @jbschlosser @anjali411 @Varal7 @jianyuh @nikitaved @pearu @mruberry @heitorschueroff @walterddr @IvanYashchuk @xwang233 @lezcano @vincentqb @vishwakftw @ssnl

Metadata

Metadata

Assignees

No one assigned

    Labels

    featureA request for a proper, new feature.module: lazymodule: linear algebraIssues related to specialized linear algebra operations in PyTorch; includes matrix multiply matmultriagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate module

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions