LinearOperator Abstraction / Structure-Exploiting LazyTensors for Linear Algebra

## Feature
Abstraction for lazy evaluation of tensors that can make use of special matrix structure for linear algebra operations, similar to Tensorflow's / scipy's `LinearOperator`.


## Motivation
In many situations, sequences of operations involving tensors can often be expressed and computed without ever instantiating the involved tensors. This can be achieved using lazy evaluation, so that a compiler can optimize the operations (“operator fusion”). In the context of linear algebra, the involved (input or intermediate) tensors often have special structure (e.g. diagonal, block diagonal, triangular, Toeplitz, …). Exploiting this structure using (semi-)symbolic evaluation and reductions can significantly decrease time and memory complexity, often by many orders of magnitude (see example below).

While TensorFlow’s `LinearOperator` (and to a lesser degree, scipy’s) provide a basic abstraction for implementing such operators, PyTorch does not have this concept. This is a significant shortcoming for researchers whose work involves a lot of linear algebra.

####  General Wishlist
In order for this to be general useful, we should have the following:
- PyTorch autograd functions can accept the operator abstraction
- The operator abstraction accept the same methods as Tensors, with the default implementations being to evaluate the operator and perform standard operations on the materialized tensors.

## Alternatives / 3rd party solutions
Third-party implementations include KeOps’s LazyTensor, which compiles the symbolic operations (basically a JIT) and GPyTorch’s LazyTensor, which performs reductions in a pair-wise fashion using the PyTorch’s python frontend.

#### Illustration: Simple example using GPyTorch LazyTensor
```
import torch
from gpytorch.lazy import DiagLazyTensor

d = torch.rand(10000)
D = d.diag()  # 10000 x 10000 tensor
D_lazy = DiagLazyTensor(d)  # same object, but represented only by d

(D @ D).diag()  # 1.6 seconds 
(D_lazy @ D_lazy).diag()  # 0.000027 seconds (D_lazy @ D_lazy is a DiagLazyTensor)
```
Many other operations can similarly be accelerated when exploiting known tensor structure, including matrix decompositions (Cholesky) and spectral operations.

#### Limitations 
Both of these solutions have significant shortcomings. KeOps requires setting up an external compiler toolchain (which can be painful), and does not straightforwardly support sparse / structured representations. GPyTorch, being implemented in pure python, incurs overhead that for smaller tensors / simpler operations often outweighs the benefits.

Pyro’s suggested [Funsor](https://github.com/pyro-ppl/funsor) abstraction also addresses the issue, but is much more general in scope (it generalizes the tensor interface to also cover arbitrary functions of multiple variables, where variables may be integers, real numbers or themselves tensors).

## Existing plans for Lazy Tensors

The proposed implementation for lazy tensors in #25753 lays the groundwork for lazy evaluation. However, so far it only deals with the basic dispatch mechanism, not with any actual code optimizations. It should be possible to implement optimization of the involved tensors by manipulating PyTorch Internal Representation generated during the evaluation of the lazy operations. 

In order to be able to exploit special structure, it seems that one would need to have certain structured tensor primitives (similar to  GPyTorch’s `BlockDiagLazyTensor` or TensorFlow’s `LinearOperatorKronecker`). 


@gpleiss, @jacobrgardner, @vishwakftw, @bwasti

cc @ezyang @gchanan @zou3519 @bdhirsh @jbschlosser @anjali411 @Varal7 @jianyuh @nikitaved @pearu @mruberry @heitorschueroff @walterddr @IvanYashchuk @xwang233 @Lezcano @vincentqb @vishwakftw @SsnL

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LinearOperator Abstraction / Structure-Exploiting LazyTensors for Linear Algebra #28341

Feature

Motivation

General Wishlist

Alternatives / 3rd party solutions

Illustration: Simple example using GPyTorch LazyTensor

Limitations

Existing plans for Lazy Tensors

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

LinearOperator Abstraction / Structure-Exploiting LazyTensors for Linear Algebra #28341

Description

Feature

Motivation

General Wishlist

Alternatives / 3rd party solutions

Illustration: Simple example using GPyTorch LazyTensor

Limitations

Existing plans for Lazy Tensors

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions