Optional tensor type

FB internal citation: https://fb.quip.com/D9d1AQuwQ4qg

@bhosmer was primary author of this note; I'm just publishing it here so it's in the public record

**Status quo:**

1. Tensor is a nullable type: `Tensor()` delivers an **undefined Tensor** - an uninitialized value that is not a valid instance of Tensor. Operations performed on an undefined Tensor raise errors (but not segfaults) at runtime. Two situations that introduce undefined Tensors into the system currently are uninitialized locals (`Tensor t;`) and post-move locations. 
    1. *Aside: some calls are ok on an undefined tensor, for example there’s a pattern `Tensor result; at::add_out(result, ...);` where you expect `result` to implicitly become defined (note: we might have to change this because this pattern doesn’t work through boxing/unboxing and blocks backend fallbacks).*
2. Some part of the system obey our **old convention**, where we use `Tensor` to represent both `Tensor` and `Tensor?` schema types in C++. In non-optional situations, kernels are allowed to assume definedness (i.e., no need to assert `t.defined()` , just let it fail). In the optional case, `t.defined()` is the discriminant
3. Other parts obey the **newer convention we had decided on but are now revisiting to make sure we want to do it.** Here we use optional<Tensor> to represent Tensor?. The parts of the system that use this convention are:
    1. The C++ frontend (both Functions.h and TensorMethods) use `optional<Tensor>`.
    2. The custom op API has always used this representation
    3. The dispatcher was recently migrated to use this representation, with hacky_wrapper providing a compatibility shim for kernels that continued to use Tensor for both.

**Choices:**

1. **Move the rest of the system to optional<Tensor>**. Make it actually illegal to call any API with undefined tensor. This was the provisional POR until recent discussions, mostly around perf, reopened the question.
    1. Pro: It’s safe, straightforward and idiomatic. If the whole system uses optional to represent optionality, there’s fewer bugs and `optional` is a pattern people are familiar with.
        1. *Aside: Scott points out that “idiomatic” oversells it: optional<T> isn’t universally idiomatic in C++ the same way it is in, say, OCaml. It’s also idiomatic C++ to use unused values in a data structure’s representation to denote a “none”  value, exactly as we’d be doing here with undefined Tensor. Maybe a better way to describe the Pro here is “highly legible”.*
    2. Neutral: ergonomics are roughly the same as now for op callers and kernel implementors: 
        1. `has_value` replaces `defined` as the discriminant
        2. implicit conversions minimize explicit transfer in/out of optional<Tensor>
    3. Con: Potentially a lot of effort to migrate the system to this state.
    4. Con: there is a risk of perf issues that could need additional work like 
        1. passing around `const optional<TensorRef>&` (or `std::reference_wrapper`) instead of `const optional<Tensor>&` to avoid refcount bumps on `Tensor` `<-> optional<Tensor>` casts or 
        2. adding an `optional<Tensor>` specialization to represent the null state efficiently.
2. **Unwind the move to optional<Tensor>, but add a nominal class like OptionalTensor that wraps an ordinary Tensor, to represent optional tensors.** `defined()` would remain the discriminant in user code.
    1. Pro: economy of representation (uses undefined Tensor for None); no need to be clever to get optimal perf
    2. Pro: less engineering effort to get there than (1), though not zero, since parts of the system have already been migrated to optional<Tensor>.
    3. Neutral: schema inference and C++ API expressiveness are roughly equivalent: OptionalTensor means the same thing as optional<Tensor>. 
    4. Neutral: `defined()` means `has_value()` and doesn’t need to be checked in non-optional contexts (just let it fail; see above), so there’s no ambiguity around the meaning of `defined()` in user code. This means that we avoid even an unambiguous double meaning for `defined()` - it means the type of the value being checked is `Tensor?`, not `Tensor`. (Of course this is further reinforced by use of `OptionalTensor`, but the scheme doesn’t rely on 100% usage of this type.)
    5. Con: non-idiomatic; without full migration to `OptionalTensor` we still have implicit optionality in code, which might be confusing.
    6. Con: `OptionalTensor` needs to be designed.

**Other discussion:**

* workplace thread https://fb.workplace.com/groups/1144215345733672/permalink/1876235475864985/


Notes on Perf

* I’ve [@smessmer] written a prototype PR that specializes c10::optional for Tensor and gets rid of the extra bool field, i.e. optional<Tensor> would then have the same binary representation as Tensor and the null states are folded together. See https://github.com/pytorch/pytorch/pull/51077
    * For benchmarking it, I chose the batch_norm op, which takes 4 optional Tensors, and ran Taylor’s instruction count benchmark, benchmarking the C++ API, not Python.
    * The results are mixed
        * I see a 2% improvement when calling it with 4 defined tensors
        * I see a 2% regression when calling it with 4 c10::nullopt’s



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Optional tensor type #51456

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Optional tensor type #51456

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions