-
Notifications
You must be signed in to change notification settings - Fork 26.3k
Open
Labels
featureA request for a proper, new feature.A request for a proper, new feature.triagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate moduleThis issue has been looked at a team member, and triaged and prioritized into an appropriate module
Description
FB internal citation: https://fb.quip.com/D9d1AQuwQ4qg
@bhosmer was primary author of this note; I'm just publishing it here so it's in the public record
Status quo:
- Tensor is a nullable type:
Tensor()delivers an undefined Tensor - an uninitialized value that is not a valid instance of Tensor. Operations performed on an undefined Tensor raise errors (but not segfaults) at runtime. Two situations that introduce undefined Tensors into the system currently are uninitialized locals (Tensor t;) and post-move locations.- Aside: some calls are ok on an undefined tensor, for example there’s a pattern
Tensor result; at::add_out(result, ...);where you expectresultto implicitly become defined (note: we might have to change this because this pattern doesn’t work through boxing/unboxing and blocks backend fallbacks).
- Aside: some calls are ok on an undefined tensor, for example there’s a pattern
- Some part of the system obey our old convention, where we use
Tensorto represent bothTensorandTensor?schema types in C++. In non-optional situations, kernels are allowed to assume definedness (i.e., no need to assertt.defined(), just let it fail). In the optional case,t.defined()is the discriminant - Other parts obey the newer convention we had decided on but are now revisiting to make sure we want to do it. Here we use optional to represent Tensor?. The parts of the system that use this convention are:
- The C++ frontend (both Functions.h and TensorMethods) use
optional<Tensor>. - The custom op API has always used this representation
- The dispatcher was recently migrated to use this representation, with hacky_wrapper providing a compatibility shim for kernels that continued to use Tensor for both.
- The C++ frontend (both Functions.h and TensorMethods) use
Choices:
- Move the rest of the system to optional. Make it actually illegal to call any API with undefined tensor. This was the provisional POR until recent discussions, mostly around perf, reopened the question.
- Pro: It’s safe, straightforward and idiomatic. If the whole system uses optional to represent optionality, there’s fewer bugs and
optionalis a pattern people are familiar with.- Aside: Scott points out that “idiomatic” oversells it: optional isn’t universally idiomatic in C++ the same way it is in, say, OCaml. It’s also idiomatic C++ to use unused values in a data structure’s representation to denote a “none” value, exactly as we’d be doing here with undefined Tensor. Maybe a better way to describe the Pro here is “highly legible”.
- Neutral: ergonomics are roughly the same as now for op callers and kernel implementors:
has_valuereplacesdefinedas the discriminant- implicit conversions minimize explicit transfer in/out of optional
- Con: Potentially a lot of effort to migrate the system to this state.
- Con: there is a risk of perf issues that could need additional work like
- passing around
const optional<TensorRef>&(orstd::reference_wrapper) instead ofconst optional<Tensor>&to avoid refcount bumps onTensor<-> optional<Tensor>casts or - adding an
optional<Tensor>specialization to represent the null state efficiently.
- passing around
- Pro: It’s safe, straightforward and idiomatic. If the whole system uses optional to represent optionality, there’s fewer bugs and
- Unwind the move to optional, but add a nominal class like OptionalTensor that wraps an ordinary Tensor, to represent optional tensors.
defined()would remain the discriminant in user code.- Pro: economy of representation (uses undefined Tensor for None); no need to be clever to get optimal perf
- Pro: less engineering effort to get there than (1), though not zero, since parts of the system have already been migrated to optional.
- Neutral: schema inference and C++ API expressiveness are roughly equivalent: OptionalTensor means the same thing as optional.
- Neutral:
defined()meanshas_value()and doesn’t need to be checked in non-optional contexts (just let it fail; see above), so there’s no ambiguity around the meaning ofdefined()in user code. This means that we avoid even an unambiguous double meaning fordefined()- it means the type of the value being checked isTensor?, notTensor. (Of course this is further reinforced by use ofOptionalTensor, but the scheme doesn’t rely on 100% usage of this type.) - Con: non-idiomatic; without full migration to
OptionalTensorwe still have implicit optionality in code, which might be confusing. - Con:
OptionalTensorneeds to be designed.
Other discussion:
- workplace thread https://fb.workplace.com/groups/1144215345733672/permalink/1876235475864985/
Notes on Perf
- I’ve [@smessmer] written a prototype PR that specializes c10::optional for Tensor and gets rid of the extra bool field, i.e. optional would then have the same binary representation as Tensor and the null states are folded together. See [prototype] Specialise optional<Tensor> #51077
- For benchmarking it, I chose the batch_norm op, which takes 4 optional Tensors, and ran Taylor’s instruction count benchmark, benchmarking the C++ API, not Python.
- The results are mixed
- I see a 2% improvement when calling it with 4 defined tensors
- I see a 2% regression when calling it with 4 c10::nullopt’s
dfalbel
Metadata
Metadata
Assignees
Labels
featureA request for a proper, new feature.A request for a proper, new feature.triagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate moduleThis issue has been looked at a team member, and triaged and prioritized into an appropriate module