Add boxed fallback #26368

ezyang · 2019-09-17T21:23:36Z

Stack from ghstack:

Tests for fallback boxed dispatch (including TLS mode) #26719 Tests for fallback boxed dispatch.
Add boxed fallback #26368 Add boxed fallback

A boxed fallback function is a per-TensorTypeId generic function that is called if there is no operator specific function (either for the TensorTypeId, or the UndefinedTensorId fallback for that operator).

The outline of the patch:

Add a new boxed_fallback_table_ in the top-level ATenDispatch struct (this is NOT per operator) for storing boxed fallback functions. Boxed fallback functions provisionally have the interface void(const char* schema, torch::jit::Stack*); we expect to replace the const char* schema with something more detailed in the near future.
Boxing and unboxing is not supported for all arguments, as IValue doesn't cover the full set of types that are in ATen. The supports_boxed_fallback type trait tests if boxing is possible. The list of exclusions here was experimentally generated. One notable exclusion is that we cannot handle any mutable functions, as they return a Tensor& which we cannot conveniently manufacture from a Tensor on an IValue stack.
The actual boxed fallback is handled in two phases. First, we do a (compile time) test to see if boxing/unboxing is supported. If it is, we do a runtime test to check if a fallback function is installed. If both conditions are met, we allocate a stack, push our arguments onto it, and then call the boxed fallback function. The return value is expected to be a single argument on the top of the stack; we retrieve it, unpack it, and return it to the user.

At present, there are no tests for this diff.

This diff also makes multi_dispatch_tensor_type_set safer by explicitly specifying that it takes all arguments as references. This prevents the function from accidentally moving in rvalue references (this never actually happened because none of the overloads of apply move out their arguments; but they could have.)

Signed-off-by: Edward Z. Yang [email protected]

Differential Revision: D17448555

Signed-off-by: Edward Z. Yang <[email protected]> [ghstack-poisoned]

Signed-off-by: Edward Z. Yang <[email protected]> ghstack-source-id: 96b6c48 Pull Request resolved: #26368

Signed-off-by: Edward Z. Yang <[email protected]> [ghstack-poisoned]

A boxed fallback function is a per-TensorTypeId generic function that is called if there is no operator specific function (either for the TensorTypeId, or the UndefinedTensorId fallback for that operator). Signed-off-by: Edward Z. Yang <[email protected]> Differential Revision: [D17448555](https://our.internmc.facebook.com/intern/diff/D17448555) [ghstack-poisoned]

Signed-off-by: Edward Z. Yang <[email protected]> ghstack-source-id: b58ff90 Pull Request resolved: #26368

bwasti · 2019-09-18T17:16:30Z

aten/src/ATen/core/ATenDispatch.h

+    return (*unboxed_fn)(std::forward<Args>(args)...);
+  }
+
+  auto* unboxed_fallback_fn = reinterpret_cast<FuncType*>(function_table_[static_cast<int64_t>(TensorTypeId::UndefinedTensorId)]);


We need the fallback(highest_tid) to be attempted before using UndefinedTensorId implementations, which by definition have a lower priority.

A boxed fallback function is a per-TensorTypeId generic function that is called if there is no operator specific function (either for the TensorTypeId, or the UndefinedTensorId fallback for that operator). The outline of the patch: * Add a new `boxed_fallback_table_` in the top-level ATenDispatch struct (this is NOT per operator) for storing boxed fallback functions. Boxed fallback functions provisionally have the interface `void(const char* schema, torch::jit::Stack*)`; we expect to replace the `const char* schema` with something more detailed in the near future. * Boxing and unboxing is not supported for all arguments, as IValue doesn't cover the full set of types that are in ATen. The `supports_boxed_fallback` type trait tests if boxing is possible. The list of exclusions here was experimentally generated. One notable exclusion is that we cannot handle any mutable functions, as they return a `Tensor&` which we cannot conveniently manufacture from a Tensor on an IValue stack. * The actual boxed fallback is handled in two phases. First, we do a (compile time) test to see if boxing/unboxing is supported. If it is, we do a runtime test to check if a fallback function is installed. If both conditions are met, we allocate a stack, push our arguments onto it, and then call the boxed fallback function. The return value is expected to be a single argument on the top of the stack; we retrieve it, unpack it, and return it to the user. At present, there are no tests for this diff. This diff also makes `multi_dispatch_tensor_type_set` safer by explicitly specifying that it takes all arguments as references. This prevents the function from accidentally moving in rvalue references (this never actually happened because none of the overloads of apply move out their arguments; but they could have.) Signed-off-by: Edward Z. Yang <[email protected]> Differential Revision: [D17448555](https://our.internmc.facebook.com/intern/diff/D17448555) [ghstack-poisoned]

Signed-off-by: Edward Z. Yang <[email protected]> ghstack-source-id: e0023a9 Pull Request resolved: #26368

A boxed fallback function is a per-TensorTypeId generic function that is called if there is no operator specific function (either for the TensorTypeId, or the UndefinedTensorId fallback for that operator). The outline of the patch: * Add a new `boxed_fallback_table_` in the top-level ATenDispatch struct (this is NOT per operator) for storing boxed fallback functions. Boxed fallback functions provisionally have the interface `void(const char* schema, torch::jit::Stack*)`; we expect to replace the `const char* schema` with something more detailed in the near future. * Boxing and unboxing is not supported for all arguments, as IValue doesn't cover the full set of types that are in ATen. The `supports_boxed_fallback` type trait tests if boxing is possible. The list of exclusions here was experimentally generated. One notable exclusion is that we cannot handle any mutable functions, as they return a `Tensor&` which we cannot conveniently manufacture from a Tensor on an IValue stack. * The actual boxed fallback is handled in two phases. First, we do a (compile time) test to see if boxing/unboxing is supported. If it is, we do a runtime test to check if a fallback function is installed. If both conditions are met, we allocate a stack, push our arguments onto it, and then call the boxed fallback function. The return value is expected to be a single argument on the top of the stack; we retrieve it, unpack it, and return it to the user. At present, there are no tests for this diff. This diff also makes `multi_dispatch_tensor_type_set` safer by explicitly specifying that it takes all arguments as references. This prevents the function from accidentally moving in rvalue references (this never actually happened because none of the overloads of apply move out their arguments; but they could have.) Signed-off-by: Edward Z. Yang <[email protected]> Differential Revision: [D17448555](https://our.internmc.facebook.com/intern/diff/D17448555) [ghstack-poisoned]

bwasti

Looks good -- as a test I rebased #25753 onto this

bwasti · 2019-09-27T15:36:40Z

aten/src/ATen/core/ATenDispatch.h

+using supports_boxed_fallback =
+  c10::guts::negation<c10::guts::disjunction<
+    std::is_lvalue_reference<Result>,
+    not_ok_to_box<Result>,


curious, why the double negatives?

no good reason, just how I initially wrote the code. I think one (bad) reason for the negative is because I was trying to avoid saying std::negation<std::is_same<...,...>>

bwasti · 2019-09-27T16:07:07Z

aten/src/ATen/core/ATenDispatch.h

+  auto* boxed_fallback_fn = globalATenDispatch().getFallbackBoxedOp(tid);
+  if (C10_UNLIKELY(boxed_fallback_fn)) {
+    if (supports_boxed_fallback<Result, Args...>::value) {
+      return callBoxedFallback<Result, Args...>(schema_.c_str(), boxed_fallback_fn, std::forward<Args>(args)...);


can we pass in a FunctionSchema rather than the schema string here?

did some light profiling and found dealing with this string on every op invocation is a huge overhead

Not surprising :) Yes, FunctionSchema can probably be arranged.

Recapping in person conversation: our new plan is to get FunctionSchema out of JIT data structures

bwasti · 2019-09-27T20:17:58Z

aten/src/ATen/core/stack.h

 template <typename... Types>
 static inline void push(Stack& stack, Types&&... args) {
-  (void)std::initializer_list<int>{(stack.emplace_back(std::forward<Types>(args)), 0)...};
+  (void)std::initializer_list<int>{(push_one(stack, args), 0)...};


do you think we could add a fast path? something like

// enable_if all ivalue constructible stack.reserve(stack.size() + sizeof...(Types)); auto v = { IValue(args)... }; stack.insert(stack.end(), v.begin(), v.end());

reason: the realloc insert is taking up a non-negligible amount of time in profiling (2%)

Yes, I'll do this in a follow up.

@bwasti I looked into applying this optimization and I don't think we can safely do it (in fact, the reserve in push_list_elements is also a bit suspect). The problem is that reserve can cause quadratic behavior when called in a loop (http://slashslash.info/2013/11/capacity-members-for-vector-reserve/). And in general, the point of putting arguments on a stack is so that we CAN reuse the stack for nested calls without requiring more memory allocations. So reserve is not the right thing here.

Summary: Pull Request resolved: pytorch/pytorch#26368 A boxed fallback function is a per-TensorTypeId generic function that is called if there is no operator specific function (either for the TensorTypeId, or the UndefinedTensorId fallback for that operator). The outline of the patch: * Add a new `boxed_fallback_table_` in the top-level ATenDispatch struct (this is NOT per operator) for storing boxed fallback functions. Boxed fallback functions provisionally have the interface `void(const char* schema, torch::jit::Stack*)`; we expect to replace the `const char* schema` with something more detailed in the near future. * Boxing and unboxing is not supported for all arguments, as IValue doesn't cover the full set of types that are in ATen. The `supports_boxed_fallback` type trait tests if boxing is possible. The list of exclusions here was experimentally generated. One notable exclusion is that we cannot handle any mutable functions, as they return a `Tensor&` which we cannot conveniently manufacture from a Tensor on an IValue stack. * The actual boxed fallback is handled in two phases. First, we do a (compile time) test to see if boxing/unboxing is supported. If it is, we do a runtime test to check if a fallback function is installed. If both conditions are met, we allocate a stack, push our arguments onto it, and then call the boxed fallback function. The return value is expected to be a single argument on the top of the stack; we retrieve it, unpack it, and return it to the user. At present, there are no tests for this diff. This diff also makes `multi_dispatch_tensor_type_set` safer by explicitly specifying that it takes all arguments as references. This prevents the function from accidentally moving in rvalue references (this never actually happened because none of the overloads of apply move out their arguments; but they could have.) Signed-off-by: Edward Z. Yang <[email protected]> Differential Revision: D17448555 Test Plan: Imported from OSS Pulled By: ezyang fbshipit-source-id: 43a5be7bdfb5c0b466d01a55380cb79e6d938044

facebook-github-bot · 2019-09-30T05:36:59Z

@ezyang merged this pull request in 7c465aa.

Summary: Pull Request resolved: pytorch#26368 A boxed fallback function is a per-TensorTypeId generic function that is called if there is no operator specific function (either for the TensorTypeId, or the UndefinedTensorId fallback for that operator). The outline of the patch: * Add a new `boxed_fallback_table_` in the top-level ATenDispatch struct (this is NOT per operator) for storing boxed fallback functions. Boxed fallback functions provisionally have the interface `void(const char* schema, torch::jit::Stack*)`; we expect to replace the `const char* schema` with something more detailed in the near future. * Boxing and unboxing is not supported for all arguments, as IValue doesn't cover the full set of types that are in ATen. The `supports_boxed_fallback` type trait tests if boxing is possible. The list of exclusions here was experimentally generated. One notable exclusion is that we cannot handle any mutable functions, as they return a `Tensor&` which we cannot conveniently manufacture from a Tensor on an IValue stack. * The actual boxed fallback is handled in two phases. First, we do a (compile time) test to see if boxing/unboxing is supported. If it is, we do a runtime test to check if a fallback function is installed. If both conditions are met, we allocate a stack, push our arguments onto it, and then call the boxed fallback function. The return value is expected to be a single argument on the top of the stack; we retrieve it, unpack it, and return it to the user. At present, there are no tests for this diff. This diff also makes `multi_dispatch_tensor_type_set` safer by explicitly specifying that it takes all arguments as references. This prevents the function from accidentally moving in rvalue references (this never actually happened because none of the overloads of apply move out their arguments; but they could have.) Signed-off-by: Edward Z. Yang <[email protected]> Differential Revision: D17448555 Test Plan: Imported from OSS Pulled By: ezyang fbshipit-source-id: 43a5be7bdfb5c0b466d01a55380cb79e6d938044

Add boxed fallback

47b2be4

Signed-off-by: Edward Z. Yang <[email protected]> [ghstack-poisoned]

pytorchbot added the module: internals Related to internal abstractions in c10 and ATen label Sep 17, 2019

This compiles, no pushing on "Add boxed fallback"

5df66ec

Signed-off-by: Edward Z. Yang <[email protected]> [ghstack-poisoned]

This ACTUALLY compiles, no pushing on "Add boxed fallback"

7bcb3b7

Signed-off-by: Edward Z. Yang <[email protected]> [ghstack-poisoned]

Update on "Add boxed fallback"

77cbba2

Signed-off-by: Edward Z. Yang <[email protected]> [ghstack-poisoned]

Closer, now at Tensor& handling on "Add boxed fallback"

463f473

Signed-off-by: Edward Z. Yang <[email protected]> [ghstack-poisoned]

ezyang added a commit that referenced this pull request Sep 18, 2019

Add boxed fallback

575a9c4

Signed-off-by: Edward Z. Yang <[email protected]> ghstack-source-id: 96b6c48 Pull Request resolved: #26368

Update on "Add boxed fallback"

7713dc3

Signed-off-by: Edward Z. Yang <[email protected]> [ghstack-poisoned]

Update on "Add boxed fallback"

6ce3890

Signed-off-by: Edward Z. Yang <[email protected]> [ghstack-poisoned]

ezyang added a commit that referenced this pull request Sep 18, 2019

Add boxed fallback

7941896

Signed-off-by: Edward Z. Yang <[email protected]> ghstack-source-id: b58ff90 Pull Request resolved: #26368

ezyang requested review from bwasti, dzhulgakov and smessmer September 18, 2019 14:26

bwasti requested changes Sep 18, 2019

View reviewed changes

ezyang mentioned this pull request Sep 19, 2019

Implement multiple dispatch #26468

Closed

This was referenced Sep 25, 2019

Make resize_as_ generic, so XLA works. #26809

Closed

fixup #26810

Closed

ezyang added a commit that referenced this pull request Sep 25, 2019

Add boxed fallback

97d75ad

Signed-off-by: Edward Z. Yang <[email protected]> ghstack-source-id: e0023a9 Pull Request resolved: #26368

ezyang added 4 commits September 25, 2019 13:30

ezyang mentioned this pull request Sep 26, 2019

Change calling convention of ATenDispatch from getOp to callUnboxed. #26857

Closed

ezyang added 4 commits September 26, 2019 11:13

ezyang requested a review from bwasti September 27, 2019 14:30

bwasti approved these changes Sep 27, 2019

View reviewed changes

bwasti reviewed Sep 27, 2019

View reviewed changes

facebook-github-bot closed this in 7c465aa Sep 30, 2019

facebook-github-bot added the merged label Sep 30, 2019

ezyang mentioned this pull request Oct 21, 2019

Per-TensorTypeId fallthrough opt-in #28386

Closed

facebook-github-bot deleted the gh/ezyang/368/head branch October 28, 2019 22:11

mruberry added the Merged label Oct 28, 2020

Add boxed fallback #26368

Add boxed fallback #26368

Uh oh!

Conversation

ezyang commented Sep 17, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bwasti left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot commented Sep 30, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

ezyang commented Sep 17, 2019 •

edited

Loading