Skip to content

Conversation

@smessmer
Copy link
Contributor

@smessmer smessmer commented Aug 1, 2019

Stack from ghstack:

Changes in this PR:

  • For each operator with use_c10_dispatcher: True, additionally generate a c10 registration line in TypeDefault.cpp, CPUType.cpp, and other backend files.
  • This doesn't change globalATenDispatch yet, the c10 registration is purely additional and the operator calling path doesn't change. A diff further up the stack will change these things.
  • Enable the use_c10_dispatcher: True flag for about ~70% of operators
    This also changes the c10->jit operator export because ATen ops are already exported to JIT directly and we don't want to export the registered c10 ops because they would clash
  • For this, we need a way to recognize if a certain operator is already moved from ATen to c10, this is done by generating a OpsAlreadyMovedToC10.cpp file with the list. A diff further up in the stack will also need this file to make sure we don't break the backend extension API for these ops.

Reasons for some ops to be excluded (i.e. not have the use_c10_dispatcher flag set to true):

  • Tensor?(a!) (i.e. optional tensor with annotations) not supported in c++ function schema parser yet
  • -> void in native_functions.yaml vs -> () expected by function schema parser
  • out functions have different argument order in C++ as in the jit schema
    Tensor? (i.e. optional tensor) doesn't work nicely with undefined tensor sometimes being undefined tensor and sometimes being None.
  • fixed-size arrays like int[3] not supported in c10 yet
    These will be fixed in separate diffs and then the exclusion tag will be removed.

Differential Revision: D16603131

@gchanan
Copy link
Contributor

gchanan commented Aug 6, 2019

what's the benefit of the registration? Are you going to try to move over the eager ops ? is it so the JIT can universally use the c10 dispatcher? etc.

when are you planning to address the above limitations?

@gchanan
Copy link
Contributor

gchanan commented Aug 6, 2019

what happens if someone adds a function with one of the above limitations? Do they get a compiler error? An error if they try to dispatch to the function (via JIT?)?

There is no documentation in the README.

Why was the choice made to exclude rather than include via the flag?

@smessmer
Copy link
Contributor Author

smessmer commented Aug 6, 2019

Yes, I'm planning to move over all of native_functions.yaml to c10 so that we have one dispatch mechanism for everything instead of multiple ones.

Most of the above limitations will be addressed very soon. The argument order for out functions probably needs a bit more discussion with you on how to do it first.

Adding a function that violates the limitations is a compiler error.

The choice to exclude instead of include is that over time the code base can PR by PR get better and better with fewer uses of the flag, until the flag is entirely gone. This is following the same pattern of ops that are annotated with matches_jit_signature: False.

I added a section to the README

@pytorchbot pytorchbot added the module: docs Related to our documentation, both in docs/ and docblocks label Aug 6, 2019
@gchanan
Copy link
Contributor

gchanan commented Aug 6, 2019

that sounds generally reasonable. Do I understand correctly: if a user adds a function that violates the limitations but doesn't put the exclude, they will get a compiler error? If so, we shouldn't do that. We should either detect the error and print a message or have the annotation be opt-in.

@@ -105,13 +112,15 @@
named_guard: False

- func: abs_(Tensor(a!) self) -> Tensor(a!)
exclude_from_c10_dispatcher: True
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you forgot to mention in the description that inplace and _out ops can't be registered at the moment because of Tensor & in their signature. Did you decide on an approach for this yet btw?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no. out functions don't work anyhow because jit and c++ have a different argument order. Not sure yet about how to fix this.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

according to @gchanan, the reason for is that _out variants of functions have out argument as kwarg_only in python, but in C++ it's logical to always provide it while still have other args with default values. Thus the only option is to make it the first argument

Relevant code seems to be:

is_out_fn = True

decl['jit_argument_order'] = [nargs - 1] + list(range(nargs - 1))

Note that all C++ APIs today: public API, kernel implementations all use "reordered" variant with out argument first. TorchScript parsing and the Stack in JIT uses the "original" order.

If we try to introduce reordering logic in c10 it'd cause a mess when switching between boxed and unboxed APIs.

I see several options:

  1. switch kernel definitions to "original" order (giant codemod) and keep reordering only in generated C++ API exposed to user
  2. try mirroring reordering logic in c10 such that all boxing/unboxing functions do reordering
  3. if we give up the idea of "no codegen" (which I think is reasonable in mid-term). Then for native_functions we can provide ability to codegen boxing/unboxing wrappers and just register them with c10 as hooks (and utilize current codegen). Default hook would be the regular templated wrappers generated automatically based on C++ signature for cases where there's no reordering.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd view this problem as analogous to the TensorOptions issue. There, we do some codegen in order to make the C++ API nice (good!), but then need to reimplement the crazy logic in multiple places (bad!). All of these endpoints don't need to use the fancy C++ signature, they aren't user endpoints, a dumb signature is totally fine.

Now, you may or may not want a single way of telling downstream components that they should actually map to a different C++ function call so we could have private implementation calls (which would allow us to e.g. only expose a single "public" function for factories and out functions, as we do now), but that logic can be super simple.

@smessmer
Copy link
Contributor Author

smessmer commented Aug 7, 2019

@gchanan I changed it to be an op-in flag instead of opt-out

@dzhulgakov
Copy link
Collaborator

I think you need to patch some more tests - RPC now uses overload signatures too, it seems

Changes in this PR:
- For each operator with use_c10_dispatcher: True, additionally generate a c10 registration line in TypeDefault.cpp, CPUType.cpp, and other backend files.
- This doesn't change globalATenDispatch yet, the c10 registration is purely additional and the operator calling path doesn't change. A diff further up the stack will change these things.
- Enable the use_c10_dispatcher: True flag for about ~70% of operators
This also changes the c10->jit operator export because ATen ops are already exported to JIT directly and we don't want to export the registered c10 ops because they would clash
- For this, we need a way to recognize if a certain operator is already moved from ATen to c10, this is done by generating a OpsAlreadyMovedToC10.cpp file with the list. A diff further up in the stack will also need this file to make sure we don't break the backend extension API for these ops.

Reasons for some ops to be excluded (i.e. not have the use_c10_dispatcher flag set to true):
- Tensor?(a!) (i.e. optional tensor with annotations) not supported in c++ function schema parser yet
- -> void in native_functions.yaml vs -> () expected by function schema parser
- out functions have different argument order in C++ as in the jit schema
Tensor? (i.e. optional tensor) doesn't work nicely with undefined tensor sometimes being undefined tensor and sometimes being None.
- fixed-size arrays like int[3] not supported in c10 yet
These will be fixed in separate diffs and then the exclusion tag will be removed.

Differential Revision: [D16603131](https://our.internmc.facebook.com/intern/diff/D16603131/)
Changes in this PR:
- For each operator with use_c10_dispatcher: True, additionally generate a c10 registration line in TypeDefault.cpp, CPUType.cpp, and other backend files.
- This doesn't change globalATenDispatch yet, the c10 registration is purely additional and the operator calling path doesn't change. A diff further up the stack will change these things.
- Enable the use_c10_dispatcher: True flag for about ~70% of operators
This also changes the c10->jit operator export because ATen ops are already exported to JIT directly and we don't want to export the registered c10 ops because they would clash
- For this, we need a way to recognize if a certain operator is already moved from ATen to c10, this is done by generating a OpsAlreadyMovedToC10.cpp file with the list. A diff further up in the stack will also need this file to make sure we don't break the backend extension API for these ops.

Reasons for some ops to be excluded (i.e. not have the use_c10_dispatcher flag set to true):
- Tensor?(a!) (i.e. optional tensor with annotations) not supported in c++ function schema parser yet
- -> void in native_functions.yaml vs -> () expected by function schema parser
- out functions have different argument order in C++ as in the jit schema
Tensor? (i.e. optional tensor) doesn't work nicely with undefined tensor sometimes being undefined tensor and sometimes being None.
- fixed-size arrays like int[3] not supported in c10 yet
These will be fixed in separate diffs and then the exclusion tag will be removed.

Differential Revision: [D16603131](https://our.internmc.facebook.com/intern/diff/D16603131/)
Changes in this PR:
- For each operator with use_c10_dispatcher: True, additionally generate a c10 registration line in TypeDefault.cpp, CPUType.cpp, and other backend files.
- This doesn't change globalATenDispatch yet, the c10 registration is purely additional and the operator calling path doesn't change. A diff further up the stack will change these things.
- Enable the use_c10_dispatcher: True flag for about ~70% of operators
This also changes the c10->jit operator export because ATen ops are already exported to JIT directly and we don't want to export the registered c10 ops because they would clash
- For this, we need a way to recognize if a certain operator is already moved from ATen to c10, this is done by generating a OpsAlreadyMovedToC10.cpp file with the list. A diff further up in the stack will also need this file to make sure we don't break the backend extension API for these ops.

Reasons for some ops to be excluded (i.e. not have the use_c10_dispatcher flag set to true):
- Tensor?(a!) (i.e. optional tensor with annotations) not supported in c++ function schema parser yet
- -> void in native_functions.yaml vs -> () expected by function schema parser
- out functions have different argument order in C++ as in the jit schema
Tensor? (i.e. optional tensor) doesn't work nicely with undefined tensor sometimes being undefined tensor and sometimes being None.
- fixed-size arrays like int[3] not supported in c10 yet
These will be fixed in separate diffs and then the exclusion tag will be removed.

Differential Revision: [D16603131](https://our.internmc.facebook.com/intern/diff/D16603131/)
Changes in this PR:
- For each operator with use_c10_dispatcher: True, additionally generate a c10 registration line in TypeDefault.cpp, CPUType.cpp, and other backend files.
- This doesn't change globalATenDispatch yet, the c10 registration is purely additional and the operator calling path doesn't change. A diff further up the stack will change these things.
- Enable the use_c10_dispatcher: True flag for about ~70% of operators
This also changes the c10->jit operator export because ATen ops are already exported to JIT directly and we don't want to export the registered c10 ops because they would clash
- For this, we need a way to recognize if a certain operator is already moved from ATen to c10, this is done by generating a OpsAlreadyMovedToC10.cpp file with the list. A diff further up in the stack will also need this file to make sure we don't break the backend extension API for these ops.

Reasons for some ops to be excluded (i.e. not have the use_c10_dispatcher flag set to true):
- Tensor?(a!) (i.e. optional tensor with annotations) not supported in c++ function schema parser yet
- -> void in native_functions.yaml vs -> () expected by function schema parser
- out functions have different argument order in C++ as in the jit schema
Tensor? (i.e. optional tensor) doesn't work nicely with undefined tensor sometimes being undefined tensor and sometimes being None.
- fixed-size arrays like int[3] not supported in c10 yet
These will be fixed in separate diffs and then the exclusion tag will be removed.

Differential Revision: [D16603131](https://our.internmc.facebook.com/intern/diff/D16603131/)

// TODO Once all ATen ops are moved to c10, this file should be removed

const std::unordered_set<c10::OperatorName>& aten_ops_already_moved_to_c10() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, @gchanan is trying to get rid of the checked in autogenerated files. So I'm pretty unhappy that we're adding another checked in autogenerated file in this patch. Is it absolutely necessary that this file lives in core? In particular, I find it hard to believe that mobile actually needs to depend on this file (and it really shouldn't, because you're going to pay for binary size to hold the unordered set and strings). If mobile doesn't depend on this file, there is no reason for the autogenerated file to be checked in.

Copy link
Contributor Author

@smessmer smessmer Sep 10, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've tried several other alternatives like adding this code to VariableType.cpp or TypeDefault.cpp before but they didn't work. ATenDispatch.h/cpp depends on this and is part of the ATen/core build and not having this file in ATen/core caused linker errors.

I don't really like this either but I think it's ok as an intermediate solution. There's a comment in the file that it ought to be deleted as soon as we can.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My comment at #23668 (comment) is applicable here

// setting cache creator to nullptr so calling the kernel doesn't need to call it, which would be expensive
// This, however, only works if there are no constructor parameters (i.e. no runtime function pointer)
// Backend extensions use runtime function pointers, so we need a cache if sizeof...(ConstructorParameters) != 0
(sizeof...(ConstructorParameters) == 0) ? KernelCacheCreatorFunction(nullptr) : detail::KernelFactory<KernelFunctor, guts::decay_t<ConstructorParameters>...>(std::forward<ConstructorParameters>(constructorParameters)...),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, this patch also adds support for some operators which didn't work before?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not for new operators but for backend extensions. All ATen operators in native_functions.yaml codegen registrations that don't need this, but if you look at "Additional concern: Extension backends" in #24132, you see the plan to keep globalATenDispatch().registerOp() working until we moved all ops to c10, and only then break all ops at once for them. To keep globalATenDispatch().registerOp() working, we need this.

@ezyang
Copy link
Contributor

ezyang commented Sep 10, 2019

It looks like the patch should work. My primary objection is procedural: adding another checked in autogenerated file.

Changes in this PR:
- For each operator with use_c10_dispatcher: True, additionally generate a c10 registration line in TypeDefault.cpp, CPUType.cpp, and other backend files.
- This doesn't change globalATenDispatch yet, the c10 registration is purely additional and the operator calling path doesn't change. A diff further up the stack will change these things.
- Enable the use_c10_dispatcher: True flag for about ~70% of operators
This also changes the c10->jit operator export because ATen ops are already exported to JIT directly and we don't want to export the registered c10 ops because they would clash
- For this, we need a way to recognize if a certain operator is already moved from ATen to c10, this is done by generating a OpsAlreadyMovedToC10.cpp file with the list. A diff further up in the stack will also need this file to make sure we don't break the backend extension API for these ops.

Reasons for some ops to be excluded (i.e. not have the use_c10_dispatcher flag set to true):
- Tensor?(a!) (i.e. optional tensor with annotations) not supported in c++ function schema parser yet
- -> void in native_functions.yaml vs -> () expected by function schema parser
- out functions have different argument order in C++ as in the jit schema
Tensor? (i.e. optional tensor) doesn't work nicely with undefined tensor sometimes being undefined tensor and sometimes being None.
- fixed-size arrays like int[3] not supported in c10 yet
These will be fixed in separate diffs and then the exclusion tag will be removed.

Differential Revision: [D16603131](https://our.internmc.facebook.com/intern/diff/D16603131/)
@smessmer smessmer requested a review from ezyang September 10, 2019 16:26
Changes in this PR:
- For each operator with use_c10_dispatcher: True, additionally generate a c10 registration line in TypeDefault.cpp, CPUType.cpp, and other backend files.
- This doesn't change globalATenDispatch yet, the c10 registration is purely additional and the operator calling path doesn't change. A diff further up the stack will change these things.
- Enable the use_c10_dispatcher: True flag for about ~70% of operators
This also changes the c10->jit operator export because ATen ops are already exported to JIT directly and we don't want to export the registered c10 ops because they would clash
- For this, we need a way to recognize if a certain operator is already moved from ATen to c10, this is done by generating a OpsAlreadyMovedToC10.cpp file with the list. A diff further up in the stack will also need this file to make sure we don't break the backend extension API for these ops.

Reasons for some ops to be excluded (i.e. not have the use_c10_dispatcher flag set to true):
- Tensor?(a!) (i.e. optional tensor with annotations) not supported in c++ function schema parser yet
- -> void in native_functions.yaml vs -> () expected by function schema parser
- out functions have different argument order in C++ as in the jit schema
Tensor? (i.e. optional tensor) doesn't work nicely with undefined tensor sometimes being undefined tensor and sometimes being None.
- fixed-size arrays like int[3] not supported in c10 yet
These will be fixed in separate diffs and then the exclusion tag will be removed.

Differential Revision: [D16603131](https://our.internmc.facebook.com/intern/diff/D16603131/)
Changes in this PR:
- For each operator with use_c10_dispatcher: True, additionally generate a c10 registration line in TypeDefault.cpp, CPUType.cpp, and other backend files.
- This doesn't change globalATenDispatch yet, the c10 registration is purely additional and the operator calling path doesn't change. A diff further up the stack will change these things.
- Enable the use_c10_dispatcher: True flag for about ~70% of operators
This also changes the c10->jit operator export because ATen ops are already exported to JIT directly and we don't want to export the registered c10 ops because they would clash
- For this, we need a way to recognize if a certain operator is already moved from ATen to c10, this is done by generating a OpsAlreadyMovedToC10.cpp file with the list. A diff further up in the stack will also need this file to make sure we don't break the backend extension API for these ops.

Reasons for some ops to be excluded (i.e. not have the use_c10_dispatcher flag set to true):
- Tensor?(a!) (i.e. optional tensor with annotations) not supported in c++ function schema parser yet
- -> void in native_functions.yaml vs -> () expected by function schema parser
- out functions have different argument order in C++ as in the jit schema
Tensor? (i.e. optional tensor) doesn't work nicely with undefined tensor sometimes being undefined tensor and sometimes being None.
- fixed-size arrays like int[3] not supported in c10 yet
These will be fixed in separate diffs and then the exclusion tag will be removed.

Differential Revision: [D16603131](https://our.internmc.facebook.com/intern/diff/D16603131/)
Changes in this PR:
- For each operator with use_c10_dispatcher: True, additionally generate a c10 registration line in TypeDefault.cpp, CPUType.cpp, and other backend files.
- This doesn't change globalATenDispatch yet, the c10 registration is purely additional and the operator calling path doesn't change. A diff further up the stack will change these things.
- Enable the use_c10_dispatcher: True flag for about ~70% of operators
This also changes the c10->jit operator export because ATen ops are already exported to JIT directly and we don't want to export the registered c10 ops because they would clash
- For this, we need a way to recognize if a certain operator is already moved from ATen to c10, this is done by generating a OpsAlreadyMovedToC10.cpp file with the list. A diff further up in the stack will also need this file to make sure we don't break the backend extension API for these ops.

Reasons for some ops to be excluded (i.e. not have the use_c10_dispatcher flag set to true):
- Tensor?(a!) (i.e. optional tensor with annotations) not supported in c++ function schema parser yet
- -> void in native_functions.yaml vs -> () expected by function schema parser
- out functions have different argument order in C++ as in the jit schema
Tensor? (i.e. optional tensor) doesn't work nicely with undefined tensor sometimes being undefined tensor and sometimes being None.
- fixed-size arrays like int[3] not supported in c10 yet
These will be fixed in separate diffs and then the exclusion tag will be removed.

Differential Revision: [D16603131](https://our.internmc.facebook.com/intern/diff/D16603131/)
Changes in this PR:
- For each operator with use_c10_dispatcher: True, additionally generate a c10 registration line in TypeDefault.cpp, CPUType.cpp, and other backend files.
- This doesn't change globalATenDispatch yet, the c10 registration is purely additional and the operator calling path doesn't change. A diff further up the stack will change these things.
- Enable the use_c10_dispatcher: True flag for about ~70% of operators
This also changes the c10->jit operator export because ATen ops are already exported to JIT directly and we don't want to export the registered c10 ops because they would clash
- For this, we need a way to recognize if a certain operator is already moved from ATen to c10, this is done by generating a OpsAlreadyMovedToC10.cpp file with the list. A diff further up in the stack will also need this file to make sure we don't break the backend extension API for these ops.

Reasons for some ops to be excluded (i.e. not have the use_c10_dispatcher flag set to true):
- Tensor?(a!) (i.e. optional tensor with annotations) not supported in c++ function schema parser yet
- -> void in native_functions.yaml vs -> () expected by function schema parser
- out functions have different argument order in C++ as in the jit schema
Tensor? (i.e. optional tensor) doesn't work nicely with undefined tensor sometimes being undefined tensor and sometimes being None.
- fixed-size arrays like int[3] not supported in c10 yet
These will be fixed in separate diffs and then the exclusion tag will be removed.

Differential Revision: [D16603131](https://our.internmc.facebook.com/intern/diff/D16603131/)
CAFFE2_API const std::unordered_set<c10::OperatorName>& aten_ops_already_moved_to_c10();

// list of ATen ops that are still on the globalATenDispatch dispatcher.
CAFFE2_API const std::unordered_set<c10::OperatorName>& aten_ops_not_moved_to_c10_yet();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two sets?! :O

Copy link
Contributor Author

@smessmer smessmer Sep 12, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's semantically three sets of operators:

  • aten_ops_already_moved_to_c10
  • aten_ops_not_moved_to_c10_yet
  • non_aten_ops (e.g. custom ops)

In the previous approach:

  • register_c10_ops.cpp needed to decide if an op is in aten_ops_already_moved_to_c10 or in union(aten_ops_not_moved_to_c10_yet, non_aten_ops) because it needs to exclude the former.
  • ATenDispatch needed to decide if an op is in aten_ops_already_moved_to_c10 or in union(aten_ops_not_moved_to_c10_yet, non_aten_ops) so it can register them with c10 instead.

Since both just needed the very same decision, it was enough to store aten_ops_already_moved_to_c10 and make the decision based on that.

Now, since in the new approach it is the c10 dispatcher forwarding to ATenDispatch, it needs to decide if an op is in aten_ops_not_moved_to_c10_yet or in union(aten_ops_already_moved_to_c10, non_aten_ops), which is different to what register_c10_ops.cpp needs. I need to store two sets to be able to make both decisions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^-- this should be in the source code

Changes in this PR:
- For each operator with use_c10_dispatcher: True, additionally generate a c10 registration line in TypeDefault.cpp, CPUType.cpp, and other backend files.
- This doesn't change globalATenDispatch yet, the c10 registration is purely additional and the operator calling path doesn't change. A diff further up the stack will change these things.
- Enable the use_c10_dispatcher: True flag for about ~70% of operators
This also changes the c10->jit operator export because ATen ops are already exported to JIT directly and we don't want to export the registered c10 ops because they would clash
- For this, we need a way to recognize if a certain operator is already moved from ATen to c10, this is done by generating a OpsAlreadyMovedToC10.cpp file with the list. A diff further up in the stack will also need this file to make sure we don't break the backend extension API for these ops.

Reasons for some ops to be excluded (i.e. not have the use_c10_dispatcher flag set to true):
- Tensor?(a!) (i.e. optional tensor with annotations) not supported in c++ function schema parser yet
- -> void in native_functions.yaml vs -> () expected by function schema parser
- out functions have different argument order in C++ as in the jit schema
Tensor? (i.e. optional tensor) doesn't work nicely with undefined tensor sometimes being undefined tensor and sometimes being None.
- fixed-size arrays like int[3] not supported in c10 yet
These will be fixed in separate diffs and then the exclusion tag will be removed.

Differential Revision: [D16603131](https://our.internmc.facebook.com/intern/diff/D16603131/)
@pytorchbot pytorchbot added the module: cpp-extensions Related to torch.utils.cpp_extension label Sep 12, 2019
Copy link
Contributor

@ezyang ezyang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reiterating from the closed PR: I think the RegisterOperators is bad (having to specify schema string twice is a big no no) and needs to be fixed.

Changes in this PR:
- For each operator with use_c10_dispatcher: True, additionally generate a c10 registration line in TypeDefault.cpp, CPUType.cpp, and other backend files.
- This doesn't change globalATenDispatch yet, the c10 registration is purely additional and the operator calling path doesn't change. A diff further up the stack will change these things.
- Enable the use_c10_dispatcher: True flag for about ~70% of operators
This also changes the c10->jit operator export because ATen ops are already exported to JIT directly and we don't want to export the registered c10 ops because they would clash
- For this, we need a way to recognize if a certain operator is already moved from ATen to c10, this is done by generating a OpsAlreadyMovedToC10.cpp file with the list. A diff further up in the stack will also need this file to make sure we don't break the backend extension API for these ops.

Reasons for some ops to be excluded (i.e. not have the use_c10_dispatcher flag set to true):
- Tensor?(a!) (i.e. optional tensor with annotations) not supported in c++ function schema parser yet
- -> void in native_functions.yaml vs -> () expected by function schema parser
- out functions have different argument order in C++ as in the jit schema
Tensor? (i.e. optional tensor) doesn't work nicely with undefined tensor sometimes being undefined tensor and sometimes being None.
- fixed-size arrays like int[3] not supported in c10 yet
These will be fixed in separate diffs and then the exclusion tag will be removed.

Differential Revision: [D16603131](https://our.internmc.facebook.com/intern/diff/D16603131/)
@smessmer
Copy link
Contributor Author

Don't know what exactly happened, but somehow ghexport closed this. Reopening in #26131

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

module: build Build system issues module: cpp-extensions Related to torch.utils.cpp_extension module: docs Related to our documentation, both in docs/ and docblocks module: internals Related to internal abstractions in c10 and ATen oncall: jit Add this issue/PR to JIT oncall triage queue

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants