-
Notifications
You must be signed in to change notification settings - Fork 3.8k
[Unity][IR] Purity Tracking #14394
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Unity][IR] Purity Tracking #14394
Conversation
|
Thanks for contributing to TVM! Please refer to the contributing guidelines https://tvm.apache.org/docs/contribute/ for useful information and tips. Please request code reviews from Reviewers by @-ing them in a comment.
Generated by tvm-bot |
2 similar comments
|
Thanks for contributing to TVM! Please refer to the contributing guidelines https://tvm.apache.org/docs/contribute/ for useful information and tips. Please request code reviews from Reviewers by @-ing them in a comment.
Generated by tvm-bot |
|
Thanks for contributing to TVM! Please refer to the contributing guidelines https://tvm.apache.org/docs/contribute/ for useful information and tips. Please request code reviews from Reviewers by @-ing them in a comment.
Generated by tvm-bot |
c36741f to
1b23bd6
Compare
f246b75 to
1953c74
Compare
|
This will probably keep failing tests because lots of them depend on syntactic matches and this has messed up a lot of it. I will start going through them, but some have proven very difficult to debug. |
| # slight hack: normally, we would prefer to use True, but the func attrs, when printed, | ||
| # will have it as 1, so it would fail roundtripping otherwise | ||
| R.func_attr({"ForcePure": 1}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yongwww, would you happen to know how I could avoid this (print the attribute as True instead of 1)? It's not super-pressing, but it would help for roundtripping
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the related printing should be https://github.com/apache/tvm/blob/unity/src/script/printer/relax/function.cc#L54-L58. Feel free to leave a todo for me
|
More general question (maybe one to leave to a community meeting): Where should we communicate that we expect certain passes to be run before others? We already have |
| Optional<ExprDoc> PrintAssertOp(const relax::Call& n, const ObjectPath& n_p, const IRDocsifier& d) { | ||
| static const Op& assert_op = Op::Get("relax.assert_op"); | ||
| if (!n->op.same_as(assert_op)) { | ||
| return NullOpt; | ||
| } | ||
| ICHECK(n->args.size() >= 2); | ||
| // special handling: it is important to indicate that the format string (second argument) | ||
| // is the _format_ string, or else roundtripping will fail | ||
| // (the format string will be interpreted as an argument and there will be a new default format | ||
| // string given) | ||
| Array<ExprDoc> args; | ||
| args.push_back(d->AsDoc<ExprDoc>(n->args[0], n_p->Attr("args")->ArrayIndex(0))); | ||
| ExprDoc second_arg = d->AsDoc<ExprDoc>(n->args[1], n_p->Attr("args")->ArrayIndex(1)); | ||
| for (size_t i = 2; i < n->args.size(); i++) { | ||
| args.push_back(d->AsDoc<ExprDoc>(n->args[i], n_p->Attr("args")->ArrayIndex(i))); | ||
| } | ||
| return Relax(d, "assert_op")->Call(args, {"format"}, {second_arg}); | ||
| } | ||
|
|
||
| Optional<ExprDoc> PrintRelaxPrint(const relax::Call& n, const ObjectPath& n_p, | ||
| const IRDocsifier& d) { | ||
| static const Op& print_op = Op::Get("relax.print"); | ||
| if (!n->op.same_as(print_op)) { | ||
| return NullOpt; | ||
| } | ||
| ICHECK(n->args.size() >= 1); | ||
| // special handling: it is important to indicate that the format string (first argument) | ||
| // is the _format_ string, or else roundtripping will fail | ||
| // (the format string will be interpreted as an argument and there will be a new default format | ||
| // string given) | ||
| ExprDoc first_arg = d->AsDoc<ExprDoc>(n->args[0], n_p->Attr("args")->ArrayIndex(0)); | ||
| Array<ExprDoc> args; | ||
| for (size_t i = 1; i < n->args.size(); i++) { | ||
| args.push_back(d->AsDoc<ExprDoc>(n->args[i], n_p->Attr("args")->ArrayIndex(i))); | ||
| } | ||
| return Relax(d, "print")->Call(args, {"format"}, {first_arg}); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One would think this isn't related to these changes at all, but in the process of testing roundtripping for call_pure, I discovered roundtripping bugs for print and assert_op (both having to do with their format strings).
src/relax/transform/run_codegen.cc
Outdated
| // preserve the purity: if the func was originally pure, wrap call_pure | ||
| bool purity = GetStructInfoAs<FuncStructInfoNode>(gvar)->purity; | ||
| auto ret = create_call_dps_packed(new_func, func->ret_struct_info); | ||
| if (purity) { | ||
| return WrapCallPure(ret); | ||
| } | ||
| return ret; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is another case where being clear about staging might make some of this reasoning unnecessary. We do codegen before doing the VM build in test_codegen_dnnl.py, so I wasn't sure if I should have eliminated dataflow blocks. The fusion passes seem to rely on having dataflow blocks, so I can't trivially get rid of purity checking and call_pure before fusion (unless we also disable checking purity from the well-formed check--we could consider having a third, internal-only annotation, that disables all purity checks, even inside dataflow blocks).
| } | ||
|
|
||
| TVM_REGISTER_OP("relax.call_builtin_with_ctx") | ||
| .set_num_inputs(4) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just noticed this should be 2, it is not related to your change...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good observation
| .add_argument("data", "Tensor", "The input tensor.") | ||
| .set_attr<FInferStructInfo>("FInferStructInfo", InferStructInfoCumsum); | ||
| .set_attr<FInferStructInfo>("FInferStructInfo", InferStructInfoCumsum) | ||
| .set_attr<Bool>("FPurity", Bool(true)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will we run into issue if FPurity is not specified explicitly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should specify it explicitly. I have an assert in the well-formedness checker that it's defined.
a425bc7 to
5c8b7af
Compare
54747d7 to
68968c9
Compare
|
Tests failing due to the rebase (have to update some more ops/passes), will address shortly. |
68968c9 to
1e34266
Compare
…ping/unwrapping behavior
d1651ce to
c920bf5
Compare
|
Updated to reflect the fix from #14864, thanks to @jinhongyii 😄 |
In this PR, I am beginning to implement the tracking of function purity as part of the
StructInfosystem. This will allow the compiler to enforce that no impure function (one that can possibly have visible side effects) can be called in aDataflowBlock. Tracking this requires noting which operators are pure or impure (I am presently doing this with an operator attribute calledFPurity, a simple boolean), though dealing with calls to other Relax functions means that this information must also be tracked via theStructInfosystem.Additionally, it is difficult to infer the purity of a function in the general case (when there are calls to other Relax functions), so this change does require users to annotate impure functions using a new field on functions,
is_pure(in TVMScript, this can be done usingR.is_pure()orR.is_impure()). Since most Relax functions are likely to be pure and purity is the default assumption, this will hopefully not be a large imposition on users. We can consider inferring purity in the easier cases, since those are likely to be common.Note that
PackedFuncs are conservatively treated as impure. However, in situations where they are needed inside a dataflow block, a call to aPackedFuncthat is, in reality, pure can be done via the new operatorcall_pure_packedor the existing operatorcall_dps_packed(it is assumed that anyPackedFuncused with it will be pure). (Similarly, a new operatorinvoke_pure_closureis introduced as a counterpart toinvoke_closurefor dealing with closure objects, though this really should only come up with theLambdaLiftingpass.)As an "escape hatch" to the purity system, one can use the attribute
relax.force_pure, which indicates to the compiler to treat the entire function as pure even if it contains an impure call. Additionally, even thoughPackedFuncs are normally treated as impure, a user can usecall_pure_packedorcall_dps_packedto callPackedFuncs in dataflow blocks when appropriate. These can be used to deal with the following situations:relax.force_purewould be useful here.PackedFuncis, in reality, pure.call_pure_packedorcall_dps_packedare useful in this situation.Changes include:
DataflowBlocks in the well-formed check.relax.force_pureis set).call_pureoperatorStill to be done:
(Also address the design concerns below)
The process of implementing purity tracking has revealed a few issues that may require some further design discussion.
Labeling purity or impurity
Using function attributes to label purity/impurity seems very messy. It might be worth making this part of the
Functionnode's AST to avoid having to wrangle attributes in many places in the codebase.The treatment ofcall_purecall_purepresents a dilemma because many passes look for calls to certain operators, but withcall_pure, these would be "wrapped" like so:Call(Op("relay.call_pure"), [inner_op, arg1, ..., argn], attrs, sinfo_args). In the posted PR, many passes needed to be revised to look for instances ofcall_pureso that they could be treated the same as calls to ordinary operators. It might cause less disruption to passes to turncall_pureinto a specialized AST node that literally wraps a call node. This way visitors or mutators could have an easy default case and the lower-level code generation passes could simply ignore thecall_purenode and deal with the wrapped call. As painful as introducing a new AST node would be, it would likely be more maintainable than having every pass have to have a special case for thecall_pureoperator.The more restricted
call_pure_packed/invoke_pure_closureare less likely to come up in passes and so are less likely to need special handling.Staging
Related to the above issue, having purity tracking in the well-formedness analysis means that even low-level passes like
VMLowerBuiltin, which replaces some Relax operators withPackedFuncs (treated as impure by default, as they are in principle dangerous black boxes), have to insert calls tocall_pureto avoid triggering purity errors. Perhaps it might be useful to ignore purity by the time we reach lower stages of compilation, much as theToNonDataflowpass is used in the VM build process to eliminateDataflowBlocks.Update: Based on the TVM Unity Community Meeting on Mar. 28, 2023, I've provisionally adopted a variant of this approach thanks to a suggestion by @tqchen. Namely, there is a pass called
RemovePurityCheckingthat simply adds theForcePureattribute to all pure functions and unwraps all invocations ofcall_pure_packed,call_pure_dps_packed, andinvoke_pure_closure. This allows lower-level passes to remain unchanged and not to have to consider purity at all. It is inserted afterToNonDataflowin thebuild()function invm_build.py.Update 2: Thanks to suggestions by @tqchen, I've made
is_pureandforce_pureinto fields on functions rather than attributes and added syntactic sugar in TVMScript to make these easy to set.Update 3: Per @tqchen's suggestion
force_purewill be kept as an attribute instead. This is because it acts more as a direction to the compiler rather than an inherent property of a function.