Skip to content

Add an opaque epilogue in AOTAutograd for aliasing/mutations #85036

@bdhirsh

Description

@bdhirsh

Today when using aot_function or aot_module to trace an arbitrary pytorch program, functionalization will run by default, creating a program graph with no mutations that gets shipped off to the compiler (whatever compiler you passed into aot_function).

However, when the entire program being compiled contain external aliasing/mutations (e.g. the outputs alias the inputs, or the inputs get mutated), then we are still forced to reflect that in the traced program. When we send the traced program to a compiler however, we want the program that the compiler sees to be 100% functional - so we'd like to "hide" parts of the graph from the compiler, forcing them to run in an epilogue.

(1) input mutations

When functionalization notices that the program it was passed includes mutations to inputs, it needs to preserve those input mutations. It does that by adding extra input.copy_(new_input_value) calls to the end of the graph.

A more subtle issue to deal with is input metadata mutations, like input.transpose_().

Keeping the copy_() calls around in the graph and sending it off to various compiler passes is dangerous, because passes that assume a functional graph might silently do the wrong thing when they see mutating operators (for example, moving around the copy_() call or removing it completely).

(2) outputs aliasing inputs

Given a function like:

def f(x):
    y = x.view(-1)
    z = y + y
    return (y, z)

One of the outputs (y) aliases an input (x). See the context here: nvfuser currently doesn't make guarantees around preserving aliasing when its performs fusions, and it might effectively turn the x.view(-1) into an x.view_copy(-1).

To ensure that output-input aliasing is preserved properly, we can detect when an output aliases an input, regenerate the alias (probably with an as_strided() call, and run that logic to re-generate the output in an opaque epilogue.

Status

There's a prototype that implements some of this logic in this PR (specifically, it adds an epilogue where input mutations are stored, but not output aliasing).

One subtle that came up is the fact that aot_autograd currently stores the output of the compiled graph in an autograd.Function. autograd.Function only allows mutations to program inputs under certain conditions, so one open question is where this epilogue should actually be stored and run.

Some options:

(1) Store (and run) the epilogue inside of the autograd.Function object. At mentioned above, this isn't allowed in certain cases

(2) Update aot_function to return something different, like tuple(autograd.function, epilogue). This would be BC breaking, and require the caller (e.g. dynamo) to explicitly run the epilogue

(3) Update aot_function to return something different, e.g. a torch.nn.module()that first runs theautograd.Function`, and then runs the epilogue. This is also technically BC-breaking, but less so because running the epilogue could be done transparently in the nn module.

Open to other ideas!

cc @ezyang @soumith @zou3519 @Chillee @samdow

Metadata

Metadata

Assignees

No one assigned

    Labels

    module: functionalizationused for issues that are specific to functionalization (AOTAutograd bugs should start w aotdispatch)module: functorchPertaining to torch.func or pytorch/functorchtriagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate module

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions