Skip to content

Conversation

@jansel
Copy link
Contributor

@jansel jansel commented Oct 10, 2022

This is the subset of the changes in #86461 not auto-generated by copy_to_core.sh.

@jansel jansel requested a review from albanD as a code owner October 10, 2022 18:33
@pytorch-bot pytorch-bot bot added the release notes: fx release notes category label Oct 10, 2022
@pytorch-bot
Copy link

pytorch-bot bot commented Oct 10, 2022

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/86621

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 1bf2406:
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@@ -0,0 +1,378 @@
#define PY_SSIZE_T_CLEAN
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -0,0 +1,578 @@
#define PY_SSIZE_T_CLEAN
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jansel jansel changed the title Manual for moving dynamo to core Manual changes for moving dynamo to core Oct 10, 2022
Copy link
Contributor

@malfet malfet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please use stacks and say land just dyanmo cpp module as is? (Which sounds totally fine to me)

@jansel
Copy link
Contributor Author

jansel commented Oct 10, 2022

Can you please use stacks and say land just dyanmo cpp module as is? (Which sounds totally fine to me)

Isn't this PR almost entirely that? I am thinking we just land this, then rebase #86461.

Copy link
Collaborator

@albanD albanD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

guards.cpp not done yet.

#include <pystate.h>

// see https://bugs.python.org/issue35886
#if PY_VERSION_HEX >= 0x03080000
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be located in torch/csrc/utils/python_compat.h with all other similar changes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may want to keep this one in this file because those python headers don't import cleanly in C++.

#define true 1

#ifdef _WIN32
#define unlikely(x) (x)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use C10_UNLIKELY for this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is C code not C++.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ho, c10/macros/Macros.h is not C compliant?

#undef Py_BUILD_CORE
#endif

#define bool char
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

??

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

C doesn't have a bool type.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From Jason: this has to be plain C and so we redefine these.
Adding a comment would be nice.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Err? C99 supports bool, one just need to include <stdbool.h>

#define unlikely(x) __builtin_expect((x), 0)
#endif

#define NULL_CHECK(val) \
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we have a TODO to migrate these to use TORCH_CHECK and TORCH_INTERNAL_ASSERT_DEBUG_ONLY?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also we have throw python_error(); that should be used here.

Copy link
Contributor Author

@jansel jansel Oct 10, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

C code, not C++ -- so can't trow.

We can migrate this in another PR. For now this is just a copy (with a minor tweak to the glue code in the last function).

inline static PyObject* eval_frame_callback_get(void) {
void* result = PyThread_tss_get(&eval_frame_callback_key);
if (unlikely(result == NULL)) {
Py_RETURN_NONE;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one returns a new reference while the one below returns a borrowed reference.
The callers, AFAICT expect a borrowed one.

// - Python callable(): enables TorchDynamo
PyObject* old_callback = eval_frame_callback_get();

// owned by caller
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This incref/decref is not necessary since we're strealing the reference here.
And the ref from the tss can directly be returned.

_methods};

PyObject* torch_c_dynamo_eval_frame_init(void) {
extra_index = _PyEval_RequestCodeExtraIndex(ignored);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also shouldn't we pass in free as the release functions to properly cleanup cache on code objects?
That won't work with the custom SKIP_CODE I guess, but a simple custom function should do it.

#endif

// Flag to just run a frame normally
#define SKIP_CODE ((void*)0x1)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: could we use a long lived cache entry instead of a fake pointer?

Py_RETURN_NONE;
}

static PyMethodDef _methods[] = {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of these should have their content wrapped into HANDLE_TH_ERRORS / END_HANDLE_TH_ERRORS{_RET}
to ensure proper propagation of error and warnings.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This only exists in C++

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could move these bindings out of the c file though?

@albanD albanD added the ciflow/binaries Trigger all binary build and upload jobs on the PR label Oct 10, 2022

namespace {

struct LocalState {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use at::ThreadLocalState ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't actually thread local, it is a snapshot of thread local data.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which is exactly what this class is for!
It holds the thread local data so that we can move it aorund.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It just a cache since reading thread local data is slow.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, that still sounds like the same class. You just have less fields here.
We can keep this for perf for sure.

bool dynamic_shapes)
: pytype(pt),
dispatch_key_(state.apply(v.key_set()).raw_repr()),
dtype_(v.dtype().toScalarType()),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

v.scalar_type()

in the check functions as well

at::ScalarType dtype_;
bool requires_grad_;
bool dynamic_shapes_;
std::vector<int64_t> sizes_;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

size and stride without storage_offset is usually wrong.
Is there a good reason not to have it here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be fine adding it. This check isn't for dynamo, it is for the backends.

With this, we assume backends produce offset independent code. This is true for inductor, but could be false for other backends.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. A comment would be great.


typedef struct {
PyObject_HEAD;
ChecksList* checks;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unique_ptr ?
That would remove the need for a custom dealloc

} TensorGuards;

static void TensorGuards_dealloc(TensorGuards* self) {
if (self->checks != NULL) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nullptr

for (auto i : c10::irange(len)) {
PyObject* item = PyTuple_GET_ITEM(args, i);
if (!THPVariable_CheckExact(item) && !THPVariable_Check(item)) {
PyErr_SetString(PyExc_TypeError, "expected Tensor()");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we cleanup the partially filed checks when an error happens?

Py_RETURN_TRUE;
}

static PyMethodDef TensorGuards_methods[] = {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: these bindings would be a lot simpler via pybind.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pybind is crazy slow.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes :/

PyObject* item = PyTuple_GET_ITEM(args, i);
if (Py_TYPE(item) != checks[i].pytype) {
std::stringstream fail_reason;
PyObject* type_str = PyObject_Str(PyObject_Type(item));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PyObject_Type returns a new reference. A decreft or wrap in a THPObjectPtr

Copy link
Collaborator

@albanD albanD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stamping, comments will be addressed in a follow up by @voznesenskym

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Oct 11, 2022
@jansel
Copy link
Contributor Author

jansel commented Oct 11, 2022

@pytorchbot merge -l

@pytorchmergebot
Copy link
Collaborator

Merge started

The -l land checks flag is deprecated and no longer needed. Instead we now automatically add the ciflow\trunk label to your PR once it's approved

Your change will be merged once all checks on your PR pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@pytorchmergebot
Copy link
Collaborator

Merge failed

Reason: 2 additional jobs have failed, first few of them are: trunk ,trunk / libtorch-linux-bionic-cuda11.6-py3.7-gcc7 / build

Details for Dev Infra team Raised by workflow job

@jansel
Copy link
Contributor Author

jansel commented Oct 11, 2022

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@jansel jansel deleted the dynamo_changes branch October 11, 2022 23:06
IvanYashchuk added a commit to csarofeen/pytorch that referenced this pull request Oct 13, 2022
commit f925b26
Author: Ivan Yashchuk <[email protected]>
Date:   Thu Oct 13 21:45:09 2022 +0300

    Allow skipping view with skip_ops

commit ddb769e
Author: Ivan Yashchuk <[email protected]>
Date:   Thu Oct 13 21:38:04 2022 +0300

    Add varargs support for view

commit a9cdefa
Author: Ivan Yashchuk <[email protected]>
Date:   Wed Oct 12 18:46:46 2022 +0300

    Use ops.view name

commit 986d76b
Author: Ivan Yashchuk <[email protected]>
Date:   Wed Oct 12 18:27:37 2022 +0300

    Fix duplicate

commit 1c9c9c6
Author: Ivan Yashchuk <[email protected]>
Date:   Wed Oct 12 16:49:52 2022 +0300

    Add print for ViewOpRecord

commit a67e6c2
Merge: b07eeb0 2344135
Author: Ivan Yashchuk <[email protected]>
Date:   Wed Oct 12 16:43:53 2022 +0300

    Merge remote-tracking branch 'upstream/viable/strict' into nvprims-view

commit 2344135
Author: Khushi <[email protected]>
Date:   Wed Oct 12 07:00:40 2022 +0000

    [primTorch] special: entr, expit (pytorch#86592)

    Add _refs for `entr` & `expit`.

    cc @mruberry @kshitij12345!
    Pull Request resolved: pytorch#86592
    Approved by: https://github.com/mruberry

commit a47f93b
Author: Sherlock Huang <[email protected]>
Date:   Wed Oct 12 02:26:02 2022 +0000

    Add type and shape annotation for gm.print_readable() (pytorch#86562)

    For
    ```
    def f(a, b):
        dim0 = a.shape[0] + b.shape[0]
        dim1 = a.shape[1] + b.shape[1]
        d = a.new_empty(dim0, dim1)
        return d

    fx_g = make_fx(f, tracing_mode="symbolic")(torch.randn(5, 3), torch.randn(4, 3))
    fx_g.print_readable()
    ```

    Tracing with 'real' and 'fake' mode yields
    ```
    class f(torch.nn.Module):
        def forward(self, a_1: Tensor<f32>[5, 3], b_1: Tensor<f32>[4, 3]):

            # No stacktrace found for following nodes
            new_empty: Tensor<f32>[9, 6] = torch.ops.aten.new_empty.default(a_1, [9, 6], dtype = torch.float32, layout = torch.strided, device = device(type='cpu'), pin_memory = False);  a_1 = None
            return new_empty
    ```

    Tracing with 'symbolic' mode yields
    ```
        def forward(self, a_1: Tensor<f32>[t0.size(0), t0.size(1)], b_1: Tensor<f32>[t1.size(0), t0.size(1)]):

            # No stacktrace found for following nodes
            sym_size: Symint(t0.size(0)) = torch.ops.aten.sym_size(a_1, 0)
            sym_size_1: Symint(t1.size(0)) = torch.ops.aten.sym_size(b_1, 0)
            add: Symint(t0.size(0) + t1.size(0)) = sym_size + sym_size_1;  sym_size = sym_size_1 = None
            sym_size_2: Symint(t0.size(1)) = torch.ops.aten.sym_size(a_1, 1)
            sym_size_3: Symint(t0.size(1)) = torch.ops.aten.sym_size(b_1, 1);  b_1 = None
            add_1: Symint(2*t0.size(1)) = sym_size_2 + sym_size_3;  sym_size_2 = sym_size_3 = None
            new_empty: Tensor<f32>[t0.size(0) + t1.size(0), 2*t0.size(1)] = torch.ops.aten.new_empty.default(a_1, [add, add_1], dtype = torch.float32, layout = torch.strided, device = device(type='cpu'), pin_memory = False);  a_1 = add = add_1 = None
            return new_empty
    ```

    Pull Request resolved: pytorch#86562
    Approved by: https://github.com/Chillee

commit e0d6898
Author: PyTorch MergeBot <[email protected]>
Date:   Wed Oct 12 04:12:43 2022 +0000

    Revert "Backport currently dont work with some models if: (pytorch#86510)"

    This reverts commit 4bfb734.

    Reverted pytorch#86510 on behalf of https://github.com/facebook-github-bot due to Diff reverted internally

commit 25725fd
Author: Eddie Yan <[email protected]>
Date:   Wed Oct 12 03:44:21 2022 +0000

    (Re-open) Adds cudaMallocAsync as an alternative backend for the CUDA allocator (pytorch#82682)

    Rebased version of @mcarilli 's cudaMallocAsync pytorch#65365 for continued testing
    Pull Request resolved: pytorch#82682
    Approved by: https://github.com/ngimel

commit a216f47
Author: Nikita Shulga <[email protected]>
Date:   Wed Oct 12 01:45:21 2022 +0000

     Add  testing on A10G GPU to periodic workflow (pytorch#85524)

    This enables testing on lots of modern CUDA features on sm_86 capable GPU

    While migrating to that platform, discovered that `functorch` tests for `nn.functional.conv.transpose3d` produce garbage on sm_80+ as well as 2 `nvfuser` tests unexpectedly pass and one unexpectedly fails.

    TODO:
     - Investigate unexpected success for `test_vmapvjp_linalg_householder_product_cuda_float32` and add `functorch` shard

    Pull Request resolved: pytorch#85524
    Approved by: https://github.com/ngimel

commit c4f0b93
Author: Elias Ellison <[email protected]>
Date:   Tue Oct 11 01:24:48 2022 +0000

    Disable autocast in aot autograd (pytorch#86515)

    Fix for pytorch/torchdynamo#1368

    From comment:
    > When we invoke a Composite Implicit autograd operator that has an autocast rule, such as Einsum,
    autocast is disabled during its invocation. When we trace out the operators in an implicit op,
    re-applying on autocast rules on those operators might yield divergence from what was executed at runtime.
    This pass checks for divergence. If divergence is found, we will disable autocast.
    We would like to avoid disabling autocast if possible because accessing TLS is slow.

    Concretely, the problem found was when invoked `sum` in `einsum`:

    As seen by the following divergence:
    ```
    >>> with torch.cuda.amp.autocast(enabled=True):
    ...     print(torch.ops.aten.sum.dim_IntList(torch.rand([2, 2, 2], device="cuda", dtype=torch.half), [1, 2]).dtype)
    ...
    torch.float32
    >>> print(torch.ops.aten.sum.dim_IntList(torch.rand([2, 2, 2], device="cuda", dtype=torch.half), [1, 2]).dtype)
    torch.float16
    ```

    Edit: we've decided to accept the overhead of universally disabling autocast instead
    Pull Request resolved: pytorch#86515
    Approved by: https://github.com/bdhirsh, https://github.com/Chillee

commit d598290
Author: Christian Puhrsch <[email protected]>
Date:   Wed Oct 12 01:27:57 2022 +0000

    Basic SDP benchmark harness (pytorch#86729)

    Basic benchmark for reference and discussion.
    Pull Request resolved: pytorch#86729
    Approved by: https://github.com/drisspg

commit 4bfb734
Author: Han Qi (qihqi) <[email protected]>
Date:   Wed Oct 12 00:39:25 2022 +0000

    Backport currently dont work with some models if: (pytorch#86510)

    Backport currently dont work with some models if:

    * model is originally exported with interface call enabled (backport would disable it)
    * model is flatbuffer (flatbuffer support is soft enabled via link time registry), so we manually trigger it

    Fixes #ISSUE_NUMBER

    Pull Request resolved: pytorch#86510
    Approved by: https://github.com/cccclai

commit ce48df9
Author: Bin Bao <[email protected]>
Date:   Tue Oct 11 20:31:12 2022 +0000

    Re-enable torchdynamo unit tests (pytorch#86658)

    Pull Request resolved: pytorch#86658
    Approved by: https://github.com/jansel

commit 692b525
Author: Nikita Shulga <[email protected]>
Date:   Wed Oct 12 00:32:53 2022 +0000

    [MPS] Extend unary ops to int64 (pytorch#86615)

    Most of them are already supported for `int64` except for:
     - rounding operations (`floor`, `ceil` and `round`), which are no-ops for integral types anyway
     - sign operation, when it can be emulated by clamping it tensor to [-1, 1] range

    Test new types by test MPS

    Fixes pytorch#86319

    Pull Request resolved: pytorch#86615
    Approved by: https://github.com/DenisVieriu97, https://github.com/huydhn

commit f912b58
Author: PyTorch MergeBot <[email protected]>
Date:   Tue Oct 11 23:53:12 2022 +0000

    Revert "Enable max.unary_out (pytorch#85926)"

    This reverts commit 16a0fa1.

    Reverted pytorch#85926 on behalf of https://github.com/osalpekar due to The internal diff for this commit shows a number of pytorch quantization test failures. Here is a sample output: AssertionError: Tensor-likes are not close! Mismatched elements: 319 / 320 (99.7%). Greatest absolute difference: 0.056652069091796875 at index (0, 0, 4, 5) (up to 1e-05 allowed). Link to the diff: [D40232598](https://www.internalfb.com/diff/D40232598). Link to the Sandcastle job that is failing: https://www.internalfb.com/intern/sandcastle/job/18014399302908587/

commit 2aa981a
Author: PyTorch MergeBot <[email protected]>
Date:   Tue Oct 11 23:39:50 2022 +0000

    Revert "Reland 2 of Merge more symbolic meta kernels and symint changes from branch (pytorch#86334) (pytorch#86488)"

    This reverts commit 978b46d.

    Reverted pytorch#86488 on behalf of https://github.com/osalpekar due to Broke executorch builds internally with the following message: RuntimeError: Missing out variant for functional op: aten::split.Tensor(Tensor(a -> *) self, SymInt split_size, int dim=0) -> Tensor(a)[] . Make sure you have loaded your custom_ops_generated_lib

commit 9eb4f9d
Author: Nikita Shulga <[email protected]>
Date:   Tue Oct 11 19:49:23 2022 +0000

    Tweak test tolerances to be compatible with A10G (pytorch#86538)

    Pull Request resolved: pytorch#86538
    Approved by: https://github.com/ngimel

commit 7fa601b
Author: Nikita Shulga <[email protected]>
Date:   Tue Oct 11 23:27:30 2022 +0000

    Skip chalf.mean in  test_reductions_large_half_tensors (pytorch#86747)

    As `mean_reduce` is not implemented for complex half

    Fixes pytorch#86743 and unblock A10G testing

    Pull Request resolved: pytorch#86747
    Approved by: https://github.com/ngimel

commit 811b8e0
Author: PyTorch MergeBot <[email protected]>
Date:   Tue Oct 11 23:12:40 2022 +0000

    Revert "min/max support for SymInt/Floats, finish as_strided/scatter/squeeze() backward symint support (pytorch#86643)"

    This reverts commit 86f914e.

    Reverted pytorch#86643 on behalf of https://github.com/osalpekar due to Need to revert this to cleanly revert pytorch#86488. This should be safe to re-land later

commit f1fdb6e
Author: Jason Ansel <[email protected]>
Date:   Tue Oct 11 23:01:21 2022 +0000

    Manual changes for moving dynamo to core (pytorch#86621)

    This is the subset of the changes in pytorch#86461 not auto-generated by `copy_to_core.sh`.
    Pull Request resolved: pytorch#86621
    Approved by: https://github.com/albanD

commit 09364f4
Author: Nikita Shulga <[email protected]>
Date:   Tue Oct 11 22:39:58 2022 +0000

    Compile C10 with `Wshadow` (pytorch#86666)

    This should prevent further regressions like pytorch#86646
    Update `fmt` to `7.1.0` to fix variable shadowing in that library

    Pull Request resolved: pytorch#86666
    Approved by: https://github.com/seemethere

commit 0337f0a
Author: Zain Rizvi <[email protected]>
Date:   Tue Oct 11 21:56:01 2022 +0000

    Add error checking to flaky test bot platform parser (pytorch#86632)

    If an invalid platform is specified when disabling a test with flaky test bot, the CI crashes, skipping all tests that come after it.

    This turns it into a console message instead.  Not erroring out here since it'll affect random PRs.  Actual error message should go into the bot that parses the original issue so that it can respond on that issue directly
    Pull Request resolved: pytorch#86632
    Approved by: https://github.com/huydhn

commit 42bd275
Author: Partho <[email protected]>
Date:   Tue Oct 11 21:41:48 2022 +0000

    [doc] LR scheduler example fix (pytorch#86629)

    Fixes issue pytorch#86208
    As suggested in the issue, updated the LR scheduler example to use a regular nn.Module like the other examples on the same page.
    Pull Request resolved: pytorch#86629
    Approved by: https://github.com/soulitzer

commit 32152ce
Author: jimku9 <[email protected]>
Date:   Tue Oct 11 21:21:53 2022 +0000

    Add original sources/references to Wishart.py in distributions (pytorch#86543)

    @fritzo As discussed, add original sources/references to Wishart.py in distributions and corrected typos in the error messages.

    Pull Request resolved: pytorch#86543
    Approved by: https://github.com/fritzo

commit 50af1ac
Author: Sherlock Huang <[email protected]>
Date:   Tue Oct 11 17:56:59 2022 +0000

    Mark aten ops as canonical (pytorch#86215)

    This is the first batch of canonical aten ops. 87 in total. More to come in the future PRs.

    native_dropout
    abs
    add.Tensor
    add.Scalar
    arange.start_step
    bitwise_not
    bmm
    cat
    clamp
    constant_pad_nd
    convolution
    convolution_backward
    div.Tensor
    div.Scalar
    embedding_dense_backward
    erf
    exp
    expand
    fill.Scalar
    grid_sampler_2d
    native_group_norm
    native_group_norm_backward
    native_layer_norm
    native_layer_norm_backward
    log
    _log_softmax
    max.dim
    amax
    mean.dim
    min.dim
    amin
    mm
    mul.Tensor
    mul.Scalar
    native_batch_norm
    permute
    scalar_tensor
    reciprocal
    neg
    repeat
    relu
    gelu
    rsqrt
    sigmoid
    slice.Tensor
    slice_scatter
    _softmax
    squeeze.dim
    sum.dim_IntList
    sqrt
    tanh
    unsqueeze
    var.dim
    where.self
    clone
    sub.Tensor
    sub.Scalar
    addmm
    _to_copy
    view
    scatter_add
    bitwise_and.Tensor
    bitwise_or.Tensor
    eq.Scalar
    ge.Scalar
    le.Scalar
    gt.Scalar
    lt.Scalar
    index_select
    nonzero
    gather
    maximum
    minimum
    pow.Tensor_Scalar
    hardtanh
    leaky_relu
    _adaptive_avg_pool2d
    _adaptive_avg_pool2d_backward
    avg_pool2d
    avg_pool2d_backward
    max_pool2d_with_indices
    max_pool2d_with_indices_backward
    upsample_bilinear2d.vec
    upsample_bilinear2d_backward.vec
    upsample_nearest2d.vec
    upsample_nearest2d_backward.vec
    col2im

    Pull Request resolved: pytorch#86215
    Approved by: https://github.com/suo, https://github.com/anjali411

commit 8db3025
Author: Jeff Daily <[email protected]>
Date:   Tue Oct 11 20:55:58 2022 +0000

    [ROCm] set nvfuser default to disabled, keep CI (pytorch#86369)

    Bug fix. nvfuser is functional for ROCm on gfx906, but some tests are failing for other gfx targets. Disable nvfuser until all features are verified. Users may still opt-in by setting the known env var PYTORCH_JIT_ENABLE_NVFUSER=1. This PR sets this env var for the github actions workflow for ROCm since all current CI hosts are gfx906.
    Pull Request resolved: pytorch#86369
    Approved by: https://github.com/huydhn

commit 5ffe24f
Author: Stephen Jia <[email protected]>
Date:   Tue Oct 11 20:16:56 2022 +0000

    [vulkan][ez] fix always printing out a warning when retrieving the global context (pytorch#86697)

    Summary: D40151818 (pytorch@82ed5ca) replaces the `TORCH_CHECK` with a `TORCH_WARN` but since it does not check if the context is valid the message gets printed every time. This diff fixes that.

    Test Plan:
    Referring to [Pytorch Vulkan Testing Procedures](https://fb.quip.com/fZALAc9zhlcU)

    On Mac:
    1. `vulkan_api_test` on Mac
    2. model comparison binary on Mac

    On Android:
    1. `vulkan_api_test` on Android
    2. benchmark binary on Android

    Reviewed By: salilsdesai

    Differential Revision: D40266820

    Pull Request resolved: pytorch#86697
    Approved by: https://github.com/kirklandsign

commit f32aeea
Author: Han Qi (qihqi) <[email protected]>
Date:   Tue Oct 11 20:07:58 2022 +0000

    Set interface_call to true be default (pytorch#86668)

    Summary: ASR models need it

    Test Plan: existing unit tests

    Reviewed By: cccclai

    Differential Revision: D40251788

    Pull Request resolved: pytorch#86668
    Approved by: https://github.com/cccclai

commit 7f02f2a
Author: Huy Do <[email protected]>
Date:   Tue Oct 11 19:34:44 2022 +0000

    [Experimentation] Add TSAN build and test (pytorch#85313)

    Some parts of the PR are adopted from the previously abandoned pytorch#36694.  This PR is the first part to setup TSAN jobs in the CI.  The data race warnings from TSAN will need to be reviewed later in a separate PR.
    Pull Request resolved: pytorch#85313
    Approved by: https://github.com/osalpekar

commit 9256204
Author: 胡玮文 <[email protected]>
Date:   Tue Oct 11 19:03:43 2022 +0000

    Optimize __dlpack_device__ performance (pytorch#86665)

    This can be critical when processing a large number of tensors

    ```bash
    python -m timeit --setup 'import torch; t = torch.empty(1000, device="cuda")' 't.__dlpack_device__()'
    ```

    based on 1.12.1:
    before:
    100000 loops, best of 5: 2.32 usec per loop
    after:
    500000 loops, best of 5: 844 nsec per loop

    Pull Request resolved: pytorch#86665
    Approved by: https://github.com/SunDoge, https://github.com/soulitzer

commit c12f829
Author: Jerry Zhang <[email protected]>
Date:   Tue Oct 11 18:49:09 2022 +0000

    [nn] Add remove_duplicate flag to named_buffers (#674) (pytorch#85903)

    Summary:
    X-link: meta-pytorch/torchrec#674

    Pull Request resolved: pytorch#84984

    this is to allow named_buffers to return the same buffer objects with different names multiple times, needed by internal use cases
    ghstack-source-id: 168589597

    Test Plan:
    python test/test_nn.py -k test_buffers_and_named_buffers

    Imported from OSS

    Reviewed By: albanD

    Differential Revision: D39493161

    Pull Request resolved: pytorch#85903
    Approved by: https://github.com/albanD

commit 693250a
Author: David <[email protected]>
Date:   Tue Oct 11 18:05:53 2022 +0000

    Docs: fx.Node docs incorrectly state that the self argument is included in args for module calls (pytorch#86685)

    It seems like the [torch.fx.Node docs](https://pytorch.org/docs/stable/fx.html#torch.fx.Node) are incorrect regarding the inclusion of the self argument for module call nodes.
    While the docs state that self (the module) is included in `args`, it is in fact not, as demonstrated by this code:
    ```python
    import torch
    from torch import fx, nn

    class Net(nn.Module):
        def __init__(self):
            super().__init__()
            self.submod = nn.Linear(10, 10)
        def forward(self, x):
            x = x.flatten()
            return self.submod(x)

    graph_module = fx.symbolic_trace(Net())
    print(graph_module.graph)  # doesn't show self for the submodule call
    submod_node = list(graph_module.graph.nodes)[2]
    print(submod_node.op)  # call_module
    print(submod_node.args)  # (flatten,) => would need to have len 2 if self was included

    flatten_node = list(graph_module.graph.nodes)[1]
    print(flatten_node.op)  # call_method
    print(flatten_node.args)  # (x,) => here self is included (and docs are correct)
    ```

    Since [torch.fx.Interpreter also uses `args` as if self was is not included](https://github.com/pytorch/pytorch/blob/2fe580859012d2d24a54e452195ccbc7f3191036/torch/fx/interpreter.py#L288), I assume the docs are incorrect.
    Pull Request resolved: pytorch#86685
    Approved by: https://github.com/soulitzer

commit 160118d
Author: Fang Wang <[email protected]>
Date:   Tue Oct 11 17:52:18 2022 +0000

    Add test case for matrix multiply-add with large inputs (pytorch#85550)

    Summary:
    - Added test case for addmm, baddbmm and linear with large inputs
    - Testing with torch types: float32, float16, bfloat16

    Test Plan:
    Run unit tests with:
    `buck2 run mode/opt //caffe2/test:linalg_re_cuda`

    ```
    ...
    test_addmm_baddbmm_large_input_1_10000_10000_10000_cpu_bfloat16 (test_linalg_re_cuda.TestLinalgReCudaCPU) ... skipped 'Only runs on cuda'
    test_addmm_baddbmm_large_input_1_10000_10000_10000_cpu_float16 (test_linalg_re_cuda.TestLinalgReCudaCPU) ... skipped 'Only runs on cuda'
    test_addmm_baddbmm_large_input_1_10000_10000_10000_cpu_float32 (test_linalg_re_cuda.TestLinalgReCudaCPU) ... skipped 'Only runs on cuda'
    test_addmm_baddbmm_large_input_1_10000_1000_10000_cpu_bfloat16 (test_linalg_re_cuda.TestLinalgReCudaCPU) ... skipped 'Only runs on cuda'
    test_addmm_baddbmm_large_input_1_10000_1000_10000_cpu_float16 (test_linalg_re_cuda.TestLinalgReCudaCPU) ... skipped 'Only runs on cuda'
    test_addmm_baddbmm_large_input_1_10000_1000_10000_cpu_float32 (test_linalg_re_cuda.TestLinalgReCudaCPU) ... skipped 'Only runs on cuda'
    test_addmm_baddbmm_large_input_2_1000_1000_1000_cpu_bfloat16 (test_linalg_re_cuda.TestLinalgReCudaCPU) ... skipped 'Only runs on cuda'
    test_addmm_baddbmm_large_input_2_1000_1000_1000_cpu_float16 (test_linalg_re_cuda.TestLinalgReCudaCPU) ... skipped 'Only runs on cuda'
    test_addmm_baddbmm_large_input_2_1000_1000_1000_cpu_float32 (test_linalg_re_cuda.TestLinalgReCudaCPU) ... skipped 'Only runs on cuda'
    test_addmm_baddbmm_large_input_2_100_100_100_cpu_bfloat16 (test_linalg_re_cuda.TestLinalgReCudaCPU) ... skipped 'Only runs on cuda'
    test_addmm_baddbmm_large_input_2_100_100_100_cpu_float16 (test_linalg_re_cuda.TestLinalgReCudaCPU) ... skipped 'Only runs on cuda'
    test_addmm_baddbmm_large_input_2_100_100_100_cpu_float32 (test_linalg_re_cuda.TestLinalgReCudaCPU) ... skipped 'Only runs on cuda'
    test_addmm_baddbmm_large_input_1_10000_10000_10000_cuda_bfloat16 (test_linalg_re_cuda.TestLinalgReCudaCUDA) ... ok
    test_addmm_baddbmm_large_input_1_10000_10000_10000_cuda_float16 (test_linalg_re_cuda.TestLinalgReCudaCUDA) ... ok
    test_addmm_baddbmm_large_input_1_10000_10000_10000_cuda_float32 (test_linalg_re_cuda.TestLinalgReCudaCUDA) ... ok
    test_addmm_baddbmm_large_input_1_10000_1000_10000_cuda_bfloat16 (test_linalg_re_cuda.TestLinalgReCudaCUDA) ... ok
    test_addmm_baddbmm_large_input_1_10000_1000_10000_cuda_float16 (test_linalg_re_cuda.TestLinalgReCudaCUDA) ... ok
    test_addmm_baddbmm_large_input_1_10000_1000_10000_cuda_float32 (test_linalg_re_cuda.TestLinalgReCudaCUDA) ... ok
    test_addmm_baddbmm_large_input_2_1000_1000_1000_cuda_bfloat16 (test_linalg_re_cuda.TestLinalgReCudaCUDA) ... ok
    test_addmm_baddbmm_large_input_2_1000_1000_1000_cuda_float16 (test_linalg_re_cuda.TestLinalgReCudaCUDA) ... ok
    test_addmm_baddbmm_large_input_2_1000_1000_1000_cuda_float32 (test_linalg_re_cuda.TestLinalgReCudaCUDA) ... ok
    test_addmm_baddbmm_large_input_2_100_100_100_cuda_bfloat16 (test_linalg_re_cuda.TestLinalgReCudaCUDA) ... ok
    test_addmm_baddbmm_large_input_2_100_100_100_cuda_float16 (test_linalg_re_cuda.TestLinalgReCudaCUDA) ... ok
    test_addmm_baddbmm_large_input_2_100_100_100_cuda_float32 (test_linalg_re_cuda.TestLinalgReCudaCUDA) ... ok

    ----------------------------------------------------------------------
    Ran 24 tests in 63.224s

    OK (skipped=12)
    ```

    Differential Revision: D39718256

    Pull Request resolved: pytorch#85550
    Approved by: https://github.com/IvanYashchuk, https://github.com/malfet

commit 212fa87
Author: vfdev <[email protected]>
Date:   Tue Oct 11 17:52:16 2022 +0000

    Fix torch histogramdd docstring (pytorch#86593)

    Fixed torch histogramdd docsting with missing common_args

    Pull Request resolved: pytorch#86593
    Approved by: https://github.com/soulitzer

commit f26292d
Author: Jane Xu <[email protected]>
Date:   Tue Oct 11 17:42:51 2022 +0000

    [BE] Fix python docs typos up till torch.chunk (pytorch#86642)

    Was doing the Views lab linked https://github.com/pytorch/pytorch/wiki/Tensor-and-Operator-Basics and noticed a few typos, which led to this PR.

    Test plan:
    verified in preview
    Pull Request resolved: pytorch#86642
    Approved by: https://github.com/soulitzer

commit 86f914e
Author: albanD <[email protected]>
Date:   Tue Oct 11 10:35:18 2022 -0400

    min/max support for SymInt/Floats, finish as_strided/scatter/squeeze() backward symint support (pytorch#86643)

    Pull Request resolved: pytorch#86643
    Approved by: https://github.com/anjali411

commit b07eeb0
Author: Ivan Yashchuk <[email protected]>
Date:   Thu Sep 29 17:01:50 2022 +0300

    Use string names for matching view-like functions

commit d8c005a
Merge: 59cb4be ad87365
Author: Ivan Yashchuk <[email protected]>
Date:   Thu Sep 29 17:01:03 2022 +0300

    Merge remote-tracking branch 'upstream/viable/strict' into nvprims-view

commit 59cb4be
Author: Ivan Yashchuk <[email protected]>
Date:   Thu Sep 22 18:37:59 2022 +0300

    lint

commit 92edd1a
Author: Ivan Yashchuk <[email protected]>
Date:   Thu Sep 22 18:15:35 2022 +0300

    Add view_copy

commit 79c18da
Author: Ivan Yashchuk <[email protected]>
Date:   Thu Sep 22 18:08:25 2022 +0300

    Add _unsafe_view to list

commit 254161d
Author: Ivan Yashchuk <[email protected]>
Date:   Thu Sep 22 18:07:51 2022 +0300

    Add _unsafe_view to tests

commit 487a7a8
Author: Ivan Yashchuk <[email protected]>
Date:   Thu Sep 22 18:00:30 2022 +0300

    Use func == torch.ops.aten.view.default

commit 24e61bf
Author: Ivan Yashchuk <[email protected]>
Date:   Thu Sep 22 17:57:48 2022 +0300

    Update torch/_prims/nvfuser_prims.py

commit abad276
Author: Ivan Yashchuk <[email protected]>
Date:   Thu Sep 22 17:53:42 2022 +0300

    Modify python frontend according latest changes

commit 712447f
Merge: a135db1 0c46e3e
Author: Ivan Yashchuk <[email protected]>
Date:   Thu Sep 22 17:22:44 2022 +0300

    Merge remote-tracking branch 'upstream/viable/strict' into nvprims-view

commit a135db1
Author: Ivan Yashchuk <[email protected]>
Date:   Wed Sep 7 17:06:30 2022 +0300

    Add interception of view for TorchRefsNvfuserCapabilityMode

commit f0c039e
Author: Ivan Yashchuk <[email protected]>
Date:   Wed Sep 7 17:06:07 2022 +0300

    Add test for view -> nvprims.view lowering

commit 246c999
Author: Ivan Yashchuk <[email protected]>
Date:   Wed Sep 7 16:40:13 2022 +0300

    Add tests

commit c48ba8e
Author: Ivan Yashchuk <[email protected]>
Date:   Wed Sep 7 16:39:59 2022 +0300

    Add nvprims.view

commit 3980f32
Author: Ivan Yashchuk <[email protected]>
Date:   Wed Sep 7 16:39:38 2022 +0300

    Add fd.ops.view
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/binaries Trigger all binary build and upload jobs on the PR ciflow/trunk Trigger trunk jobs on your pull request Merged module: dynamo release notes: fx release notes category topic: not user facing topic category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants