[CUDA] revert PR 130472 #162950

thenumberouscode · 2025-09-15T08:51:00Z

This change may also resolve #161789, though verification is still needed.

PR #130472 would introduced the problem of freeing the same address without clean metadata. according to the below discussion, reverted it.

pytorch-bot · 2025-09-15T08:51:04Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/162950

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 9a7bd01 with merge base 1c16c18 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

thenumberouscode · 2025-09-15T08:52:51Z

@ngimel Please take a look when you have a moment.

thenumberouscode · 2025-09-15T09:11:43Z

@pytorchbot label "release notes:cuda"

pytorch-bot · 2025-09-15T09:11:52Z

Didn't find following labels among repository labels: release notes:cuda

thenumberouscode · 2025-09-15T09:14:46Z

@pytorchbot label "release notes: cuda"

ngimel · 2025-09-15T16:25:31Z

torch/csrc/cuda/CUDAPluggableAllocator.cpp

  {
    const std::lock_guard<std::mutex> lock(allocator_mutex_);
-    allocation_metadata_.emplace(r, _AllocationMetadata(size, device, stream));
+    auto result = allocation_metadata_.try_emplace(r, size, device, stream);


How can previous allocation with the same address not have been erased during free?

How can previous allocation with the same address not have been erased during free?
oh, because the same address was not erased before it was allocated again, it was deallocated by context, and the context did not clear the key in allocation_metadata_.

pytorch/torch/csrc/cuda/CUDAPluggableAllocator.cpp

Lines 22 to 25 in 9009c4d

void CUDAPluggableAllocatorDeleterContext::free() {

free_fn_(data_, size_, device_, stream_);

delete this;

}

SO that's the original sin, raw_delete behavior is different from context deleter. Context deleter has to also clean up metadata.

SO that's the original sin, raw_delete behavior is different from context deleter. Context deleter has to also clean up metadata.

I think we should not do clean metadata in context, for this is not the design goal of context, see PR #130472. it said the author want context to free the memory without depending on an allocator. and the metadata is a private member of an allocator. so we should not record metadata in allocate function again for context itself which already records the metadata. removing record metadata may be a better way.

Ok, then not recording the metadata sounds like a better way.

So essentially the context as implemented in #130472 is bc-breaking? Nice! If we stop recording metadata, that is a bc-breaking change. If we stop cleaning up metadata, metadata is useless. cc @syed-ahmed

…to prevent free failures when deallocating the same memory address with different sizes.

ngimel · 2025-09-17T18:16:46Z

Actually, after talking to @syed-ahmed we've decided that the proper fix is to just revert #130472, it's not doing anything useful. @thenumberouscode can you please try that?

syed-ahmed · 2025-09-17T19:13:58Z

I'm verifying locally if reverting #130472 fixes the problem.

Recording our conversation here for posterity, #130472 was written to allow usage of multiple CUDAPluggableAllocator in the same program and at that time I thought we would need that functionality to enable MemPools. However, when MemPools were implemented, we actually didn't end up needing that functionality.

change_current_allocator makes sure that the user can't change the allocator again, once it's called. It doesn't prevent one from creating multiple pluggable allocators which we need for mempools. We just can't set multiple allocators as current, and that's fine, because we never need that for mempools.

thenumberouscode · 2025-09-18T04:12:30Z

Actually, after talking to @syed-ahmed we've decided that the proper fix is to just revert #130472, it's not doing anything useful. @thenumberouscode can you please try that?

Yes, I already had revert PR 130472 in my local development environment. Local testing shows the issue had been resolved. I've updated this PR's code to revert PR 130472. cc @syed-ahmed

thenumberouscode · 2025-09-18T04:25:28Z

torch/csrc/cuda/CUDAPluggableAllocator.h

-  cudaStream_t stream_{};
-};
-
 #if defined(USE_ROCM)


@ngimel @syed-ahmed this line should not be reverted to #if defined(TORCH_HIP_VERSION) for it is imported by another bug fix PR a19b667

ngimel · 2025-09-18T13:53:13Z

Thank you @thenumberouscode

syed-ahmed

LGTM. Thank you @thenumberouscode!

build_variables.bzl

thenumberouscode · 2025-09-19T05:02:47Z

@syed-ahmed I had cleaned all the build cache use python setup.py clean command. And I rebuild the Pytorch project based on the lastest commit of this branch. I still could not reproduce bug. Here are my test code and output.

#include <torch/csrc/cuda/CUDAPluggableAllocator.h>
#include <torch/torch.h>

std::unordered_map<void*, ssize_t> allocation_sizes;

#define EXPECT_EQ(actual, expected) \
    do { \
        auto actual_val = (actual); \
        auto expected_val = (expected); \
        if (actual_val != expected_val) { \
            std::cerr << "FAIL: EXPECT_EQ(" << #actual << ", " << #expected << ")" << std::endl; \
            std::cerr << "  Expected: " << expected_val << std::endl; \
            std::cerr << "  Actual: " << actual_val << std::endl; \
            std::cerr << "  At: " << __FILE__ << ":" << __LINE__ << std::endl; \
            return false; \
        } \
    } while (0)

void* logging_malloc(size_t size, int device, cudaStream_t stream) {
    void* ptr;
    cudaMalloc(&ptr, size);
    std::cout << "alloc ptr=" << ptr << " size=" << size << " device=" << device
              << " stream=" << stream << std::endl;
    allocation_sizes[ptr] = size;   
    return ptr;
}

void logging_free(void* ptr, size_t size, int device, cudaStream_t stream) {
    std::cout << "free  ptr=" << ptr << " size=" << size << " device=" << device
              << " stream=" << stream << std::endl;
    // Print out any frees that don't match the allocation sizes
    if (allocation_sizes.find(ptr) != allocation_sizes.end()) {
        if (allocation_sizes[ptr] != size) {
            std::cout << "*** ERROR: free mismatch: " << ptr << " size=" << size
                      << " expected=" << allocation_sizes[ptr] << std::endl;
        }
    } else {
        std::cout << "WARNING: free of unknown ptr=" << ptr << std::endl;
    }
    cudaFree(ptr);
    allocation_sizes.erase(ptr);
}

// TEST(TestTorchUnique, UniqueComparisonTest) {
int main() {
    auto custom_allocator =
        torch::cuda::CUDAPluggableAllocator::createCustomAllocator(logging_malloc, logging_free);
    torch::cuda::CUDAPluggableAllocator::changeCurrentAllocator(custom_allocator);
    // Run the command 3 times; the first 2 will pass and the third invocation will have
    // different sizes in alloc and free
    for (int i = 0; i < 3; ++i) {
        std::cout << "\n Starting test " << i << std::endl;
        // Initialize simple sorted tensor with repeats
        torch::Tensor sorted_tensor =
            torch::tensor({0, 0, 0, 1, 1, 2, 3, 3, 3, 3, 5},
                          torch::TensorOptions().dtype(torch::kFloat32).device(at::kCUDA));

        std::cout << "Starting unique_consecutive" << std::endl;
        // This operation will call malloc/free with different sizes on the same pointer
        auto unique_dim_result = torch::unique_consecutive(sorted_tensor, false, true, 0);
        std::cout << "Finished unique_consecutive" << std::endl;
        // Everything below is only there to validate correct results
        auto unique_dim_values = std::get<0>(unique_dim_result);
        auto unique_dim_counts = std::get<2>(unique_dim_result);

        // Check tensor sizes
        EXPECT_EQ(unique_dim_values.size(0), 5);
        EXPECT_EQ(unique_dim_counts.size(0), 5);

        // Copy to CPU before accessing elements
        torch::Tensor cpu_values = unique_dim_values.cpu();
        torch::Tensor cpu_counts = unique_dim_counts.cpu();

        // Use accessors on the CPU tensors
        auto values_accessor = cpu_values.accessor<float, 1>();
        auto counts_accessor = cpu_counts.accessor<int64_t, 1>();

        // Check individual values using accessors
        EXPECT_EQ(values_accessor[0], 0.0f);
        EXPECT_EQ(values_accessor[1], 1.0f);
        EXPECT_EQ(values_accessor[2], 2.0f);
        EXPECT_EQ(values_accessor[3], 3.0f);
        EXPECT_EQ(values_accessor[4], 5.0f);

        // Check count values using accessors
        EXPECT_EQ(counts_accessor[0], 3);
        EXPECT_EQ(counts_accessor[1], 2);
        EXPECT_EQ(counts_accessor[2], 1);
        EXPECT_EQ(counts_accessor[3], 4);
        EXPECT_EQ(counts_accessor[4], 1);
        std::cout << "Finished test " << i << " \n" << std::endl;
    }
}

output:

Starting test 0
alloc ptr=0x762f79600000 size=44 device=0 stream=0
Starting unique_consecutive
alloc ptr=0 size=0 device=0 stream=0
alloc ptr=0x762f79600200 size=88 device=0 stream=0
alloc ptr=0 size=0 device=0 stream=0
alloc ptr=0 size=0 device=0 stream=0
alloc ptr=0 size=0 device=0 stream=0
alloc ptr=0x762f79600400 size=96 device=0 stream=0
alloc ptr=0x762f79600600 size=1535 device=0 stream=0
free  ptr=0x762f79600600 size=1535 device=0 stream=0
alloc ptr=0x762f79600600 size=40 device=0 stream=0
alloc ptr=0x762f79600800 size=255 device=0 stream=0
free  ptr=0x762f79600800 size=255 device=0 stream=0
free  ptr=0x762f79600400 size=96 device=0 stream=0
alloc ptr=0 size=0 device=0 stream=0
alloc ptr=0x762f79600400 size=20 device=0 stream=0
free  ptr=0x762f79600200 size=88 device=0 stream=0
Finished unique_consecutive
Finished test 0 

free  ptr=0x762f79600400 size=20 device=0 stream=0
free  ptr=0x762f79600600 size=40 device=0 stream=0
free  ptr=0x762f79600000 size=44 device=0 stream=0

 Starting test 1
alloc ptr=0x762f72800000 size=44 device=0 stream=0
Starting unique_consecutive
alloc ptr=0 size=0 device=0 stream=0
alloc ptr=0x762f72800200 size=88 device=0 stream=0
alloc ptr=0 size=0 device=0 stream=0
alloc ptr=0 size=0 device=0 stream=0
alloc ptr=0 size=0 device=0 stream=0
alloc ptr=0x762f72800400 size=96 device=0 stream=0
alloc ptr=0x762f72800600 size=1535 device=0 stream=0
free  ptr=0x762f72800600 size=1535 device=0 stream=0
alloc ptr=0x762f72800600 size=40 device=0 stream=0
alloc ptr=0x762f72800800 size=255 device=0 stream=0
free  ptr=0x762f72800800 size=255 device=0 stream=0
free  ptr=0x762f72800400 size=96 device=0 stream=0
alloc ptr=0 size=0 device=0 stream=0
alloc ptr=0x762f72800400 size=20 device=0 stream=0
free  ptr=0x762f72800200 size=88 device=0 stream=0
Finished unique_consecutive
Finished test 1 

free  ptr=0x762f72800400 size=20 device=0 stream=0
free  ptr=0x762f72800600 size=40 device=0 stream=0
free  ptr=0x762f72800000 size=44 device=0 stream=0

 Starting test 2
alloc ptr=0x762f72800000 size=44 device=0 stream=0
Starting unique_consecutive
alloc ptr=0 size=0 device=0 stream=0
alloc ptr=0x762f72800200 size=88 device=0 stream=0
alloc ptr=0 size=0 device=0 stream=0
alloc ptr=0 size=0 device=0 stream=0
alloc ptr=0 size=0 device=0 stream=0
alloc ptr=0x762f72800400 size=96 device=0 stream=0
alloc ptr=0x762f72800600 size=1535 device=0 stream=0
free  ptr=0x762f72800600 size=1535 device=0 stream=0
alloc ptr=0x762f72800600 size=40 device=0 stream=0
alloc ptr=0x762f72800800 size=255 device=0 stream=0
free  ptr=0x762f72800800 size=255 device=0 stream=0
free  ptr=0x762f72800400 size=96 device=0 stream=0
alloc ptr=0 size=0 device=0 stream=0
alloc ptr=0x762f72800400 size=20 device=0 stream=0
free  ptr=0x762f72800200 size=88 device=0 stream=0
Finished unique_consecutive
Finished test 2 

free  ptr=0x762f72800400 size=20 device=0 stream=0
free  ptr=0x762f72800600 size=40 device=0 stream=0
free  ptr=0x762f72800000 size=44 device=0 stream=0

thenumberouscode · 2025-09-19T06:23:41Z

@syed-ahmed I've used the main branch to reproduce the bug. It occur again. So I guess the revert may work. please let me know if I am wrong.
the output build with main branch:

./consecutive 

 Starting test 0
alloc ptr=0x7d1f03600000 size=44 device=0 stream=0
Starting unique_consecutive
alloc ptr=0 size=0 device=0 stream=0
alloc ptr=0x7d1f03600200 size=88 device=0 stream=0
free  ptr=0 size=0 device=0 stream=0
alloc ptr=0 size=0 device=0 stream=0
alloc ptr=0 size=0 device=0 stream=0
alloc ptr=0 size=0 device=0 stream=0
alloc ptr=0x7d1f03600400 size=96 device=0 stream=0
free  ptr=0 size=0 device=0 stream=0
alloc ptr=0x7d1f03600600 size=1535 device=0 stream=0
free  ptr=0x7d1f03600600 size=1535 device=0 stream=0
alloc ptr=0x7d1f03600600 size=40 device=0 stream=0
free  ptr=0 size=0 device=0 stream=0
WARNING: free of unknown ptr=0
alloc ptr=0x7d1f03600800 size=255 device=0 stream=0
free  ptr=0x7d1f03600800 size=255 device=0 stream=0
free  ptr=0x7d1f03600400 size=96 device=0 stream=0
alloc ptr=0 size=0 device=0 stream=0
alloc ptr=0x7d1f03600400 size=20 device=0 stream=0
free  ptr=0 size=0 device=0 stream=0
free  ptr=0x7d1f03600200 size=88 device=0 stream=0
Finished unique_consecutive
Finished test 0 

free  ptr=0x7d1f03600400 size=20 device=0 stream=0
free  ptr=0 size=0 device=0 stream=0
WARNING: free of unknown ptr=0
free  ptr=0x7d1f03600600 size=40 device=0 stream=0
free  ptr=0x7d1f03600000 size=44 device=0 stream=0

 Starting test 1
alloc ptr=0x7d1efa800000 size=44 device=0 stream=0
Starting unique_consecutive
alloc ptr=0 size=0 device=0 stream=0
alloc ptr=0x7d1efa800200 size=88 device=0 stream=0
free  ptr=0 size=0 device=0 stream=0
alloc ptr=0 size=0 device=0 stream=0
alloc ptr=0 size=0 device=0 stream=0
alloc ptr=0 size=0 device=0 stream=0
alloc ptr=0x7d1efa800400 size=96 device=0 stream=0
free  ptr=0 size=0 device=0 stream=0
alloc ptr=0x7d1efa800600 size=1535 device=0 stream=0
free  ptr=0x7d1efa800600 size=1535 device=0 stream=0
alloc ptr=0x7d1efa800600 size=40 device=0 stream=0
free  ptr=0 size=0 device=0 stream=0
WARNING: free of unknown ptr=0
alloc ptr=0x7d1efa800800 size=255 device=0 stream=0
free  ptr=0x7d1efa800800 size=255 device=0 stream=0
free  ptr=0x7d1efa800400 size=96 device=0 stream=0
alloc ptr=0 size=0 device=0 stream=0
alloc ptr=0x7d1efa800400 size=20 device=0 stream=0
free  ptr=0 size=0 device=0 stream=0
free  ptr=0x7d1efa800200 size=88 device=0 stream=0
Finished unique_consecutive
Finished test 1 

free  ptr=0x7d1efa800400 size=20 device=0 stream=0
free  ptr=0 size=0 device=0 stream=0
WARNING: free of unknown ptr=0
free  ptr=0x7d1efa800600 size=40 device=0 stream=0
free  ptr=0x7d1efa800000 size=44 device=0 stream=0

 Starting test 2
alloc ptr=0x7d1efa800000 size=44 device=0 stream=0
Starting unique_consecutive
alloc ptr=0 size=0 device=0 stream=0
alloc ptr=0x7d1efa800200 size=88 device=0 stream=0
free  ptr=0 size=0 device=0 stream=0
alloc ptr=0 size=0 device=0 stream=0
alloc ptr=0 size=0 device=0 stream=0
alloc ptr=0 size=0 device=0 stream=0
alloc ptr=0x7d1efa800400 size=96 device=0 stream=0
free  ptr=0 size=0 device=0 stream=0
alloc ptr=0x7d1efa800600 size=1535 device=0 stream=0
free  ptr=0x7d1efa800600 size=40 device=0 stream=0
*** ERROR: free mismatch: 0x7d1efa800600 size=40 expected=1535
alloc ptr=0x7d1efa800600 size=40 device=0 stream=0
free  ptr=0 size=0 device=0 stream=0
WARNING: free of unknown ptr=0
alloc ptr=0x7d1efa800800 size=255 device=0 stream=0
free  ptr=0x7d1efa800800 size=255 device=0 stream=0
free  ptr=0x7d1efa800400 size=96 device=0 stream=0
alloc ptr=0 size=0 device=0 stream=0
alloc ptr=0x7d1efa800400 size=20 device=0 stream=0
free  ptr=0 size=0 device=0 stream=0
free  ptr=0x7d1efa800200 size=88 device=0 stream=0
Finished unique_consecutive
Finished test 2 

free  ptr=0x7d1efa800400 size=20 device=0 stream=0
free  ptr=0 size=0 device=0 stream=0
WARNING: free of unknown ptr=0
free  ptr=0x7d1efa800600 size=40 device=0 stream=0
free  ptr=0x7d1efa800000 size=44 device=0 stream=0

ngimel · 2025-09-19T06:36:54Z

Can we add code from #161789 as a test (or actually a specifically constructed example that would try to allocate to different-sized buffers with the same address? Custom allocators can be constructed with load_inline

ngimel · 2025-09-19T15:45:08Z

@syed-ahmed are you ok with landing this PR as is? Even if it doesn't fix all the failures (I doubt it, tbh, @thenumberouscode examples show that it actually fixes the previous failure) it's a strict improvememt.

syed-ahmed · 2025-09-19T18:59:04Z

Yes, landing this as is sounds good to me. Let's keep the original bug open. @thenumberouscode thank you for the PR and the follow-ups!

syed-ahmed · 2025-09-19T19:00:56Z

@pytorchbot merge

pytorchmergebot · 2025-09-19T19:02:49Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

ngimel · 2025-09-19T22:03:22Z

@syed-ahmed @thenumberouscode can one of you take an AI to add the test as a follow-up, and to verify if original bug(s) are fixed, because currently @syed-ahmed sees failures even on top of this, whereas @thenumberouscode doesn't?

ngimel · 2025-09-19T22:05:06Z

@pytorchbot cherry-pick --onto release/2.9 -c regression

This change may also resolve #161789, though verification is still needed. PR #130472 would introduced the problem of freeing the same address without clean metadata. according to the below discussion, reverted it. Pull Request resolved: #162950 Approved by: https://github.com/ngimel, https://github.com/eqy, https://github.com/syed-ahmed (cherry picked from commit 4a160da)

pytorchbot · 2025-09-19T22:10:09Z

Cherry picking #162950

The cherry pick PR is at #163379 and it is recommended to link a regression cherry pick PR with an issue. The following tracker issues are updated:

[v.2.9.0] Release Tracker #162497 (comment)

Details for Dev Infra team

Raised by workflow job

syed-ahmed · 2025-09-19T23:25:26Z

currently @syed-ahmed sees failures even on top of this, whereas @thenumberouscode doesn't?

I believe @thenumberouscode is also seeing per their last comment. I'll take the AI and will post the test on Monday.

ngimel · 2025-09-20T01:45:59Z

If I'm reading his comments correctly, bug doesn't repro with revert for him, and repros on the main branch.

thenumberouscode · 2025-09-22T03:10:25Z

If I'm reading his comments correctly, bug doesn't repro with revert for him, and repros on the main branch.

@ngimel Yes, I cannot reproduce the bug with the reverted code, but I can reproduce it on the main branch without the revert. cc @syed-ahmed

thenumberouscode · 2025-09-22T03:17:07Z

@syed-ahmed @thenumberouscode can one of you take an AI to add the test as a follow-up, and to verify if original bug(s) are fixed, because currently @syed-ahmed sees failures even on top of this, whereas @thenumberouscode doesn't?

@

@syed-ahmed @thenumberouscode can one of you take an AI to add the test as a follow-up, and to verify if original bug(s) are fixed, because currently @syed-ahmed sees failures even on top of this, whereas @thenumberouscode doesn't?

@ngimel I'm willing to follow up and add the unit test as well. I can do this if @syed-ahmed
is too busy to write it.

This change may also resolve pytorch#161789, though verification is still needed. PR pytorch#130472 would introduced the problem of freeing the same address without clean metadata. according to the below discussion, reverted it. Pull Request resolved: pytorch#162950 Approved by: https://github.com/ngimel, https://github.com/eqy, https://github.com/syed-ahmed

ngimel · 2025-09-22T16:53:17Z

@thenumberouscode yes, please add the test, greatly appreciated!

syed-ahmed · 2025-09-22T22:53:16Z

Added a test, need to make sure it's running in the CI and cleanup logging if needed: #163575

This change may also resolve pytorch#161789, though verification is still needed. PR pytorch#130472 would introduced the problem of freeing the same address without clean metadata. according to the below discussion, reverted it. Pull Request resolved: pytorch#162950 Approved by: https://github.com/ngimel, https://github.com/eqy, https://github.com/syed-ahmed

[CUDA] revert PR 130472 (#162950) This change may also resolve #161789, though verification is still needed. PR #130472 would introduced the problem of freeing the same address without clean metadata. according to the below discussion, reverted it. Pull Request resolved: #162950 Approved by: https://github.com/ngimel, https://github.com/eqy, https://github.com/syed-ahmed (cherry picked from commit 4a160da) Co-authored-by: thenumberouscode <[email protected]>

thenumberouscode requested review from eqy and syed-ahmed as code owners September 15, 2025 08:51

pytorchbot added the open source label Sep 15, 2025

thenumberouscode mentioned this pull request Sep 15, 2025

"cudaErrorIllegalAddress" in unique_consecutive (with return_inverse and RMM) #161789

Closed

pytorch-bot bot added the release notes: cuda release notes category label Sep 15, 2025

ngimel reviewed Sep 15, 2025

View reviewed changes

[CUDA] Remove metadata recording in CUDAPluggableAllocator::allocate …

2c11b85

…to prevent free failures when deallocating the same memory address with different sizes.

thenumberouscode force-pushed the bugfix_cuda_pluggable_allocator_malloc branch from 3a1093f to 2c11b85 Compare September 17, 2025 08:40

mikaylagawarecki added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Sep 18, 2025

thenumberouscode changed the title ~~[CUDA] Remove metadata recording in CUDAPluggableAllocator::allocate to prevent free failures when deallocating the same memory address with different sizes..~~ [CUDA] revert 130472 Sep 18, 2025

pytorch-bot bot added the ci-no-td Do not run TD on this PR label Sep 18, 2025

thenumberouscode changed the title ~~[CUDA] revert 130472~~ [CUDA] revert PR 130472 Sep 18, 2025

thenumberouscode force-pushed the bugfix_cuda_pluggable_allocator_malloc branch from 6ef5a02 to ee549fd Compare September 18, 2025 04:23

thenumberouscode commented Sep 18, 2025

View reviewed changes

ngimel closed this Sep 18, 2025

ngimel reopened this Sep 18, 2025

ngimel approved these changes Sep 18, 2025

View reviewed changes

eqy approved these changes Sep 18, 2025

View reviewed changes

syed-ahmed approved these changes Sep 18, 2025

View reviewed changes

syed-ahmed reviewed Sep 18, 2025

View reviewed changes

build_variables.bzl Show resolved Hide resolved

ngimel added the ciflow/trunk Trigger trunk jobs on your pull request label Sep 19, 2025

syed-ahmed approved these changes Sep 19, 2025

View reviewed changes

pytorchmergebot added the merging label Sep 19, 2025

pytorchmergebot added the Merged label Sep 19, 2025

pytorchmergebot closed this in 4a160da Sep 19, 2025

pytorchmergebot removed the merging label Sep 19, 2025

pytorchbot mentioned this pull request Sep 19, 2025

[v.2.9.0] Release Tracker #162497

Closed

syed-ahmed mentioned this pull request Oct 2, 2025

Inconsistent size passed to custom CUDA alloc/free in torch::unique_consecutive #153109

Closed

	void CUDAPluggableAllocatorDeleterContext::free() {
	free_fn_(data_, size_, device_, stream_);
	delete this;
	}

[CUDA] revert PR 130472 #162950

[CUDA] revert PR 130472 #162950

Uh oh!

Conversation

thenumberouscode commented Sep 15, 2025 • edited by syed-ahmed Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Sep 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/162950

✅ No Failures

Uh oh!

thenumberouscode commented Sep 15, 2025

Uh oh!

thenumberouscode commented Sep 15, 2025

Uh oh!

pytorch-bot bot commented Sep 15, 2025

Uh oh!

thenumberouscode commented Sep 15, 2025

Uh oh!

ngimel Sep 15, 2025

Choose a reason for hiding this comment

Uh oh!

thenumberouscode Sep 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ngimel Sep 16, 2025

Choose a reason for hiding this comment

Uh oh!

thenumberouscode Sep 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ngimel Sep 17, 2025

Choose a reason for hiding this comment

Uh oh!

ngimel Sep 17, 2025

Choose a reason for hiding this comment

Uh oh!

ngimel commented Sep 17, 2025

Uh oh!

syed-ahmed commented Sep 17, 2025

Uh oh!

thenumberouscode commented Sep 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

thenumberouscode Sep 18, 2025

Choose a reason for hiding this comment

Uh oh!

ngimel commented Sep 18, 2025

Uh oh!

syed-ahmed left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

thenumberouscode commented Sep 19, 2025

Uh oh!

thenumberouscode commented Sep 19, 2025

Uh oh!

ngimel commented Sep 19, 2025

Uh oh!

ngimel commented Sep 19, 2025

Uh oh!

syed-ahmed commented Sep 19, 2025

Uh oh!

syed-ahmed commented Sep 19, 2025

Uh oh!

pytorchmergebot commented Sep 19, 2025

Merge started

Uh oh!

ngimel commented Sep 19, 2025

Uh oh!

ngimel commented Sep 19, 2025

Uh oh!

pytorchbot commented Sep 19, 2025

Cherry picking #162950

Uh oh!

thenumberouscode commented Sep 15, 2025 •

edited by syed-ahmed

Loading

pytorch-bot bot commented Sep 15, 2025 •

edited

Loading

thenumberouscode Sep 16, 2025 •

edited

Loading

thenumberouscode Sep 17, 2025 •

edited

Loading

thenumberouscode commented Sep 18, 2025 •

edited

Loading

syed-ahmed left a comment •

edited

Loading

thenumberouscode commented Sep 22, 2025 •

edited

Loading

syed-ahmed commented Sep 22, 2025 •

edited

Loading