[feature request] Caching allocator diagnostics and memory allocation tracing/visualization

Would be cool to peek into the state of the caching allocator on things like:
- [x] Total cached memory
- [x] Total currently used memory, referenced by Tensors
- [x] Forced free of unused segments

- [ ] Tracing of memory allocations (along with some measure of fragmentation) and deallocations (both logical and physical). Would be useful for custom anasysis scripts and for understanding a reason of OOM (fragmentation or actual lack of memory)
- [ ] Stats about currently existing tensors (if possible, otherwise with a full trace one implement this post-hoc): type, sizes, gpu device. if we had a way to dump timestamp of allocation, would be cool too (would allow to track sort of reliably memory leaks)
- [ ] Dump information of all existing tensors / storages with refcounts, so that an easy vis of fragmentation can be done (hopefully with annotation of what required them for backward)
- [ ] Built-in minimal tool for visualization of used memory (cached memory / used memory) into an SVG/HTML string
- [ ] Arena allocators (memfree in one go, optionally mempreallocation in one go with upper memlimit, configurable block sizes, per-allocator stats on existing tensors referencing memory, optional support for CUDA Unified Memory)

With caching allocaotr it's hard to understand sometimes what's happening with memory since after some big allocations / deallocations memory on nvidia-smi always stays high and doesn't reflect actual usage.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[feature request] Caching allocator diagnostics and memory allocation tracing/visualization #1529

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[feature request] Caching allocator diagnostics and memory allocation tracing/visualization #1529

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions