-
Notifications
You must be signed in to change notification settings - Fork 26.3k
Description
Would be cool to peek into the state of the caching allocator on things like:
-
Total cached memory
-
Total currently used memory, referenced by Tensors
-
Forced free of unused segments
-
Tracing of memory allocations (along with some measure of fragmentation) and deallocations (both logical and physical). Would be useful for custom anasysis scripts and for understanding a reason of OOM (fragmentation or actual lack of memory)
-
Stats about currently existing tensors (if possible, otherwise with a full trace one implement this post-hoc): type, sizes, gpu device. if we had a way to dump timestamp of allocation, would be cool too (would allow to track sort of reliably memory leaks)
-
Dump information of all existing tensors / storages with refcounts, so that an easy vis of fragmentation can be done (hopefully with annotation of what required them for backward)
-
Built-in minimal tool for visualization of used memory (cached memory / used memory) into an SVG/HTML string
-
Arena allocators (memfree in one go, optionally mempreallocation in one go with upper memlimit, configurable block sizes, per-allocator stats on existing tensors referencing memory, optional support for CUDA Unified Memory)
With caching allocaotr it's hard to understand sometimes what's happening with memory since after some big allocations / deallocations memory on nvidia-smi always stays high and doesn't reflect actual usage.