Skip to content

Fix cache memory use#934

Merged
mstimberg merged 4 commits intomasterfrom
fix_cache_memory_use
Apr 4, 2018
Merged

Fix cache memory use#934
mstimberg merged 4 commits intomasterfrom
fix_cache_memory_use

Conversation

@mstimberg
Copy link
Copy Markdown
Member

This fixes the memory leak discussed in #933. It turned out that the biggest issue was the way the SpikeQueue was handled. Whenever code referenced to the SpikeQueue (all on_pre/on_post code), then the caching mechanism used an explicit reference to the SpikeQueue as part of the "cache key". This led to two problems:

  1. Recreating exactly the same network with the same code would not use the cache at all for all synaptic statements, because the SpikeQueue was no longer the same.
  2. The reference to the SpikeQueue prevented it from getting garbage collected (since the SpikeQueue stores synaptic delays for each synapse, it can take quite a bit of memory).

This was fixed by using our existing mechanism (the _state_tuple attribute), to replace the SpikeQueue reference by a simple tuple with all information relevant to code generation. I also made the caching mechanism use weak references if possible to avoid the problem in a more general way. There were some other smaller issues (e.g. our _hasheable function did not recurse down into tuples) that I fixed as well, together with some minor performance optimisation (which became necessary because the _hasheable function is now called more often than before).

Performance/memory usage looks pretty good now.

Recreated objects (see #933 for code):
full_mem_comparison

Store/Restore mechanism:
full_mem_comparison_restore

Fixes #933

Previously, synaptic pre/post code had a reference to the `SpikeQueue`
which was unique for each run, and was therefore not making use of the
cache.
Due to the wrong name, the attribute was not actually fulfilling its
documented role. For example, running exactly the same synaptic code
twice but with a different number of synapses would not use the cache.
@mstimberg mstimberg requested a review from thesamovar March 29, 2018 14:22
Copy link
Copy Markdown
Member

@thesamovar thesamovar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code looks good but one question I don't understand. Why is SpikeQueue part of a key? And if it should be, why is it enough to only start source start and end as the state of that class? This doesn't seem entirely obvious to me.

Also, is there any way we can add a test suite for this sort of error in the future?

@mstimberg
Copy link
Copy Markdown
Member Author

Why is SpikeQueue part of a key? And if it should be, why is it enough to only start source start and end as the state of that class? This doesn't seem entirely obvious to me.

Fully agree that it isn't obvious... We cache functions like make_statements that basically get a code string and a dictionary of Variable objects. So these are transformed into the key of the caching dictionary. SpikeQueue is part of the variables for the synaptic propagation code, because this code calls peek() and advance() on the spike queue. Now, if we did a code refactoring this might change: the SpikeQueue is only used in the fixed code in the template, so make_statements doesn't actually have to know about it. Why can we describe the state of the SpikeQueue as only source start/end? The term "state" is actually not very appropriate here: what we call state for the caching mechanism is "everything that could change the generated code" -- in this specific case, we could actually even have an empty state, as the code involving the spike queue is entirely independent of the actual queue.

Also, is there any way we can add a test suite for this sort of error in the future?

That would be great, but I think we shouldn't do more complex stuff like memory profiling in the general test suite -- we are already running into time outs every now and then. It could be part of #33 ?

@thesamovar
Copy link
Copy Markdown
Member

Wow, issue #33 was written almost 5 years ago! :)

OK that all makes sense. Finishing review now.

@mstimberg mstimberg merged commit 3786a6e into master Apr 4, 2018
@mstimberg mstimberg deleted the fix_cache_memory_use branch April 4, 2018 13:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Caching leads to important memory leak

2 participants