-
Notifications
You must be signed in to change notification settings - Fork 4.2k
Better control of cached object lifecycles #364
Description
Problem
Cached objects have the lifetime of the server itself, which is oftem more of a memory leak than a desired feaure. In particular, I have observed several interrelated problems.
- In most cases, cached objects shouldn't persist past the current session (example).
- We've been asked for months for an option so that cache entries can expire after a certain time (example).
- When a cached object represents an OS resource, like an open database connnectionn (example), we need to have a way of finalizing that obect, e.g. closing that connection.
- We've also been asked to add numerical limits to the cache (I can't find an example right now.)
Solution
MVP
The simplest solution would be to add the following keyword options to
st.cache:
| Option | Default | Meaning |
|---|---|---|
global |
False |
Whether the object is cached across sessions. |
ttl |
None |
The number of seconds to keep the object cached |
max_entries |
None |
The entries to keep for that function |
Reference Implementation
A starting point for implementing these ideas is this gist. Note that the difference between these ideas and this gist is that we should let cached items expire when they're no longer needed!
Why is max_entries None?
A basic goal of Streamlit is that the default options should "do the right thing." It's not clear what the "right thing" is in this case but I think there are two possible failure modes:
- If
max_entriesis a particular number, then the user might experience sudden, discontinuous drop in performance as the cache starts evicting entries - But if
max_entriesisNone, then the user might see a slow degradation in performance due to a continuous increase in the number of cached items.
I think that (1) is more mysterious than (2) and thus less desirable.
Next Step: Explicit Finalizing
The next past the MVP would be to allow a function to be called on cache eviction:
| Option | Default | Meaning |
|---|---|---|
finalize_func |
None |
Function called on the object on cached eviction |
as follows:
from db import Connection # <- making this example up
# close() is a method of Connnection
@st.cache(finalize_func=Connection.close)
def get_db_connection():
return Connectin(...)Next Step Part Deux: Fancy Finalizing
Sometimes the user might want to write their own finalizer after the st.cache, which looks prettier, which we could enable as follows:
from db import Connection # <- making this example up
# close() is a method of Connnection
@st.cache(finalize_func=Connection.close)
def get_db_connection():
return Connectin(...)
@get_db_connection.finalizer
def finalize_db_connection(conn):
conn.close()which would be equivalent to the previous code snippet.