Skip to content

Fix st.cache#1208

Merged
tconkling merged 11 commits intostreamlit:developfrom
tconkling:tim/FixPerFunctionCache
Mar 10, 2020
Merged

Fix st.cache#1208
tconkling merged 11 commits intostreamlit:developfrom
tconkling:tim/FixPerFunctionCache

Conversation

@tconkling
Copy link
Copy Markdown
Contributor

This fixes breakage that I introduced in #1152

Rather than creating the cache as a local variable in the st.cache wrapper, we instead manage all caches in a global _MemCaches class. Each cache is keyed off its wrapped function's fully qualified name, and its contents.

There's a new ScriptRunner test that makes sure that caches are reused across multiple runs of the same script.

@tconkling tconkling requested a review from a team as a code owner March 10, 2020 02:40
max_entries,
ttl,
)
mem_cache = TTLCache(maxsize=max_entries, ttl=ttl, timer=TTLCACHE_TIMER)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are the other Timer types?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changing any of the cache params trashes all the previous cache entries? Is this communicated to the user somewhere?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The timer types are just functions that return values. (So: time.time(), time.monotonic(), etc.) The only reason it's explicitly specified here is so that the passed-in timer can be patched in unit tests. I'll add a comment to that effect!

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changing any of the cache params trashes all the previous cache entries? Is this communicated to the user somewhere?

Not explicitly, no. Though changing anything else about the function (its name, its contents) also trashes the cache, so this seems like a reasonable thing to do without spelling it out. I'm willing to change it if there's disagreement, though!

# we must retrieve the cache object *and* perform the cached-value lookup
# inside the decorated function.

func_hasher = CodeHasher("md5", None, hash_funcs)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why passing None instead of a hashlib.new("md5")?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you pass None, it just creates the hasher itself. It's a confusing API; I think Thiago may be changing it. (The reason not to pass None is if you want access to the hasher.)

# functions in the same module will not share a hash.
func_hasher.update(func.__module__)
func_hasher.update(func.__qualname__)
func_hasher.update(func)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So now we hash the function twice? Once for the function cache key and once for the hash of the function call?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I don't think there's a great way to solve this without some API surgery. The second hash happens in a separate hasher instance (by necessity).

* develop:
  file_uploader: support for multiple files (streamlit#1183)
  Another attempt at test_multiple_scriptrunners timeouts (streamlit#1211)
@tconkling tconkling merged commit 931e061 into streamlit:develop Mar 10, 2020
@tconkling tconkling deleted the tim/FixPerFunctionCache branch March 10, 2020 23:26
tconkling added a commit to tconkling/streamlit that referenced this pull request Mar 11, 2020
* develop:
  Better error messages for st.cache (streamlit#1146)
  Fix st.cache (streamlit#1208)
  file_uploader: support for multiple files (streamlit#1183)
  Another attempt at test_multiple_scriptrunners timeouts (streamlit#1211)
  Speed up `make jstest` on CircleCI
  De-flake ScriptRunner_test.py (streamlit#1195)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants