Merge ForwardMsg caching into develop by tconkling · Pull Request #178 · streamlit/streamlit

tconkling · 2019-09-23T23:46:04Z

Implements ForwardMsg caching on the client and server.

Server

ForwardMsg.hash is calculated on every outgoing ForwardMsg
The server caches all messages whose size is >= 10k in its MessageCache instance. (These cached message are never evicted - this will come in a future PR.) Each entry in the cache contains the cached message, and also a (weak-referenced) set of all ReportSessions for which that message has already been delivered.
Cached ForwardMsgs have the metadata.cacheable flag set. When the client receives a ForwardMsg with cacheable == true, it also caches it.
If the server believes that a client has cached a given ForwardMsg (because the cache has an entry for that message+session), it instead sends a "reference" message that points to the original message via ForwardMsg.ref_hash.
The server also exposes a new HTTP endpoint, /message, from which clients can request messages that they have received a ref_hash message for, but are missing in that client's local cache. (This should not happen frequently! If clients are getting a bunch of local cache misses, then the server is not maintaining its cache properly.)

Client

WebsocketConnection has a new ForwardMsgCache member
All incoming ForwardMsgs are passed through this cache (via ForwardMsgCache.processMessage):
- If the ForwardMsg is not a reference message, it is cached locally and nothing more is done
- If the ForwardMsg is a reference message, the client checks to see if it has the original message in its cache; if so, it gets returned
- If the reference message is missing from the client's cache, it is requested from the server's /message endpoint, and then cached.

Cache expiration

The global.maxCachedMessageAge config value specifies the maximum age of a message in the cache. The age of a message is defined by the number of times a referencing report has finished running since the message was accessed by that report. So when a report finishes running for the first time, the age of all its cached messages is 1.
Our default maxCachedMessageAge is 2, which means that messages will remain cached even after not having been referenced for a report run.
To determine the age of messages in its cache, the server tracks a report_run_count alongside each ReportSession in its websocket->reportsession dictionary. The server increments this value each time it processes a report_finished ForwardMsg message for a report.
When the server increments report_finished, it also asks the cache to remove any expired messages. I haven't done anything fancy here - the cache just iterates through all its entries, does an age check, and deletes them if they're expired.
Similarly, when the client receives a report_finished ForwardMsg, it performs the same cache expiration step. This means that the server and client's caches should stay in sync. (I haven't thought hard enough about whether it's possible for them to get briefly out of sync, but if they do, we have the new /message endpoint as a fallback.)
If a report fails to run due to a compilation error, neither the server nor the client will increment the report_run_count for that session. (The run-count incrementing is conditional on the status value of the report_finished message, which is set to FINISHED_WITH_COMPILE_ERROR when that happens.)

There are client and server tests for all this new logic. But the easiest way to see the expiration in action is to run something like examples/core/message_deduping.py and watch the debug output on the server and the client. (You'll see logs for cache hits and misses - you should never see a miss! - on the client; and logs for sending cached message refs on the server.)

This example just creates a big dataframe and sends it twice
The first time the report is run, the message will only be delivered to the client once.
If you rerun the report, the message won't be delivered at all because it'll be cached.
If you remove the st.dataframe() calls and rerun the report, the dataframe will not be removed from the cache because it won't be old enough. Re-adding those calls and rerunning the report should confirm this.
If you remove the dataframe calls and re-run the report twice with them missing, they will then be expired from the cache. Re-adding them should result in the message being resent to the client.

This is the first part of ForwardMsg caching: ## Server - `ForwardMsg.hash` is calculated on every outgoing ForwardMsg - The server caches all messages whose size is >= 10k in its `MessageCache` instance. (These cached message are never evicted - this will come in a future PR.) Each entry in the cache contains the cached message, and also a (weak-referenced) set of all ReportSessions for which that message has already been delivered. - Cached ForwardMsgs have the `metadata.cacheable` flag set. When the client receives a ForwardMsg with `cacheable == true`, it also caches it. - If the server believes that a client has cached a given ForwardMsg (because the cache has an entry for that message+session), it instead sends a "reference" message that points to the original message via `ForwardMsg.ref_hash`. - The server also exposes a new HTTP endpoint, `/message`, from which clients can request messages that they have received a `ref_hash` message for, but are missing in that client's local cache. (This should not happen frequently! If clients are getting a bunch of local cache misses, then the server is not maintaining its cache properly.) ## Client - `WebsocketConnection` has a new `ForwardMsgCache` member - All incoming `ForwardMsgs` are passed through this cache (via `ForwardMsgCache.processMessage`): - If the ForwardMsg is *not* a reference message, it is cached locally and nothing more is done - If the ForwardMsg *is* a reference message, the client checks to see if it has the original message in its cache; if so, it gets returned - If the reference message is missing from the client's cache, it is requested from the server's `/message` endpoint, and then cached. ## TODO - The client and server both need to perform cache eviction; currently, their caches never have anything removed from them. - I'd like to have some better client tests that use a mock server to test its handling of the `/message` endpoint

* Adding unit tests for magic (streamlit#13) * Adding unit tests for magic * Standardize cypress viewport across all tests and fixes (streamlit#11) - standardize cypress viewport across all tests - remove snapshot corresponding to missing `mplotlib` test - fixed issue with snapshots being cutoff or wrong element being captured sporadically - increased snapshot diff threshold by .5% - increase snapshot bounding box for `disconnect.py` and `empty_dataframes.py` - allow multiple valid width values for `vega_lite_chart.py` - set `devicePixelRatio` to 1 when running `cypress open' - remove spinner e2e test and core example - fix various cypress tests - preload `viz.js` when running `yarn start` * Deprecating data parameter from st.deck_gl_chart() (streamlit#22) * Deprecating data parameter for st.deck_gl_chart() * Basic deck_gl example setting viewport * DOCSTRING * LINT * Moving deck_gl_chart test from streamlit_test to deck_gl_test * Checking kwargs instead of a data parameter * Fixing comments from thiago * st.image() resizing images when are bigger than the maximum (streamlit#30) * Resizing image when its bigger than the maximum width * Resizing image when its bigger than the maximum width * MAXIMUM_CONTENT_WIDTH documentation * MAXIMUM_CONTENT_WIDTH documentation * Hash numpy arrays. Use fixed size sample. (streamlit#16) * Merge version 0.45.0 (streamlit#34) * Restore correct vertical spacing for report elements. * Break Block.tsx into more functions. * Tiny refactors in Block.tsx * Add "key" property to element in DocString, to avoid warnings. * Fix streamlit#1230: OSError not caught when writing cache to disk * Don't show hashing warning when user imports a module inside a cached function. * Fix streamlit#6: report should loads element by element (was broken since Sidebar PR) * Try to fix CircleCI issue: Method 'renderElementWithErrorBoundary' expected no return value * Add typing to function. * Remove unecessary dockerfiles * Fix a bunch of examples * Fix color of syntax error code in modal dialog. * Remove old style copyright from conftest.py * Remove st.warning when hashing a class. * Add astor to Python requirements * Update widget API in some e2e tests * Clean up st.foo_widget() signatures in st.help() * Lint docs/getting_started.md * Clean up widget examples in docstrings. * Add comment to slider examples in docs. * Tweak Streamlit's description in setup.py * Sort pipfile modules * Re-add scripts and Makefile rules to publish packages to conda. * Fix update_version.py to support new conda location and remove old docker folder. * Make `streamlit config show` pipeable to a file. * Update docs/ site * Remove /docs/api invalidation from Makefile * Improve docstring for st.slider * Remove st.Cache (capital C) from __init__ while we figure out what to call it. * Up version to 0.45.0 * Turn off tests for st.Cache (capital C) * Try turning off tests again * Test frontend in CircleCI (streamlit#35) * CircleCI: run frontend tests after python tests * `make jstest`: don't run non-existent 'coverage' script * Add `make jscoverage` in case we want to use it * Make sure we generate scssvars * Conditionally run our frontend tests * 403 Error message (streamlit#37) * 403 Error message * ESLint * Adding width and height parameter to st.dataframe (streamlit#33) This PR proposes to add a width and height parameter to st.dataframe. Protobuf is changed to include an ElementDimensionSpec field to carry this information. This is included as part of the forward message metadata. The message is parsed and height/width information is extracted and passed down to the Dataframe React component in Block.tsx. Width is set in Block.tsx since existing logic applies to a generic element, while height is set in the Dataframe element in some ad hoc logic. * Fix WebsocketConnection totalTries count (streamlit#38) * Fixing overlap between onload and onreadystatechange * Removing unused events

@tvst

This implements `ForwardMsgCache` expiration on both the client and the server. - The `global.maxCachedMessageAge` config value specifies the maximum age of a message in the cache. The age of a message is defined by the number of times a referencing report has finished running since the message was accessed by that report. So when a report finishes running for the first time, the age of all its cached messages is 1. - Our default maxCachedMessageAge is 2, which means that messages will remain cached even after not having been referenced for a report run. - To determine the age of messages in its cache, the server tracks a `report_run_count` alongside each `ReportSession` in its websocket->reportsession dictionary. The server increments this value each time it processes a `report_finished` ForwardMsg message for a report. - When the server increments `report_finished`, it also asks the cache to remove any expired messages. I haven't done anything fancy here - the cache just iterates through all its entries, does an age check, and deletes them if they're expired. @tvst, I know you were potentially concerned about performance issues with the "iterate the whole cache" strategy. We could add a simple timer to this function to collect local metrics to see if it's actually an issue? - Similarly, when the client receives a `report_finished` ForwardMsg, it performs the same cache expiration step. This means that the server and client's caches should stay in sync. (I haven't thought hard enough about whether it's possible for them to get briefly out of sync, but if they do, we have the new `/message` endpoint as a fallback.) - If a report fails to run due to a compilation error, neither the server nor the client will increment the `report_run_count` for that session. (The run-count incrementing is conditional on the status value of the `report_finished` message, which is set to `FINISHED_WITH_COMPILE_ERROR` when that happens.) There are client and server tests for all this new logic. But the easiest way to see the expiration in action is to run something like `examples/core/message_deduping.py` and watch the debug output on the server and the client. (You'll see logs for cache hits and misses - you should never see a miss! - on the client; and logs for sending cached message refs on the server.) - This example just creates a big dataframe and sends it twice - The first time the report is run, the message will only be delivered to the client once. - If you rerun the report, the message won't be delivered at all because it'll be cached. - If you remove the `st.dataframe()` calls and rerun the report, the dataframe will *not* be removed from the cache because it won't be old enough. Re-adding those calls and rerunning the report should confirm this. - If you remove the dataframe calls and re-run the report _twice_ with them missing, they will then be expired from the cache. Re-adding them should result in the message being resent to the client.

* develop: (54 commits) Removing uber demo from streamlit repo (streamlit#159) Release 0.46.0 (streamlit#170) Magic fixes (streamlit#138) [docs] Add analytics; redirect /secret/docs; fix compilation problem (streamlit#149) Fix bug for startup under windows (streamlit#144) Responsive layout (streamlit#104) Add basic PR template Better method signatures (streamlit#88) Publish docs to both /docs and /secret/docs, until we deprecate /secret/docs in January. (streamlit#141) Rename/report2app (streamlit#126) Test running streamlit commands "bare" (streamlit#133) Updates to LP and sidebar. (streamlit#134) tests for z-index of date input popover in sidebar (streamlit#131) cypress test for escaping html in markdown (streamlit#125) On a Mac, check if xcode installed before requiring watchdog (streamlit#91) [Docs] Fix st.slider API in tutorial (streamlit#98) Sidebar exceptions (streamlit#107) Fixing unbound local variable (streamlit#110) Support hashing dataframes with unhashable objects. Gracefully f… (streamlit#118) Fix hashing if the object has a name but the name is not a string. (streamlit#117) ...

tvst

Cool, I'll just blindly approve since that branch was already reviewed.

That said, there are tons of ESLint errors. If they're not related to this PR, feel free to merge as is, so you can avoid merge conflicts.

Also: I'm not sure whether these lint errors will pop up for everyone as soon as this PR is merged. Do errors only show up for files modified in each person's PR? Or all files in the repo?. If the latter, then please address the errors in a subsequent PR.

tconkling · 2019-09-24T15:48:44Z

There were only three or four errors - the rest are a bunch of warnings about missing return types on non-Typescript code.

Looks like @kantuni disabled failure-on-warning, which seems like the right thing to do when faced with this torrent of silly issues. That said, there was a point when our eslint configuration did not complain about Typescript-specific issues on js files. I'm not sure what changed?

# By Jonathan Rhone (3) and others # Via GitHub * develop: Text changes in ☰ menu and several CLI prompts (streamlit#180) add format option for slider widget and return texts for display values (streamlit#154) Merge ForwardMsg caching into develop (streamlit#178) e2e test for magic (streamlit#172) Bart example update (streamlit#179) Pedantic CircleCI cleanup (streamlit#174) Fix overwriting elements in the sidebar (streamlit#181) update vega lite chart snapshot (streamlit#183) Stop react-markdown from converting "[foo]" to a link. (streamlit#151) Issue 1204: Graphs using Vega-Lite (streamlit#56) Removing uber demo from streamlit repo (streamlit#159) # Conflicts: # lib/streamlit/DeltaGenerator.py

tconkling added 11 commits August 28, 2019 13:35

Run black on all python code

487fb42

Run prettier on all js

22ac70b

Run black again

4e64964

Restore some accidentally modified files

7b51cfc

Run prettier again

b4fff94

Undo some reformatting changes not part of this PR

6db0c78

Remove some more unrelated formatting

4cee014

tconkling requested a review from tvst September 23, 2019 23:46

tconkling marked this pull request as ready for review September 23, 2019 23:47

tvst approved these changes Sep 24, 2019

View reviewed changes

Fix a couple jslint errors

0727852

tconkling added 2 commits September 24, 2019 11:34

Fixes

caa0666

message_deduping.spec.ts

e0e300b

tconkling merged commit fbe5c38 into streamlit:develop Sep 24, 2019

tconkling deleted the tim/ApplyBlackAndPrettier branch September 24, 2019 19:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Merge ForwardMsg caching into develop #178

Merge ForwardMsg caching into develop #178
tconkling merged 14 commits intostreamlit:developfrom
tconkling:tim/ApplyBlackAndPrettier

tconkling commented Sep 23, 2019 •

edited

Loading

Uh oh!

tvst left a comment •

edited

Loading

Uh oh!

tconkling commented Sep 24, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

tconkling commented Sep 23, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Server

Client

Cache expiration

Uh oh!

tvst left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tconkling commented Sep 24, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

tconkling commented Sep 23, 2019 •

edited

Loading

tvst left a comment •

edited

Loading