Skip to content

Merge ForwardMsg caching into develop #178

Merged
tconkling merged 14 commits intostreamlit:developfrom
tconkling:tim/ApplyBlackAndPrettier
Sep 24, 2019
Merged

Merge ForwardMsg caching into develop #178
tconkling merged 14 commits intostreamlit:developfrom
tconkling:tim/ApplyBlackAndPrettier

Conversation

@tconkling
Copy link
Copy Markdown
Contributor

@tconkling tconkling commented Sep 23, 2019

Implements ForwardMsg caching on the client and server.

Server

  • ForwardMsg.hash is calculated on every outgoing ForwardMsg
  • The server caches all messages whose size is >= 10k in its MessageCache instance. (These cached message are never evicted - this will come in a future PR.) Each entry in the cache contains the cached message, and also a (weak-referenced) set of all ReportSessions for which that message has already been delivered.
  • Cached ForwardMsgs have the metadata.cacheable flag set. When the client receives a ForwardMsg with cacheable == true, it also caches it.
  • If the server believes that a client has cached a given ForwardMsg (because the cache has an entry for that message+session), it instead sends a "reference" message that points to the original message via ForwardMsg.ref_hash.
  • The server also exposes a new HTTP endpoint, /message, from which clients can request messages that they have received a ref_hash message for, but are missing in that client's local cache. (This should not happen frequently! If clients are getting a bunch of local cache misses, then the server is not maintaining its cache properly.)

Client

  • WebsocketConnection has a new ForwardMsgCache member
  • All incoming ForwardMsgs are passed through this cache (via ForwardMsgCache.processMessage):
    • If the ForwardMsg is not a reference message, it is cached locally and nothing more is done
    • If the ForwardMsg is a reference message, the client checks to see if it has the original message in its cache; if so, it gets returned
    • If the reference message is missing from the client's cache, it is requested from the server's /message endpoint, and then cached.

Cache expiration

  • The global.maxCachedMessageAge config value specifies the maximum age of a message in the cache. The age of a message is defined by the number of times a referencing report has finished running since the message was accessed by that report. So when a report finishes running for the first time, the age of all its cached messages is 1.
  • Our default maxCachedMessageAge is 2, which means that messages will remain cached even after not having been referenced for a report run.
  • To determine the age of messages in its cache, the server tracks a report_run_count alongside each ReportSession in its websocket->reportsession dictionary. The server increments this value each time it processes a report_finished ForwardMsg message for a report.
  • When the server increments report_finished, it also asks the cache to remove any expired messages. I haven't done anything fancy here - the cache just iterates through all its entries, does an age check, and deletes them if they're expired.
  • Similarly, when the client receives a report_finished ForwardMsg, it performs the same cache expiration step. This means that the server and client's caches should stay in sync. (I haven't thought hard enough about whether it's possible for them to get briefly out of sync, but if they do, we have the new /message endpoint as a fallback.)
  • If a report fails to run due to a compilation error, neither the server nor the client will increment the report_run_count for that session. (The run-count incrementing is conditional on the status value of the report_finished message, which is set to FINISHED_WITH_COMPILE_ERROR when that happens.)

There are client and server tests for all this new logic. But the easiest way to see the expiration in action is to run something like examples/core/message_deduping.py and watch the debug output on the server and the client. (You'll see logs for cache hits and misses - you should never see a miss! - on the client; and logs for sending cached message refs on the server.)

  • This example just creates a big dataframe and sends it twice
  • The first time the report is run, the message will only be delivered to the client once.
  • If you rerun the report, the message won't be delivered at all because it'll be cached.
  • If you remove the st.dataframe() calls and rerun the report, the dataframe will not be removed from the cache because it won't be old enough. Re-adding those calls and rerunning the report should confirm this.
  • If you remove the dataframe calls and re-run the report twice with them missing, they will then be expired from the cache. Re-adding them should result in the message being resent to the client.

This is the first part of ForwardMsg caching:

## Server

- `ForwardMsg.hash` is calculated on every outgoing ForwardMsg
- The server caches all messages whose size is >= 10k in its `MessageCache` instance. (These cached message are never evicted - this will come in a future PR.) Each entry in the cache contains the cached message, and also a (weak-referenced) set of all ReportSessions for which that message has already been delivered.
- Cached ForwardMsgs have the `metadata.cacheable` flag set. When the client receives a ForwardMsg with `cacheable == true`, it also caches it.
- If the server believes that a client has cached a given ForwardMsg (because the cache has an entry for that message+session), it instead sends a "reference" message that points to the original message via `ForwardMsg.ref_hash`.
- The server also exposes a new HTTP endpoint, `/message`, from which clients can request messages that they have received a `ref_hash` message for, but are missing in that client's local cache. (This should not happen frequently! If clients are getting a bunch of local cache misses, then the server is not maintaining its cache properly.)

## Client

- `WebsocketConnection` has a new `ForwardMsgCache` member
- All incoming `ForwardMsgs` are passed through this cache (via `ForwardMsgCache.processMessage`):
  - If the ForwardMsg is *not* a reference message, it is cached locally and nothing more is done
  - If the ForwardMsg *is* a reference message, the client checks to see if it has the original message in its cache; if so, it gets returned
  - If the reference message is missing from the client's cache, it is requested from the server's `/message` endpoint, and then cached.

## TODO

- The client and server both need to perform cache eviction; currently, their caches never have anything removed from them.
- I'd like to have some better client tests that use a mock server to test its handling of the `/message` endpoint
* Adding unit tests for magic (streamlit#13)

* Adding unit tests for magic

* Standardize cypress viewport across all tests and fixes (streamlit#11)

- standardize cypress viewport across all tests
- remove snapshot corresponding to missing `mplotlib` test
- fixed issue with snapshots being cutoff or wrong element being captured sporadically
- increased snapshot diff threshold by .5%
- increase snapshot bounding box for `disconnect.py` and `empty_dataframes.py`
- allow multiple valid width values for `vega_lite_chart.py`
- set `devicePixelRatio` to 1 when running `cypress open'
- remove spinner e2e test and core example
- fix various cypress tests
- preload `viz.js` when running `yarn start`

* Deprecating data parameter from st.deck_gl_chart() (streamlit#22)

* Deprecating data parameter for st.deck_gl_chart()

* Basic deck_gl example setting viewport

* DOCSTRING

* LINT

* Moving deck_gl_chart test from streamlit_test to deck_gl_test

* Checking kwargs instead of a data parameter

* Fixing comments from thiago

* st.image() resizing images when are bigger than the maximum (streamlit#30)

* Resizing image when its bigger than the maximum width

* Resizing image when its bigger than the maximum width

* MAXIMUM_CONTENT_WIDTH documentation

* MAXIMUM_CONTENT_WIDTH documentation

* Hash numpy arrays. Use fixed size sample. (streamlit#16)

* Merge version 0.45.0 (streamlit#34)

* Restore correct vertical spacing for report elements.

* Break Block.tsx into more functions.

* Tiny refactors in Block.tsx

* Add "key" property to element in DocString, to avoid warnings.

* Fix streamlit#1230: OSError not caught when writing cache to disk

* Don't show hashing warning when user imports a module inside a cached function.

* Fix streamlit#6: report should loads element by element (was broken since Sidebar PR)

* Try to fix CircleCI issue: Method 'renderElementWithErrorBoundary' expected no return value

* Add typing to function.

* Remove unecessary dockerfiles

* Fix a bunch of examples

* Fix color of syntax error code in modal dialog.

* Remove old style copyright from conftest.py

* Remove st.warning when hashing a class.

* Add astor to Python requirements

* Update widget API in some e2e tests

* Clean up st.foo_widget() signatures in st.help()

* Lint docs/getting_started.md

* Clean up widget examples in docstrings.

* Add comment to slider examples in docs.

* Tweak Streamlit's description in setup.py

* Sort pipfile modules

* Re-add scripts and Makefile rules to publish packages to conda.

* Fix update_version.py to support new conda location and remove old docker folder.

* Make `streamlit config show` pipeable to a file.

* Update docs/ site

* Remove /docs/api invalidation from Makefile

* Improve docstring for st.slider

* Remove st.Cache (capital C) from __init__ while we figure out what to call it.

* Up version to 0.45.0

* Turn off tests for st.Cache (capital C)

* Try turning off tests again

* Test frontend in CircleCI (streamlit#35)

* CircleCI: run frontend tests after python tests

* `make jstest`: don't run non-existent 'coverage' script

* Add `make jscoverage` in case we want to use it

* Make sure we generate scssvars

* Conditionally run our frontend tests

* 403 Error message (streamlit#37)

* 403 Error message

* ESLint

* Adding width and height parameter to st.dataframe (streamlit#33)

This PR proposes to add a width and height parameter to st.dataframe. Protobuf is changed to include an ElementDimensionSpec field to carry this information. This is included as part of the forward message metadata. The message is parsed and height/width information is extracted and passed down to the Dataframe React component in Block.tsx. Width is set in Block.tsx since existing logic applies to a generic element, while height is set in the Dataframe element in some ad hoc logic.

* Fix WebsocketConnection totalTries count (streamlit#38)

* Fixing overlap between onload and onreadystatechange

* Removing unused events
This implements `ForwardMsgCache` expiration on both the client and the server.

- The `global.maxCachedMessageAge` config value specifies the maximum age of a message in the cache. The age of a message is defined by the number of times a referencing report has finished running since the message was accessed by that report. So when a report finishes running for the first time, the age of all its cached messages is 1.
- Our default maxCachedMessageAge is 2, which means that messages will remain cached even after not having been referenced for a report run.
- To determine the age of messages in its cache, the server tracks a `report_run_count` alongside each `ReportSession` in its websocket->reportsession dictionary. The server increments this value each time it processes a `report_finished` ForwardMsg message for a report.
- When the server increments `report_finished`, it also asks the cache to remove any expired messages. I haven't done anything fancy here - the cache just iterates through all its entries, does an age check, and deletes them if they're expired. @tvst, I know you were potentially concerned about performance issues with the "iterate the whole cache" strategy. We could add a simple timer to this function to collect local metrics to see if it's actually an issue?
- Similarly, when the client receives a `report_finished` ForwardMsg, it performs the same cache expiration step. This means that the server and client's caches should stay in sync. (I haven't thought hard enough about whether it's possible for them to get briefly out of sync, but if they do, we have the new `/message` endpoint as a fallback.)
- If a report fails to run due to a compilation error, neither the server nor the client will increment the `report_run_count` for that session. (The run-count incrementing is conditional on the status value of the `report_finished` message, which is set to `FINISHED_WITH_COMPILE_ERROR` when that happens.)


There are client and server tests for all this new logic. But the easiest way to see the expiration in action is to run something like `examples/core/message_deduping.py` and watch the debug output on the server and the client. (You'll see logs for cache hits and misses - you should never see a miss! - on the client; and logs for sending cached message refs on the server.)
- This example just creates a big dataframe and sends it twice
- The first time the report is run, the message will only be delivered to the client once.
- If you rerun the report, the message won't be delivered at all because it'll be cached.
- If you remove the `st.dataframe()` calls and rerun the report, the dataframe will *not* be removed from the cache because it won't be old enough. Re-adding those calls and rerunning the report should confirm this.
- If you remove the dataframe calls and re-run the report _twice_ with them missing, they will then be expired from the cache. Re-adding them should result in the message being resent to the client.
* develop: (54 commits)
  Removing uber demo from streamlit repo (streamlit#159)
  Release 0.46.0 (streamlit#170)
  Magic fixes (streamlit#138)
  [docs] Add analytics; redirect /secret/docs; fix compilation problem (streamlit#149)
  Fix bug for startup under windows (streamlit#144)
  Responsive layout (streamlit#104)
  Add basic PR template
  Better method signatures (streamlit#88)
  Publish docs to both /docs and /secret/docs, until we deprecate /secret/docs in January. (streamlit#141)
  Rename/report2app (streamlit#126)
  Test running streamlit commands "bare" (streamlit#133)
  Updates to LP and sidebar. (streamlit#134)
  tests for z-index of date input popover in sidebar (streamlit#131)
  cypress test for escaping html in markdown (streamlit#125)
  On a Mac, check if xcode installed before requiring watchdog (streamlit#91)
  [Docs] Fix st.slider API in tutorial (streamlit#98)
  Sidebar exceptions (streamlit#107)
  Fixing unbound local variable (streamlit#110)
  Support hashing dataframes with unhashable objects. Gracefully f… (streamlit#118)
  Fix hashing if the object has a name but the name is not a string. (streamlit#117)
  ...
@tconkling tconkling requested a review from tvst September 23, 2019 23:46
@tconkling tconkling marked this pull request as ready for review September 23, 2019 23:47
Copy link
Copy Markdown
Contributor

@tvst tvst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, I'll just blindly approve since that branch was already reviewed.

That said, there are tons of ESLint errors. If they're not related to this PR, feel free to merge as is, so you can avoid merge conflicts.

Also: I'm not sure whether these lint errors will pop up for everyone as soon as this PR is merged. Do errors only show up for files modified in each person's PR? Or all files in the repo?. If the latter, then please address the errors in a subsequent PR.

@tconkling
Copy link
Copy Markdown
Contributor Author

There were only three or four errors - the rest are a bunch of warnings about missing return types on non-Typescript code.

Looks like @kantuni disabled failure-on-warning, which seems like the right thing to do when faced with this torrent of silly issues. That said, there was a point when our eslint configuration did not complain about Typescript-specific issues on js files. I'm not sure what changed?

@tconkling tconkling merged commit fbe5c38 into streamlit:develop Sep 24, 2019
@tconkling tconkling deleted the tim/ApplyBlackAndPrettier branch September 24, 2019 19:34
tconkling added a commit to tconkling/streamlit that referenced this pull request Sep 25, 2019
# By Jonathan Rhone (3) and others
# Via GitHub
* develop:
  Text changes in ☰ menu and several CLI prompts (streamlit#180)
  add format option for slider widget and return texts for display values (streamlit#154)
  Merge ForwardMsg caching into develop  (streamlit#178)
  e2e test for magic (streamlit#172)
  Bart example update (streamlit#179)
  Pedantic CircleCI cleanup (streamlit#174)
  Fix overwriting elements in the sidebar (streamlit#181)
  update vega lite chart snapshot (streamlit#183)
  Stop react-markdown from converting "[foo]" to a link. (streamlit#151)
  Issue 1204: Graphs using Vega-Lite (streamlit#56)
  Removing uber demo from streamlit repo (streamlit#159)

# Conflicts:
#	lib/streamlit/DeltaGenerator.py
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants