Fix fragment runtime errors for just removed fragments #9965

raethlein · 2024-12-04T15:59:57Z

Describe your changes

Split into following PRs: #10132, #10147, #10130

In issue #9921, the problem is that the st.rerun call inside of the dialog triggers a rerun. If you click on the button again before the dialog closes, another RequestRerun is sent to the server. By the time this second request is processed, the full-app rerun has already happened and cleared the fragment storage. So when the second RequestRerun is processed, the fragment id cannot be found anymore and the error is thrown.

This PR adds a previous_fragment_ids list to the FragmentStorage and only raises an error if the fragment id has never been seen. Otherwise, just log is emitted.
Discussion 1: We could always just emit a log and do not introduce the additional list.

After making this change, another issue surfaced: the error itself was gone, but the Dialog stayed open sometimes. There are two different root causes for this (both race conditions interplaying with the ScriptRunner threads etc.):

In forward_msg_queue, the FinishedScript type was always set to STOPPED_EARLY_FOR_RERUN. Now, the status is not changed for full app reruns by fragment runs. This is correct since a fragment indeed does not stop full app reruns. This only happens when the fragment run is so fast that the browser queue is not cleared yet.
In app_session, a fragment run might create a new ScriptRunner when the current ScriptRunner is in state STOPPED (in this case, success here is false and the new ScriptRunner is created). This will lead to all events from the previous script runner being ignored (see here). When the full app rerun ScriptRunner is done (STOPPED) but its events are not processed before the new ScriptRunner is created, its finished message is not sent to the frontend and no cleanup is happening.

GitHub Issue Link (if applicable)

Testing Plan

Explanation of why no additional tests are needed
Unit Tests (JS and/or Python): Added unit tests for the new behavior
E2E Tests: Reproducing the issues requires multiple tries as the issue is caused by some race conditions. This would lead to flaky E2E tests so the unit tests should be sufficient.
Any manual testing needed?

Contribution License Agreement

By submitting this pull request you agree that all contributions to this project are made under the Apache 2.0 license.

kmcgrady

Overall, the approach seems solid. My only question is should we consider clearing the fragment id from the past after some time (or after a rerun or two without it?). Just seems like this storage can get arbitrarily large. We may also be fine with this because the storage is in the AppSession which is tied to a browser tab living (which is very unlikely). Just wondering if we can make a guarantee where we know for sure that we should never see an old fragment id.

lib/streamlit/runtime/app_session.py

kmcgrady · 2024-12-05T18:14:29Z

lib/streamlit/runtime/forward_msg_queue.py

Might benefit from a comment to clarify how this relates to the fragments case (like the one from the PR description)

Great point, I have added a comment to explain this in more detail

raethlein · 2024-12-06T23:07:53Z

Yeah that's a great point. I think the best options might be to use a cache counter and clear them after clear was called X times on the fragment_storage. Similar to what you wrote about clearing after 2 reruns. Tying it to the clear function might be clearer / more easy to grasp than tying it to reruns which has to be tracked somewhere else etc.
Or we don't raise the exception ever (also if we have never seen the fragment id at all) and simply emit a log which would allow us to remove the previous list completely.

lukasmasuch · 2025-01-06T17:35:39Z

PR looks good, but it seems that the PR contains two slightly independent fixes. Would it be possible to split it into two PRs that might make it easier to handle follow-up issues that might arise from the fixes.

Discussion 1: We could always just emit a log and do not introduce the additional list.

I'm leaning slightly more towards keeping it simpler, just always logging it to the console and never raising an exception that's shown to the user. I don't think it adds a lot of benefit to show this error to the user in any situation and the developer can see it in the logs -> and adding the additional complexity just to be able to show the error sometimes on UI doesn't feel to be worth it.

raethlein · 2025-01-08T12:27:53Z

Good points, I have started to split the PRs, here is the first one: PR-10130 to change from throwing an error to just showing a log

…10130) ## Describe your changes After the discussion in context of this PR: #9965, we want to show a log instead of raising an error in all cases to keep it simple and not introduce another list / more tech debt. ## GitHub Issue Link (if applicable) ## Testing Plan - Unit Tests (JS and/or Python) - Remove a couple of unit tests that checked for the exception to be thrown --- **Contribution License Agreement** By submitting this pull request you agree that all contributions to this project are made under the Apache 2.0 license.

raethlein · 2025-01-10T13:18:50Z

Close in favor of the split PRs: #10132, #10147, #10130

…treamlit#10130) ## Describe your changes After the discussion in context of this PR: streamlit#9965, we want to show a log instead of raising an error in all cases to keep it simple and not introduce another list / more tech debt. ## GitHub Issue Link (if applicable) ## Testing Plan - Unit Tests (JS and/or Python) - Remove a couple of unit tests that checked for the exception to be thrown --- **Contribution License Agreement** By submitting this pull request you agree that all contributions to this project are made under the Apache 2.0 license.

raethlein added security-assessment-completed Security assessment has been completed for PR change:bugfix PR contains bug fix implementation impact:users PR changes affect end users labels Dec 4, 2024

raethlein changed the title ~~Fix showing exception and potential race condition~~ [Draft] Fix fragment runtime errors for just removed fragments Dec 4, 2024

raethlein requested a review from kmcgrady December 4, 2024 18:01

raethlein force-pushed the fix/issue-9921 branch from 0f93041 to 19912dc Compare December 4, 2024 23:41

kmcgrady reviewed Dec 5, 2024

View reviewed changes

raethlein force-pushed the fix/issue-9921 branch from 19912dc to 52908ce Compare December 6, 2024 23:03

raethlein marked this pull request as ready for review December 6, 2024 23:19

raethlein force-pushed the fix/issue-9921 branch from 52908ce to d8fa998 Compare December 7, 2024 00:02

raethlein changed the title ~~[Draft] Fix fragment runtime errors for just removed fragments~~ Fix fragment runtime errors for just removed fragments Dec 13, 2024

raethlein force-pushed the fix/issue-9921 branch 2 times, most recently from 9d62354 to 6ac2fbf Compare December 14, 2024 10:06

raethlein added 4 commits January 8, 2025 12:58

Fix showing exception and potential race condition

9331046

Fix tests

ebe4da6

Remove print statements

21bddf9

Address feedback

9be0735

raethlein force-pushed the fix/issue-9921 branch from 6ac2fbf to 9be0735 Compare January 8, 2025 12:01

raethlein mentioned this pull request Jan 8, 2025

Only show log but don't raise an error when fragment id is not found #10130

Merged

This was referenced Jan 8, 2025

Don't update FinishedMessage status of full app runs by fragment runs #10132

Merged

Don't request_rerun for fragments that do not exist anymore #10147

Merged

raethlein closed this Jan 10, 2025

akramsystems added a commit to akramsystems/streamlit that referenced this pull request Apr 27, 2025

streamlitgh-9965 rm changes to event_based_path_watcher

744040e

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix fragment runtime errors for just removed fragments #9965

Fix fragment runtime errors for just removed fragments #9965

Uh oh!

raethlein commented Dec 4, 2024 •

edited

Loading

Uh oh!

kmcgrady left a comment

Uh oh!

Uh oh!

kmcgrady Dec 5, 2024

Uh oh!

raethlein Dec 6, 2024

Uh oh!

raethlein commented Dec 6, 2024

Uh oh!

lukasmasuch commented Jan 6, 2025 •

edited

Loading

Uh oh!

raethlein commented Jan 8, 2025

Uh oh!

raethlein commented Jan 10, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Fix fragment runtime errors for just removed fragments #9965

Fix fragment runtime errors for just removed fragments #9965

Uh oh!

Conversation

raethlein commented Dec 4, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Describe your changes

GitHub Issue Link (if applicable)

Testing Plan

Uh oh!

kmcgrady left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

kmcgrady Dec 5, 2024

Choose a reason for hiding this comment

Uh oh!

raethlein Dec 6, 2024

Choose a reason for hiding this comment

Uh oh!

raethlein commented Dec 6, 2024

Uh oh!

lukasmasuch commented Jan 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

raethlein commented Jan 8, 2025

Uh oh!

raethlein commented Jan 10, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

raethlein commented Dec 4, 2024 •

edited

Loading

lukasmasuch commented Jan 6, 2025 •

edited

Loading