Skip to content

Conversation

@raethlein
Copy link
Collaborator

@raethlein raethlein commented Jan 8, 2025

Describe your changes

Related to #9921 and extracted from PR #9965

In #10130, we stopped raising a RuntimeError when a fragment with a specified id cannot be found. This solves the original issue in #9921, but surfaced a different issue where the dialog did not close anymore when the button in the dialog was clicked in fast succession. The reason for this is that in forward_msg_queue, the FinishedScript type prior to this PR here is always set to STOPPED_EARLY_FOR_RERUN. This is now modified so that the status is not changed for full app reruns triggered by a fragment runs. This resolves the described issue with the not-closing dialog because a second, fast subsequent click on the button does not override the initial finished-message in the queue belonging to the st.rerun() in the example.
This is correct since a fragment indeed does not stop full app reruns. This only happens when the fragment run is so fast that the browser queue is not cleared yet.

GitHub Issue Link (if applicable)

Testing Plan

  • Unit Tests (JS and/or Python)
    • Updated Python unit tests.
  • E2E Tests
    • Added an e2e test that can catch an underlying issue where the dialog does not close correctly. However, since the cause of the particular issue addressed in the PR is a race condition, the test can succeed even if the issue exists. However, it should still fail from time to time which can help us to get aware of an underlying problem.
  • Any manual testing needed
    • The test introduced in the e2e test is more likely to fail when executing manually, so that should be done if an issue arises.

Contribution License Agreement

By submitting this pull request you agree that all contributions to this project are made under the Apache 2.0 license.

@raethlein raethlein force-pushed the fix/issue-9921-2 branch 2 times, most recently from 1a154b3 to 4e9a59e Compare January 8, 2025 15:56
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am myself unsure about whether we want to have the test. The test might succeed even without the fix but only fail sometimes, so its not a 100% guarantee catch. But it might still be a good indicator to become aware of the issue eventually if the test starts to fail / fails from time to time.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably better than nothing 👍

@raethlein raethlein added security-assessment-completed Security assessment has been completed for PR impact:internal PR changes only affect internal code change:bugfix PR contains bug fix implementation labels Jan 8, 2025
@raethlein raethlein requested a review from lukasmasuch January 8, 2025 19:23
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Could also use click_button from app_utils (if its fine to wait for the end of the rerun)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, I have replaced all those occurrences with it 👍

Copy link
Collaborator

@lukasmasuch lukasmasuch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably better than nothing 👍

@raethlein raethlein merged commit 31d06f7 into develop Jan 9, 2025
33 checks passed
raethlein added a commit that referenced this pull request Jan 10, 2025
## Describe your changes

We stopped raising
a RuntimeError when a fragment with a specified id cannot be found. This
solves the original issue in
#9921, but surfaced a
different issue where the dialog did not always close when the button in
the dialog was clicked in fast succession.

The reason for this is that in app_session, a fragment run might create
a new ScriptRunner when the current ScriptRunner is in state STOPPED
is false and the new ScriptRunner is created). This will lead to all
events from the previous script runner being ignored.
When the full app rerun ScriptRunner is done (STOPPED) but its events
are not processed before the new ScriptRunner is created, its finished
message is not sent to the frontend and no cleanup is happening.

## GitHub Issue Link (if applicable)

## Testing Plan

- Unit Tests (JS and/or Python)
- Add unit test to ensure that the scriptrunner is not requested if the
fragment does not even exist anymore
- E2E Tests
- The e2e test that was added in
#10132 _could_ catch this one
here, although it happens more rarely than the race condition tackled in
that PR

---

**Contribution License Agreement**

By submitting this pull request you agree that all contributions to this
project are made under the Apache 2.0 license.
edegp pushed a commit to edegp/streamlit that referenced this pull request Jan 19, 2025
…streamlit#10132)

## Describe your changes

We stopped raising a `RuntimeError` when a fragment with a specified id cannot be found. This solves the original issue in streamlit#9921, but surfaced a different issue where the dialog did not close anymore when the button in the dialog was clicked in fast succession. The reason for this is that in `forward_msg_queue`, the `FinishedScript` type prior to this PR here is always set to `STOPPED_EARLY_FOR_RERUN`. This is now modified so that the status is not changed for full app reruns triggered by a fragment runs. This resolves the described issue with the not-closing dialog because a second, fast subsequent click on the button does not override the initial finished-message in the queue belonging to the `st.rerun()` in the example.
This is correct since a fragment indeed does not stop full app reruns. This only happens when the fragment run is so fast that the browser queue is not cleared yet.

## GitHub Issue Link (if applicable)

## Testing Plan

- Unit Tests (JS and/or Python)
  - Updated Python unit tests.
- E2E Tests
  - Added an e2e test that can catch an underlying issue where the dialog does not close correctly. However, since the cause of the particular issue addressed in the PR is a race condition, the test can succeed even if the issue exists. However, it should still fail from time to time which can help us to get aware of an underlying problem.
- Any manual testing needed
  - The test introduced in the e2e test is more likely to fail when executing manually, so that should be done if an issue arises.

---

**Contribution License Agreement**

By submitting this pull request you agree that all contributions to this project are made under the Apache 2.0 license.
edegp pushed a commit to edegp/streamlit that referenced this pull request Jan 19, 2025
…t#10147)

## Describe your changes

We stopped raising
a RuntimeError when a fragment with a specified id cannot be found. This
solves the original issue in
streamlit#9921, but surfaced a
different issue where the dialog did not always close when the button in
the dialog was clicked in fast succession.

The reason for this is that in app_session, a fragment run might create
a new ScriptRunner when the current ScriptRunner is in state STOPPED
is false and the new ScriptRunner is created). This will lead to all
events from the previous script runner being ignored.
When the full app rerun ScriptRunner is done (STOPPED) but its events
are not processed before the new ScriptRunner is created, its finished
message is not sent to the frontend and no cleanup is happening.

## GitHub Issue Link (if applicable)

## Testing Plan

- Unit Tests (JS and/or Python)
- Add unit test to ensure that the scriptrunner is not requested if the
fragment does not even exist anymore
- E2E Tests
- The e2e test that was added in
streamlit#10132 _could_ catch this one
here, although it happens more rarely than the race condition tackled in
that PR

---

**Contribution License Agreement**

By submitting this pull request you agree that all contributions to this
project are made under the Apache 2.0 license.
@lukasmasuch lukasmasuch deleted the fix/issue-9921-2 branch March 13, 2025 21:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

change:bugfix PR contains bug fix implementation impact:internal PR changes only affect internal code security-assessment-completed Security assessment has been completed for PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants