Skip to content

Fix concurrency bug in FutureDependencyExecutor#145

Merged
gpauloski merged 3 commits intomainfrom
test-me-py312-test-hang
Sep 5, 2024
Merged

Fix concurrency bug in FutureDependencyExecutor#145
gpauloski merged 3 commits intomainfrom
test-me-py312-test-hang

Conversation

@gpauloski
Copy link
Copy Markdown
Contributor

Description

The Python 3.12 tests were deadlocking in CI. This turned out to be the proble,

See the commit history for full details. I also made tests in CI run in verbose mode, and found a test dir with a missing __init__.py. Both of these were attempts to find the true source of the test deadlocks---the FutureDependencyError.

Fixes N/A

Type of Change

  • Bug (non-breaking change which fixes an issue)
  • Enhancement (non-breaking change which adds or improves functionality)
  • Internal (refactoring, performance, and testing)
  • Breaking (fix or feature that would cause existing functionality to not work as expected)
  • Documentation (no changes to the code)
  • Development (CI workflows, packages, templates, etc.)
  • Package (package dependencies and versions)

Testing

N/A

Pull Request Checklist

Please confirm the PR meets the following requirements.

  • Relevant tags are added (breaking, bug, documentation, enhancement, package, etc.).
  • Code changes pass pre-commit (e.g., ruff, mypy, etc.).
  • Tests have been added to show the fix is effective or that the new feature works.
  • New and existing unit tests pass locally with the changes.
  • Docs have been updated and reviewed if relevant.

The FutureDependencyExecutor constructor has two subtle bugs it:
 - Future callbacks were added before adding the future to the pending
   futures list which caused assertion errors in the callback if the
   callback was executed between those two lines.
 - _submit() could get called twice if there were pending futures that
   were already finished (so the callback was invoked immediately when
   the callback was added).

These were causing deadlocks on Python 3.12 in the
taps/apps/synthetic_test.py::run_sequential, but only when run in GitHub
actions. I was not able to reproduce the deadlock locally but the logic
error was apparent.

I also noticed that we check for future arguments using Future rather
than FutureProtocol. FutureProtocol was added later and is safer for
executors that implement their own future type. Due to FutureProtocol
now being used outside of taps.engine and a recursive import error, I
moved taps.engine.future to taps.future.
@gpauloski gpauloski added the bug Something isn't working label Sep 5, 2024
@gpauloski gpauloski merged commit 8c6686a into main Sep 5, 2024
@gpauloski gpauloski deleted the test-me-py312-test-hang branch September 5, 2024 22:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant