Skip to content

fix: fixed a race condition for terraform task statuses#3686

Merged
fiftin merged 1 commit intosemaphoreui:developfrom
JulianKap:fix/tasks-status-race
Mar 9, 2026
Merged

fix: fixed a race condition for terraform task statuses#3686
fiftin merged 1 commit intosemaphoreui:developfrom
JulianKap:fix/tasks-status-race

Conversation

@JulianKap
Copy link
Copy Markdown
Contributor

PR Title

Fix race condition in runner task status sync causing incorrect confirmation state for Terraform tasks

Description

This PR fixes a race condition in the runner that could overwrite task confirmation statuses during synchronization with the server.

Observed problem

When using separate runners and executing Terraform tasks, there were situations where the task should transition to the "waiting for confirmation" state after terraform plan detects infrastructure changes.

However, the task remained in the "running" state even though the logs clearly showed that Terraform had finished the plan phase and was waiting for confirmation.

As a result:

  • the task never transitioned to the confirmation state in the UI
  • the infrastructure update could not be approved
  • the task had to be restarted multiple times until the confirmation state eventually appeared

This behavior was intermittent and difficult to reproduce reliably.

Root cause

The runner uses two concurrent goroutines:

  1. Status sync goroutine — periodically fetches task statuses from the server and updates local task states if they differ.
  2. Reporting goroutine — collects local task states and logs and sends them back to the server.

A race condition could occur in the following scenario:

  1. A task running on the runner transitions to the "waiting for confirmation" state.
  2. At the same time, the status sync goroutine fetches task states from the server where the task is still "running".
  3. Because the statuses differ, the runner overwrites the local task status with the server status (running).
  4. Immediately after that, the reporting goroutine sends the current task states back to the server.

As a result, instead of sending the expected "waiting for confirmation" status, the runner sends "running", effectively losing the confirmation state.

Additional issue

A similar race condition also affected the "confirmed" status.
In this case the issue was mostly visual and did not affect actual task execution, but the root cause was the same concurrent state overwrite.

Solution

The fix ensures that task status updates from the server do not overwrite local runner states that have already transitioned to confirmation-related statuses.

This prevents the runner from reverting task states when synchronization happens concurrently with local state transitions.

Result

  • Eliminates the race condition between status synchronization and reporting
  • Prevents loss of "waiting for confirmation" state
  • Fixes incorrect "confirmed" status display
  • Ensures Terraform confirmation workflow works reliably

@fiftin
Copy link
Copy Markdown
Collaborator

fiftin commented Mar 8, 2026

Hi @JulianKap thank you! Have you test it?

@JulianKap
Copy link
Copy Markdown
Contributor Author

Hi @fiftin ! Yes, I tested this fix on individual runners, and the error in the terraform template did not reproduce (I also tested Ansible templates). I also added temporary logs to track the behavior when receiving server statuses, which confirmed that the issue had been fixed.

@fiftin fiftin merged commit 32fb433 into semaphoreui:develop Mar 9, 2026
13 checks passed
@fiftin
Copy link
Copy Markdown
Collaborator

fiftin commented Mar 9, 2026

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants