Always send shutdown_worker RPC, fix WorkerStatus state when shutting down worker by yuandrew · Pull Request #1082 · temporalio/sdk-core

yuandrew · 2025-12-16T22:32:13Z

What was changed

Always send shutdown_worker RPC, decouple disabling eager workflow start and worker heartbeat unregistration for worker shutdown

Why?

shutdown_worker RPC doesn't indicate that the worker has fully shutdown, only that it has started. Server and others can tell that a worker has fully shutdown by checking if there has been a heartbeat within the "heartbeat interval" after receiving the ShuttingDown status.

WorkerStatus with shutdown today is not accurate. The ShutdownWorker RPC does not indicate that the worker has shutdown, but only that it has begun shutting down. So before this PR, we are today marking a worker as shutdown even though it is still in the process of shutting down.
With this accurately setting status to shutdown, there is no mechanism for a worker to communicate to server that it is fully shutdown. It is up to the server to mark it as fully shutdown, using its own TTL definition. Today this is defaulted to 5 minutes, https://github.com/temporalio/temporal/blob/main/common/dynamicconfig/constants.go#L1401. This will be improved in the future to be a shorter interval, likely relating to the heartbeat interval that the worker needs to start sending. This requires an API update to add this field to the heartbeat.

Checklist

Closes
How was this tested:

Any docs updates needed?

Note

Implements two-phase worker unregistration and ensures shutdown RPC is always sent.

Introduces unregister_slot_provider and finalize_unregister in ClientWorkerSet to decouple disabling eager workflow start from heartbeat unregistration; enforces order and updates all call sites (shutdown, replace_client, tests)
Worker shutdown now always invokes shutdown_worker with final heartbeat and sets status to ShuttingDown; tests updated to expect ShuttingDown and mocks always allow shutdown_worker
Removes client-side mutation of heartbeat status during shutdown RPC; heartbeat cleanup moved to end via finalize_unregister
Minor test/log tweaks (string formatting)

^{Written by Cursor Bugbot for commit 5221608. This will update automatically on new commits. Configure here.}

… down

…register

cursor · 2025-12-17T19:52:51Z

                .workers()
-                .unregister_worker(self.worker_instance_key);
+                .unregister_slot_provider(self.worker_instance_key);
        }


Bug: Shutdown status not set on initiation

initiate_shutdown no longer updates self.status to WorkerStatus::ShuttingDown. Callers that use initiate_shutdown to begin shutdown (before awaiting shutdown/finalize_shutdown) will keep sending heartbeats with Running, delaying/obscuring shutdown signaling and breaking server-side “seen ShuttingDown then no heartbeat” detection.

We want ShuttingDown state to be set when we send the worker_shutdown RPC call

cursor · 2025-12-17T21:28:37Z

+        // This is a best effort call and we can still shutdown the worker if it fails
+        match self.client.shutdown_worker(sticky_name, heartbeat).await {
+            Err(err)
+                if !matches!(


Bug: Empty sticky queue sent on shutdown

shutdown() now always calls shutdown_worker and uses unwrap_or_default() for sticky_name, which becomes an empty string when no sticky queue is used (e.g., max_cached_workflows == 0 or workflow polling disabled). If the server treats an empty sticky_task_queue as invalid when implemented, this can cause noisy warnings and failed shutdown signaling.

This is intentional, we want to start always sending shutdown_worker, not just on sticky queue

Sushisource

Looking good to me (aside from it looks like a few tests need updating), just one minor thing

yuandrew · 2026-01-09T20:50:56Z

Will update the PR description with some new information

Some things to note:

WorkerStatus with shutdown today is not accurate. The ShutdownWorker RPC does not indicate that the worker has shutdown, but only that it has begun shutting down. So before this PR, we are today marking a worker as shutdown even though it is still in the process of shutting down.
With this accurately setting status to shutdown, there is no mechanism for a worker to communicate to server that it is fully shutdown. It is up to the server to mark it as fully shutdown, using its own TTL definition. Today this is defaulted to 5 minutes, https://github.com/temporalio/temporal/blob/main/common/dynamicconfig/constants.go#L1401. This will be improved in the future to be a shorter interval, likely relating to the heartbeat interval that the worker needs to start sending. This requires an API update to add this field to the heartbeat.

…#686) **What changed?** - Added a new `worker_instance_key` field to `ShutdownWorkerRequest`, as well as all other poll calls. - Added comment to `ShutdownWorkerRequest.sticky_task_queue` saying it may be blank, now that we've expanded the scope of when `ShutdownWorkerRequest` is called. - Added `task_queue` and `task_queue_kind` to `ShutdownWorkerRequest` **Why?** ShutdownWorker was changed to always be sent by SDK (temporalio/sdk-core#1082), so sticky queue name is now optional. This plus the new heartbeat info we send on shutdown means Server will now have a more accurate map of which workers are shutting down. Adding task queue and task_queue_kind should also allow us to fix a lost task issue, where there is a race when the SDK cancels an outstanding poll rpc and the server decides to send a task to that poller. Technically some of this info exists in the worker heartbeat part of the message, but it needs to be lifted to its own field due to the scenario where worker heartbeating is disabled. **Breaking changes** N/A I think, just adding new fields **Server PR**

… (#686) **What changed?** - Added a new `worker_instance_key` field to `ShutdownWorkerRequest`, as well as all other poll calls. - Added comment to `ShutdownWorkerRequest.sticky_task_queue` saying it may be blank, now that we've expanded the scope of when `ShutdownWorkerRequest` is called. - Added `task_queue` and `task_queue_kind` to `ShutdownWorkerRequest` **Why?** ShutdownWorker was changed to always be sent by SDK (temporalio/sdk-core#1082), so sticky queue name is now optional. This plus the new heartbeat info we send on shutdown means Server will now have a more accurate map of which workers are shutting down. Adding task queue and task_queue_kind should also allow us to fix a lost task issue, where there is a race when the SDK cancels an outstanding poll rpc and the server decides to send a task to that poller. Technically some of this info exists in the worker heartbeat part of the message, but it needs to be lifted to its own field due to the scenario where worker heartbeating is disabled. **Breaking changes** N/A I think, just adding new fields **Server PR**

yuandrew added 2 commits December 16, 2025 17:30

Always send shutdown_worker RPC, fix WorkerStatus state when shutting…

756fa71

… down

Split unregister worker into unregister_slot_provider and finalize_un…

9d47601

…register

yuandrew marked this pull request as ready for review December 17, 2025 19:46

yuandrew requested a review from a team as a code owner December 17, 2025 19:46

cursor bot reviewed Dec 17, 2025

View reviewed changes

check for unregistration order without mutating state, prevent leak

61591f8

cursor bot reviewed Dec 17, 2025

View reviewed changes

Sushisource approved these changes Dec 18, 2025

View reviewed changes

Comment thread crates/client/src/worker/mod.rs Outdated

yuandrew and others added 3 commits December 19, 2025 09:18

Remove duplicate check, fix test

19af7ab

Merge branch 'master' into worker-heartbeat-shutdown-status

1849b16

Merge branch 'master' into worker-heartbeat-shutdown-status

5221608

yuandrew merged commit d8a2bf1 into temporalio:master Jan 9, 2026
20 checks passed

yuandrew deleted the worker-heartbeat-shutdown-status branch January 9, 2026 21:44

yuandrew mentioned this pull request Jan 23, 2026

Enhance ShutdownWorkerRequest and poll calls with worker_instance_key temporalio/api#686

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Always send shutdown_worker RPC, fix WorkerStatus state when shutting down worker#1082

Always send shutdown_worker RPC, fix WorkerStatus state when shutting down worker#1082
yuandrew merged 6 commits intotemporalio:masterfrom
yuandrew:worker-heartbeat-shutdown-status

yuandrew commented Dec 16, 2025 •

edited

Loading

Uh oh!

Uh oh!

cursor bot Dec 17, 2025

Uh oh!

yuandrew Dec 17, 2025

Uh oh!

cursor bot Dec 17, 2025

Uh oh!

yuandrew Dec 19, 2025

Uh oh!

Sushisource left a comment •

edited

Loading

Uh oh!

Uh oh!

yuandrew commented Jan 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

yuandrew commented Dec 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What was changed

Why?

Checklist

Uh oh!

Uh oh!

cursor bot Dec 17, 2025

Choose a reason for hiding this comment

Bug: Shutdown status not set on initiation

Uh oh!

yuandrew Dec 17, 2025

Choose a reason for hiding this comment

Uh oh!

cursor bot Dec 17, 2025

Choose a reason for hiding this comment

Bug: Empty sticky queue sent on shutdown

Uh oh!

yuandrew Dec 19, 2025

Choose a reason for hiding this comment

Uh oh!

Sushisource left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

yuandrew commented Jan 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

yuandrew commented Dec 16, 2025 •

edited

Loading

Sushisource left a comment •

edited

Loading