Race condition between signal handler and workflow completing

I am seeing an issue where there is a race condition between any activities started in a signal handler and the completion of the workflow. If I trigger a signal before the end of the workflow, the signal is received and the activity runs, but failure retries will stop if the workflow completes.

Here is some example code

import {
  defineSignal,
  setHandler,
  sleep,
  proxyActivities,
} from "@temporalio/workflow";
import type * as activities from "@api/_temporal/shared-activities";

const { failingActivityWithDelay } = proxyActivities<typeof activities>({
  startToCloseTimeout: "1 minute",
  retry: {
    maximumAttempts: 10,
  },
});

export const testFailSignal = defineSignal("testFailSignal");

export async function testWorkflow() {
  setHandler(testFailSignal, async () => {
    await failingActivityWithDelay(); // Fails after 5 seconds
  });

  await sleep(20 * 1000);
}

In the above code it will retry 4 times (4 * 5 seconds), but any retries after the sleep completes don’t happen. In addition the workflow is marked as completed (not failed).

Once workflow function returns the workflow is completed and all its activities are considered canceled thus not retried.

Use allHandlersFinished to ensure that workflow function doesn’t return prematurely:

await workflow.condition(workflow.allHandlersFinished)

Thank you. I will add that to my base workflow function (or force our handlers be synchronous and not call activities).

Feature suggestion to have all workflows wait by default or error when calling activities in handlers (or mention in the docs at a minimum). I think it’s a bit of a hidden footgun currently. I am guessing lots of workflows have this bug if they are using signals

Taylor,
Good point, and you’re right, lots of workflows have this bug.
We’ve made some recent progress on the docs front. See Handling Signals, Queries, & Updates | Temporal Platform Documentation for an overview of things you need to know.