Skip to content

ApiInstanceState: Make a separate state for "rebooting".#118

Merged
smklein merged 5 commits intomainfrom
reboot-separate
Jun 8, 2021
Merged

ApiInstanceState: Make a separate state for "rebooting".#118
smklein merged 5 commits intomainfrom
reboot-separate

Conversation

@smklein
Copy link
Collaborator

@smklein smklein commented Jun 8, 2021

This is the second in a series of "lessons learned" regarding the ApiInstanceState's treatment of rebooting.

Some History

Originally, the structure appeared as follows:

pub enum ApiInstanceState {
    Creating,
    Starting,
    Running,
    Stopping,
    Stopped,
    Repairing,
    Failed,
    Destroyed,
}

pub struct ApiInstanceRuntimeState {
  pub run_state: ApiInstanceState,
  pub reboot_in_progress: bool,
  ...
}

In #88 , it was identified that the "reboot_in_progress" value, although present for all states, only actually had relevance for the "stopping" and "stopped" states. The structure was updated to include the following enum variants, removing the need for the reboot_in_progress bool:

pub enum ApiInstanceState {
  ...
  Stopping { rebooting: bool },
  Stopped { rebooting: bool },
  ...
}

The Problem Currently Being Faced

As identified by #117 , this ergonomic-in-rust pattern has some complications for other languages - namely TypeScript, which generates fairly non-intuitive bindings (and, as discussed, still risks some confusion for developers even if the "right" bindings were generated). TL;DR, the use of heterogeneous Rust enum variants is probably too painful for dynamic languages to handle gracefully.

Additionally, while integrating Propolis and Omicron, an impedance mismatch was identified:

  • Sled Agent considered the "reboot" command to be a multi-step operation from "running" -> "stopping" -> "stopped", followed by an explicit command to return to "running", via "stopped" -> "starting" -> "running".
  • Propolis, on the other hand, has first-class support for "rebooting", and only needs one command to successfully reboot.

This distinction permits some wiggle-room in distinguishing the "stopping / stopped / rebooting" states:

  • Stopping can identify that the instance is proceeding to the Stopped state, with the expectation to remain stopped until either another command is issued or some failure occurs.
  • Stopped can identify that the instance is not executing, with the expectation to remain stopped until either another command is issued or some failure occurs.
  • Rebooting can identify that the instance is in the process of stopping, but with the expectation that it will immediately re-enter the Starting state (effectively skipping "Stopped") unless some failure occurs.

Proposal

This PR proposes the following modification to ApiInstanceState, based on the distinction between the three states:

pub enum ApiInstanceState {
  ...
  Stopping,
  Stopped,
  Rebooting
  ...
}

This change aligns Omicron more closely with the actual interface provided by Propolis, and also simplifies the handling of API states by end-users (and dynamic languages).

Notes

@smklein smklein requested review from bnaecker and david-crespo June 8, 2021 05:14
Copy link
Contributor

@david-crespo david-crespo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good, solves my problem in #117

@smklein smklein merged commit c73a4ef into main Jun 8, 2021
@smklein smklein deleted the reboot-separate branch June 8, 2021 19:30
Copy link
Collaborator

@bnaecker bnaecker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great, thanks for handling that. It's interesting -- the data-carrying enum variant is more idiomatic Rust, but this "flattened" version is simpler for both the generated client and the database itself. Maybe it's just the mismatch between the Propolis and sled-agent state machines manifested. But looks great!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Problem with structs in ApiInstanceState enum

3 participants