Skip to content

fix(runtime): drop service before sys exit#2109

Merged
jonaro00 merged 1 commit intomainfrom
rt-drop-service
Sep 9, 2025
Merged

fix(runtime): drop service before sys exit#2109
jonaro00 merged 1 commit intomainfrom
rt-drop-service

Conversation

@jonaro00
Copy link
Member

@jonaro00 jonaro00 commented Sep 9, 2025

Allows services and other internals to Drop before the runtime calls exit()

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Summary

This PR refactors the runtime exit handling to enable proper resource cleanup before process termination. The changes modify the rt::start function to return exit codes instead of calling std::process::exit() directly, allowing the caller in start.rs to handle cleanup operations before terminating the process.

The main change involves updating the function signature of rt::start from pub async fn start(...) to pub async fn start(...) -> i32, replacing all internal exit() calls with return statements that preserve the same exit codes (ranging from 0 for success to 202 for various failure scenarios). The start.rs module is updated to capture this exit code and call std::process::exit() explicitly.

This change addresses a fundamental issue in the Shuttle runtime where services and their associated resources (network connections, file handles, background tasks) weren't being properly cleaned up before process termination. In Rust, proper resource cleanup often relies on Drop trait implementations, which wouldn't run when exit() is called directly. By returning control to the caller first, the runtime now allows for orderly shutdown procedures.

The exit codes maintain their semantic meaning for debugging and monitoring purposes - for example, 101 indicates secret loading failures, 111 for loader failures, 121 for provisioning failures, etc. This preserves the runtime's ability to communicate specific failure reasons to process managers and deployment environments.

Important Files Changed

Changed Files
Filename Score Overview
runtime/src/rt.rs 3/5 Refactored start function to return exit codes instead of calling exit() directly, with inconsistent exit() calls remaining in health check task
runtime/src/start.rs 4/5 Updated to capture exit code from rt::start and explicitly call std::process::exit, enabling proper cleanup

Confidence score: 3/5

  • This PR addresses an important resource management issue but has implementation inconsistencies that need attention
  • Score reflects the correct approach to the problem but incomplete refactoring in the health check server spawned task
  • Pay close attention to runtime/src/rt.rs lines 101, 107, and 127 where exit() calls still bypass the cleanup mechanism

Sequence Diagram

sequenceDiagram
    participant User
    participant Runtime as "Runtime (start.rs)"
    participant RT as "RT Module (rt.rs)"
    participant API as "Shuttle API Client"
    participant Service as "User Service"
    participant System as "System"

    User->>Runtime: "Start Shuttle app"
    Runtime->>Runtime: "Check --version flag"
    alt version check
        Runtime-->>User: "Print version and exit"
    end
    
    Runtime->>Runtime: "Parse args and env vars"
    alt validation fails
        Runtime-->>User: "Print error and hint"
        Runtime->>System: "Return (no exit)"
    end
    
    Runtime->>Runtime: "Setup tracing/logging"
    Runtime->>RT: "start(loader, runner)"
    
    RT->>RT: "Parse RuntimeEnvVars"
    RT->>API: "Create ShuttleApiClient"
    
    opt health check enabled
        RT->>RT: "Spawn health check server"
    end
    
    RT->>API: "get_secrets(project_id)"
    API-->>RT: "Return secrets"
    RT->>RT: "Create ResourceFactory"
    RT->>RT: "loader.load(factory)"
    
    loop for each Shuttle resource
        RT->>API: "provision_resource(project_id, resource)"
        API-->>RT: "Return resource state"
        alt state is Provisioning/Authorizing
            RT->>RT: "Sleep 2 seconds"
        else state is Ready
            RT->>RT: "Update resource bytes"
        else bad state
            RT-->>Runtime: "Return error code 132"
        end
    end
    
    opt in Shuttle environment
        RT->>API: "Send sidecar shutdown request"
    end
    
    RT->>RT: "runner.run(resources)"
    RT-->>RT: "Get service instance"
    
    alt runner fails
        RT-->>Runtime: "Return error code 151"
    end
    
    RT->>Service: "service.bind(service_addr)"
    
    par Service execution
        Service->>Service: "Handle requests"
    and Signal handling
        RT->>RT: "Listen for SIGTERM/SIGINT/etc"
    end
    
    alt Signal received
        RT-->>Runtime: "Return exit code 10"
    else Service terminates normally
        RT-->>Runtime: "Return exit code 0"
    else Service error
        RT-->>Runtime: "Return exit code 1"
    end
    
    Runtime->>Service: "Drop service (implicit)"
    Runtime->>System: "std::process::exit(exit_code)"
Loading

2 files reviewed, no comments

Edit Code Review Bot Settings | Greptile

@jonaro00 jonaro00 merged commit c83c0d4 into main Sep 9, 2025
23 checks passed
@jonaro00 jonaro00 deleted the rt-drop-service branch September 9, 2025 13:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant