Skip to content

feat: Trust-system Arc 3 (first wave) — timed half-open circuit breaker + consolidation#172

Merged
hugocorreia90 merged 1 commit intomainfrom
feat/arc-3-resume-safety
Apr 20, 2026
Merged

feat: Trust-system Arc 3 (first wave) — timed half-open circuit breaker + consolidation#172
hugocorreia90 merged 1 commit intomainfrom
feat/arc-3-resume-safety

Conversation

@hugocorreia90
Copy link
Copy Markdown
Contributor

Summary

First wave of Arc 3 from the trust-system direction. Focused scope after advisor flagged rocky branch promote as a separate design exercise.

  • Consolidated + enhanced CircuitBreaker. Three-state machine (Closed/Open/HalfOpen). New RetryConfig.circuit_breaker_recovery_timeout_secs promotes the breaker from "manual-reset only" to auto-recovery: after timeout elapses while Open, the next check transitions to HalfOpen and lets one trial through. Success closes the breaker (emits circuit_breaker_recovered); failure re-opens (emits circuit_breaker_tripped again). CAS on both trip paths guarantees exactly one event per transition even under concurrent failing trials.
  • Adapter consolidation. rocky-databricks and rocky-snowflake drop their inline consecutive_failures: AtomicU32 and share Arc<CircuitBreaker> via RetryConfig::build_circuit_breaker(). Removes the orphan-module drift advisor surfaced — there's now one implementation. rocky-fivetran has no inline breaker (source adapter, retry shape different); rocky-bigquery likewise has no inline breaker.
  • Events surfaced; hook deferred. Adapters emit circuit_breaker_tripped / _recovered on transitions with adapter-name target + error class. TransitionOutcome return value decouples the breaker from rocky-observe — callers choose whether to publish. HookEvent::CircuitBreakerTripped is deferred to wave 2 since adapter-emitted events don't currently have a hook bridge (mirrors statement_retry's existing precedent).

Deferred to later Arc 3 waves

  • rocky branch promote <name> (atomic shadow→prod swap). Advisor flagged real deploy-semantics questions — cross-table atomicity, Unity Catalog rename privileges, dry-run output shape, behavior when branch/prod disagree on table set. Separate plan doc + dedicated PR.
  • Event→hook bridge so [hook.on_circuit_breaker_tripped] fires.
  • Transactional checkpoint atomicity audit (watermark + table_progress in one txn).

Test plan

  • cargo test -p rocky-core — 976 tests (9 new circuit_breaker::tests::* covering half-open transitions, concurrent-trial CAS, timeout-elapsed check).
  • cargo test -p rocky-databricks — 93 tests pass, wiremock suite clean.
  • cargo test -p rocky-snowflake — 98 tests pass.
  • cargo clippy --workspace --all-targets -- -D warnings clean.
  • cargo fmt --check clean.
  • just codegen — adapter_config + rocky_project schemas picked up the new circuit_breaker_recovery_timeout_secs field; Pydantic + TypeScript regenerated.
  • just regen-fixtures — no RunOutput changes, fixture corpus unchanged.
  • uv run pytest in integrations/dagster/ — 307 tests pass.

🤖 Generated with Claude Code

…er + adapter consolidation

- `rocky_core::circuit_breaker::CircuitBreaker` gains a three-state
  machine (Closed / Open / HalfOpen). New
  `with_recovery_timeout` constructor enables timed auto-recovery:
  after `circuit_breaker_recovery_timeout_secs` elapses while Open,
  the next `check` transitions to HalfOpen and lets one trial request
  through. Success closes the breaker, failure re-opens. Wired to
  `RetryConfig::build_circuit_breaker()` so every adapter picks it up.
- `rocky-databricks` and `rocky-snowflake` drop their inline
  `consecutive_failures: AtomicU32` and share the consolidated
  breaker via `Arc<CircuitBreaker>`. Replaces the "orphan breaker
  module" drift — there's now one implementation in the tree.
- Adapters emit `circuit_breaker_tripped` / `circuit_breaker_recovered`
  `PipelineEvent`s on state transitions. `TransitionOutcome` lets the
  caller forward the signal without the breaker depending on
  `rocky-observe`. `HookEvent::CircuitBreakerTripped` is intentionally
  deferred to wave 2 (needs event→hook bridge; parallels
  `statement_retry`'s existing adapter-event-without-hook precedent).

`rocky branch promote` (atomic shadow→prod swap) moved to its own plan
doc after advisor-surfaced deploy-semantics questions (cross-table
atomicity, UC rename privileges, dry-run output shape).
@hugocorreia90 hugocorreia90 merged commit a4b40fd into main Apr 20, 2026
14 of 15 checks passed
@hugocorreia90 hugocorreia90 deleted the feat/arc-3-resume-safety branch April 20, 2026 10:31
@hugocorreia90 hugocorreia90 mentioned this pull request Apr 20, 2026
3 tasks
hugocorreia90 added a commit that referenced this pull request Apr 20, 2026
Closes the first wave of every trust-system arc (Arcs 1-7) plus the two
wave-2 follow-ups landed the same day. Nine feature PRs since v1.10.0.

- Arc 1 (#170): rocky lineage --downstream, rocky branch, rocky run --branch, rocky replay
- Arc 2 (#171): per-run cost attribution, [budget] block, budget_breach hook
- Arc 3 (#172): three-state CircuitBreaker, adapter consolidation
- Arc 4 (#173): rocky trace Gantt + feature-gated OTLP metrics export
- Arc 5 (#174): schema-grounded rocky ai prompt + project-aware validator
- Arc 6 wave 1 (#184): --target-dialect P001 portability lint (12 constructs)
- Arc 7 wave 1 (#185): blast-radius P002 SELECT * lint (semantic-graph aware)
- Arc 6 wave 2 (#186): [portability] config block + per-model rocky-allow pragma
- Arc 7 wave 2 wave-1 (#187): --with-seed source-schema inference

Plus #169 fix: install scripts pick latest engine version by semver.

Version bump: 20 Cargo.toml files (all workspace members except
rocky-bigquery, which tracks its own version).

Wave 2/3 work for every arc remains in the deferred backlog — see
the changelog Deferred section for the full carry-forward.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant