Skip to content

feat(engine): unify CLI + LSP state_path resolution#238

Merged
hugocorreia90 merged 2 commits intomainfrom
feat/engine-state-path-unification
Apr 23, 2026
Merged

feat(engine): unify CLI + LSP state_path resolution#238
hugocorreia90 merged 2 commits intomainfrom
feat/engine-state-path-unification

Conversation

@hugocorreia90
Copy link
Copy Markdown
Contributor

Summary

Resolves the known follow-up from Arc 7 wave 2 wave-2 (PRs #223/#228/#230/#231/#232) called out in engine/CHANGELOG.md under 1.14.0.

The CLI default state_path (.rocky-state.redb in CWD) and the LSP / server default (<models>/.rocky-state.redb) diverged, so the schema-cache write tap (#230) persisted entries to CWD while the LSP read-path (#228) looked next to the models directory. Inlay-hint cache-hits never fired end-to-end for any project where the two locations weren't already co-located.

Fix

Add rocky_core::state::resolve_state_path(explicit, models_dir) as the single resolution helper for both halves of the binary, and wire it through every callsite (8 production sites total — CLI main.rs + watch.rs, and all four server-side callsites in rocky-server/{state,lsp,api,dashboard}.rs).

Policy (checked in order):

  1. Explicit --state-path — use verbatim, no warning.
  2. <models>/.rocky-state.redb exists — use it (canonical default).
  3. CWD .rocky-state.redb exists — use it with a one-time deprecation warning on stderr (legacy fallback).
  4. Both exist — CWD wins (preserves legacy watermarks / branches / partitions / run history) with a louder warning asking the user to reconcile. Merge is lossy — don't attempt it.
  5. Neither exists — fresh project; default to <models>/.rocky-state.redb.

What existing users see on upgrade

User layout Behavior on upgrade
CWD .rocky-state.redb only (legacy) Works. One rocky::state_path warning on stderr per invocation until the file is moved.
<models>/.rocky-state.redb only Works silently — this is now the canonical layout.
Both exist CWD wins (preserves existing state). Louder warning asking to reconcile.
Neither (fresh project) New default: <models>/.rocky-state.redb.
--state-path X explicit Works verbatim. No warning.
Dagster integration Unchanged — RockyResource always passes --state-path explicitly, so case 1 always applies. Dagster-only users who want the new layout update state_path in their resource config.

Why option (a) alone, not (a)+(b)

Option (a) unifies automatically with no user action required, no loss of existing state, and a clear nudge toward the new layout. A rocky state migrate verb (option b) would also work but would pull in the codegen + dagster + vscode cascade (new *Output struct, new resource method, new vscode command) for essentially zero user-visible benefit over the silent-auto-fallback — the CWD file keeps working verbatim. I'd rather ship the fix tight and add a migration verb as a follow-up if anyone actually asks for it.

Caveats

  • The top-level main.rs resolves against ./models (the convention shared with the LSP's <root>/models assumption). Per-command --models <custom> overrides on rocky run, rocky compile, etc. intentionally don't feed back into state-path resolution — the state file lives with the project, not with a one-shot --models flag.
  • The LSP logs the deprecation warning at debug!, not warn! — the LSP doesn't drive the file location, so it shouldn't nag via log output. Legacy-layout users only see the nudge when they invoke the CLI.

Test plan

  • cargo build --release — passes.
  • cargo test — full suite, no failures. Five new resolver tests in rocky-core/src/state.rs: resolve_state_path_{explicit_wins,prefers_models_dir_when_only_it_exists,falls_back_to_cwd_with_warning,both_prefer_cwd_but_warn_louder,fresh_project_picks_models_dir}.
  • cargo clippy --all-targets --all-features -- -D warnings — clean.
  • cargo fmt --all -- --check — clean.
  • Manual smoke test against the release binary in /tmp/rocky-smoke:
    • CWD-only layout emits the using legacy state file warning on stderr.
    • Models-dir-only layout is silent.
    • Both-exist layout emits the louder two Rocky state files detected warning.

Follow-ups (not in this PR)

  • rocky state migrate verb — a guided copy from CWD → <models>/.rocky-state.redb if operators ask for it.
  • Consider escalating the LSP's debug log to an LSP window/showMessage notification so editor users see the deprecation nudge without digging through the LSP trace.

The CLI default state_path (`.rocky-state.redb` in CWD) and the LSP /
server default (`<models>/.rocky-state.redb`) diverged after Arc 7 wave
2 wave-2. The schema-cache write tap (PR #230) persisted entries to
CWD, while the LSP read-path (PR #228) looked next to the models
directory — so inlay-hint cache-hits never fired end-to-end for any
project where the two locations weren't already co-located.

Add `rocky_core::state::resolve_state_path(explicit, models_dir)` as
the single resolution point for both halves of the binary:

1. Explicit `--state-path` wins verbatim.
2. `<models>/.rocky-state.redb` exists — use it (canonical default).
3. CWD `.rocky-state.redb` exists — use it with a one-time
   deprecation warning on stderr (legacy fallback; protects existing
   watermarks, branches, partitions, run history).
4. Both exist — CWD wins (preserves legacy state) with a louder
   warning asking the user to reconcile. Merge is lossy.
5. Neither exists (fresh project) — default to
   `<models>/.rocky-state.redb`, no warning.

Wire the helper through main.rs (single top-level resolution),
commands/watch.rs, and all four rocky-server callsites (state.rs,
lsp.rs, api.rs, dashboard.rs). Passing `--state-path` explicitly
remains a hard override, so the Dagster integration — which always
passes an explicit path — is unchanged.

Five resolver unit tests cover every branch (explicit / models-dir /
CWD-fallback / both-exist / fresh). Smoke-tested end-to-end against
the release binary: the warning lands on stderr; the models-dir case
is silent; the both-exist case emits the louder warning.
Case 5 of `resolve_state_path` returned `<models>/.rocky-state.redb`
unconditionally on fresh projects. For replication-only and
quality-only pipelines (and several POCs, e.g.
`02-performance/06-schema-drift-recover`) there is no `models/`
directory at all, so the next `rocky run` failed trying to open
the state lock file at a path whose parent doesn't exist:

    Error: failed to open state store at models/.rocky-state.redb
      i/o error opening state lock file at models/.rocky-state.redb.lock:
      No such file or directory (os error 2)

That crash surfaced as a `just regen-fixtures` normalizer failure
in the codegen-drift CI workflow for PR #238 — the `drift/run_clean`
capture emitted empty stdout and the JSON normalizer then errored on
`Expecting value: line 1 column 1 (char 0)`.

Refine the resolver to check `models_dir.is_dir()`:

- Case 5 (fresh project): default to `<models>/.rocky-state.redb`
  only when `models_dir` exists; otherwise fall back to CWD.
- Case 3 (legacy CWD state): emit the migration-nudge warning only
  when `models_dir` is a real directory. Without one there's
  nowhere to move the file to, so the warning would be noise.

The LSP only attaches to projects with `.rocky` files (i.e.
projects that have a models dir by definition), so the no-models
fallback path has no CLI-vs-LSP divergence to unify.

Two new unit tests pin the behaviour — fresh-project-without-models
and CWD-state-without-models. Verified locally: the drift POC run
now emits valid JSON, and `just regen-fixtures` completes with
zero drift against the committed `fixtures_generated/`.
Comment thread engine/crates/rocky-core/src/state.rs Dismissed
Comment thread engine/crates/rocky-core/src/state.rs Dismissed
Comment thread engine/crates/rocky-core/src/state.rs Dismissed
@hugocorreia90 hugocorreia90 merged commit db926fe into main Apr 23, 2026
13 checks passed
@hugocorreia90 hugocorreia90 deleted the feat/engine-state-path-unification branch April 23, 2026 16:29
hugocorreia90 added a commit that referenced this pull request Apr 23, 2026
* chore: release engine-v1.16.0 + dagster-v1.12.0 + vscode-v1.8.0

Bundles the governance waveplan — five merged PRs (#240 audit trail,
#241 classification + masking, #242 rocky compliance, #243 role-graph,
#244 retention) on top of three FR-004 / state-path follow-ups
(#237 error-path idempotency, #238 state-path unification,
#239 success-path idempotency finalize).

Version bumps: engine 1.15.0 → 1.16.0, dagster-rocky 1.11.0 → 1.12.0,
vscode extension 1.7.0 → 1.8.0.

CHANGELOGs updated for all three artifacts.

* chore(dagster): regen test fixtures for 1.16.0

Fixture drift flagged by CI (`codegen-drift.yml`). Fixtures are captured
from the live engine binary — the version-string bump to 1.16.0 ripples
through every `version` field, and the Wave A audit-trail work (#240)
adds the 8 `RunRecord` fields to `rocky history` output, which the
playground POC now emits.

Regenerated via `just regen-fixtures` against
`examples/playground/pocs/00-foundations/00-playground-default`.

* chore(scripts): sentinel top-level version field in fixture normaliser

Every CLI output's top-level `version` is `env!("CARGO_PKG_VERSION")`
at emit time, so every engine version bump rippled through all 38
captured fixtures — every release PR fought `codegen-drift.yml` until
`just regen-fixtures` was re-run.

Extend the existing `AUDIT_FIELD_SENTINELS` set (Wave A already
sentineled the audit-trail `rocky_version` field + hostname / git
commit / etc.) with the top-level `version` key → `"0.0.0-SENTINEL"`.
After this, version bumps only touch Cargo.toml / pyproject.toml /
package.json / CHANGELOGs — never fixtures.

Regen captured all 38 fixtures; top-level `version` now uniformly
renders as `"0.0.0-SENTINEL"`.
hugocorreia90 added a commit that referenced this pull request Apr 24, 2026
…246)

Root cause of the recurring codegen-drift CI failures: PR #238 moved the
default state path from `<CWD>/.rocky-state.redb` to
`<models>/.rocky-state.redb`, but `regen_fixtures.sh` still only cleaned
the old CWD path. Local regens accumulated runs in
`<POC>/models/.rocky-state.redb` between invocations while CI started
from a fresh checkout with no prior runs — producing divergent fixture
content depending on how many `rocky run` invocations had happened
locally before the capture.

Concretely this surfaced on `optimize.json`:
- Local (stale state): `"fast execution (0.0s)..."` (5+ runs in state)
- CI (clean): `"insufficient history: 2 runs (need 5)"`

Fix: extend every `rm -f .rocky-state.redb` to also clean
`models/.rocky-state.redb`. Defense-in-depth keeps the old path too so
any environment that still writes there cleans up identically.

Regen from clean state captures the deterministic 2-run output for
`optimize.json` and the corresponding smaller `history.json`. These
match what CI sees end-to-end; `history.json` in particular will stay
stable across future regens.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants