fix(dagster): propagate ValidationError, dev-suffix versions, sanitize pr_number, parallelize lineage#289
Merged
hugocorreia90 merged 1 commit intomainfrom Apr 29, 2026
Merged
Conversation
…e pr_number, parallelize lineage
P1 maintenance-sweep cluster, four independent fixes in one PR:
* `_compile_payload` / `_optimize_payload` now propagate Pydantic
`ValidationError` as a structured `dg.Failure` instead of swallowing
it. Engine schema breaks no longer make checks/freshness vanish
silently — operators see a clear "schema mismatch — built against a
different rocky binary" message.
* `_verify_engine_version` strips pre-release / build suffixes
(`-dev`, `-pre`, `-rc.N`, `+sha.<hash>`) before semver compare so
dev builds like `1.17.4-dev` no longer skip the gate via a silent
`int("4-dev")` parse failure.
* `branch_deploy_shadow_suffix` now validates `pr_number` is a
positive integer string before interpolating it into a SQL identifier
suffix; non-numeric input falls through to the (already-sanitized)
deployment-name branch instead of being inlined raw.
* `_collect_lineage` runs per-model `rocky lineage` subprocesses through
a bounded `ThreadPoolExecutor` (max 8 workers) so a code-server cold
start with N models pays roughly one wall-clock latency, not N times
it. The CLI takes a single `--target`, so batching into one call
isn't an option; thread-pool fan-out is the smaller refactor.
5 tasks
hugocorreia90
added a commit
that referenced
this pull request
Apr 29, 2026
Engine 1.18.0 ships the rocky preview workflow end-to-end (#279, #280, #281, #282), the [budget].max_bytes_scanned threshold (#288), the audit-sweep closeout (#283, #285–#287, #290–#293), and the rocky-server auth + CORS gate (#291). Dagster 1.15.0 picks up the regenerated Pydantic models for the rocky preview surface and ships the P1 cluster (#289) + FR-014 follow-on (#284). VS Code 1.10.0 regenerates TypeScript bindings for rocky preview and RunCostSummary.total_bytes_scanned. See per-artifact CHANGELOG entries for the full breakdown.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Four independent P1 fixes from the dagster-rocky maintenance sweep, landed as a single PR (no shared surface, but all small).
_compile_payload/_optimize_payloadre-raise PydanticValidationErroras a structureddg.Failure("Engine output schema mismatch — built against a different rocky binary version"). Previously swallowed silently, which made checks/freshness vanish without a signal whenever the engine and dagster-rocky drifted in lockstep cadence. Note: these helpers live incomponent.py, notresource.pyas the audit's pointer suggested._verify_engine_versionstrips-dev,-pre,-rc.N,+sha.<hash>before the semver tuple compare. Without this,int(\"4-dev\")fails the parse and the gate silently skips — the exact failure mode the audit flagged for1.17.4-dev.pr_numberinjection hardening.branch_deploy_shadow_suffixnow validatespr_numberis a positive integer string before interpolating into a SQL identifier suffix. Untrusted input (e.g.\"42; DROP TABLE\") falls through to the (already-sanitized) deployment-name branch instead of being inlined raw._collect_lineagefans per-modelrocky lineagesubprocesses across a boundedThreadPoolExecutor(max 8 workers). The CLI takes a single--targetso batching into one call isn't viable; thread-pool fan-out is the smaller refactor and gives ~Nx cold-start speedup with a memory cap.Out of scope (follow-up)
_dag_payloadincomponent.pyhas the sameexcept Exception → return Noneshape as the slot helpers and would swallow aValidationErrorthe same way. Audit only asked for compile/optimize, so leaving as a known follow-up.Test plan
cd integrations/dagster && uv run pytest— 487 passedcd integrations/dagster && uv run ruff check— All checks passedcd integrations/dagster && uv run ruff format --check— 47 files already formattedtest_write_state_compile_slot_propagates_validation_error(P1.1)test_write_state_compile_slot_propagates_dg_failure_caused_by_validation_error(P1.1, transitive case)test_write_state_optimize_slot_propagates_validation_error(P1.1)ValidationErrorfrom the swallowed-exceptions matrix (P1.1)test_verify_version_strips_pre_release_suffixcovering-dev,-pre,-rc.N,-alpha.2,+sha.<hash>(P1.2)test_verify_version_dev_suffix_below_minimum_fails— dev builds below MIN still gate (P1.2)pr_number=\"42; DROP TABLE users\",pr_number=\"42'--\",pr_number=\"-1\",pr_number=\"0\",pr_number=\" 42 \", non-numeric without deployment fallback, large valid (P1.3)test_surface_column_lineage_runs_in_parallel— observes overlapping in-flight lineage calls via instrumented stub (P1.4)