The two things that need validation across the entire DAG are:
Parameters - Validation for the DAG parameters (support key=value and JSON input)
Step outputs - Currently, there’s no feature to define "input" for each step directly, so the validation needs to happen at the "output" stage. Below is a rough idea.
1) Parameter validation
Each parameter is interpreted as a field in a JSON schema, and the values are validated accordingly. Fields without defaults are treated as required. Supporting JSON input for parameters will be the next step.
params:
- name: ENVIRONMENT
type: string
enum: [dev, staging, prod]
default: dev
description: Deployment environment
required: true
2) Output validation
Allow user to define optional schema in the output field. If the validation fails, the step should fail. This would be useful to validate LLM output for example.
steps:
- command: python task.py
output:
name: OUTPUT
schema:
type: object
properties:
summary: { type: string }
confidence: { type: number, minimum: 0.0, maximum: 1.0 }
required: [summary, confidence]
The two things that need validation across the entire DAG are:
Parameters - Validation for the DAG parameters (support key=value and JSON input)
Step outputs - Currently, there’s no feature to define "input" for each step directly, so the validation needs to happen at the "output" stage. Below is a rough idea.
1) Parameter validation
Each parameter is interpreted as a field in a JSON schema, and the values are validated accordingly. Fields without defaults are treated as required. Supporting JSON input for parameters will be the next step.
2) Output validation
Allow user to define optional schema in the output field. If the validation fails, the step should fail. This would be useful to validate LLM output for example.