Deploy environments to the platform. Run evals at scale.
You’ve built an environment with tools and scenarios. Deploy it to the platform and you can run evals at scale—hundreds of parallel runs across models, all traced, all generating training data.
The simplest path. One command builds and deploys your environment directly to HUD:
hud deploy
This:
Packages your build context (respects .dockerignore)
Uploads to HUD’s build service
Builds remotely via AWS CodeBuild
Streams logs in real-time
Links this directory to the deployed environment
Once complete, your environment appears on the platform:
See your environment’s tools, scenarios, and builds at hud.ai/environments. For full details on managing environments through the platform UI, see Platform Environments.
Started with hud deploy but want GitHub integration later? Just connect the same repo on the platform. HUD links them by environment ID.Going the other way? Use hud sync env to connect a local directory to an existing platform environment:
hud sync env my-env-name
This links your local directory and verifies your scenarios match the deployed environment.
Every HUD image supports scenario operations via hud scenario. Setup and grading are shell commands; agents interact with tools via the MCP server at :8080/mcp.
The default Dockerfile CMD uses --stdio for the HUD platform. For external use, override the command to start an HTTP server:
# Build and push the image to a registryhud build .docker tag my-env:latest <your-registry>/my-env:latestdocker push <your-registry>/my-env:latest# Start the environment with HTTP server (overrides default stdio CMD)docker run -d --name my-env -p 8080:8080 my-image:latest \ hud dev env:env --port 8080# List available scenariosdocker exec my-env hud scenario list# Setup a scenario (prints the prompt)docker exec my-env hud scenario setup count \ --args '{"text": "strawberry", "letter": "r"}'# Your agent runs against MCP tools at localhost:8080/mcp# Grade (prints reward as JSON)docker exec my-env hud scenario grade count --answer "3"# Test graders without an agent (setup + grade in one shot)docker exec my-env hud scenario run count \ --args '{"text": "mississippi", "letter": "s"}' --answer "4"