GAIA

Explore generative AI for autonomy: video generation models as world simulators

Generative AI world model

Generative world models are transforming how AI systems learn, simulate, and reason about the real world. These models allow AI to build rich, predictive representations, similar to human mental models, that support better decision-making and safer interactions in complex, dynamic environments.

GAIA-2, our latest generative world model for autonomy, significantly expands the capabilities of our original GAIA-1 model. GAIA-2 pushes the boundaries of synthetic data generation with enhanced controllability, expanded geographic diversity, and broader vehicle representation. Unlike general-purpose generative models, GAIA-2 is purpose-built to navigate the complexities of driving—handling multiple camera viewpoints, diverse road conditions, and critical corner cases. By offering fine-grained control over key driving factors, GAIA-2 empowers engineers and researchers to create richer, more realistic training scenarios, accelerating the path to safer and more robust autonomy.

Multimodal model

GAIA-2 utilizes video, text, and action inputs to produce realistic driving videos while providing precise control over ego-vehicle behavior and scene features. Its multimodal nature also enables GAIA-2 to generate videos from various prompt modalities and combinations.

GAIA’s true marvel is its ability to manifest the underlying rules of our world.

GAIA’s capabilities

GAIA’s deep understanding of driving and language allows it to accurately interpret prompts and generate detailed driving videos. These videos encompass diverse traffic scenarios, specific types of motion, accurate depictions of time of day and weather conditions, and realistic interactions between vehicles and other road users.

This provides Wayve with a versatile and powerful synthetic tool to advance the training and validation of safer and more intelligent autonomous systems.

Fine-Grained Scenario Customization

GAIA-2 offers precise control over key driving factors like ego-vehicle behavior, weather, lighting, and road configurations, enabling realistic and customizable simulations.

 

Geographic and Environmental Diversity

It generates synthetic driving scenarios across varied regions (UK, US, Germany) and conditions, covering urban, suburban, and highway environments with diverse weather and time-of-day settings.

Multi-Camera, High-Fidelity Synthesis

Utilizing a latent diffusion framework, GAIA-2 delivers stable, long-horizon video sequences with spatiotemporal consistency across multiple camera viewpoints.

Simulation of Safety-Critical Edge Cases

It can systematically generate rare and high-risk scenarios—such as sudden cut-ins and emergency maneuvers—to improve testing and validation of autonomous driving systems.

Videos below: Safety-critical scenarios that showcase our ability to manipulate the ego-vehicle’s state, influence other agents, and generate states that lie entirely outside the training distribution to create hazardous conditions for training and validation.

Read our research blogs

Publications

26 Mar 2025
GAIA-2: A Controllable Multi-View Generative World Model for Autonomous Driving
Download paper
29 Sep 2023
GAIA-1: A Generative World Model for Autonomous Driving
Download paper
Back to top