Skip to content

VBVR-DataFactory/G-45_key_door_matching_data-generator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

25 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

G-45: Key Door Matching Data Generator

Generates synthetic datasets for training and evaluating vision models on symbolic matching and navigation tasks. Each sample contains a maze where an agent must find a specific colored key and navigate to the matching colored door.

Each sample pairs a task (first frame + prompt describing what needs to happen) with its ground truth solution (final frame showing the result + video demonstrating how to achieve it). This structure enables both model evaluation and training.


📌 Basic Information

Property Value
Task ID G-45
Task Key Door Matching
Category Spatiality
Resolution 1024×1024 px
FPS 16 fps
Duration ~6-7 seconds
Output PNG images + MP4 video

🚀 Usage

Installation

# 1. Clone the repository
git clone https://github.com/VBVR-DataFactory/G-45_key_door_matching_data-generator.git
cd G-45_key_door_matching_data-generator

# 2. Create and activate virtual environment
python3 -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# 3. Install dependencies
pip install --upgrade pip
pip install -r requirements.txt
pip install -e .

Generate Data

# Generate 50 samples
python examples/generate.py --num-samples 50

# Custom output directory
python examples/generate.py --num-samples 100 --output data/my_dataset

# Reproducible generation with seed
python examples/generate.py --num-samples 50 --seed 42

# Without videos (faster)
python examples/generate.py --num-samples 50 --no-videos

Command-Line Options

Argument Description
--num-samples Number of tasks to generate (required)
--output Output directory (default: data/questions)
--seed Random seed for reproducibility
--no-videos Skip video generation (images only)

📖 Task Example

Prompt

The scene shows a maze with a green circular agent, colored diamond-shaped keys, and colored hollow rectangular doors. Find the Red key and then navigate to the matching Red door, showing the complete movement process step by step.

Visual

Initial Frame
Agent in maze with keys and doors
Animation
Agent finds key then navigates to matching door
Final Frame
Agent reaches correct colored door

📖 Task Description

Objective

Navigate a maze to find a specific colored key, then proceed to the matching colored door.

Task Setup

  • Agent: Green circular navigator
  • Keys: Colored diamond shapes (various colors)
  • Doors: Colored hollow rectangles (matching key colors)
  • Maze structure: Walls creating pathways
  • Target color: Specified in prompt (e.g., "Yellow")
  • Two-phase task: (1) Find key, (2) Navigate to matching door
  • Goal: Reach the door that matches the specified key color

Key Features

  • Color-based symbolic matching
  • Two-step navigation task (key first, then door)
  • Maze pathfinding with obstacles
  • Tests color recognition and matching logic
  • Requires planning multi-stage navigation
  • Step-by-step movement visualization

📦 Data Format

data/questions/key_door_matching_task/key_door_matching_00000000/
├── first_frame.png      # Maze with agent, keys, and doors
├── final_frame.png      # Agent at matching door
├── prompt.txt           # Key-door matching instruction with color
├── ground_truth.mp4     # Animation of key collection and door navigation
└── question_metadata.json # Task metadata

File specifications:

  • Images: 1024×1024 PNG format
  • Video: MP4 format, 16 fps
  • Duration: ~6-7 seconds

🏷️ Tags

logic-symbols color-matching maze-navigation multi-step-task symbolic-reasoning pathfinding


About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages