Skip to content

VBVR-DataFactory/G-9_identify_objects_in_region_data-generator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

G-9: Identify Objects In Region Data Generator

Generates synthetic datasets for training and evaluating vision models on spatial perception and selective attention tasks. Each sample contains multiple shapes scattered across the image with a highlighted region, requiring the model to identify and outline specific shapes only within that region.

Each sample pairs a task (first frame + prompt describing what needs to happen) with its ground truth solution (final frame showing the result + video demonstrating how to achieve it). This structure enables both model evaluation and training.


📌 Basic Information

Property Value
Task ID G-9
Task Identify Objects In Region
Category Perception
Resolution 1024×1024 px
FPS 16 fps
Duration ~2 seconds
Output PNG images + MP4 video

🚀 Usage

Installation

# 1. Clone the repository
git clone https://github.com/VBVR-DataFactory/G-9_identify_objects_in_region_data-generator.git
cd G-9_identify_objects_in_region_data-generator

# 2. Create and activate virtual environment
python3 -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# 3. Install dependencies
pip install --upgrade pip
pip install -r requirements.txt
pip install -e .

Generate Data

# Generate 50 samples
python examples/generate.py --num-samples 50

# Custom output directory
python examples/generate.py --num-samples 100 --output data/my_dataset

# Reproducible generation with seed
python examples/generate.py --num-samples 50 --seed 42

# Without videos (faster)
python examples/generate.py --num-samples 50 --no-videos

Command-Line Options

Argument Description
--num-samples Number of tasks to generate (required)
--output Output directory (default: data/questions)
--seed Random seed for reproducibility
--no-videos Skip video generation (images only)

📖 Task Example

Prompt

Outline all triangles in the circular region with a green border. Only outline objects within that region.

Visual

Initial Frame
Shapes scattered, region highlighted
Animation
Target shapes get outlined sequentially
Final Frame
All target shapes within region outlined

📖 Task Description

Objective

Identify and outline all instances of a specific shape type that lie within a designated region, demonstrating spatial awareness and selective attention.

Task Setup

  • Regions: 2 regions per image (one circular, one rectangular), arranged left and right
  • Shapes per region: 2-5 shapes per region (dynamically adjusted size based on count)
  • Shape types: Circle, square, triangle, trapezoid (4 types)
  • Colors: 100 distinct colors (fixed palette for reproducibility)
  • Target: Specific shape type to identify in a specific region (varies per sample)
  • Outline: Green border marks correctly identified shapes
  • Background: Pure white
  • Goal: Outline all instances of target shape within the specified region only

Key Features

  • Two distinct regions (circle and rectangle) with clear boundaries
  • Shapes distributed within each region (not overlapping between regions)
  • Only shapes within the target region matching the target type are outlined
  • Multiple shape types provide distractor objects
  • Dynamic shape sizing: shapes automatically resize based on count (2-5 per region)
  • Grid-based distribution ensures even spacing within each region
  • 100-color palette provides extensive visual diversity
  • Minimum spacing between shapes prevents overlap

📦 Data Format

data/questions/identify_objects_in_region_task/identify_objects_in_region_00000000/
├── first_frame.png      # Shapes scattered, region marked
├── final_frame.png      # Target shapes within region outlined
├── prompt.txt           # Identification instruction
├── ground_truth.mp4     # Animation of outlining process
└── question_metadata.json # Task metadata

File specifications:

  • Images: 1024×1024 PNG format
  • Video: MP4 format, 16 fps
  • Duration: ~2 seconds

🏷️ Tags

perception spatial-awareness selective-attention region-detection shape-recognition


About

This is the data generator for identify objects in region task

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages