Skip to content

VBVR-DataFactory/G-21_multiple_occlusions_vertical_data-generator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

G-21: Multiple Occlusions Vertical Data Generator

Generates synthetic datasets for training and evaluating vision models on visual reasoning with occlusion. Each sample contains objects that are temporarily obscured by a moving dark mask, testing object permanence understanding.

Each sample pairs a task (first frame + prompt describing what needs to happen) with its ground truth solution (final frame showing the result + video demonstrating how to achieve it). This structure enables both model evaluation and training.


📌 Basic Information

Property Value
Task ID G-21
Task Multiple Occlusions Vertical
Category Transformation
Resolution 1024×1024 px
FPS 16 fps
Duration ~3 seconds
Output PNG images + MP4 video

🚀 Usage

Installation

# 1. Clone the repository
git clone https://github.com/VBVR-DataFactory/G-21_multiple_occlusions_vertical_data-generator.git
cd G-21_multiple_occlusions_vertical_data-generator

# 2. Create and activate virtual environment
python3 -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# 3. Install dependencies
pip install --upgrade pip
pip install -r requirements.txt
pip install -e .

Generate Data

# Generate 50 samples
python examples/generate.py --num-samples 50

# Custom output directory
python examples/generate.py --num-samples 100 --output data/my_dataset

# Reproducible generation with seed
python examples/generate.py --num-samples 50 --seed 42

# Without videos (faster)
python examples/generate.py --num-samples 50 --no-videos

Command-Line Options

Argument Description
--num-samples Number of tasks to generate (required)
--output Output directory (default: data/questions)
--seed Random seed for reproducibility
--no-videos Skip video generation (images only)

📖 Task Example

Prompt

The scene shows 4 objects arranged in a horizontal line in the center of the frame, with a dark rectangular mask initially positioned above them. Move the mask vertically downward in a continuous motion until it leaves the frame. As it moves, the mask passes in front of the objects, temporarily blocking them from view.

Visual

Initial Frame
Objects visible, mask above them
Animation
Mask moves down occluding objects
Final Frame
Mask below, objects fully visible again

📖 Task Description

Objective

Demonstrate and understand object permanence by moving a dark mask vertically across objects, temporarily occluding them from view.

Task Setup

  • Objects: 2-6 colored shapes (circles, squares, triangles, diamonds, hexagons, stars)
  • Arrangement: Horizontal line in center of frame
  • Mask: Dark rectangular overlay (~30% of image height)
  • Initial position: Mask above objects
  • Final position: Mask below frame (objects fully visible)
  • Background: White with clear contrast
  • Colors: 40 distinct fixed colors
  • Goal: Move mask downward continuously, occluding and revealing objects

Key Features

  • Vertical occlusion motion demonstrating object permanence
  • Multiple objects temporarily hidden simultaneously
  • Continuous smooth mask movement
  • 6 object shapes: circle, square, triangle, diamond, hexagon, star
  • 40 distinct fixed colors for object variety
  • Dark mask with high opacity for clear occlusion effect
  • Objects remain stationary while mask moves

📦 Data Format

data/questions/multiple_occlusions_vertical_task/multiple_occlusions_vertical_00000000/
├── first_frame.png      # Mask above objects, objects visible
├── final_frame.png      # Mask below frame, objects fully visible
├── prompt.txt           # Mask movement and occlusion instruction
├── ground_truth.mp4     # Animation of mask moving down
└── question_metadata.json # Task metadata

File specifications:

  • Images: 1024×1024 PNG format
  • Video: MP4 format, 16 fps
  • Duration: ~3 seconds

🏷️ Tags

visual-reasoning object-permanence occlusion motion attention visual-tracking


About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages