Skip to content

VBVR-DataFactory/G-6_resize_object_data-generator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

25 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

G-6: Resize Object Data Generator

Generates synthetic datasets for training and evaluating vision models on object size transformation tasks. Each sample contains two identical objects (same shape and color) separated by a divider line, where one object must be resized to match the reference object's size.

Each sample pairs a task (first frame + prompt describing what needs to happen) with its ground truth solution (final frame showing the result + video demonstrating how to achieve it). This structure enables both model evaluation and training.


📌 Basic Information

Property Value
Task ID G-6
Task Resize Object
Category Transformation
Resolution 1024×1024 px
FPS 16 fps
Duration ~2 seconds
Output PNG images + MP4 video

🚀 Usage

Installation

# 1. Clone the repository
git clone https://github.com/VBVR-DataFactory/G-6_resize_object_data-generator.git
cd G-6_resize_object_data-generator

# 2. Create and activate virtual environment
python3 -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# 3. Install dependencies
pip install --upgrade pip
pip install -r requirements.txt
pip install -e .

Generate Data

# Generate 50 samples
python examples/generate.py --num-samples 50

# Custom output directory
python examples/generate.py --num-samples 100 --output data/my_dataset

# Reproducible generation with seed
python examples/generate.py --num-samples 50 --seed 42

# Without videos (faster)
python examples/generate.py --num-samples 50 --no-videos

Command-Line Options

Argument Description
--num-samples Number of tasks to generate (required)
--output Output directory (default: data/questions)
--seed Random seed for reproducibility
--no-videos Skip video generation (images only)

📖 Task Example

Prompt

Resize the top object so its size matches the bottom reference object. Only change size.

Visual

Initial Frame
Source object at different size
Animation
Object smoothly resizes to match reference
Final Frame
Both objects now same size

📖 Task Description

Objective

Resize the source object to match the reference object's size while maintaining its position, shape, and color.

Task Setup

  • Layout: Split canvas with divider line (horizontal or vertical)
  • Objects: Two identical objects (same shape and color, different sizes)
  • Shapes: Circle, square, triangle, diamond, hexagon, pentagon, heptagon, octagon, star, cross (10 types)
  • Colors: 100 distinct colors (fixed palette for reproducibility)
  • Size range: 39-156 pixels (radius/half-size)
  • Minimum size difference: 26 pixels between objects
  • Background: Pure white
  • Goal: Resize source object to match reference size exactly

Key Features

  • Objects centered in their respective canvas halves
  • Thin divider line (2 pixels) separates the two regions
  • Source and reference sides randomized (left/right or top/bottom)
  • Only size changes - position, shape, and color remain constant
  • Smooth size transformation animation
  • Clear spatial separation prevents ambiguity

📦 Data Format

data/questions/resize_object_task/resize_object_00000000/
├── first_frame.png      # Source object at initial size
├── final_frame.png      # Source object resized to match reference
├── prompt.txt           # Resize instruction with position details
├── ground_truth.mp4     # Animation of resize transformation
└── question_metadata.json # Task metadata

File specifications:

  • Images: 1024×1024 PNG format
  • Video: MP4 format, 16 fps
  • Duration: ~2 seconds

🏷️ Tags

transformation size-matching visual-reasoning object-properties spatial-relations


About

This is the data generator for resize object task

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages