Generates synthetic datasets for training and evaluating vision models on object size transformation tasks. Each sample contains two identical objects (same shape and color) separated by a divider line, where one object must be resized to match the reference object's size.
Each sample pairs a task (first frame + prompt describing what needs to happen) with its ground truth solution (final frame showing the result + video demonstrating how to achieve it). This structure enables both model evaluation and training.
| Property | Value |
|---|---|
| Task ID | G-6 |
| Task | Resize Object |
| Category | Transformation |
| Resolution | 1024×1024 px |
| FPS | 16 fps |
| Duration | ~2 seconds |
| Output | PNG images + MP4 video |
# 1. Clone the repository
git clone https://github.com/VBVR-DataFactory/G-6_resize_object_data-generator.git
cd G-6_resize_object_data-generator
# 2. Create and activate virtual environment
python3 -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# 3. Install dependencies
pip install --upgrade pip
pip install -r requirements.txt
pip install -e .# Generate 50 samples
python examples/generate.py --num-samples 50
# Custom output directory
python examples/generate.py --num-samples 100 --output data/my_dataset
# Reproducible generation with seed
python examples/generate.py --num-samples 50 --seed 42
# Without videos (faster)
python examples/generate.py --num-samples 50 --no-videos| Argument | Description |
|---|---|
--num-samples |
Number of tasks to generate (required) |
--output |
Output directory (default: data/questions) |
--seed |
Random seed for reproducibility |
--no-videos |
Skip video generation (images only) |
Resize the top object so its size matches the bottom reference object. Only change size.
![]() |
![]() |
![]() |
| Initial Frame Source object at different size |
Animation Object smoothly resizes to match reference |
Final Frame Both objects now same size |
Resize the source object to match the reference object's size while maintaining its position, shape, and color.
- Layout: Split canvas with divider line (horizontal or vertical)
- Objects: Two identical objects (same shape and color, different sizes)
- Shapes: Circle, square, triangle, diamond, hexagon, pentagon, heptagon, octagon, star, cross (10 types)
- Colors: 100 distinct colors (fixed palette for reproducibility)
- Size range: 39-156 pixels (radius/half-size)
- Minimum size difference: 26 pixels between objects
- Background: Pure white
- Goal: Resize source object to match reference size exactly
- Objects centered in their respective canvas halves
- Thin divider line (2 pixels) separates the two regions
- Source and reference sides randomized (left/right or top/bottom)
- Only size changes - position, shape, and color remain constant
- Smooth size transformation animation
- Clear spatial separation prevents ambiguity
data/questions/resize_object_task/resize_object_00000000/
├── first_frame.png # Source object at initial size
├── final_frame.png # Source object resized to match reference
├── prompt.txt # Resize instruction with position details
├── ground_truth.mp4 # Animation of resize transformation
└── question_metadata.json # Task metadata
File specifications:
- Images: 1024×1024 PNG format
- Video: MP4 format, 16 fps
- Duration: ~2 seconds
transformation size-matching visual-reasoning object-properties spatial-relations


