Generates synthetic datasets for training and evaluating vision models on size comparison and ranking tasks. Each sample contains multiple circles of different sizes where the second largest must be identified and outlined.
Each sample pairs a task (first frame + prompt describing what needs to happen) with its ground truth solution (final frame showing the result + video demonstrating how to achieve it). This structure enables both model evaluation and training.
| Property | Value |
|---|---|
| Task ID | G-161 |
| Task | Mark Second Largest Shape |
| Category | Perception |
| Resolution | 1024×1024 px |
| FPS | 16 fps |
| Duration | ~2-3 seconds |
| Output | PNG images + MP4 video |
# 1. Clone the repository
git clone https://github.com/VBVR-DataFactory/G-161_mark_second_largest_shape_data-generator.git
cd G-161_mark_second_largest_shape_data-generator
# 2. Create and activate virtual environment
python3 -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# 3. Install dependencies
pip install --upgrade pip
pip install -r requirements.txt
pip install -e .# Generate 50 samples
python examples/generate.py --num-samples 50
# Custom output directory
python examples/generate.py --num-samples 100 --output data/my_dataset
# Reproducible generation with seed
python examples/generate.py --num-samples 50 --seed 42
# Without videos (faster)
python examples/generate.py --num-samples 50 --no-videos| Argument | Description |
|---|---|
--num-samples |
Number of tasks to generate (required) |
--output |
Output directory (default: data/questions) |
--seed |
Random seed for reproducibility |
--no-videos |
Skip video generation (images only) |
There are 4 circles on the screen, with different sizes.
Compare their sizes and find the second largest circle.
The second largest is the shape that would be in position 2 if you sort all shapes by size from largest to smallest.
Outline the second largest shape in red.
![]() |
![]() |
![]() |
| Initial Frame 5 circles with different sizes |
Animation Second largest circle outlined in red |
Final Frame Red outline around second largest circle |
Compare sizes of multiple circles and outline the second largest circle with a red border.
- Shapes: 5 circles with different sizes
- Size variation: Circles have distinct diameters
- Ranking criterion: Sort by size (largest to smallest)
- Target: Second position in sorted order
- Task: Compare sizes, identify second largest
- Marking: Outline with red border
- Background: White with clear visibility
- Goal: Outline the circle with second highest diameter
- Size comparison across multiple objects
- Ranking and ordinal position identification
- Tests understanding of "second largest" concept
- Requires sorting by size metric
- Non-maximum selection (not just largest)
- Precise ranking among similar objects
data/questions/mark_second_largest_shape_task/mark_second_largest_shape_00000000/
├── first_frame.png # 5 circles with different sizes
├── final_frame.png # Second largest circle outlined in red
├── prompt.txt # Second largest identification instruction
├── ground_truth.mp4 # Animation of outlining process
└── question_metadata.json # Task metadata
File specifications:
- Images: 1024×1024 PNG format
- Video: MP4 format, 16 fps
- Duration: ~2-3 seconds
visual-reasoning size-comparison ranking ordinal-selection second-largest quantitative-reasoning


