Generates synthetic datasets for training and evaluating vision models on visual reasoning with occlusion. Each sample contains objects that are temporarily obscured by a moving dark mask, testing object permanence understanding.
Each sample pairs a task (first frame + prompt describing what needs to happen) with its ground truth solution (final frame showing the result + video demonstrating how to achieve it). This structure enables both model evaluation and training.
| Property | Value |
|---|---|
| Task ID | G-21 |
| Task | Multiple Occlusions Vertical |
| Category | Transformation |
| Resolution | 1024×1024 px |
| FPS | 16 fps |
| Duration | ~3 seconds |
| Output | PNG images + MP4 video |
# 1. Clone the repository
git clone https://github.com/VBVR-DataFactory/G-21_multiple_occlusions_vertical_data-generator.git
cd G-21_multiple_occlusions_vertical_data-generator
# 2. Create and activate virtual environment
python3 -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# 3. Install dependencies
pip install --upgrade pip
pip install -r requirements.txt
pip install -e .# Generate 50 samples
python examples/generate.py --num-samples 50
# Custom output directory
python examples/generate.py --num-samples 100 --output data/my_dataset
# Reproducible generation with seed
python examples/generate.py --num-samples 50 --seed 42
# Without videos (faster)
python examples/generate.py --num-samples 50 --no-videos| Argument | Description |
|---|---|
--num-samples |
Number of tasks to generate (required) |
--output |
Output directory (default: data/questions) |
--seed |
Random seed for reproducibility |
--no-videos |
Skip video generation (images only) |
The scene shows 4 objects arranged in a horizontal line in the center of the frame, with a dark rectangular mask initially positioned above them. Move the mask vertically downward in a continuous motion until it leaves the frame. As it moves, the mask passes in front of the objects, temporarily blocking them from view.
![]() |
![]() |
![]() |
| Initial Frame Objects visible, mask above them |
Animation Mask moves down occluding objects |
Final Frame Mask below, objects fully visible again |
Demonstrate and understand object permanence by moving a dark mask vertically across objects, temporarily occluding them from view.
- Objects: 2-6 colored shapes (circles, squares, triangles, diamonds, hexagons, stars)
- Arrangement: Horizontal line in center of frame
- Mask: Dark rectangular overlay (~30% of image height)
- Initial position: Mask above objects
- Final position: Mask below frame (objects fully visible)
- Background: White with clear contrast
- Colors: 40 distinct fixed colors
- Goal: Move mask downward continuously, occluding and revealing objects
- Vertical occlusion motion demonstrating object permanence
- Multiple objects temporarily hidden simultaneously
- Continuous smooth mask movement
- 6 object shapes: circle, square, triangle, diamond, hexagon, star
- 40 distinct fixed colors for object variety
- Dark mask with high opacity for clear occlusion effect
- Objects remain stationary while mask moves
data/questions/multiple_occlusions_vertical_task/multiple_occlusions_vertical_00000000/
├── first_frame.png # Mask above objects, objects visible
├── final_frame.png # Mask below frame, objects fully visible
├── prompt.txt # Mask movement and occlusion instruction
├── ground_truth.mp4 # Animation of mask moving down
└── question_metadata.json # Task metadata
File specifications:
- Images: 1024×1024 PNG format
- Video: MP4 format, 16 fps
- Duration: ~3 seconds
visual-reasoning object-permanence occlusion motion attention visual-tracking


