Generates synthetic 3D mental rotation tasks. The goal is to understand camera rotation around a fixed 3D voxel structure, creating a smooth video showing the camera's horizontal rotation while maintaining a tilted viewing angle.
Each sample pairs a task (first frame + prompt describing what needs to happen) with its ground truth solution (final frame showing the result + video demonstrating how to achieve it). This structure enables both model evaluation and training.
| Property | Value |
|---|---|
| Task ID | O-55 |
| Task | Rotation |
| Category | Spatiality |
| Resolution | 1024×1024 px |
| FPS | 16 fps |
| Duration | ~5-10 seconds |
| Output | PNG images + MP4 video |
# Clone the repository
git clone https://github.com/Jiaqi-Gong/Gong_VBVR_Data.git
cd Gong_VBVR_Data/O-55_rotation_data-generator
# Install dependencies
pip install -r requirements.txt# Generate 100 samples
python examples/generate.py --num-samples 100
# Generate with specific seed
python examples/generate.py --num-samples 100 --seed 42
# Generate without videos
python examples/generate.py --num-samples 100 --no-videos
# Custom output directory
python examples/generate.py --num-samples 100 --output data/my_output| Argument | Type | Description | Default |
|---|---|---|---|
--num-samples |
int | Number of samples to generate | Required |
--seed |
int | Random seed for reproducibility | Random |
--output |
str | Output directory | data/questions |
--no-videos |
flag | Skip video generation | False |
A 6-block sculpture sits fixed on a table. First frame: Your camera is tilted at 23° elevation, viewing from 230° azimuth. Final frame: Your camera remains at 23° elevation, but rotates horizontally to 50° azimuth. This is a 180-degree rotation Create a smooth video showing the camera's horizontal rotation around the sculpture, and try to maintain the tilted viewing angle throughout.
![]() |
![]() |
![]() |
| Initial Frame 3D structure from initial viewpoint |
Animation Camera rotating around structure |
Final Frame 3D structure from rotated viewpoint |
Understand 3D mental rotation by creating a smooth video showing camera rotation around a fixed 3D voxel structure, maintaining a consistent elevation angle while rotating horizontally.
- Voxel Structure: 6-7 voxel blocks forming a 3D sculpture
- Structure Generation: Voxel snake algorithm with branching
- Elevation Range: 20-40 degrees (tilted view)
- Rotation Angle: 180 degrees horizontal rotation - this generator modifies the rotation parameter (camera/viewpoint rotation) while keeping the 3D structure fixed
- Viewpoint: Camera rotates around fixed structure
- Voxel Colors: 20 distinct colors for visual variety
- Structure Complexity: Configurable voxel count and branching patterns
- 3D mental rotation: Tests ability to understand 3D spatial transformations
- Camera perspective: Requires understanding camera rotation and viewpoint changes
- Fixed structure: 3D sculpture remains fixed while camera moves
- Smooth animation: Shows continuous camera rotation around structure
- Elevation maintenance: Camera maintains tilted viewing angle throughout rotation
- Visual clarity: Clear 3D voxel representation with distinct colors
data/questions/rotation_task/rotation_00000000/
├── first_frame.png # Initial state (structure from initial viewpoint)
├── final_frame.png # Goal state (structure from rotated viewpoint)
├── prompt.txt # Task instructions
├── ground_truth.mp4 # Solution video (16 fps)
└── question_metadata.json # Task metadata
File specifications: Images are 1024×1024 PNG. Videos are MP4 at 16 fps, approximately 5-10 seconds long.
3d-rotation mental-rotation spatial-reasoning camera-perspective voxel-structures transformation 3d-visualization


