Skip to content

offjangir/SpatialPi

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

18 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸš€ SpatialPi

3D-Enhanced Robot Learning with Point Cloud Representations

License OpenPI

Transform robot demonstrations into rich 3D point cloud datasets and fine-tune state-of-the-art vision-language-action models

Installation β€’ Dataset Pipeline β€’ Model Training β€’ Evaluation


🌟 Overview

SpatialPi is a comprehensive toolkit that bridges the gap between raw robot demonstrations and powerful 3D-enhanced learning.

πŸ“‹ Table of Contents

πŸ’» Installation

Step-by-Step Setup

1️⃣ Clone Repository

git clone --recurse-submodules [email protected]:offjangir/SpatialPi.git
cd SpatialPi

# If already cloned:
git submodule update --init --recursive

2️⃣ Perception Stack Environment

Create the pcgen conda environment for point cloud generation:

conda create -n pcgen python=3.11 -y
conda activate pcgen

# Install stable-virtual-camera
pip install -e .

# Install VGGT
cd ../vggt
pip install -e .

# Install dependencies
pip install tensorflow tensorflow_datasets
pip install git+https://github.com/huggingface/lerobot@0cf864870cf29f4738d3ade893e6fd13fbd7cdb5

cd ../../  # Return to SpatialPi root

3️⃣ OpenPI Training Environment

Set up the training environment using uv:

cd openpi

# Install dependencies
GIT_LFS_SKIP_SMUDGE=1 uv sync
GIT_LFS_SKIP_SMUDGE=1 uv pip install -e .

πŸ’‘ Tip: For Docker-based installation, see openpi/docs/docker.md


πŸ“Š Dataset & Point Cloud Pipeline

The SpatialPi pipeline converts RLDS-format robot demonstrations into LeRobot datasets enriched with 3D point clouds.

πŸ“₯ Step 0: Download Data

Download the LIBERO RLDS dataset:

huggingface-cli download openvla/modified_libero_rlds \
  --repo-type dataset \
  --local-dir ./data/modified_libero_rlds

πŸ”§ Stage 1: Metadata Creation

Create dataset structure and allocate point cloud filenames:

conda activate pcgen
export PYTHONPATH=$(pwd)

python src/conver_data.py \
  --data_dir ./data/modified_libero_rlds \
  --stage 1 \
  --output_dir ./lerobot_pc

What happens:

  • βœ… Enumerates all RLDS episodes
  • βœ… Creates LeRobot dataset structure
  • βœ… Pre-allocates point cloud filenames
  • βœ… Generates metadata for Stage 2

🎨 Stage 2: Point Cloud Generation

Generate point clouds using multi-GPU processing:

python src/conver_data.py \
  --data_dir ./data/modified_libero_rlds \
  --stage 2 \
  --output_dir ./lerobot_pc \
  --num_gpus 4 \
  --workers_per_gpu 1 \
  --resume True

Key Options:

Option Description Default
--num_gpus Number of GPUs to use 2
--workers_per_gpu Workers per GPU 1
--resume Resume from already processed episodes True

πŸ”¨ Stage 3: Schema Fix

Fix HuggingFace metadata for LeRobot compatibility:

python src/fix_data.py ./lerobot_pc

This ensures all nested features use _type="Sequence" instead of "List" for seamless LeRobot integration.


πŸŽ“ Finetuning Ο€-Models with OpenPI

Train state-of-the-art vision-language-action models on your point cloud datasets.

πŸ“Š Step 1: Compute Normalization Statistics

cd openpi
uv run scripts/compute_norm_stats.py --config-name pi0_libero

This generates normalization statistics required for training.

πŸš€ Step 2: Fine-Tune Ο€β‚€ (JAX)

XLA_PYTHON_CLIENT_MEM_FRACTION=0.9 \
uv run scripts/train.py pi0_libero \
  --exp-name spatialpi_pi0_libero \
  --overwrite

⚠️ Note: You should customize your own data path in openpi/src/openpi/training/config.py if you store the dataset in a different location.


πŸ§ͺ Evaluation Workflow

Run evaluation with a two-terminal setup: one for the simulation environment, one for the policy server.

πŸ–₯️ Terminal 1: Simulation Environment

cd openpi

# Create evaluation environment
uv venv --python 3.9 examples/libero/.venv
source examples/libero/.venv/bin/activate

# Install dependencies
uv pip sync \
  examples/libero/requirements.txt \
  third_party/libero/requirements.txt \
  --extra-index-url https://download.pytorch.org/whl/cu113 \
  --index-strategy=unsafe-best-match

uv pip install -e packages/openpi-client
uv pip install -e third_party/libero

# Install Python 3.9-compatible VGGT
git clone [email protected]:sorceressyidi/vggt.git
cd vggt && uv pip install -e . && cd ..

# Set environment
export PYTHONPATH=$PYTHONPATH:$(pwd)/third_party/libero

# Run simulation
python examples/libero/main_vggt.py

# If EGL errors occur:
MUJOCO_GL=glx python examples/libero/main_vggt.py

πŸ”Œ Terminal 2: Policy Server

cd openpi
uv run scripts/serve_policy.py --env LIBERO

The server streams actions via WebSocket to the simulation environment.

πŸ“ˆ Evaluation Results

We have fine-tuned the pi_base model on the Libero dataset for 5K steps and evaluated on the Libero benchmark.

Model Libero Spatial Libero Object Libero Goal Libero 10 Average
Ο€β‚€_pc @ 5K 81.8% 93.4% 73.8% 55.2% 76.5%
Ο€β‚€ @ 5K 76.4% 92.2% 70.8% 54.2% 73.4%
Ο€β‚€_pc @ 30K 98.2% 97.4% 94.0% 89.6% 94.8%
Ο€0_libero 96.8% 98.0% 93.4% 81% 92.3%
Ο€β‚€_base 0.0% 0.0% 0.0% 0.0% 0.0%

πŸ™ Acknowledgments

We gratefully acknowledge the following projects and teams:

  • Physical Intelligence for the amazing OpenPI framework and Ο€-models
  • Meta AI for VGGT depth estimation
  • LIBERO Team for the comprehensive manipulation benchmark
  • HuggingFace for LeRobot and dataset infrastructure

Built with ❀️ for the robotics community

⭐ Star us on GitHub β€’ πŸ“– Documentation β€’ πŸ› Report Issues

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published