This is the source code release for the paper Leveraging Demonstrations with Latent Space Priors. It contains pre-training code (VAE, latent space prior, low-level policy) and code to train high-level policies with different prior integrations. Pre-trained models are provided as well (see below).
Install PyTorch according to the official instructions, for example in a new conda environment. This code-base was tested with PyTorch 1.10 and 1.12. Python 3.9 is required.
Install Python dependencies via
pip install -r requirements.txtWe provide a pre-trained low-level policy which was trained with MuJoCo 2.0. For maximum compatibility, we recommend to perform a manual install of the 2.0.0 which is available at roboti.us, along with an unrestricted license key.
For pre-training, you will have to build the C++ extension, which in turn
requires cmake and pybind11. These can be installed via conda or your
system-wide package manager of choice. You'll also require the CUDA
SDK. The extension can be built as follows:
cmake -S cpp -B cpp/build -DCMAKE_BUILD_TYPE=Release \
-DMUJOCO_INCLUDE_DIR=${HOME}/.mujoco/mujoco200_linux/include \
-DMUJOCO_LIBRARY=${HOME}/.mujoco/mujoco200_linux/bin/libmujoco200.so
make -C cpp/buildNaturally, the MuJoCo location in the above command might have to be adjusted.
For optimal performance, we also recommend installing NVidia's PyTorch extensions.
NOTE Depending on your MuJoCo installation, rendering environments may
require additional configuration. For all the commands below, passing
video=null eval.video=null on the command-line disables video generation.
Download and extract the pre-trained models:
wget https://dl.fbaipublicfiles.com/latent-space-priors/pretrained-models.tar.gz
tar xvzf pretrained-models.tar.gzExperiments from the paper can be launched as follows:
# High-level policy training without a prior
python train.py -cn zprior_explore policy_lo=pretrained/policy-lo.pt prior=null
# z-prior Explore
python train.py -cn zprior_explore policy_lo=pretrained/policy-lo.pt prior=pretrained/prior.pt
# z-prior Regularize
python train.py -cn zprior_regularize policy_lo=pretrained/policy-lo.pt prior=pretrained/prior.pt
# z-prior Options
python train.py -cn zprior_options policy_lo=pretrained/policy-lo.pt prior=pretrained/prior.pt
# Planning
python train.py -cn zprior_plan policy_lo=pretrained/policy-lo.pt prior=pretrained/prior.ptMocap data preparation
You'll need to download the SMPL-H model from smpl.is.tue.mpg.de and place it in hucc/envs/assets like so:
hucc/envs/assets/smplh/
├── LICENSE.txt
├── female
│ └── model.npz
├── info.txt
├── male
│ └── model.npz
└── neutral
└── model.npz
The AMASS data can be obtained at
amass.is.tue.mpg.de. First, extract the
CMU.tar.bz2 archive to a location of choice. Afterwards, prepare a training
data file as follows:
find <path-to-cmu-clips> -name '*_poses.npz' | PYTHONPATH=$PWD python scripts/gen_amass_jpos.py -f - --subset cmu_locomotion_small --output-path lcs2-train.h5
find <path-to-cmu-clips> -name '*_poses.npz' | PYTHONPATH=$PWD python scripts/gen_amass_jpos.py -f - --subset cmu_locomotion_small_valid --output-path lcs2-valid.h5
python scripts/merge_h5_corpora.py --output-path lcs2.h5 --train lcs2-train --valid lcs2-valid --test ''VAE training
Let's specify a dedicated job directory (hydra.run.dir) so that we can easily
refer to the resulting checkpoint and configuration.
python train_zprior.py -cn lcs2_vae dataset.path=$PWD/lcs2.h5 hydra.run.dir=vaePrior training
As a first step, the training data file has to be augmented with latent states and distributions from the VAE:
PYTHONPATH=$PWD python scripts/add_latents_to_hdf5.py --checkpoint vae/checkpoint.pt lcs2.h5 lcs2-with-latents.h5The prior can then be trained as follows:
python train_zprior.py -cn lcs2_prior dataset.path=$PWD/lcs2-with-latents.h5 hydra.run.dir=priorLow-level policy training
The low-level policy training code requires a dedicated data file which can be produced as follows:
find <path-to-cmu-clips> -name '*_poses.npz' | PYTHONPATH=$PWD python scripts/gen_amass_mjbox.py -f - --subset cmu_locomotion_small --checkpoint vae/checkpoint.pt --output-path lcs2-lo.h5Training the policy itself is computationally demanding. In our experiments, we use 2 machines, each consisting of 80 CPUs and 8 GPUs. We provide several configurations:
# Single-GPU training
python train.py -cn lcs2_policy_lo ref_path=lcs2-lo.h5 hydra.run.dir=policy_lo
# Multi-GPU training, single machine
python train_dist.py -cn lcs2_policy_lo ref_path=lcs2-lo.h5 agent.distributed.size=<num-gpus> hydra.run.dir=policy_lotrain_dist.py supports multi-machine training with slurm. Manual invocation is
also possible: the example below expects two 8-GPU machines. A shared
directory is required to perform the initial rendez-vous. The following can then
be run on each machine (where <machine-id> is either 0 or 1):
# Multi-GPU training on 2 machines, on each machine:
SLURM_NODEID=<machine-id> SLURM_NNODES=2 SLURM_JOBID=<random-string> \
python train_dist.py -cn lcs2_policy_lo ref_path=lcs2-lo.h5 \
agent.distributed.size=16 \
agent.distributed.rdvu_path=<shared_directory> \
hydra.run.dir=policy_loEnvironments can be selected with the env.name option. The table below shows
environment names as used in the paper:
| Environment | env.name |
|---|---|
| GoToTargets | BiskGoToTargetsC-v1 |
| Gaps | BiskGapsC-v1 |
| Butterflies | BiskButterfliesC-v1 |
| Stairs | BiskStairsContC-v1 |
The majority of latent-space-priors is licensed under CC-BY-NC, however portions of the project are available under separate license terms: the transformer modeling code in hucc/models/prior.py is originally licensed under the Apache 2.0 license; the gym vector environment code in hucc/envs/thmp_vector_env.py is license under the MIT license.