Skip to content

LeCAR-Lab/BFM-Zero

 
 

Repository files navigation

BFM-Zero: A Promptable Behavioral Foundation Model for Humanoid Control Using Unsupervised Reinforcement Learning

BFM-Zero Deployment Stack

Deployment stack for BFM-Zero on the Unitree G1 (Jetson Orin)

This repository provides a complete deployment solution for running BFM-Zero policies on the Unitree G1 robot. It includes everything you need to test in simulation and deploy to the real robot.

📦 What's Included

  • ONNX Policy Runner (rl_policy/bfm_zero.py, exposing BFMZeroPolicy)
  • Robot & Policy Configuration files under config/
  • Experiment Configs for tracking, reward inference, and goal reaching (config/exp/)

🚀 Quick Start

Environment Setup

  1. Create and activate the runtime environment (Python 3.10 recommended):

    conda create -n bfm0real python=3.10 -y
    conda activate bfm0real
  2. Install Python dependencies:

    cd BFM-Zero
    pip install -r requirements.txt

Downloading the BFM-Zero ONNX Model

run

python download_hf_model.py --token <YOUR_HF_TOKEN>

You can either pass your Hugging Face token as a flag (as above), or set it via the HF_TOKEN environment variable:

export HF_TOKEN=your_token_here
python download_hf_model.py

After downloading the model, your directory structure should look like this:

model/
├── exported/
│   └── *.onnx              # ONNX policy model
├── tracking_inference/
│   └── *.pkl                   # Latent variables for tracking tasks
├── reward_inference/
│   └── *.pkl                   # Latent variables for reward inference tasks
└── goal_inference/
    └── *.pkl                   # Latent variables for goal reaching tasks

Directory Structure:

  • exported/model.onnx - The main ONNX policy model file
  • tracking_inference/*.pkl - Pre-computed latent variables for motion tracking (shape: [seq_length, 256])
  • reward_inference/*.pkl - Pre-computed latent variables for reward inference tasks
  • goal_inference/*.pkl - Pre-computed latent variables for goal reaching tasks

🧪 Sim-to-Sim Test Workflow

⚠️ Highly Recommended Before Real Robot Deployment
Not required on Jetson if you only plan to run on the real robot, but highly recommended to validate every motion first.

Step 1: Launch the Simulation (MuJoCo)

Standard (Linux/Windows):

python -m sim_env.base_sim \
  --robot_config=./config/robot/g1.yaml \
  --scene_config=./config/scene/g1_29dof.yaml

macOS (with MuJoCo's mjpython):

mjpython -m sim_env.base_sim \
  --robot_config=./config/robot/g1.yaml \
  --scene_config=./config/scene/g1_29dof.yaml

You should now see a MuJoCo environment:

🎮 Interactive Simulation Commands
  • 7 - Lift the robostack
  • 8 - Lower down the robostack
  • 9 - Release the robostack

Step 2: Start the Policy Process

In a separate terminal, activate the environment and navigate to the project:

conda activate bfm0real
cd BFM-Zero

Run the policy:

python rl_policy/bfm_zero.py \
  --robot_config config/robot/g1.yaml \
  --policy_config ${POLICY_CONFIG} \
  --model_path ${MODEL_ONNX_PATH} \
  --task ${TASK}

We provide three types of tasks: tracking, reward inference, and goal reaching.

⌨️ Interactive Terminal Commands
Key Action Task Type
] Start policy action (stops at stop frame) All
[ Start tracking Tracking
p Stop at the stop frame Tracking
n Switch to next reward/goal Reward/Goal
i Reset robot to initial position All
o Emergency stop (release all control) All

📋 Task-Specific Instructions

🎯 Tracking

./rl_policy/tracking.sh

Configuration: config/exp/tracking/<your-motion>.yaml

Parameter Description Example
ctx_path .pkl file containing seq_length latent variables. Shape: [seq_length, 256] ../tracking_inference/zs_walking.pkl
start Start frame index in the context sequence 0
end End frame index (set to None for continuous tracking) 2000 or None
gamma Discount factor for context averaging. Controls weight of recent vs. older frames 0.8
window_size Number of frames to average over. Larger = smoother but more delayed 3

💡 Tip: If you generate latent z with discounted window, set window_size to 1.


🎁 Reward Inference

./rl_policy/reward.sh

Configuration: config/exp/reward/<your-rewards>.yaml

Parameter Description
ctx_path Path to the reward inference .pkl file containing reward-specific latent variables
selected_rewards_filter_z List of dictionaries specifying which rewards and z indices to use

Example Configuration:

selected_rewards_filter_z:
  - reward: "raisearms-m-l"
    z_ids: [3]
  - reward: "sitonground"
    z_ids: [3, 4]

Each entry selects specific z latent variables (different subsampled buffers are used to infer different zs) for a given reward.


🎯 Goal Reaching

./rl_policy/goal.sh

Configuration: config/exp/goal/<your-goal-states>.yaml

Parameter Description
ctx_path Path to the goal inference .pkl file containing goal-specific latent variables
selected_goals List of goal names to select from the context dictionary

Example Configuration:

selected_goals: [
  'walk3_subject3_9044',    # on your knees
  'fight1_subject3_5559',   # hands on hip
  'dance1_subject3_4024',   # raise arm high
]

🤖 On-Robot Deployment (Jetson Orin, Unitree G1)

‼️Alert & Disclaimer

Deploying these models on physical hardware can be hazardous, and these models can ONLY BE DEPLOYED ON G1 Type 10, 11, 12, 15, or 16. Unless you have deep sim‑to‑real expertise and robust safety protocols, we strongly advise against running the model on real robots. These models are supplied for research use only, and we disclaim all responsibility for any harm, loss, or malfunction arising from their deployment.

Required Setup

Build the SDK with CMake

git clone https://github.com/EGalahad/unitree_sdk2.git
sudo apt-get update
sudo apt-get install build-essential cmake python3-dev python3-pip
pip3 install pybind11 pybind11-stubgen numpy
cd ./unitree_sdk2
mkdir build
cd build
cmake -DCMAKE_BUILD_TYPE=Release -Dpybind11_DIR=<your-pybind11-path> # `python3 -m pybind11 --cmakedir` to see the path
make -j$(nproc)
git clone https://github.com/eclipse-cyclonedds/cyclonedds -b releases/0.10.x
cd cyclonedds && mkdir build install && cd build
cmake .. -DCMAKE_INSTALL_PREFIX=../install
cmake --build . --target install
echo "export CYCLONEDDS_HOME=/home/<username>/cyclonedds/install">>~/.bashrc. # if your cyclonedds is not in root directory, change this path accordingly
source ~/.bashrc

Running on the Real Robot

  • In ./rl_policy/bfm_zero.py, replace "/path/to/your/unitree_sdk2/build/lib" with the actual path to your Unitree SDK2 installation (e.g., sys.path.append("/home/unitree/User/unitree_sdk2/build/lib")).

  • When deploying to the real robot, use the real-robot configuration file g1_real.yaml instead of the simulation configuration g1.yaml.

    python rl_policy/bfm_zero.py \
      --robot_config config/robot/g1_real.yaml \
      --policy_config ${POLICY_CONFIG} \
      --model_path ${MODEL_ONNX_PATH} \
      --task  ${TASK}

Known Issues

  • If you encounter an error stating that eth0 is not a valid network interface, update the interface name in the file './config/robot/g1_real.yaml' to match your robot’s actual network interface (e.g., eth1).

  • If you encounter severe jittering / non-stable performance on the real robot, please ensure you are using full GPU for inference and that there are no network issues, so the inference latency stays around 4–5 ms, and that your robot is type 10, 11, 12, 15, or 16 with the hip pitch motor N7520-22.5.

👥 Citation

If you find this project useful in your research, please consider citing:

@misc{li2025bfmzeropromptablebehavioralfoundation,
      title={BFM-Zero: A Promptable Behavioral Foundation Model for Humanoid Control Using Unsupervised Reinforcement Learning}, 
      author={Yitang Li and Zhengyi Luo and Tonghe Zhang and Cunxi Dai and Anssi Kanervisto and Andrea Tirinzoni and Haoyang Weng and Kris Kitani and Mateusz Guzek and Ahmed Touati and Alessandro Lazaric and Matteo Pirotta and Guanya Shi},
      year={2025},
      eprint={2511.04131},
      archivePrefix={arXiv},
      primaryClass={cs.RO},
      url={https://arxiv.org/abs/2511.04131}, 
}

This sim-to-real repo is built upon:

@misc{weng2025hdmilearninginteractivehumanoid,
      title={HDMI: Learning Interactive Humanoid Whole-Body Control from Human Videos}, 
      author={Haoyang Weng and Yitang Li and Nikhil Sobanbabu and Zihan Wang and Zhengyi Luo and Tairan He and Deva Ramanan and Guanya Shi},
      year={2025},
      eprint={2509.16757},
      archivePrefix={arXiv},
      primaryClass={cs.RO},
      url={https://arxiv.org/abs/2509.16757}, 
}

@article{he2025asap,
  title={ASAP: Aligning Simulation and Real-World Physics for Learning Agile Humanoid Whole-Body Skills},
  author={He, Tairan and Gao, Jiawei and Xiao, Wenli and Zhang, Yuanhang and Wang, Zi and Wang, Jiashun and Luo, Zhengyi and He, Guanqi and Sobanbabu, Nikhil and Pan, Chaoyi and Yi, Zeji and Qu, Guannan and Kitani, Kris and Hodgins, Jessica and Fan, Linxi "Jim" and Zhu, Yuke and Liu, Changliu and Shi, Guanya},
  journal={arXiv preprint arXiv:2502.01143},
  year={2025}
}

License

BFM-Zero is licensed under the CC BY-NC 4.0 license. See LICENSE for details.

About

BFM_Zero: A Promptable Behavioral Foundation Model for Humanoid Control Using Unsupervised Reinforcement Learning

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages