HeatFormer: A Neural Optimizer for Multiview Human Mesh Recovery

This is the offical Pytorch implementation of the paper : [arXiv] [project page]

Requirements

We tested our code with Python 3.8 on Ubuntu 20.04 LTS using the following packages.

Please refer to environment/pip_freeze.txt for the specific versions we used.

You can also use singularity to replicate our environment:

singularity build environment/HeatFormer.sif environment/HeatFormer.def
singularity run --nv environment/HeatFormer.sif

Data preparation

SMPL: You download SMPL layers from here(male&female) and here(neutral). You can download additional data for SMPL from here and place the data in a directory as follows.

${HeatFormer root}
|-- data
    |-- base_data
        |-- SMPL_MALE.pkl
        |-- SMPL_FEMALE.pkl
        |-- SMPL_NEUTRAL.pkl
        |-- J_regressor_body25.npy
        |-- J_regressor_extra.npy
        |-- J_regressor_h36m.npy
        |-- J_regressor_h36m_correct.npy
        |-- smpl_mean_params.npz

Datasets

Except for images, we provide preprocessed_data for all datasets. You can download these from Google Drive and place them following each dataset instruction:

Human3.6M: You can register and download from link. The Human3.6M dataset can be preprocessed with H36M-Toolbox. After preprocessing the data or by downloading our preprocessed data, you can place it in the following data directory.

${HeatFormer root}
|-- data
    |-- preprocessed_data
        |--h36m_train_25fps_new_db.pt
        |--h36m_test_25fps_new_db.pt
        |--h36m_train_25fps_ex_db.pt
        |--h36m_test_25fps_ex_db.pt
        |extra_data
            |-- Human36M_subject*_camera.json
            |-- Human36M_subject*_joint_3d.json
            |-- Human36M_subject*_SMPL_NeuralAnnot.json
    |-- dataset
        |-- images
            |-- s_01_act_02_subact_01_ca_01
            ..

MPI-INF-3DHP: You can visit the website of the dataset, download the zip file and, run the scripts. After running the scripts, you can place the data as follows.

${HeatFormer root}
|-- data
    |-- preprocessed_data
        |-- mpii3d_train_scale12_new_db.pt
        |-- mpii3d_val_scale12_new_db.pt
        |-- mis_fit_mpii3d_train_sampling_5.pt
        |-- mis_fit_mpii3d_train_sampling_10.pt
        |-- j3d_mpi_db.pt
    |-- dataset
        |-- images_mpii3d
            |-- S*
                |Seq*
                    |-- video_*
|-- mpi_inf_3dhp
    |-- S*
        |--Seq*
            |-- camera.calibration

BEHAVE: We use the BEHAVE dataset for evaluation. You can visit BEHAVE and download the data and put them in a directory structure as follows.

${HeatFormer root}
|-- data
    |-- preprocessed_data
        |-- BEHAVE_train_db.pt
        |-- BEHAVE_valid_db.pt
|-- BEHAVE
    |-- sequences
    |-- calibs

Training

Following Data preparation, you can download the datasets and load the pretrain model including ViT from Google Drive, then start training by

Iteration : 3
python train.py --cfg asset/train_iter3.yaml --gpu 0

Iteration : 4
python train.py --cfg asset/train_iter4.yaml --gpu 0

Evaluation

Download the dataset (Human3.6M, MPI-INF-3DHP, BEHAVE) and load the pretrain models. Then run the following code for each dataset:

Human3.6M

Iteration : 3
python eval.py --cfg asset/eval_iter3.yaml --pretrain lib/models/pretrain/model_best_iter3.pth.tar --align_type pgt --dataset H36M --gpu 0

Iteration : 4
python eval.py --cfg asset/eval_iter4.yaml --pretrain lib/models/pretrain/model_best_iter4.pth.tar --align_type pgt --dataset H36M --gpu 0

MPI-INF-3DHP

Iteration : 3
python eval.py --cfg asset/eval_iter3.yaml --pretrain lib/models/pretrain/model_best_iter3.pth.tar --align_type pgt --dataset MPII3D --gpu 0

Iteration : 4
python eval.py --cfg asset/eval_iter4.yaml --pretrain lib/models/pretrain/model_best_iter4.pth.tar --align_type pgt --dataset MPII3D --gpu 0

BEHAVE

Iteration : 3
python eval_BEHAVE.py --cfg asset/eval_iter3.yaml --pretrain lib/models/pretrain/model_best_iter3.pth.tar --align_type pgt --score 0.3 --gpu 0

Iteration : 4
python eval_BEHAVE.py --cfg asset/eval_iter4.yaml --pretrain lib/models/pretrain/model_best_iter4.pth.tar --align_type pgt --score 0.3 --gpu 0

Acknowledgements

We thank the authors for releasing code of the following excellent work without which our work would have not been possible.

Citing

Please cite the following paper, if you use any part of our code and data.

@InProceedings{Ymatsubara_2025_CVPR,
    author    = {Matsubara, Yuto and Nishino, Ko},
    title     = {HeatFormer: A Neural Optimizer for Multiview Human Mesh Recovery},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2025},
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

HeatFormer: A Neural Optimizer for Multiview Human Mesh Recovery

Requirements

Data preparation

Training

Evaluation

Acknowledgements

Citing

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
asset		asset
environment		environment
lib		lib
LICENSE		LICENSE
README.md		README.md
eval.py		eval.py
eval_BEHAVE.py		eval_BEHAVE.py
train.py		train.py

License

kyotovision-public/HeatFormer

Folders and files

Latest commit

History

Repository files navigation

HeatFormer: A Neural Optimizer for Multiview Human Mesh Recovery

Requirements

Data preparation

Training

Evaluation

Acknowledgements

Citing

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages