Code for reproducing the results in the following paper:
This repo, together with skeleton2d3d and pose-hg-train (branch image-play), hold the code for reproducing the results in the following paper:
Forecasting Human Dynamics from Static Images
Yu-Wei Chao, Jimei Yang, Brian Price, Scott Cohen, Jia Deng
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017
Check out the project site for more details.
-
The main content of this repo is for implementing training step 3 (Sec. 3.3), i.e. training the full 3D-PFNet (hourglass + RNNs + 3D skeleton converter).
-
For the implementation of training step 1, please refer to submodule pose-hg-train (branch
image-play). -
For the implementation of training step 2, please refer to submodule skeleton2d3d.
Please cite Image-Play if it helps your research:
@INPROCEEDINGS{chao:cvpr2017,
author = {Yu-Wei Chao and Jimei Yang and Brian Price and Scott Cohen and Jia Deng},
booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
title = {Forecasting Human Dynamics from Static Images},
year = {2017},
}
This repo contains three submodules (pose-hg-train, skeleton2d3d, and Deep3DPose), so make sure you clone with --recursive:
git clone --recursive https://github.com/ywchao/image-play.git- Download Pre-Computed Models and Prediction
- Dependencies
- Setting Up Penn Action
- Training to Forecast 2D Pose
- Training to Forecast 3D Pose
- Comparison with NN Baselines
- Evaluation
- Human Character Rendering
If you just want to run prediction or evaluation, you can simply download the pre-computed models and prediction (2.4G) and skip the training sections.
./scripts/fetch_implay_models_prediction.sh
./scripts/setup_symlinks_models.shThis will populate the exp folder with precomputed_implay_models_prediction and set up a set of symlinks.
You can now set up Penn Action and run the evaluation demo with the downloaded prediction. This will ensure exact reproduction of the paper's results.
To proceed to the remaining content, make sure the following are installed.
- Torch7
- We used commit bd5e664 (2016-10-17) with CUDA 8.0.27 RC and cuDNN v5.1 (cudnn-8.0-linux-x64-v5.1).
- All our models were trained on a GeForce GTX TITAN X GPU.
- matio-ffi
- torch-hdf5
- MATLAB
- Blender
- This is only required for human character rendering.
- We used release blender-2.78a-linux-glibc211-x86_64 (2016-10-26).
The Penn Action dataset is used for training and evaluation.
-
Download the Penn Action dataset to
external.externalshould containPenn_Action.tar.gz. Extract the files:tar zxvf external/Penn_Action.tar.gz -C external
This will populate the
externalfolder with a folderPenn_Actionwithframes,labels,tools, andREADME. -
Preprocess Penn Action by cropping the images:
matlab -r "prepare_penn_crop; quit"This will populate the
data/penn-cropfolder withframesandlabels. -
Generate validation set and preprocess annotations:
matlab -r "generate_valid_penn; quit" python tools/preprocess.pyThis will populate the
data/penn-cropfolder withvalid_ind.txt,train.h5,val.h5, andtest.h5. -
Optional: Visualize statistics:
matlab -r "vis_data_stats; quit"The output will be saved in
output/vis_dataset. -
Optional: Visualize annotations:
matlab -r "vis_data_anno; quit"The output will be saved in
output/vis_dataset. -
Optional: Visualize frame skipping. As mentioned in the paper (Sec 4.1), we generated training and evaluation sequences by skipping frames. The following MATLAB script visualizes a subset of the generated sequences after frame skipping:
matlab -r "vis_action_phase; quit"The output will be saved in
output/vis_action_phase.
We begin with training a minimal model (hourglass + RNNs) which does just 2D pose forecasting.
-
Before starting, make sure to remove the symlinks from the download section, if any:
find exp -type l -delete
-
Obtain a trained hourglass model. This is done with the submodule
pose-hg-train.Option 1: Download pre-computed hourglass models (50M): (recommended)
cd pose-hg-train ./scripts/fetch_hg_models.sh ./scripts/setup_symlinks_models.sh cd ..
This will populate the
pose-hg-train/expfolder withprecomputed_hg_modelsand set up a set of symlinks.Option 2: Train your own models.
-
Start training:
./scripts/penn-crop/hg-256-res-clstm.sh $GPU_IDThe output will be saved in
exp/penn-crop/hg-256-res-clstm. -
Optional: Visualize training loss and accuracy:
matlab -r "exp_name = 'hg-256-res-clstm'; plot_loss_err_acc; quit"The output will be saved to
output/plot_hg-256-res-clstm.pdf. -
Optional: Visualize prediction on a subset of the test set:
matlab -r "vis_preds_2d; quit"The output will be saved in
output/vis_hg-256-res-clstm.
Now we train the full 3D-PFNet (hourglass + RNNs + 3D skeleton converter), which also converts each 2D pose into 3D.
-
Obtain a trained hourglass model if you have not (see the section above).
-
Obtain a trained 3d skeleton converter. This is done with the submodule
skeleton2d3d.Option 1: Download pre-computed s2d3d models (108M): (recommended)
cd skeleton2d3d ./scripts/fetch_s2d3d_models_prediction.sh ./scripts/setup_symlinks_models.sh cd ..
This will populate the
skeleton2d3d/expfolder withprecomputed_s2d3d_models_predictionand set up a set of symlinks. -
Start training:
./scripts/penn-crop/hg-256-res-clstm-res-64.sh $GPU_IDThe output will be saved in
exp/penn-crop/hg-256-res-clstm-res-64. -
Optional: Visualize training loss and accuracy:
matlab -r "exp_name = 'hg-256-res-clstm-res-64'; plot_loss_err_acc; quit"The output will be saved to
output/plot_hg-256-res-clstm-res-64.pdf. -
Optional: Visualize prediction on a subset of the test set. Here we leverage Human3.6M's 3D pose visualizing routine.
First, download the Human3.6M dataset code:
cd skeleton2d3d ./h36m_utils/fetch_h36m_code.sh cd ..
This will populate the
skeleton2d3d/h36m_utilsfolder withRelease-v1.1.Then run the visualization script:
matlab -r "vis_preds_3d; quit"If you run this for the first time, the script will ask you to set two paths. Set the data path to
skeleton2d3d/external/Human3.6Mand the config file directory toskeleton2d3d/h36m_utils/Release-v1.1. This will create a new fileH36M.confunderimage-play.The output will be saved in
output/vis_hg-256-res-clstm-res-64.
This demo reproduces the nearest neighbor (NN) baselines reported in the paper (Sec. 4.1).
-
Obtain a trained hourglass model if you have not (see the section above).
-
Run pose estimation on input images.
./scripts/penn-crop/hg-256.sh $GPU_IDThe output will be saved in
exp/penn-crop/hg-256. -
Run the NN baselines:
matlab -r "nn_run; quit"The output will be saved in
exp/penn-crop/nn-all-th09andexp/penn-crop/nn-oracle-th09. -
Optional: Visualize prediction on a subset of the test set:
matlab -r "nn_vis; quit"The output will be saved in
output/vis_nn-all-th09andoutput/vis_nn-oracle-th09.
This demo runs the MATLAB evaluation script and reproduces our results in the paper (Tab. 1 and Fig. 7). If you are using pre-computed prediction, and want to also evaluate the NN baselines, make sure to first run step 3 in the last section.
Compute Percentage of Correct Keypoints (PCK):
matlab -r "eval_run; quit"This will print out the PCK values with threshold 0.05 ([email protected]) and also show the PCK curves.
Finally, we show how we rendered human characters from the forecasted 3D skeletal poses using the method developed by Chen et al. [4]. This relies on the submodule Deep3DPose.
-
Obtain forecasted 3D poses by either downloading pre-computed prediction or generating your own.
-
Set Blender path. Edit the following line in
tools/render_scape.m:blender_path = '$BLENDER_PATH/blender-2.78a-linux-glibc211-x86_64/blender';
-
Run rendering. We provide demo for both rendering without and with textures.
Render body shape without textures:
matlab -r "texture = 0; render_scape; quit"Render body shape with textures:
matlab -r "texture = 1; render_scape; quit"The output will be saved in
output/render_hg-256-res-clstm-res-64.