TMR++

A Cross-Dataset Study for Text-based 3D Human Motion Retrieval

Léore Bensabath · Mathis Petrovich · Gül Varol

Description

Official PyTorch implementation of the paper:

A Cross-Dataset Study for Text-based 3D Human Motion Retrieval

This repo is based on the implementation of TMR: Text-to-Motion Retrieval Using Contrastive 3D Human Motion Synthesis.

Please visit our webpage for more details.

Bibtex

If you find this code useful in your research, please cite:

@inproceedings{lbensabath2024,
    title={TMR++: A Cross-Dataset Study for Text-based 3D Human Motion Retrieval},
    author={Bensabath, Léore and Petrovich, Mathis and Varol, G{\"u}l},
    journal={CVPRW HuMoGen},
    year={2024}
}

and

@inproceedings{petrovich23tmr,
    title     = {{TMR}: Text-to-Motion Retrieval Using Contrastive {3D} Human Motion Synthesis},
    author    = {Petrovich, Mathis and Black, Michael J. and Varol, G{\"u}l},
    booktitle = {International Conference on Computer Vision ({ICCV})},
    year      = 2023
}

You can also put a star ⭐, if the code is useful to you.

Installation 👷

Create environment

Create a python virtual environnement:

python -m venv ~/.venv/TMR
source ~/.venv/TMR/bin/activate

Install PyTorch

python -m pip install torch torchvision --index-url https://download.pytorch.org/whl/cu118

Then install remaining packages:

python -m pip install -r requirements.txt

which corresponds to the packages: pytorch_lightning, einops, hydra-core, hydra-colorlog, orjson, tqdm, scipy. The code was tested on Python 3.10.12 and PyTorch 2.0.1.

Set up the datasets

Please first set up the datasets as explain in https://github.com/Mathux/TMR/tree/master in the same README section.

In this repo, we provide the augmented versions of dataset humanml3d, kitml and babel. For a given dataset ($DATASET), up to 3 new annotation file have been created:

dataset/annotations/$DATASET/annotations_paraphrases.json: Includes all the paraphrases generated by a llm
dataset/annotations/$DATASET/annotations_actions.json: For humanml3d and kitml only, includes the action type label generated by a llm
dataset/annotations/$DATASET/annotations_all.json: Includes a concatenation by key id of all the annotations (original and llm generated)

Copy the data in your repo from here

Compute the text embeddings for the data with text augmentation

Run this command to compute the sentence embeddings and token embeddings for the annotations with text augmentation:

python -m prepare.text_embeddings --config-name=text_embeddings_with_augmentation data=$DATASET

Combine datasets

To create a combination of any of the datasets, run:

python -m prepare.combine_datasets datasets=$DATASETS test_sets=$TEST_DATASETS split_suffix=$SPLIT_SUFFIX [OPTIONS]

Where:

datasets: The list of datasets to combine
test_sets: The intended list on which the dataset is going to be tested. When generating the split files, this will filter from the training set the samples from one of the training datasets that overlap with samples from another provided testing dataset. Note that you can create different splits for different intended testing sets by leveraging parameter split_suffix. The annotations file for the given combination will stay the same regardless of the test_sets value.
split_suffix: The split file suffix for this given combination of test sets. Training and validation split files will be saved under: datasets/annotations/splits/train{split_suffix}.txt and datasets/annotations/splits/val{split_suffix}.txt

The new dataset will be created inside folder datasets/annotations/{dataset1}_{dataset2}(_{dataset3})

Example:

python -m prepare.combine_datasets datasets=["humanml3d","kitml"] test_sets=["babel"] split_suffix="_wo_hkb"

Then run the ''python -m prepare.text_embeddings'' command with or without text augmentations on your new dataset combination.

Example:

python -m prepare.text_embeddings --config-name=text_embeddings_with_augmentation data=humanml3d_kitml

Training 🚀

Training with a combination of datasets

To train with a combination of datasets without any text augmentation, run the same command as in TMR with the relevant dataset name:

Example:

python train.py data=humanml3d_kitml

Training with text augmentation

python train.py --config-name=train_with_augmentation data=$DATASET

Details

Relevant parameters you can modify in addition to the ones in TMR are the text augmentation picking probabilities detailed in the paper: **Example** ```bash python train.py --config-name=train_with_augmentation data=humanml3d data.paraphrase_prob=0.2 data.summary_prob=0.2 data.averaging_prob=0.3 run_dir=outputs/tmr_humanml3d_w_textAugmentation_0.2_0.2_0.3 ```

Extracting weights

After training, run the following command, to extract the weights from the checkpoint:

python extract.py run_dir=$RUN_DIR

It will take the last checkpoint by default. This should create the folder RUN_DIR/last_weights and populate it with the files: motion_decoder.pt, motion_encoder.pt and text_encoder.pt. This process makes loading models faster, it does not depends on the file structure anymore, and each module can be loaded independently. This is already done for pretrained models.

Pretrained models 📀

You can find the different models used in the paper here: pre-trained models

Evaluation 📊

Motion to text / Text to motion retrieval

python retrieval.py run_dir=$RUN_DIR data=$DATA

Action recognition

For action recognition on datasets babel_actions_60 and babel_actions_120, run:

python retrieval_action_multi_labels.py run_dir=$RUN_DIR data=$DATA

It will compute the metrics, show them and save them in this folder RUN_DIR/contrastive_metrics_$DATA/. You can change the name of the saving file using argument save_file_name.

Usage 💻

Encode a motion

Note that the .npy file should corresponds to HumanML3D Guo features.

python encode_motion.py run_dir=RUN_DIR npy=/path/to/motion.npy

Encode a text

python encode_text.py run_dir=RUN_DIR text="A person is walking forward."

Compute similarity between text and motion

python text_motion_sim.py run_dir=RUN_DIR text=TEXT npy=/path/to/motion.npy

For example with text="a man sets to do a backflips then fails back flip and falls to the ground" and npy=HumanML3D/HumanML3D/new_joint_vecs/001034.npy you should get around 0.96.

Launch the demo

Encode the whole motion dataset

python encode_dataset.py run_dir=RUN_DIR

Text-to-motion retrieval demo

Run this command:

python app.py

and then open your web browser at the address: http://localhost:7860.

License 📚

This code is distributed under an MIT LICENSE.

Note that our code depends on other libraries, including PyTorch, PyTorch3D, Hugging Face, Hydra, and uses datasets which each have their own respective licenses that must also be followed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

TMR++

A Cross-Dataset Study for Text-based 3D Human Motion Retrieval

Description

Bibtex

Installation 👷

Compute the text embeddings for the data with text augmentation

Training 🚀

Training with a combination of datasets

Training with text augmentation

Pretrained models 📀

Evaluation 📊

Motion to text / Text to motion retrieval

Action recognition

Usage 💻

Encode a motion

Encode a text

Compute similarity between text and motion

Launch the demo

Encode the whole motion dataset

Text-to-motion retrieval demo

License 📚

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
configs		configs
datasets/annotations		datasets/annotations
demo		demo
prepare		prepare
src		src
stats		stats
.gitignore		.gitignore
DATASETS.md		DATASETS.md
LICENSE.md		LICENSE.md
README.md		README.md
app.py		app.py
encode_dataset.py		encode_dataset.py
encode_motion.py		encode_motion.py
encode_text.py		encode_text.py
extract.py		extract.py
requirements.txt		requirements.txt
retrieval.py		retrieval.py
retrieval_action.py		retrieval_action.py
retrieval_action_multi_labels.py		retrieval_action_multi_labels.py
text_motion_sim.py		text_motion_sim.py
train.py		train.py

License

leorebensabath/TMRPlusPlus

Folders and files

Latest commit

History

Repository files navigation

TMR++

A Cross-Dataset Study for Text-based 3D Human Motion Retrieval

Description

Bibtex

Installation 👷

Compute the text embeddings for the data with text augmentation

Training 🚀

Training with a combination of datasets

Training with text augmentation

Pretrained models 📀

Evaluation 📊

Motion to text / Text to motion retrieval

Action recognition

Usage 💻

Encode a motion

Encode a text

Compute similarity between text and motion

Launch the demo

Encode the whole motion dataset

Text-to-motion retrieval demo

License 📚

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages