Skip to content

art-jang/LiTFiC

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Lost in Translation, Found in Context:
Sign Language Translation with Contextual Cues
(CVPR 2025)


🔥 Official Code Repository and Usage Instructions

📄 Paper & Project Links

  • 📘 Paper (arXiv): Official research paper for Lost in Translation, Found in Context.

  • 🌐 Project Page: Detailed project webpage with demos, code, and additional resources.

🚀  Environment

# clone project
git clone https://github.com/art-jang/Lost-in-Translation-Found-in-Context.git
cd LiTFiC

conda create -n myenv python=3.10
conda activate myenv

pip install torch==2.3.1 torchvision==0.18.1 --index-url https://download.pytorch.org/whl/cu121

# [Optional]
conda install nvidia/label/cuda-12.1.0::cuda-toolkit

# install requirements
pip install -r requirements.txt
conda install bioconda::java-jdk

📥 Data Download for Training and Testing

To train and test the model properly, please follow the detailed instructions on how to download and prepare the datasets.

You can find the full explanation and configuration details here:
➡️ Dataset Paths and Download Instructions

This guide covers all necessary datasets, including annotations, video features, subtitles, and more.

Training with Different Modalities

Use the following commands to train the model with different modality combinations.
Specify which GPUs to use with the trainer.devices option.

# 1) Train using only the video (vid) modality
python src/train.py trainer.devices=[0,1,2,3] task_name=vid experiment=vid

# 2) Train using video + pseudo-gloss captions (pg) modality
python src/train.py trainer.devices=[0,1,2,3] task_name=vid+pg experiment=vid+pg

# 3) Train using video + pseudo-gloss + previous sentence (prev) modality
python src/train.py trainer.devices=[0,1,2,3] task_name=vid+pg+prev experiment=vid+pg+prev

# 4) Train using video + pseudo-gloss + previous sentence + background (bg) modality
python src/train.py trainer.devices=[0,1,2,3] task_name=vid+pg+prev+bg experiment=vid+pg+prev+bg

Viewing Training Logs

If you want to track and visualize training logs using Weights & Biases (W&B),
add the option logger=wandb to your training command. For example:

python src/train.py trainer.devices=[0,1,2,3] task_name=vid experiment=vid logger=wandb

To evaluate a trained model, run the following command:

python src/eval.py trainer.devices=[0,1,2,3] \
  task_name=vid+pg+prev+bg-eval \
  experiment=vid+pg+prev+bg \
  ckpt_path={CKPT_PATH}

For LLM evaluation, run the following command (open api key is required):

python src/llm_eval.py --data_file={CAPTION_FILE}

Acknowledgements

This repository is built upon the excellent lightning-hydra-template,
which provided a solid foundation for organizing training and configuration workflows.

We also adapted parts of our data loading pipeline from CSLR2,
which was instrumental in handling the BOBSL dataset efficiently.

We would like to thank the authors of these projects for their contributions to the community.

🙌 Please Cite Us

If you find our work useful for your research, please consider citing:

@INPROCEEDINGS{jang2025lost,
  title={Lost in Translation, Found in Context: Sign Language Translation with Contextual Cues},
  author={Jang, Youngjoon and Raajesh, Haran and Momeni, Liliane and Varol, G{\"u}l and Zisserman, Andrew},
  booktitle={CVPR},
  year={2025}
}

About

[CVPR2025] Official code for Lost in Translation Found in Context

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages