This is the codebase of the following paper in WMT2023:
Tosho Hirasawa, Emanuele Bugliarello, Desmond Elliott and Mamoru Komachi. Visual Prediction Improves Zero-Shot Cross-Modal Machine Translation. In Proceedings of the Eighth Conference on Machine Translation (WMT23). 2023.
Check out examples/multimodal/README.md to setup the task and train and evaluate models.