Byungwoo Jeon1 ·
Yoonwoo Jeong2 ·
Hyunseok Lee1 ·
Minsu Cho2 3 ·
Jinwoo Shin1 3
1 KAIST 2POSTECH 3RLWRLD
1 KAIST 2POSTECH 3RLWRLD
[project page] [arXiv]
This repository is the official implementation of Vision-aligned Latent Reasoning for Multi-modal Large Language Model (VaLR).
- Release inference code
- Release training code and data
- Release evaluation code
@article{jeon2026vision,
title={Vision-aligned Latent Reasoning for Multi-modal Large Language Model},
author={Jeon, Byungwoo and Jeong, Yoonwoo and Lee, Hyunseok and Cho, Minsu and Shin, Jinwoo},
journal={arXiv preprint arXiv:2602.04476},
year={2026}
}