Vision-aligned Latent Reasoning
for Multi-modal Large Language Model

Byungwoo Jeon¹ · Yoonwoo Jeong² · Hyunseok Lee¹ · Minsu Cho^{2 3} · Jinwoo Shin^{1 3}
¹ KAIST ²POSTECH ³RLWRLD

[project page] [arXiv]

This repository is the official implementation of Vision-aligned Latent Reasoning for Multi-modal Large Language Model (VaLR).

TODO

Release inference code
Release training code and data
Release evaluation code

BibTeX

@article{jeon2026vision,
  title={Vision-aligned Latent Reasoning for Multi-modal Large Language Model},
  author={Jeon, Byungwoo and Jeong, Yoonwoo and Lee, Hyunseok and Cho, Minsu and Shin, Jinwoo},
  journal={arXiv preprint arXiv:2602.04476},
  year={2026}
}

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Vision-aligned Latent Reasoning
for Multi-modal Large Language Model

[project page] [arXiv]

TODO

BibTeX

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Vision-aligned Latent Reasoningfor Multi-modal Large Language Model

[project page] [arXiv]

TODO

BibTeX

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Vision-aligned Latent Reasoning
for Multi-modal Large Language Model

Packages