LinFeng Li 1,2 Jian Zhao 2 Yuan Xie 1 Xin Tan 1 Xuelong Li 2
1 East China Normal University; 2 The Institute of Artificial Intelligence (TeleAI), China Telecom
The top panel contrasts our End-to-End Goal Navigation paradigm with traditional approaches. Unlike Vision-Language Navigation (VLN), which relies on dense, step-by-step instructions, and complex Modular Navigation pipelines, CompassNav directly maps a high-level goal (e.g., "find the plant") to an action through integrated spatial logical reasoning.
The bottom panel details our core contribution-how to stimulate model reasoning ability: a paradigm shift from "Path Imitation" to "Decision Understanding." While traditional methods train agents to replicate a single expert trajectory and penalize any deviation, our agent learns to evaluate the relative quality of all feasible paths at each decision point. This approach cultivates a true "internal compass," enabling the agent to make more intelligent and flexible decisions in unseen environments.
- Release Compass-Data-22k
- Release CompassNav-7B
- Release CompassNav training code
- Release CompassNav Object Goal Nav/Instance Image-Goal Nav test code