Skip to content
View deep-overflow's full-sized avatar
😎
Focusing
😎
Focusing
  • CVLAB @ KAIST AI
  • Seoul, Korea
  • 05:18 (UTC +09:00)

Highlights

  • Pro

Block or report deep-overflow

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
deep-overflow/README.md

πŸ‘‹ Hi, I'm Seongchan Kim (κΉ€μ„±μ°¬)

πŸ€– **Vision-Language-Action (VLA) & World Models ** πŸ§‘β€πŸ’» Integrated M.S./Ph.D. @CVLAB in KAIST AI

I research how models can predict and model physical interactions between agents (robots/humans) and the world.

Currently focused on building Vision-Language-Action (VLA) models and interested in World Models to understand the fundamental laws of interaction.


πŸ§ͺ Research Highlights

  • 🌍 World Models & Physical Interaction β€” Modeling and predicting how the world changes through agent-environment interactions.

  • 🦾 Vision-Language-Action (VLA) β€” Developing embodied AI that understands multi-modal instructions and translates them into physical actions.

  • 🎬 Interaction-Aware Generation β€” Leveraging generative models to simulate realistic physical dynamics and multi-instance interactions.

  • 🧠 Video Understanding β€” Utilizing MLLMs for deep temporal reasoning and understanding complex object relationships in video.


πŸ“ Publications

  • Self-Evolving Neural Radiance Fields
    Wild3D Workshop @ ICCV 2025
    πŸ”— Project Page

  • MUG-VOS: Multi-Granularity Video Object Segmentation
    AAAI 2025
    πŸ”— Project Page

  • Referring Video Object Segmentation via Language Aligned Track Selection
    arXiv 2025 πŸ”— Project Page

  • InterRVOS: Interaction-aware Referring Video Object Segmentation
    CVPR 2026
    πŸ”— Project Page

  • MATRIX: Mask Track Alignment for Interaction-Aware Video Generation
    ICLR 2026


🌎 Links

✨ β€œUnderstanding the World through Video and Multimodalities.”

Wave

πŸ”„ Last updated: 2025λ…„ 9μ›” 28일 | πŸ’» Made with ❀️ by Deep Overflow

Pinned Loading

  1. SE-NeRF SE-NeRF Public

    Forked from cvlab-kaist/SE-NeRF

    [Wild3D @ ICCVW'25] Official implementation of "SE-NeRF : Self-Evolving Neural Radiance Fields"

    1

  2. MUG-VOS MUG-VOS Public

    Forked from cvlab-kaist/MUG-VOS

    Official Implementation of "Multi-Granularity Video Object Segmentation" (AAAI 2025)

    Python 1

  3. SOLA SOLA Public

    Forked from cvlab-kaist/SOLA

    Official implementation of "Referring Video Object Segmentation via Language Aligned Track Selection".

    Python 1

  4. InterRVOS InterRVOS Public

    Forked from cvlab-kaist/InterRVOS

    Official implementation of "InterRVOS: Interaction-aware Referring Video Object Segmentation".

    Python 1