Skip to content
View OmkarThawakar's full-sized avatar
😎
😎

Block or report OmkarThawakar

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
OmkarThawakar/README.md

Hi, I'm Omkar Thawakar πŸ‘‹

PhD Researcher Β· Multimodal AI Β· Video Understanding Β· LLMs & Agents
🌐 Portfolio β€’ πŸ“„ Google Scholar β€’ πŸ’Ό LinkedIn β€’ βœ‰οΈ Email


πŸ”¬ About Me

I'm a PhD student at MBZUAI (Abu Dhabi, UAE), advised by Prof. Fahad Khan, working on:

  • πŸŽ₯ Video Understanding with Large Multimodal Models (LMMs)
  • 🧠 Multimodal Reasoning β€” step-by-step visual reasoning, self-evolving AI
  • 🌍 Multilingual & Culturally-Aware LLMs
  • πŸ€– LLM/LMM Agents & efficient on-device deployment

Previously a Researcher at MBZUAI and Research Assistant at IIT Ropar.


πŸ† Highlights (Recent)

  • πŸ“’ 3 papers at CVPR 2026 (2Γ— Findings + 1Γ— Main)
  • ⭐ ICLR 2025 Spotlight (Top 2%) β€” MobiLLaMA
  • 🌟 CVPR 2025 Highlight β€” All Languages Matter
  • πŸ“¦ 300K+ HuggingFace downloads across released models
  • πŸ₯‡ Khalifa Fund Entrepreneurship Winner β€” 250K AED grant
  • πŸš€ Founder @ Lawa.AI & Nutrigenics.Care

πŸ“š Selected Publications

Paper Venue Links
EvoLMM: Self-Evolving Large Multimodal Models CVPR 2026 Paper
CoVR-R: Reason-Aware Composed Video Retrieval CVPR 2026 Paper
LlamaV-o1: Step-by-Step Visual Reasoning ACL 2025 Paper
MobiLLaMA: Lightweight Transparent GPT ICLR 2025 ⭐ Spotlight Paper Β· HF πŸ€—
Composed Video Retrieval via Enriched Context CVPR 2024 Paper
Video Instance Segmentation (Open-World) IJCV 2024 Paper

Full list β†’ Google Scholar Β· Portfolio


πŸ› οΈ Tech Stack

Python PyTorch HuggingFace AWS MongoDB React


πŸš€ Startups

Lawa.AI β€” Agentic AI platform for enterprises. Multilingual, privacy-first, on-device. $130K+ yearly revenue Β· $70K seed grant Β· Khalifa Fund Winner

Nutrigenics.Care β€” AI-powered personalized nutrition & health platform. $100K grant funding Β· Microsoft Founders Hub ($150K support)


πŸ“Š GitHub Stats

Pinned Loading

  1. mbzuai-oryx/MobiLlama mbzuai-oryx/MobiLlama Public

    [ICLR-2025-SLLM Spotlight πŸ”₯]MobiLlama : Small Language Model tailored for edge devices

    Python 669 52

  2. mbzuai-oryx/XrayGPT mbzuai-oryx/XrayGPT Public

    [BIONLP@ACL 2024] XrayGPT: Chest Radiographs Summarization using Medical Vision-Language Models.

    Python 528 64

  3. FuzzyMinMax FuzzyMinMax Public

    Fuzzy Min Max Neural Network Library

    Python 28 4

  4. mbzuai-oryx/ClimateGPT mbzuai-oryx/ClimateGPT Public

    [EMNLP'23] ClimateGPT: a specialized LLM for conversations related to Climate Change and Sustainability topics in both English and Arabic languages.

    Python 79 11

  5. Self-Learning-Robot Self-Learning-Robot Public

    Reinforcement Training of Robot

    Python 11 1

  6. composed-video-retrieval composed-video-retrieval Public

    Composed Video Retrieval

    Python 62 1