Skip to content
View hanialomari's full-sized avatar

Highlights

  • Pro

Block or report hanialomari

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
hanialomari/README.md

Hi, I'm Hani Alomari 👋

I build retrieval systems that understand images, text, video, and sound - not just literal matches.

I'm a PhD researcher at Virginia Tech, working on vision-language models (VLMs), RAG, and ranking/reranking. My focus is multi-prompt (multi-vector) embeddings: many small, controllable "views" of meaning that make search richer, more interpretable, and less prone to collapse.

What I work on

  • Reasoning in vision-language models (VLMs).
  • Cross-modal retrieval across images, text, video, and audio.
  • Structured information extraction from multimodal data.
  • Knowledge representation for multimodal reasoning.
  • Exploring room acoustics (RIRs) as spatial signals for learning geometry-aware representations

Why it matters

Real-world queries are polysemous: idioms, metaphor, culture, and context often matter more than surface similarity. I design retrieval pipelines that surface the right connections, not only the nearest neighbor.


Projects (quick view)

  • Multi-Prompt Embedding for Retrieval
    • One input -> multiple focused embeddings to boost recall and reduce length/bias collapse.
  • RAG + Reranker for Multimodal Search
    • Lightweight bi-encoder retrieval + VLM reader + cross-encoder reranker for better final ranking.
  • Diversity-Aware VLM Retrieval
    • Retrieves multiple perspectives (literal/figurative/emotional/abstract/background) instead of forcing a single vector.

Tech I use (most often)

Languages

ML / Data

Systems / Tools


Open to collaborations

If you are working on diversity-aware retrieval, interpretable VLMs, or multimodal reasoning benchmarks, lets talk.

How to reach me

Pinned Loading

  1. MaxMatch MaxMatch Public

    HTML