Graduate Student, Computer Science, University of California San Diego
Advisor: Julian McAuley
Research Focus: Multimodal AI, Conversational Recommender Systems, Audio × ML
📧 [email protected]
🔗 LinkedIn | GitHub | Google Scholar | Website
I am a graduate student in Computer Science at UC San Diego, working at the intersection of multimodal AI, LLMs, and music intelligence.
My work focuses on building conversational recommender systems, audio-language alignment models, and scalable multimodal frameworks that connect sound, language, and reasoning.
At the McAuley Lab, I design large-scale models that learn from audio, text, and visual data for retrieval, recommendation, and generation.
My current research involves efficient fine-tuning (LoRA/QLoRA), reinforcement-based data selection, and temporal modeling for multimodal understanding and conversational intelligence.
University of California, San Diego
- M.S. in Computer Science (Machine Learning), 2025–2026, GPA: 4.0
Graduate TA: CSE 153/253 (ML for Music), CSE 158/258 (Recommender Systems and Web Mining) - B.S. in Computer Science, 2021–2025, GPA: 3.7, Provost Honors
- Software Engineer Intern, Apple (Core OS)
Built dashboards to identify SoC regressions and collaborated with the Siri team to build LLM-as-a-judge for different audio related tasks. - Graduate Research Assistant, McAuley Lab (UCSD)
Conducting research on multimodal learning, conversational recommendation, and model adaptation. - Data Engineer, FDI Lab (UCSD)
Developed scalable ETL and retrieval-augmented systems for large-scale data analysis.
-
MusiCRS: Benchmarking Audio-Centric Conversational Recommendation
arXiv preprint, 2025
Benchmark for conversational recommendation grounded in audio context.
arXiv -
WildScore: Benchmarking MLLMs in-the-Wild Symbolic Music Reasoning
EMNLP 2025
Evaluates multimodal LLMs for symbolic music reasoning in real-world settings.
arXiv -
FUTGA-MIR: Enhancing Fine-grained and Temporally-aware Music Understanding with MIR
ICASSP 2025
Improves fine-grained music retrieval through temporal and generative augmentation.
IEEE Xplore -
CoLLAP: Contrastive Long-form Language-Audio Pretraining with Musical Temporal Structure Augmentation
ICASSP 2025
Proposes contrastive strategies for long-form audio–text pretraining using temporal cues.
IEEE Xplore
- Multimodal Learning (Audio, Vision, Language)
- Conversational Recommender Systems
- Audio × Machine Learning
- LLM Adaptation and Fine-tuning
- Generative Evaluation and Temporal Modeling
Languages: Python, C/C++, Java, TypeScript, SQL
ML/AI: PyTorch, TensorFlow, Hugging Face, LangChain, LlamaIndex, LoRA/QLoRA
Systems and Data: Spark, Docker, AWS, CUDA, REST APIs
Web and Infra: FastAPI, React, Streamlit, Node.js, Kubernetes


