Yixin Huang

Hi, I'm

Yixin Huang

|

Research Assistant at UCSD Hao AI Lab
M.S. Computer Science Student • San Diego, CA

research.py
class Researcher:
    def __init__(self):
        self.focus = [
            "LLM Systems",
            "Agent Evaluation",
            "GPU Infrastructure"
        ]
        self.tools = [
            "vLLM", "SGLang",
            "NeMo RL", "Ray"
        ]
    
    def build(self):
        return "🚀 Innovation"
Scroll

About Me

"A journey of a thousand miles begins with a single step." — Confucius

I work on LLM systems, evaluation, and GPU-accelerated ML infrastructure. Currently an M.S. Computer Science student at UC San Diego (GPA: 4.0), previously B.A. in Computer Science & Applied Mathematics from UC Berkeley (GPA: 3.86).

Research Interests

  • LLM evaluation & benchmarks (agents, games, scientific reasoning)
  • Large-scale training & inference systems (FSDP, vLLM, Ray, Slurm)
  • GPU efficiency, memory systems, and model parallelism
  • Reinforcement learning for agents (GRPO, NeMo-Gym)

Current Focus

  • 🔄 Scaling agent evaluation with interactive environments
  • ⚡ Training & serving efficiency on multi-GPUs
  • 🎯 Reward modeling and RL for LLM agents

Tech Stack

Python PyTorch CUDA vLLM SGLang NeMo RL Areal Ray Docker Slurm FSDP DeepSpeed Linux Git

Featured Projects

🔬

VideoScience

CVPR

Benchmark for scientific correctness in text-to-video models. Evaluates physics & chemistry concepts using VLM-as-Judge scoring.

Python Benchmark Video
View Project →
🤖

NVIDIA NeMo Gym

⭐ 603

Build RL environments for LLM training. Integrating Sokoban & Tetris for scalable RL training, reward profiling, and GRPO.

Python RL NVIDIA
View Project →
🌐

lmenv

LLM environment framework for interactive evaluation. Standardized interfaces for game-based agent testing.

Python Framework
View Project →

Global Visitors

A privacy-friendly snapshot of where people have visited this site.

Visitor Map

Live widget powered by MapMyVisitors

Live
Live traffic
Historical visits

Blog

Notes on LLM systems, evaluation, and research workflows.

Jan 29, 2026

Context & Learn in Public / 语境与公开学习

A bilingual reflection on why context matters and why learning in public compounds over time.

Bilingual Reflection
Read →
Coming soon

Deploying efficient inference stacks

Notes on GPU scheduling, memory tuning, and vLLM/SGLang integration.

Systems GPU

Announcements

Short updates on new posts, releases, and talks.

Jan 29, 2026

New blog: Context & Learn in Public

Published a bilingual reflection on why context matters in research and why sharing work-in-progress helps over time.

Read the post →
Jan 29, 2026

VideoScience paper on arXiv

Our evaluation work is now on arXiv, covering scientific reasoning benchmarks for video models.

Read the paper →
Jan 29, 2026

VideoScience-Bench leaderboard live

Check out the public leaderboard tracking model performance on the VideoScience-Bench tasks.

View leaderboard →
Coming soon

Research updates

Upcoming paper releases and project milestones will be posted here.

Get in Touch

Feel free to reach out for collaborations, discussions, or just to say hi!

📧 Recruiters: Feel free to reach out at [email protected]

💬 Open an issue or discussion on any of my repositories!