My research advances safer, more efficient, and robust AI systems at scale — spanning training, inference, and deployment — by tackling core challenges in safety alignment, inference efficiency, and scalable system design across language, vision, and multimodal models:
- Advancing Safety Alignment Throughout Model Training: LLM safety basin, the first framework explaining how minimal unsafe data can collapse alignment during fine-tuning (NeurIPS'24); robust CNN architectures that achieved SOTA on RobustBench (BMVC'23 & Best Poster Award); dynamic safety shaping framework for LLM finetuning risk mitigation (In Submission).
- Optimizing Inference for Scalability and Throughput: video VLM scaling study for optimal inference (ACL'25); token reduction method that doubles LLM inference throughput (In Submission).
- Bridging Research and Deployment for Real-World Impact: UniTable, a modular table parsing system with over 470+ stars (workshops at NeurIPS'23 (oral), AAAI'24 (oral), & NeurIPS'24); distributed systems tutorials on Medium (33K+ readers).
- Large Reasoning Models Learn Better Alignment from Flawed Thinking, in submission
- Shape it Up! Restoring LLM Safety during Finetuning, NeurIPS'25 - [paper]
- Interpretation Meets Safety: A Survey on Interpretation Methods and Tools for Improving LLM Safety, EMNLP'25 - [paper]
- Compcap: Improving multimodal large language models with composite captions, ICCV'25 - [paper]
- Inference Compute-Optimal Video Vision Language Models, ACL'25 - [paper] [code]
- Navigating the Safety Landscape: Measuring Risks in Finetuning Large Language Models, NeurIPS'24 - [paper] [code]
- Llm self defense: By self examination, llms know they are being tricked, ICLR'24 Tiny Paper - [paper] [code]
- UniTable: Towards a Unified Framework for Table Recognition via Self-Supervised Pretraining, NeurIPS'24 Workshop - [paper] [code]
- Self-Supervised Pre-Training for Table Structure Recognition Transformer, AAAI'24 Workshop (Oral) - [paper] [code]
- High-Performance Transformers for Table Structure Recognition Need Early Convolutions, NeurIPS'23 Workshop (Oral) - [paper] [code]
- Robust Principles: Architectural Design Principles for Adversarially Robust CNNs, BMVC'23 (Best Poster Award) - [paper] [code]
- SkeleVision: Towards Adversarial Resiliency of Person Tracking with Multi-Task Learning, ECCV'22 Workshop - [paper] [code]




