I am a final year PhD candidate at
Mila,
where I work with
Guillaume Lajoie
and
Yoshua Bengio.
My research spans meta-learning, amortized inference, diffusion models and variational methods,
and large language model optimization — from pretraining to post-training.
During my PhD, I also spent time at various research labs including Google, Meta, Apple, NVIDIA, Morgan Stanley and Uber ATG, where I tackled problems ranging from long-context meta-learning and LLM pre-training
to probabilistic forecasting and distillation.
I am primarily interested in improving the capabilities of current large-scale systems,
from leveraging inductive biases in their architectures to latent variable modeling
(analogously RL) as a framework for improving both reasoning and long-context abilities.