PhD in Probability Theory & Statistics (Moscow State University) | MA in Economics (New Economic Schoool)|MSc in Computer Science & ML (Yandex School of Data Analysis)
Language Modeling from Scratch — Implementations for Stanford CS336, including BPE tokenization, multi-head self-attention, transformer blocks, and a training pipeline on TinyStories/OpenWebText. Built from first principles in PyTorch.
Softmax Sharpness & Algorithmic Reasoning — Reimplementation and exploration of "Softmax is not Enough (for Sharp Size Generalisation)" (Veličković et al., 2024), investigating fundamental limitations of softmax in approximating sharp functions and the adaptive temperature mechanism. Related experimental work on the CLRS algorithmic reasoning benchmark.
Sentiment Analysis — Exercise in text classification and NLP pipelines.
Ephemeral Value Adjustment (EVA) — Implementation of "Fast Deep Reinforcement Learning Using Online Adjustments from the Past" (Hansen et al., 2018). Features Trajectory Central Planning with approximate nearest-neighbor search in learned embedding space, applied to Atari environments. Completed as part of the DeepPavlov Advanced Topics in Deep Reinforcement Learning course.
Graph Convolutional Network for Molecular Properties — Custom graph convolutional recurrent neural network for predicting scalar coupling constants (J-coupling) in the Kaggle "Predicting Molecular Properties" competition. Includes bond type inference from atomic coordinates, hybridization detection, and BFS-based path pooling for multi-hop coupling types.
Claude Sensei — Interactive AI tutoring system built at the Anthropic Hackathon 2023. Uses a state-machine architecture to adapt Claude's pedagogical strategy based on student progress, supporting Socratic dialogue, incremental hints, and solution validation.