-

Your RAG System Retrieves the Right Data — But Still Produces Wrong Answers. Here’s Why (and How to Fix It).
Large Language ModelsYour RAG system is retrieving the right documents with perfect scores — yet it still…
17 min read -

Git worktrees, parallel agentic coding sessions, and the setup tax you should be aware of
20 min read
Latest
-

What I wish I did at the beginning of my journey
8 min read -

How I turned my eight-year weekly visualization habit into a reusable AI workflow
7 min read -

What if an unsupervised model could become a strong classifier with only a handful of…
10 min read -

From rank-stabilized scaling to quantization stability: A statistical and architectural deep dive into the optimizations…
11 min read -

Architectures, pitfalls, and patterns that work
14 min read -

Inside MareNostrum V: SLURM schedulers, fat-tree topologies, and scaling pipelines across 8,000 nodes in a…
11 min read -

The upstream decision no model, or LLM can fix once you get it wrong
22 min read -

Building a personal AI assistant is rarely a single, monolithic effort. In this piece, I…
9 min read -

memweave: Zero-Infra AI Agent Memory with Markdown and SQLite — No Vector Database Required
Agentic AIThe problem with agent memory today
17 min read
Editor’s Picks
-

Machine learning models can be confident even when they shouldn’t be. This article introduces Deep…
12 min read -

How to turn OpenStreetMap data into an interactive map of wild swimming spots using Overpass…
19 min read -

What to use, when to use it, and what to ignore?
7 min read -

What has changed in the past five years in the role and importance of generalists…
5 min read -

By compiling a simple program directly into transformer weights.
19 min read -

Most ReAct-style agents are silently wasting their retry budget on errors that can never succeed.…
19 min read -

AI coding assistants need a persistent memory layer to overcome the statelessness of LLMs and…
10 min read -

A long-form article featuring over 100 visualizations, covering a range of topics from how to…
107 min read -

The mathematical foundations of Vision-Language-Action (VLA) models for humanoid robots and more
18 min read
The Variable Newsletter
-

Sorting through the good, bad, and ambiguous aspects of vibe coding
4 min read
Deep Dives
-

Prefill Is Compute-Bound. Decode Is Memory-Bound. Why Your GPU Shouldn’t Do Both.
Large Language ModelsInside disaggregated LLM inference — the architecture shift behind 2-4x cost reduction that most ML…
16 min read -

It’s not about audio and video anymore
21 min read -

Most RAG tutorials focus on retrieval or prompting. The real problem starts when context grows.…
14 min read -

The best data models make it hard to ask bad questions and easy to answer…
29 min read -

In an age of constrained compute, learn how to optimize GPU efficiency through understanding architecture,…
18 min read -

Generate high-quality, minimal SVG plots by fitting Bézier curves with an ODF algorithm.
11 min read
