Recommended reading: History of Large Language Models

I I found the following nice history of large language models (LLMs, or in other words Chat-GPT, CLaude and all their AI kin) on Gregory Gundersen’s blog:

https://gregorygundersen.com/blog/2025/10/01/large-language-models/

I find it interesting to follow up on developments in AI and machine learning not “just” because it is THE most dramatic and exciting revolution in science and technology of the decade (so far!) but also because it is intriguing to me, as a mathematician, to see what kinds of math is needed (i) to understand what’s going on under the hood, and (ii) to have partaken in the development of it (the short story is: not very much, but still some nontrivial amount of mathematics and mathematical maturity is needed). When I write “as a mathematician” I mean both as research mathematician as well as a teacher of math – it is incredibly important that we who teach mathematics have an idea of what math is used in today’s frontiers of science and technology.

Read the rest of this entry »