Books

Build a Reasoning Model (From Scratch) – In Progress

ISBN-13 9781633434677

Amazon (pre-order)
Manning (complete book in early access, pre-final layout, 528 pages)

Description

In Build a Reasoning Model (from Scratch), you will learn and understand how a reasoning large language model (LLM) works.

Reasoning is one of the most exciting and important recent advances in improving LLMs, but it’s also one of the easiest to misunderstand if you only hear the term reasoning and read about it in theory. This is why this book takes a hands-on approach. We will start with a pre-trained base LLM and then add reasoning capabilities ourselves, step by step in code, so you can see exactly how it works.

This book can be read on its own, and it also works well as a follow-up to Build a Large Language Model (from Scratch). The latter focuses on LLM architecture and pre-training from the ground up. This book starts with a pre-trained LLM and focuses on implementing reasoning techniques that are not covered there, including inference-time scaling, reinforcement learning training, and distillation.

The two books complement each other well, but they can also be read independently. Readers who want to focus on reasoning can start here and later read Build a Large Language Model (from Scratch) to understand how the underlying LLM architecture and pre-training work. Alternatively, readers who prefer a more bottom-up path can start with Build a Large Language Model (from Scratch) first and then continue with this book.

Reviews

One of the best resources I’ve seen on reasoning models. It cuts through the hype with hands-on practice using evals to build intuition for each tweak so you understand why these techniques work.

– Ivan Leo, Member of Technical Staff, Google DeepMind

Big. Dope. Great read, fun writing. It’s hard to walk away without actually learning something!

– Chris Alexiuk, Sr. Product Research Engineer, NVIDIA

The most important topic in modern AI taught in the best way possible: by building it from the ground up. My go-to resource for mastering reasoning models.

– Logan Thorneloe, Software Engineer, Google and author of AI for Software Engineers

In the age of AI, fundamentals matter more than ever. Without them, you’re vibe-coding on sand and unable to tell signal from hype on Twitter.

In this book, Sebastian Raschka distills the profound ideas behind LLM reasoning in the clearest, most accessible way, with hands-on examples that make the concepts stick. If you’re eager to learn how to build a reasoning model but have little ML experience, this book is where to start.

– Byron Hsu, Member, LMSYS

An exceptional deep dive into the next frontier of AI. This book doesn’t just explain reasoning models, it equips you to build, test, and truly understand them from first principles. A must-read for anyone serious about advancing beyond prompt engineering into real model intelligence.

– Aman Chadha, Senior Staff Tech Lead / Senior Manager, Google DeepMind

A really well-structured, hands-on introduction to reasoning models! The visuals bring transparency to what often feels like a black box by clearly illustrating each stage of the reasoning process. It stands out as one of the few resources that walks through these methods step by step, helping build a clear understanding of a complex subject.

– Vinija Jain, ML Lead, Google

Not just a book, a learning ecosystem: deep knowledge, practical coding and an evolving repo!

– Ivan Fioravanti, Serial Entrepreneur & Local AI Advocate

Sebastian makes the complex feel intuitive. By building reasoning models from scratch, you gain a level of understanding that papers alone cannot provide. Truly essential for the modern AI engineer.

– Omar Sanseviero, Developer Experience Lead, Google DeepMind

Sebastian has a rare gift for making complex ML ideas intuitive. Of all the writing on reasoning models, this is a book you can read cover to cover and leave with deep clarity on how to build them.

– Omar Khattab, Assistant Professor, MIT EECS

Build a Large Language Model (From Scratch)

ISBN-13 978-1633437166

Description

In Build a Large Language Model (from Scratch), you’ll discover how LLMs work from the inside out. In this book, I’ll guide you step by step through creating your own LLM, explaining each stage with clear text, diagrams, and examples.

The method described in this book for training and developing your own small-but-functional model for educational purposes mirrors the approach used in creating large-scale foundational models such as those behind ChatGPT. The book uses Python and PyTorch for all its coding examples.

This book is a strong starting point for readers who prefer a bottom-up approach and want to understand how LLMs are built and pretrained before moving on to reasoning-specific techniques.

Reviews

Yes, we can absolutely build applications while knowing very little about what an LLM actually is (just by calling APIs).

But honestly, if you want to become a top-tier ML / AI Engineer, you need to understand what’s going on under the hood.

And what better book to start with than one that explains how to build an actual LLM from scratch?

–Via Miguel Otero Pedrido, Senior Machine Learning Engineer at Zapier

I got a serious closeup look at what goes on inside an LLM. every step of the way, the book surprised with great detail, reiteration, recap and very manageable chunks to internalize the ideas.

–Via Ganapathy Subramaniam, Gen AI developer

I have read many technical books in my career spanning 20+ years, but this is the best technical book I have ever studied by a large margin. So if you are someone who is looking for a in-depth explanation of internal workings and from the scratch development of Large Language Models, then this is the book you should be reading.

–Via Soumitri Kadambi, Director Artificial Intelligence at ZeOmega

‘Build a Large Language Model from Scratch’ by Sebastian Raschka @rasbt has been an invaluable resource for me, connecting many dots and sparking numerous ‘aha’ moments.

This book comes highly recommended for gaining a hands-on understanding of large language models.

–Via Faisal Alsrheed, AI researcher

While learning a new concept, I have always felt more confident about my understanding of the concept if I’m able to code it myself from scratch. Most tutorials tend to cover the high level concept and leave out the minor details, and the absence of these details is acutely felt when you try to put these concepts into code. Thats why I really appreciate Sebastian Raschka, PhD’s latest book - Build a Large Language Model (from scratch).

At a time when most LLM implementations tend to use high level packages (transformers, timm), its really refreshing to see the progressive development of an LLM by coding the core building blocks using basic PyTorch elements. It also makes you appreciate how some of the core building blocks of SOTA LLMs can be distilled down to relatively simple concepts.

–Roshan Santhosh, Data Scientist at Meta

Ultimate hands on guide to build foundational models. This is the book you want to buy if you want to go deep. –Antonio Gulli, Google Sr Director

It is a great book. I learned many things that were not clear to me. I highly recommend this book.

–Via Tae-Wan Kim, Professor, Seoul National University

A high-level, no-code overview that explains the development of an LLM, featuring numerous figures from the book, which itself focuses on the underlying code that implements these processes:

Build a Large Language Model (From Scratch) Course Options

Main course: the free LLMs-from-Scratch YouTube playlist
Alternative: the Manning course with additional structure, bonus lectures, and ad-free viewing

Machine Learning Q and AI

ISBN-10: 1718503768 ISBN-13: 978-1718503762 Paperback: 264 pages No Starch Press (March, 2024)

📄 Read Online:

Full Book (Free)

Store links

Description

If you’re ready to venture beyond introductory concepts and dig deeper into machine learning, deep learning, and AI, the question-and-answer format of Machine Learning Q and AI will make things fast and easy for you, without a lot of mucking about.

Each brief, self-contained chapter journeys through a fundamental question in AI, unraveling it with clear explanations, diagrams, and exercises.

Multi-GPU training paradigms
Finetuning transformers
Differences between encoder- and decoder-style LLMs
Concepts behind vision transformers
Confidence intervals for ML
And many more!

Reviews

“Sebastian has a gift for distilling complex, AI-related topics into practical takeaways that can be understood by anyone. His new book, Machine Learning Q and AI, is another great resource for AI practitioners of any level.”
–Cameron R. Wolfe, Writer of Deep (Learning) Focus

“Sebastian uniquely combines academic depth, engineering agility, and the ability to demystify complex ideas. He can go deep into any theoretical topics, experiment to validate new ideas, then explain them all to you in simple words. If you’re starting your journey into machine learning, Sebastian is your guide.”
–Chip Huyen, Author of Designing Machine Learning Systems

“One could hardly ask for a better guide than Sebastian, who is, without exaggeration, the best machine learning educator currently in the field. On each page, Sebastian not only imparts his extensive knowledge but also shares the passion and curiosity that mark true expertise.”
–Chris Albon, Director of Machine Learning, The Wikimedia Foundation

Machine Learning with PyTorch and Scikit-Learn

ISBN-10: 1801819319 ISBN-13: 978-1801819312 Paperback: 770 pages Packt Publishing Ltd. (February 25, 2022)

Store links

About this book

Initially, this project started as the 4th edition of Python Machine Learning. However, after putting so much passion and hard work into the changes and new topics, we thought it deserved a new title. So, what’s new? There are many contents and additions, including the switch from TensorFlow to PyTorch, new chapters on graph neural networks and transformers, a new section on gradient boosting, and many more that I will detail in a separate blog post. For those who are interested in knowing what this book covers in general, I’d describe it as a comprehensive resource on the fundamental concepts of machine learning and deep learning. The first half of the book introduces readers to machine learning using scikit-learn, the defacto approach for working with tabular datasets. Then, the second half of this book focuses on deep learning, including applications to natural language processing and computer vision. While basic knowledge of Python is required, this book will take readers on a journey from understanding machine learning from the ground up towards training advanced deep learning models by the end of the book.

Reviews

“I’m confident that you will find this book invaluable both as a broad overview of the exciting field of machine learning and as a treasure of practical insights. I hope it inspires you to apply machine learning for the greater good in your problem area, whatever it might be.”

– Dmytro Dzhulgakov, PyTorch Core Maintainer

“This 700-page book covers most of today’s widely used machine learning algorithms, and will be especially useful to anybody who wants to understand modern machine learning through examples of working code. It covers a variety of approaches, from basic algorithms such as logistic regression to very recent topics in deep learning such as BERT and GPT language models and generative adversarial networks. The book provides examples of nearly every algorithm it discusses in the convenient form of downloadable Jupyter notebooks that provide both code and access to datasets. Importantly, the book also provides clear instructions on how to download and start using state-of-the-art software packages that take advantage of GPU processors, including PyTorch and Google Colab.”

– Tom M. Mitchell, professor, founder and former Chair of the Machine Learning Department at Carnegie Mellon University (CMU)

More information

Reddit AMA
My blog post explaining the contents
A Twitter Review Thread by @radekosmulski
A short YouTube review by Bhavesh Bhatt

Translations

Japanese ISBN-13: 978-4295015581
Serbian ISBN-13: 978-8673105772
Spanish ISBN-13: 978-8426735737
Korean ISBN-13: 979-1140707362

Older Books

You can find a list of all my books here.

Books

Build a Reasoning Model (From Scratch) – In Progress

Description

Other links

Reviews

Build a Large Language Model (From Scratch)

Description

Other links

Reviews

Build a Large Language Model (From Scratch) Course Options

Machine Learning Q and AI

Store links

Description

Other links

Reviews

Machine Learning with PyTorch and Scikit-Learn

Store links

About this book

Reviews

Other links

More information

Translations

Older Books