What does finetuning
do for the model?
● Lets you put more data into the model than what fits into
the prompt
● Gets the model to learn the data, rather than just get
access to it
What does finetuning
do for the model?
● Steers the model to more consistent outputs
● Reduces hallucinations
● Customizes the model to a specific use case
● Process is similar to the model's earlier training
Prompt Engineering vs. Finetuning
● No data to get started ● Nearly unlimited data fits
● Smaller upfront cost ● Learn new information
● No technical knowledge ● Correct incorrect information
needed ● Less cost afterwards if
● Connect data through smaller model
retrieval (RAG) ● Use RAG too
● Much less data fits ● More high-quality data
● Forgets data ● Upfront compute cost
● Hallucinations ● Needs some technical
● RAG misses, or gets knowledge, esp. data
incorrect data
Generic, side projects, prototypes Domain-specific, enterprise,
production usage, …privacy!
Benefits of finetuning your own LLM
Where finetuning fits in
Pretraining
● Model at the start:
○ Zero knowledge about the world
○ Can’t form English words
● Next token prediction
● Giant corpus of text data
● Often scraped from the internet:
“unlabeled”
● Self-supervised learning
● After Training
○ Learns language
○ Learns knowledge
Limitations of
pretrained base models
Finetuning after pretraining
● Finetuning usually refers to training further
○ Can also be self-supervised unlabeled data
○ Can be “labeled” data you curated
○ Much less data needed
○ Tool in your toolbox
● Finetuning for generative tasks is not well-defined:
○ Updates entire model, not just part of it
○ Same training objective: next token prediction
○ More advanced ways reduce how much to update
(more later!)
What is finetuning doing for you?
● Behavior change
○ Learning to respond more consistently
○ Learning to focus, e.g. moderation
○ Teasing out capability, e.g. better at conversation
● Gain knowledge
○ Increasing knowledge of new specific concepts
○ Correcting old incorrect information
● Both
First time finetuning
What is instruction finetuning?
● AKA "instruction-tuned" or "instruction-following" LLMs
● Teaches model to behave more like a chatbot
● Better user interface for model interaction
○ Turned GPT-3 into ChatGPT
○ Increase AI adoption, from thousands of
researchers to millions of people
Instruction-following datasets
Some existing data is ready as-is, online:
● FAQs
● Customer support conversations
● Slack messages
LLM Data Generation
Non-Q&A data can also be
converted to Q&A
● Using a prompt template
● Using another LLM
● ChatGPT (“Alpaca”)
● Open-source models
Instruction Finetuning Generalization
● Can access model's pre-existing knowledge
● Generalize following instructions to other data, not in
finetuning dataset
Overview of Finetuning
Different Types of Finetuning