0% found this document useful (0 votes)

14 views4 pages

Pre Training

Uploaded by

animation.work.6

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as TXT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views4 pages

Pre Training

Uploaded by

animation.work.6

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as TXT, PDF, TXT or read online on Scribd

You are on page 1/ 4

PRE-TRAINING

Pre-training is the first phase where a model learns general patterns and knowledge
from large datasets which often includes text, images, or other forms of
information.
This stage doesn't focus on task-specific details but instead learns generic
features.
It allows the model to build a broad understanding of language, concepts, or visual
features, depending on its application.

Pre-training in AI refers to the process where a model is trained on a large amount

of general data before being fine-tuned for a specific task.
It's like teaching the AI the basics of a language, patterns, and knowledge about
the world so it has a strong foundation to build on later.

Models are trained on large amounts of unlabeled data.

The models learn generalizable features that can be used later.
The models are then fine-tuned for specific tasks.

Lets take a simple analogy:

Think of it like learning to read before reading a specific subject textbook.
First of all, you learn the alphabet, vocabulary, and grammar.

During pre-training, the model learns:

Word meanings
Sentence structures
Common sense knowledge
Relationships between things

FINE-TUNING

Fine-tuning in AI refers to the process of taking a pretrained model and making

small adjustments to it so it performs better on a specific task or dataset.

Fine-tuning is a process in machine learning where a pre-trained model is further

trained on a smaller, task-specific dataset to improve its performance for a
particular application.
Instead of training a model from scratch, fine-tuning leverages the knowledge the
model has already gained from a large dataset and adapts it to a new but related
task.

🧠 Why Fine-Tune?

Saves time and resources compared to training a model from scratch.

Leverages existing knowledge (like language or image understanding).

Produces better results on niche or domain-specific tasks.

[ Millions of Random Images ] => [ Pre-trained Vision Model ]

(cats, cars, trees) (understands general image features)
↓
[ Fine-tuning with Damaged Car Images ]
(learns specific task)
↓
[ Final Model: Damaged Car Detector]
Parameter Efficient Fine-tuning (LoRA, QLoRa)

Parameter-Efficient Fine-Tuning (PEFT) is a technique used in deep learning to

fine-tune large pre-trained models (like GPT, BERT, or Vision Transformers) with
fewer trainable parameters.
This helps save memory, reduce computational cost, and improve efficiency,
especially when adapting models to multiple tasks.
Fine-tuning the full model is expensive and requires significant GPU resources.
PEFT methods update only a small subset of parameters while keeping most of the
model frozen.
Helps when deploying models in resource-constrained environments.

With the rise of large language models (LLMs) like GPT, LLaMA, or BERT-based
models, training or even fine-tuning these models fully can be hugely expensive in
terms of compute, memory, and time.

That’s where Parameter-Efficient Fine-Tuning (PEFT) techniques like LoRA and QLoRA
come in! we will see them in further slides.

LoRA stands for Low-Rank Adaptation of Large Language Models.

LoRA (Low-Rank Adaptation) is a technique used in machine learning, particularly in

fine-tuning large language models (LLMs) and other deep learning architectures.
It is designed to reduce the computational and memory costs associated with fine-
tuning massive pre-trained models.

The core idea of LoRA:

Instead of updating the entire set of weights in a large pretrained model during
fine-tuning, LoRA freezes the original weights and adds small trainable matrices
that approximate the changes — using low-rank matrix decomposition.
This approach drastically reduces the number of trainable parameters, making
training:
Faster
Cheaper
Less memory-intensive

1.Start with a Pretrained GPT-style Model

Use a general-purpose language model like LLaMA 2, GPT-J, or GPT-NeoX.
These models have strong general language understanding but lack specific knowledge
of your domain (e.g., customer support, legal advice, medical answers).

2.Prepare Domain-Specific Training Data

Collect conversations, documents, or Q&A logs from your domain.
Examples:
Customer support: Chat logs, ticket resolutions
Legal: Contracts, case summaries
Healthcare: Medical notes, patient inquiries
Clean and tokenize the data to feed into the model in a prompt-response format.

3.Apply LoRA to Key Layers of the Model

LoRA inserts small, trainable matrices (A and B) into existing model layers (like
the query/key/value projections in attention).
These matrices are low-rank adapters that learn the domain-specific patterns.
The original weights of the GPT model are frozen—LoRA only trains these adapters.
Why this matters: You're only adding new knowledge, not overwriting what the model
already knows.

4.Train the Model on Domain Data

Use your domain data to fine-tune just the LoRA adapters.
This teaches the model:
How users typically ask questions in your domain
The correct terminology and response style
Contextual understanding specific to your use case
Example: Instead of replying “I’m not sure,” the fine-tuned model learns to say:
📦 “Let me check your order status. Can you share your tracking number?”

QLORA
QLoRA = Quantized model + LoRA-based fine-tuning

QLoRA (Quantized Low-Rank Adaptation) is a recent and powerful technique that

combines quantization with low-rank adaptation to enable efficient fine-tuning of
large language models (LLMs) — even on consumer-grade GPUs.

QLoRA extends LoRA by quantizing the low-rank adaptation matrices. Quantization

reduces the precision of weights, lowering memory and computation requirements
while retaining performance.
QLoRA is very effective at understanding and generating natural language. This
makes it a valuable tool for applications that require a deep understanding of
context, such as language translation, content creation, and even complex problem-
solving tasks.

Load model in 4-bit:

Use bitsandbytes or Hugging Face’s transformers + accelerate
Memory usage drops to ~5-6GB for LLaMA-7B

Inject LoRA adapters:

Freeze original weights
Insert trainable low-rank matrices (A and B) in attention layers
Only train a few million parameters (not billions!)

FLASH ATTENTION

📚 Traditional Attention:
You try to compare every sentence to every other sentence at once.
You write all possible relationships out on a giant whiteboard.
Problem: The whiteboard gets too big to handle — slow and memory-intensive!

⚡ FlashAttention:
You read the novel in small sections (tiles).
You summarize each section as you go, without writing everything down.
You use a high-speed notepad (GPU SRAM) that’s very fast but small.
Result: You understand the whole book without memory overload, and much faster.

Benefits of Flash Attention:

Speed & Efficiency:

Significantly faster model training and inference.
Scalability:
Makes it feasible to train and deploy very large models that would otherwise be
impractical due to memory constraints.

Real-world Use Cases:

Reduces the cost and resource demand for training large models in applications like
autonomous driving, natural language understanding, and large-scale data
generation.

Fine-Tuning Models for Developers
No ratings yet
Fine-Tuning Models for Developers
24 pages
Fine-Tuning Techniques for LLMs
No ratings yet
Fine-Tuning Techniques for LLMs
8 pages
LLM Fine-Tuning On Laptop
No ratings yet
LLM Fine-Tuning On Laptop
38 pages
Efficient Fine-Tuning with PEFT
No ratings yet
Efficient Fine-Tuning with PEFT
10 pages
AI Frameworks and Fine-Tuning: An Overview
No ratings yet
AI Frameworks and Fine-Tuning: An Overview
10 pages
LLM Research Report
No ratings yet
LLM Research Report
8 pages
LLM Fince-Tuning
No ratings yet
LLM Fince-Tuning
16 pages
LoRA Techniques for LLM Fine-Tuning
No ratings yet
LoRA Techniques for LLM Fine-Tuning
27 pages
GenAI Preparation
No ratings yet
GenAI Preparation
15 pages
Mora: High-Rank PEFT Techniques
No ratings yet
Mora: High-Rank PEFT Techniques
98 pages
Loraland
No ratings yet
Loraland
27 pages
LoRA+ - Efficient Low Rank Adaptation of Large Models
No ratings yet
LoRA+ - Efficient Low Rank Adaptation of Large Models
24 pages
Full Fine-Tuning, PEFT, Prompt Engineering, or RAG
No ratings yet
Full Fine-Tuning, PEFT, Prompt Engineering, or RAG
23 pages
Dive Into LoRA Adapters
No ratings yet
Dive Into LoRA Adapters
15 pages
Introduction To LoRA & QLoRA
No ratings yet
Introduction To LoRA & QLoRA
20 pages
(Slide v2) Peft For Mcqa
No ratings yet
(Slide v2) Peft For Mcqa
48 pages
Fine-Tuning LLMs with PEFT & LoRa Techniques
No ratings yet
Fine-Tuning LLMs with PEFT & LoRa Techniques
25 pages
04 AIS421 Finetuning Part 2
No ratings yet
04 AIS421 Finetuning Part 2
50 pages
LoRA vs QLoRA: Fine-Tuning Techniques
No ratings yet
LoRA vs QLoRA: Fine-Tuning Techniques
5 pages
LLM Fine-Tuning Techniques Explained
No ratings yet
LLM Fine-Tuning Techniques Explained
16 pages
LLM Fine Tuning
No ratings yet
LLM Fine Tuning
1 page
Lora: Low-Rank Adaptation of Large Language Models
No ratings yet
Lora: Low-Rank Adaptation of Large Language Models
20 pages
Lora Fine-Tuning Without Gpus: A Cpu-Efficient Meta-Generation Framework For Llms
No ratings yet
Lora Fine-Tuning Without Gpus: A Cpu-Efficient Meta-Generation Framework For Llms
19 pages
lora综述2501 00365v1
No ratings yet
lora综述2501 00365v1
22 pages
Paper 2
No ratings yet
Paper 2
8 pages
Fine Tuning LLM Locally With Qlora
No ratings yet
Fine Tuning LLM Locally With Qlora
4 pages
Fine-Tuning Techniques for LLaMA Models
No ratings yet
Fine-Tuning Techniques for LLaMA Models
16 pages
LoRA Survey for Large Language Models
No ratings yet
LoRA Survey for Large Language Models
30 pages
Why Finetuning
No ratings yet
Why Finetuning
7 pages
Fine-Tuning AI Models for Developers
100% (2)
Fine-Tuning AI Models for Developers
19 pages
Compact Vision-Language With Open Weights, Faster Learning, Diffusion in Few Steps, LLMs Aid Tutors
No ratings yet
Compact Vision-Language With Open Weights, Faster Learning, Diffusion in Few Steps, LLMs Aid Tutors
15 pages
Low-Rank Adaptation for Language Models
No ratings yet
Low-Rank Adaptation for Language Models
13 pages
Loftq
No ratings yet
Loftq
16 pages
O - A: G L RA P - E F - : NE FOR LL Eneralized O FOR Arameter Fficient INE Tuning
No ratings yet
O - A: G L RA P - E F - : NE FOR LL Eneralized O FOR Arameter Fficient INE Tuning
16 pages
NB4-10 PT V Transfer Learning
No ratings yet
NB4-10 PT V Transfer Learning
16 pages
Fine Tuning Llama 3 On AMD Radeon Gpus
No ratings yet
Fine Tuning Llama 3 On AMD Radeon Gpus
27 pages
Quanta: Efficient High-Rank Fine-Tuning of Llms With Quantum-Informed Tensor Adaptation
No ratings yet
Quanta: Efficient High-Rank Fine-Tuning of Llms With Quantum-Informed Tensor Adaptation
28 pages
PEFT Techniques for Efficient Fine-Tuning
No ratings yet
PEFT Techniques for Efficient Fine-Tuning
9 pages
QuantizationLoRA Fine-Tune A 7B Model Single GPU
No ratings yet
QuantizationLoRA Fine-Tune A 7B Model Single GPU
6 pages
2024 - A Survey On LoRA of Large Language Models - Mao Et Al - Arxiv
No ratings yet
2024 - A Survey On LoRA of Large Language Models - Mao Et Al - Arxiv
31 pages
Slides
No ratings yet
Slides
9 pages
Insights on LoRA for LLM Adaptation
No ratings yet
Insights on LoRA for LLM Adaptation
6 pages
A Comprehensive Review of Low Rank Adaptation in Large Language Models For Efficient Parameter Tuning
No ratings yet
A Comprehensive Review of Low Rank Adaptation in Large Language Models For Efficient Parameter Tuning
11 pages
Mix Lora
No ratings yet
Mix Lora
18 pages
3 - Where Finetuning Fits
No ratings yet
3 - Where Finetuning Fits
7 pages
Finetuning Large Language Models - Short Course
No ratings yet
Finetuning Large Language Models - Short Course
16 pages
Unit 2
No ratings yet
Unit 2
9 pages
Parameter Efficient Fine
No ratings yet
Parameter Efficient Fine
14 pages
VERA VECTOR BASED RANDOM MATRIX ADAPTATIONi
No ratings yet
VERA VECTOR BASED RANDOM MATRIX ADAPTATIONi
21 pages
KD LoRA
No ratings yet
KD LoRA
8 pages
LLM Fine-Tuning - LLM Inference Handbook
No ratings yet
LLM Fine-Tuning - LLM Inference Handbook
4 pages
LoRA: Efficient Adaptation for Large Language Models
No ratings yet
LoRA: Efficient Adaptation for Large Language Models
26 pages
Lab 10
No ratings yet
Lab 10
9 pages
Bayesian Fine-Tuning for LLMs
No ratings yet
Bayesian Fine-Tuning for LLMs
48 pages
Fine-Tuning Large Language Models Guide
No ratings yet
Fine-Tuning Large Language Models Guide
6 pages
Aoml Projj
No ratings yet
Aoml Projj
11 pages
LLM Fine-Tuning: Best Practices & Tools
100% (1)
LLM Fine-Tuning: Best Practices & Tools
13 pages
Lembah Bujang: Malaysia's Ancient Archaeological Site
No ratings yet
Lembah Bujang: Malaysia's Ancient Archaeological Site
8 pages
23 Samss 060
No ratings yet
23 Samss 060
14 pages
Besck104c 1
No ratings yet
Besck104c 1
2 pages
BulkConfigurator V11
No ratings yet
BulkConfigurator V11
96 pages
Project Report Formatting Guide
No ratings yet
Project Report Formatting Guide
10 pages
Abm CH 6 Full PDF
No ratings yet
Abm CH 6 Full PDF
21 pages
Reading Techniques - Summarizing
67% (3)
Reading Techniques - Summarizing
2 pages
Why Study Literature? 20 Reasons
No ratings yet
Why Study Literature? 20 Reasons
3 pages
Assertiveness
No ratings yet
Assertiveness
97 pages
MASE-Math Course Description G10 (2024-2025)
No ratings yet
MASE-Math Course Description G10 (2024-2025)
3 pages
Basic Calculus Summative Test Guide
No ratings yet
Basic Calculus Summative Test Guide
3 pages
Generative Grammar Insights
No ratings yet
Generative Grammar Insights
18 pages
Contributions of American Descriptive Linguistic School To The Study of Vietnamese: A Contemporary Look
No ratings yet
Contributions of American Descriptive Linguistic School To The Study of Vietnamese: A Contemporary Look
17 pages
Review 3 & 4 - Gabarito
No ratings yet
Review 3 & 4 - Gabarito
9 pages
Toefle Ujian
No ratings yet
Toefle Ujian
19 pages
César A. Salgado, "A Note On Víctor Hernández-Cruz"
No ratings yet
César A. Salgado, "A Note On Víctor Hernández-Cruz"
6 pages
Presentation ON Tripwire Alarm Using Arduino: Presented By: Siddharth Maurya Abhijeet Sharma Saurav Agrahari
No ratings yet
Presentation ON Tripwire Alarm Using Arduino: Presented By: Siddharth Maurya Abhijeet Sharma Saurav Agrahari
11 pages
How To Hide Final Entry Indicator and Block Indicator From Service Entry Sheet (ML81N) - SAP Blogs
No ratings yet
How To Hide Final Entry Indicator and Block Indicator From Service Entry Sheet (ML81N) - SAP Blogs
12 pages
1st Grade Guided Reading Plan: Clouds
No ratings yet
1st Grade Guided Reading Plan: Clouds
2 pages
RepetMat U03 Kartk Slow Rozsz A-2
No ratings yet
RepetMat U03 Kartk Slow Rozsz A-2
1 page
English 6 Weekend Homework Assignment
No ratings yet
English 6 Weekend Homework Assignment
4 pages
API Security in Microservices: Authentication, Authorization, and Threat Mitigation
No ratings yet
API Security in Microservices: Authentication, Authorization, and Threat Mitigation
6 pages
Why Do Advertisers Use Puns?
No ratings yet
Why Do Advertisers Use Puns?
9 pages
B1 Preliminary For Schools Answer Sheet - Listening
80% (5)
B1 Preliminary For Schools Answer Sheet - Listening
1 page
Filipino 10 Q1 - M2B For Printing
No ratings yet
Filipino 10 Q1 - M2B For Printing
30 pages
Margret Prepositions
No ratings yet
Margret Prepositions
269 pages
EV5 - Tests - Unit Quiz11 - A
100% (1)
EV5 - Tests - Unit Quiz11 - A
3 pages
2 Marker Assignment 6
No ratings yet
2 Marker Assignment 6
3 pages
Python Journal
No ratings yet
Python Journal
55 pages
Accomplishment-Report - Teacher Day
No ratings yet
Accomplishment-Report - Teacher Day
3 pages

Pre Training

Uploaded by

Pre Training

Uploaded by

PRE-TRAINING

Pre-training in AI refers to the process where a model is trained on a large amount

Models are trained on large amounts of unlabeled data.

Lets take a simple analogy:

During pre-training, the model learns:

Fine-tuning in AI refers to the process of taking a pretrained model and making

Fine-tuning is a process in machine learning where a pre-trained model is further

Saves time and resources compared to training a model from scratch.

Leverages existing knowledge (like language or image understanding).

Produces better results on niche or domain-specific tasks.

[ Millions of Random Images ] => [ Pre-trained Vision Model ]

Parameter-Efficient Fine-Tuning (PEFT) is a technique used in deep learning to

LoRA stands for Low-Rank Adaptation of Large Language Models.

LoRA (Low-Rank Adaptation) is a technique used in machine learning, particularly in

The core idea of LoRA:

1.Start with a Pretrained GPT-style Model

2.Prepare Domain-Specific Training Data

3.Apply LoRA to Key Layers of the Model

4.Train the Model on Domain Data

QLoRA (Quantized Low-Rank Adaptation) is a recent and powerful technique that

QLoRA extends LoRA by quantizing the low-rank adaptation matrices. Quantization

Load model in 4-bit:

Inject LoRA adapters:

Benefits of Flash Attention:

Speed & Efficiency:

Real-world Use Cases:

You might also like