AI WORKPLACE FOUNDATIONS
Understanding LLMs
DAY 2, MODULE 2
Agenda
01 What is a LLM?
02 Key Timelines
03 LLM – Under the Hood
04 LLM Capabilities
05 Getting More Out of LLMs
06 Building LLM-Powered Assistants
What is a Large Language Model?
• A large language model is a trained
deep-learning model that contextually
understands human language and can
generate text in a human-like fashion.
• LLMs are trained on vast amounts of text
data to develop deep understanding of
language structures and meanings.
What is a Large Language Model?
• The “large” in large language models
refers to 3 things: the huge training data
utilized, the massive scale of the model's
architecture, and the costly computational
resources required for training.
• They are typically based on transformer Model Language Large
architectures, which rely on self-attention Neural
network
Designed for
NLP tasks
Lots of
params
mechanisms that allow the model to
capture long-range dependencies
between words in a sentence.
Key Timelines
In the Beginning Statistical Revolution Deep Learning Age Age of Transformers
’50s – 80s ’90s – 2000s 2010s 2017 & beyond
Symbolic AI & rule- Rise of probabilistic Word embeddings Rise of attention mech-
based systems (ELIZA) models (n-gram) anism. Pretraining
Sequence modelling
with masked modelling
Statistical methods, Neural networks (RNN, LSTM)
start of connectionism (feedforward, RNN) Attention mechanism Parameter scaling.
LLMs – Under the Hood
LLMs follow a two-step training process:
• Pre-training: The models learn from massive amounts of
unlabeled text data. Using self-supervised learning, the
model learns to predict masked or corrupted words in the
input, allowing it to capture rich contextual representations.
• Fine-tuning: The models are further trained on specific tasks
using labeled data to specialize their language
understanding for various applications. This is known as
transfer learning, which allows a model to generalize its
capabilities to various downstream NLP tasks.
LLM Capabilities
• Conversation and dialogue, mimicking different writing
styles, adapting to various genres, and producing
contextually appropriate responses.
• Language translation, capturing nuances and idiomatic
expressions.
• Document summarization and knowledge extraction from a
wide range of sources.
• Intelligent text suggestion and completion based on partial
input.
• Sentiment analysis, distinguishing positive, negative or
neutral tones.
• Creative content generation, including fictional stories,
poetry, or script dialogues.
More LLM Capabilities
1 2
Coding Copilot Data Analysis & Interpretation
Assist developers in completing code Automatic generation of reports from
snippets, suggesting functions, and raw data to provide insights and
debugging. Generate technical summaries.
documentation from code or explain Generative analytics enables running
code in simple language. analysis using prompts. Conversational
Code completion tools: Codex analytics uses NLQ to query databases
(OpenAI), Github Copilot, AlphaCode and fetch data for non-technical users.
(DeepMind), TabNine, IntelliCode (MS).
Leading LLMs
1 2
Open Models Closed Models
Llama 3, Llama 3.1, Mixtral 8x22b,
Claude 3.5, Gemini 1.5, Gemini Ultra,
Mixtral 8x7b, Mistral Large, Qwen2,
GPT-4, GPT-4 Turbo, GPT-4o
Command-R
Getting More out of LLMs
The ambiguity of natural language affects how LLMs perform in
different tasks. These issues can be addressed in two ways –
prompt engineering and finetuning:
Prompt engineering
• This involves designing and refining input queries, known as
prompts, to achieve desired responses from LLMs.
• The phrasing, structure, and context of a prompt directly
influence the quality and relevance of the model's output.
• Understanding how to tune prompts effectively will help you
obtain more accurate and nuanced responses that are useful
and relevant.
Getting More out of LLMs
The ambiguity of natural language affects how LLMs perform in
different tasks. These issues can be addressed in two ways –
prompt engineering and finetuning:
Finetuning
• Finetuning involves taking a pre-trained LLM, such as GPT-3,
and further training it on a domain-specific task.
• Finetuning a model on a more focused dataset enables the
model to adapt to the specific requirements of the target
task, resulting in improved performance and tailored
responses.
• When you finetune a LLM, you train it on how to respond, so
you don’t necessarily have to do any prompt engineering
subsequently.
Building LLM-Powered Assistants
Key must-haves of LLM assistants:
• Contextual understanding: Should be to comprehend and
interpret user input, going beyond syntax to understanding
nuanced contextual cues.
• Information retrieval: Should be accurate at retrieving and
presenting information, responding to general knowledge
queries, providing up-to-date weather forecasts, fetching
relevant news articles, and offering personalized
recommendations.
• Task management: Must have an exceptional task
management system tailored to user preferences and
priorities, including seamless organisation of to-do lists,
appointment scheduling & reminder setting.
Building LLM-Powered Assistants
Key must-haves of LLM assistants:
• Personalisation: Should be adaptive, incorporating user
preferences to provide tailored responses and
recommendations.
• Conversational interface: A simplified and intuitive user
interface for providing the user with a chat experience
• Voice interaction (optional): Enables users to effortlessly
communicate with it via speech. This integration of speech-
to-text and text-to-speech technologies fosters an intuitive
user experience to enhance convenience and accessibility.
• Security and Privacy: Should be able to safeguard user
data, prioritising trust and protection of sensitive information
at all times.
AI WORKPLACE FOUNDATIONS
Understanding LLMs
DAY 2, MODULE 2