Enhancing Agentforce with Models API
Detailed description of building your own custom chat bot using
Models API
Author: Manisha Maharana
Date: Aug 3, 2025
Introduction
Supported Models for Models API
Model Criteria
Model API Names
Key Capabilities Of Models API
Chat Generations API
Features
Architecture
Description
createGenerations(...)
createGenerations_Request input
ModelsAPI_GenerationRequest
ModelsAPI_ChatMessageRequest
ModelsAPI_ChatGenerationsResponse
ModelsAPI_GenerationDetails
ModelsAPI_ChatMessage
DEMO
Pre-requisites
Apex Class
Lightning Web Component
Example
Resources
Introduction
The Models API provides Apex classes and REST endpoints that connect your
application to LLMs from Salesforce partners, including Anthropic, Google, and
OpenAI.
Supported Models for Models API
Custom actions that execute the Models API can use any Salesforce-managed
models or BYO (Bring Your Own) model. You can use any Salesforce-enabled model
that can be configured in Einstein Studio. To customize model hyperparameters,
create a custom-configured model in Einstein Studio and then use that configured
model with the Models API.
● Salesforce-Managed Models
● Bring Your Own LLM
Model Criteria
To choose the right model for your application, consider these criteria.
● Capabilities: What can the model do?
● Cost: How much does the model cost to use?
● Quality: How well does the model respond?
● Speed: How long does it take the model to complete a task?
Model API Names
Example Model(OOTB) Name : sfdc_ai__DefaultOpenAIGPT4OmniMini
The API name is a string made up of substrings:
● Namespace: sfdc_ai
● Separator: __
● Configuration name: Default
● Provider name: OpenAI
● Model name: GPT4OmniMini
Key Capabilities Of Models API
● Generate Chat
● Generate Embeddings
● Generate Text
● Submit Feedback
Chat Generations API
Features
● Used in scenarios involving multi-turn interactions, long-running context or
multiple interactions
● The chat capability allows you to prompt the model with a list of messages
rather than just one prompt
● Localizations for 18+ regions
● Zero retention agreement and PII redaction (Einstein Trust Layer)
Architecture
Description
● The chat begins with a user’s prompt, which is sent to the Chat Generations
API as a chat message request.
● The chat message request contains a message list which contains the user’s
prompts.
● For each subsequent prompt, append the user’s new prompt and the
assistant’s previous response as a ChatMessageRequest to the list.
● You must specify a role for each message. Supported values:
○ user : prompt from user
○ system : prompt from developer
○ Assistant : response from AI
● The role helps the model understand the conversation and know how to
respond.
● After you pass in this information, the API responds with generated text.
createGenerations(...)
Generate a response based on the provided prompt.
● Parameters: createGenerations_Request input
● Returns: createGenerations_Response
● Throws: createGenerations_ResponseException
● Signature:createChatGenerations_Response
createChatGenerations(createChatGenerations_Request input);
● Example:aiplatform.ModelsAPI.createChatGenerations_Response
response = modelsAPI.createChatGenerations(request);
createGenerations_Request input
Contains the request information for a generations request.
Property Type Description
modelName String Configured model name.
Generation request class
containing the prompt,
body ModelsAPI_GenerationRequest
localization, and necessary
tags.
ModelsAPI_GenerationRequest
Property Type Description
List of messages to send to the
List<ModelsAPI_ChatMessageRequ
messages model. These messages should
est>
be in chronological order.
Localization information, which
can include the default locale,
localization ModelsAPI_Localization
input locale(s), and expected
output locale(s).
Entries used by the Models API
tags ModelsAPI_Tags for non-generative purposes
and entries used by the client
for free-form data.
ModelsAPI_ChatMessageRequest
A message in a conversation. This message is used in a chat generation request
Property Type Description
role string The person that sent the
message. (Possible
Values: user, system,
assistant)
content string The content of the
message.
ModelsAPI_ChatGenerationsResponse
A chat generation response.
Property Type Description
id string Unique identifier for the
response.
generationDetails ModelsAPI_GenerationDetails Details for a chat generation
response.
ModelsAPI_GenerationDetails
Details for a chat generation response.
Property Type Description
generations List<ModelsAPI_ChatMessage> A list of messages used in
a chat generation
response.
parameters ModelsAPI_GenerationDetails_param Any provider-specific
eters attributes included as
part of this object.
Example of parameters returned in a response
None
10:36:59:335 USER_DEBUG [69]|DEBUG|response
ModelsAPI_GenerationDetails_parameters:[properties={created=1754284017,
model=gpt-4o-mini-2024-07-18, object=chat.completion, provider=openai,
system_fingerprint=fp_34a54ae93c, usage={completion_tokens=96,
completion_tokens_details={accepted_prediction_tokens=0, audio_tokens=0,
reasoning_tokens=0, rejected_prediction_tokens=0}, prompt_tokens=480,
prompt_tokens_details={audio_tokens=0, cached_tokens=0}, total_tokens=576}},
properties_set=true]
ModelsAPI_ChatMessage
Property Type Description
id string Generation ID. Required to
register feedback.
role string Persona that sent the
message.
content string The content of the message.
timestamp long Timestamp when the message
was sent.
parameters ModelsAPI_ChatMessage_paramet Provider-specific attributes
ers included as part of this object.
contentQuality ModelsAPI_ContentTrustRepresent Content quality details.
ation
DEMO
Pre-requisites
● From Setup, in the Quick Find box, enter Einstein Setup and click Einstein
Setup.
● Enable Turn on Einstein.
● Enable Einstein Trust Layer
Apex Class
RequiredData.cls
A helper class designed to retrieve Salesforce data for model summarization.
None
public class RequiredData {
public static string fetchData(Id recordId){
List<ServiceAppointment> saList = [SELECT Id,Status,
ServiceTerritory.Name,
Subject,SchedStartTime,
Duration,ParentRecordId
FROM ServiceAppointment
WHERE Id = :recordId];
List<WorkOrder> woList = new List<WorkOrder>();
List<WorkStep> wsList = new List<WorkStep>();
String jsonString = '';
if(!saList.isEmpty()){
woList = [SELECT Id,Status,Subject,
ServiceTerritory.Name,Duration,WorkType.Name
FROM WorkOrder
WHERE Id =:saList[0].ParentRecordId];
wsList = [SELECT Id,Name,Status,WorkOrderId
FROM WorkStep
WHERE WorkOrderId = :saList[0].ParentRecordId];
}
if(!woList.isEmpty()){
jsonString = saList.toString()+ ' ' +
woList.toString()+ ' '+ wsList.toString();
}
return jsonString;
}
}
ModelsAPIChatGenerations
This class is responsible for implementing the core logic and using Models API.
None
public with sharing class ModelsAPIChatGenerations {
@AuraEnabled
public static String createChatGenerations(Id recordId, String input) {
//formating the input received from the LWC
List<ChatMessage> messages = (List<ChatMessage>) JSON.deserialize(
input,
List<ChatMessage>.class
);
//Preparing the Data list fetched from helper class
String requiredRecords = RequiredData.fetchData(recordId)
.replace('\\', '\\\\') // Escape backslashes
.replace('"', '\\"') // Escape double quotes
.replace('\n', '\\n') // Escape newlines
.replace('\r', '\\r'); // Escape carriage returns
//Sending system prompt to the train the model regarding
how to answer to the customer and salesforce data to summarise
you can provide this in form of a custom metadata as well.
String requiredRecordsFormatted = '[{"role": "system", "message":
"You are a Salesforce Expert who has good knowledge on
field service managed package. User wants to know about
specific record details. Give them a short and crisp information
about the record they ask for and do not show them all the
record details. Always provide information in a paragraph
summarising the answer. Please use the following
records to provide the results:' + requiredRecords + ' "}]';
//formating the system prompt message
List<ChatMessage> requiredRecordsMessage =
(List<ChatMessage>) JSON.deserialize(
requiredRecordsFormatted,
List<ChatMessage>.class
);
// Instantiate the API class
aiplatform.ModelsAPI modelsAPI = new aiplatform.ModelsAPI();
// Prepare the request and body objects
aiplatform.ModelsAPI.createChatGenerations_Request request = new
aiplatform.ModelsAPI.createChatGenerations_Request();
aiplatform.ModelsAPI_ChatGenerationsRequest body = new
aiplatform.ModelsAPI_ChatGenerationsRequest();
// Specify model
request.modelName = 'sfdc_ai__DefaultOpenAIGPT4OmniMini';
// Create a list to hold chat messages
List<aiplatform.ModelsAPI_ChatMessageRequest> messagesList = new
List<aiplatform.ModelsAPI_ChatMessageRequest>();
// Loop through the input messages and create message requests
for (ChatMessage msg : messages) {
aiplatform.ModelsAPI_ChatMessageRequest messageRequest = new
aiplatform.ModelsAPI_ChatMessageRequest();
messageRequest.content = msg.message != null ? msg.message : ''; // Handle null
message
messageRequest.role = msg.role != null ? msg.role : 'user'; // Handle null role
messagesList.add(messageRequest);
}
// Loop through the system prompt and create message requests
for (ChatMessage msg : requiredRecordsMessage) {
aiplatform.ModelsAPI_ChatMessageRequest messageRequest = new
aiplatform.ModelsAPI_ChatMessageRequest();
messageRequest.content = msg.message != null ? msg.message : ''; // Handle null
message
messageRequest.role = msg.role != null ? msg.role : 'user'; // Handle null role
messagesList.add(messageRequest);
}
// Set the messages in the request body
body.messages = messagesList;
// Set the request body
request.body = body;
String response = '';
try {
// Call the API and get the response
aiplatform.ModelsAPI.createChatGenerations_Response apiResponse =
modelsAPI.createChatGenerations(
request
);
// Check that we have a non-null response
if (
apiResponse?.Code200?.generationDetails?.generations != null &&
!apiResponse.Code200.generationDetails.generations.isEmpty()
){
// Set the variable from the response
response = apiResponse.Code200.generationDetails.generations[0]
.content;
} else {
// Handle the case where response is null
response = 'No content generated';
}
// Handle error
} catch(aiplatform.ModelsAPI.createChatGenerations_ResponseException e) {
System.debug('Response code: ' + e.responseCode);
System.debug('The following exception occurred: ' + e);
// Add error to the output
response = 'Unable to get a valid response. Error code: ' + e.responseCode;
}
return response;
}
//custom class to format the message
public class ChatMessage {
@AuraEnabled
public String role;
@AuraEnabled
public String message;
public ChatMessage() {
}
public ChatMessage(String role, String message) {
this.role = role;
this.message = message;
}
}
}
Lightning Web Component
This user interaction component, launched as a quick action, retrieves the record ID
of the originating record. This ID, along with a message list, is then transmitted to
the Apex class and used in the request to obtain a model response.
customChat.html
HTML
<template>
<lightning-quick-action-panel header="Salesforce Buddy">
<div
class="slds-var-m-around_medium slds-grid slds-grid_vertical slds-box
slds-theme_default"
>
<!-- Chat messages container -->
<div
class="slds-scrollable_y"
style="height: 440px"
lwc:ref="chatContainer"
>
<!-- Iterate over each message in the messages array -->
<template for:each={messages} for:item="message">
<div key={message.id} class="slds-var-m-around_small">
<!-- If the message is from the user -->
<template lwc:if={message.isUser}>
<div class="custom-chat-message_outbound slds-var-p-around_small">
<div class="slds-chat-message__body">
<div class="slds-chat-message__text">{message.text}</div>
</div>
</div>
</template>
<!-- If the message is from the assistant -->
<template lwc:else>
<div class="custom-chat-message_inbound slds-var-p-around_small">
<div class="slds-chat-message__body">
<div class="slds-chat-message__text">{message.text}</div>
</div>
</div>
</template>
</div>
</template>
<!-- Loading indicator -->
<template lwc:if={isLoading}>
<div class="loading-container slds-var-m-around_small">
<div class="loading-text">
Processing your question… One moment.
</div>
<div class="loading-indicator"></div>
</div>
</template>
</div>
<!-- User input textarea -->
<div class="slds-grid slds-grid_vertical-align-center">
<lightning-textarea
class="custom-textarea slds-size_full"
label="Type a message"
value={userMessage}
onchange={handleInputChange}
style="margin-bottom: 20px"
></lightning-textarea>
</div>
<!-- Send button -->
<div class="slds-grid slds-grid_vertical-align-center">
<div class="slds-col slds-size_1-of-4">
<lightning-button
label="Send"
variant="brand"
onclick={handleSendMessage}
disabled={isLoading}
></lightning-button>
</div>
</div>
</div>
</lightning-quick-action-panel>
</template>
customChat.css
CSS
.custom-chat-message_inbound {
border-radius: 10px;
max-width: 100%;
background-color: #f3f2f2;
align-self: flex-start;
}
.custom-chat-message_outbound {
border-radius: 10px;
max-width: 100%;
background-color: #0070d2;
color: white;
align-self: flex-end;
}
.custom-textarea {
width: 100%;
}
.loading-indicator {
width: 50px;
height: 50px;
overflow: hidden;
vertical-align: middle;
position: relative;
}
.loading-indicator::after {
content: "";
display: inline-block;
width: 50px;
height: 50px;
overflow: hidden;
position: absolute;
top: 50%;
left: 50%;
transform: translate(-50%, -50%);
animation: loading 1s infinite steps(7);
}
@keyframes loading {
0% {
content: ".";
}
12.5% {
content: "..";
}
25% {
content: "...";
}
37.5% {
content: ".....";
}
50% {
content: "......";
}
62.5% {
content: "........";
}
75% {
content: "............";
}
87.5% {
content: "";
}
}
customChat.js
JavaScript
import { LightningElement, track, api } from "lwc";
import createChatGenerations from
"@salesforce/apex/ModelsAPIChatGenerations.createChatGenerations";
export default class CustomChat extends LightningElement {
@api recordId //fetching the current record Id
on which the action is launched
@track messages = []; // Array to store chat messages
userMessage = ""; // User input message
isLoading = false; // Track loading state
// Handle user input change
handleInputChange(event) {
this.userMessage = event.target.value;
}
// Scroll to the bottom of the chat container
renderedCallback() {
this.scrollToBottom();
}
// Handle send message button click
handleSendMessage() {
if (this.userMessage.trim()) {
const userMessageObj = {
id: this.messages.length + 1,
text: this.userMessage,
role: "user",
isUser: true,
};
// Add user message to the messages array
this.messages = [...this.messages, userMessageObj];
this.isLoading = true; // Show loading indicator
// Prepare message array for API call
let messageArray = this.messages.map((msg) => ({
role: msg.isUser ? "user" : "assistant",
message: msg.text,
}));
// Call Apex method to fetch chat response
createChatGenerations({ recordId: this.recordId,
input: JSON.stringify(messageArray) })
.then((result) => {
this.simulateTypingEffect(result);
})
.catch((error) => {
console.error("Error fetching bot response", JSON.stringify(error));
})
.finally(() => {
this.isLoading = false; // Hide loading indicator
});
this.userMessage = ""; // Clear user input
}
}
// Simulate typing effect for the chat response
simulateTypingEffect(fullText) {
const words = fullText.split(" ");
let currentIndex = 0;
let displayText = "";
const intervalId = setInterval(() => {
if (currentIndex < words.length) {
displayText += words[currentIndex] + " ";
const botResponseObj = {
id: this.messages.length + 1,
text: displayText.trim(),
role: "assistant",
isUser: false,
};
// Replace the last message if it’s the bot’s typing message
if (currentIndex > 0) {
this.messages.splice(this.messages.length - 1, 1, botResponseObj);
} else {
this.messages = [...this.messages, botResponseObj];
}
this.scrollToBottom();
currentIndex++;
} else {
clearInterval(intervalId);
}
}, 30); // Adjust typing speed (ms per word)
}
// Scroll to the bottom of the chat container
scrollToBottom() {
const chatContainer = this.template.querySelector(".slds-scrollable_y");
if (chatContainer) {
chatContainer.scrollTop = chatContainer.scrollHeight;
}
}
}
customChat.-metaxml
None
<?xml version="1.0" encoding="UTF-8"?>
<LightningComponentBundle xmlns="http://soap.sforce.com/2006/04/metadata">
<apiVersion>64.0</apiVersion>
<isExposed>true</isExposed>
<targets>
<target>lightning__RecordAction</target>
</targets>
<targetConfigs>
<targetConfig targets="lightning__RecordAction">
<actionType>ScreenAction</actionType>
</targetConfig>
</targetConfigs>
</LightningComponentBundle>
Quick Action
● From Setup click on Object Manager and then Service Appointment.
● Go to Buttons,Links and Actions
● Create a LWC Quick Action using the LWC
● To make the quick action accessible, place it on the service appointment
page layout. This applies to layouts assigned to both admin users and Field
Service mobile workers.
Example
On desktop
On FSL mobile APP
Resources
● Models API Developer Guide
● Blog post on Models API