{"id":42151,"date":"2024-07-05T00:43:20","date_gmt":"2024-07-05T07:43:20","guid":{"rendered":"https:\/\/devblogs.microsoft.com\/premier-developer\/?p=42151"},"modified":"2024-07-05T08:04:54","modified_gmt":"2024-07-05T15:04:54","slug":"copilot-studio-prompt-response-times","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/premier-developer\/copilot-studio-prompt-response-times\/","title":{"rendered":"Copilot Studio &#8211; Prompt Response Times"},"content":{"rendered":"<p><a href=\"https:\/\/www.linkedin.com\/in\/tnaveed\/\">Tahir Naveed<\/a> walks through how to measure prompt response times with Microsoft Copilot Studio and explores considerations that factor into the overall performance of these requests.<\/p>\n<hr \/>\n<p>Building <strong>custom copilots<\/strong> (GenAI chatbots) in <a href=\"https:\/\/www.microsoft.com\/en-us\/microsoft-copilot\/microsoft-copilot-studio\">Microsoft Copilot Studio<\/a> is amazing. Especially with the new functionalities which were released in <a href=\"https:\/\/www.youtube.com\/watch?v=HsGxq7Ws0wE\">Microsoft Build 2024<\/a>. What\u2019s more amazing is when we can point the custom copilots to different kinds of external and internal data sources like public websites (external), Dataverse tables (internal), SharePoint etc.<\/p>\n<p>Today we will be looking at how to check the response time of a prompt when you ask a copilot a question. For this we will be using <a href=\"https:\/\/learn.microsoft.com\/en-us\/microsoft-edge\/devtools-guide-chromium\/overview\">Developer Tools<\/a> (<a href=\"https:\/\/learn.microsoft.com\/en-us\/microsoft-edge\/devtools-guide-chromium\/network\/\">Network Tool<\/a> to be exact) in Microsoft Edge browser to perform following three experiments:<\/p>\n<p><strong>Pre Req: <\/strong><a href=\"https:\/\/learn.microsoft.com\/en-us\/microsoft-copilot-studio\/fundamentals-get-started?tabs=web\">Create a custom copilot<\/a> with Copilot Studio and provided following data sources in the knowledge tab.<\/p>\n<ul>\n<li><strong>Public website:<\/strong> www.cnbc.com<\/li>\n<li><strong>Dataverse Table:<\/strong> Employee Information<\/li>\n<\/ul>\n<p><img decoding=\"async\" width=\"1471\" height=\"400\" class=\"wp-image-42153\" src=\"https:\/\/devblogs.microsoft.com\/premier-developer\/wp-content\/uploads\/sites\/31\/2024\/07\/a-screenshot-of-a-computer-description-automatica.png\" alt=\"A screenshot of a computer Description automatically generated\" srcset=\"https:\/\/devblogs.microsoft.com\/premier-developer\/wp-content\/uploads\/sites\/31\/2024\/07\/a-screenshot-of-a-computer-description-automatica.png 1471w, https:\/\/devblogs.microsoft.com\/premier-developer\/wp-content\/uploads\/sites\/31\/2024\/07\/a-screenshot-of-a-computer-description-automatica-300x82.png 300w, https:\/\/devblogs.microsoft.com\/premier-developer\/wp-content\/uploads\/sites\/31\/2024\/07\/a-screenshot-of-a-computer-description-automatica-1024x278.png 1024w, https:\/\/devblogs.microsoft.com\/premier-developer\/wp-content\/uploads\/sites\/31\/2024\/07\/a-screenshot-of-a-computer-description-automatica-768x209.png 768w\" sizes=\"(max-width: 1471px) 100vw, 1471px\" \/><\/p>\n<p>Test the copilot to make sure it is responding to the prompts by bringing data from the public website and Dataverse table.<\/p>\n<p>In your copilot, go to <strong>Topics<\/strong> -&gt; <strong>System<\/strong> -&gt; <strong>Conversational boosting<\/strong> to see the developer view.<\/p>\n<p><img decoding=\"async\" width=\"1495\" height=\"678\" class=\"wp-image-42154\" src=\"https:\/\/devblogs.microsoft.com\/premier-developer\/wp-content\/uploads\/sites\/31\/2024\/07\/a-screenshot-of-a-computer-description-automatica-1.png\" alt=\"A screenshot of a computer Description automatically generated\" srcset=\"https:\/\/devblogs.microsoft.com\/premier-developer\/wp-content\/uploads\/sites\/31\/2024\/07\/a-screenshot-of-a-computer-description-automatica-1.png 1495w, https:\/\/devblogs.microsoft.com\/premier-developer\/wp-content\/uploads\/sites\/31\/2024\/07\/a-screenshot-of-a-computer-description-automatica-1-300x136.png 300w, https:\/\/devblogs.microsoft.com\/premier-developer\/wp-content\/uploads\/sites\/31\/2024\/07\/a-screenshot-of-a-computer-description-automatica-1-1024x464.png 1024w, https:\/\/devblogs.microsoft.com\/premier-developer\/wp-content\/uploads\/sites\/31\/2024\/07\/a-screenshot-of-a-computer-description-automatica-1-768x348.png 768w\" sizes=\"(max-width: 1495px) 100vw, 1495px\" \/><\/p>\n<p>We will focus on this special system topic as this gets triggered and when a prompt\u2019s intention is not known&#8211; copilot searches across our provided data sources and then results from those data sources are used as context for the LLM to respond in a natural way. This is <strong>RAG (Retrieval Augmented Generation)<\/strong> pattern.<\/p>\n<h2>Experiment 1 &#8211; Prompt Response Time without GenAI<\/h2>\n<ol>\n<li>Launch the <strong>Developer Tools<\/strong> (Ctrl+Shift+I) and then select the <strong>Network tool<\/strong>. Additionally Turn off the recording and clear the network log &amp; filter by Fetch\/XHR request to reduce the noise.<\/li>\n<\/ol>\n<p><img decoding=\"async\" width=\"1873\" height=\"921\" class=\"wp-image-42155\" src=\"https:\/\/devblogs.microsoft.com\/premier-developer\/wp-content\/uploads\/sites\/31\/2024\/07\/a-screenshot-of-a-computer-description-automatica-2.png\" alt=\"A screenshot of a computer Description automatically generated\" srcset=\"https:\/\/devblogs.microsoft.com\/premier-developer\/wp-content\/uploads\/sites\/31\/2024\/07\/a-screenshot-of-a-computer-description-automatica-2.png 1873w, https:\/\/devblogs.microsoft.com\/premier-developer\/wp-content\/uploads\/sites\/31\/2024\/07\/a-screenshot-of-a-computer-description-automatica-2-300x148.png 300w, https:\/\/devblogs.microsoft.com\/premier-developer\/wp-content\/uploads\/sites\/31\/2024\/07\/a-screenshot-of-a-computer-description-automatica-2-1024x504.png 1024w, https:\/\/devblogs.microsoft.com\/premier-developer\/wp-content\/uploads\/sites\/31\/2024\/07\/a-screenshot-of-a-computer-description-automatica-2-768x378.png 768w, https:\/\/devblogs.microsoft.com\/premier-developer\/wp-content\/uploads\/sites\/31\/2024\/07\/a-screenshot-of-a-computer-description-automatica-2-1536x755.png 1536w\" sizes=\"(max-width: 1873px) 100vw, 1873px\" \/><\/p>\n<ol start=\"2\">\n<li>Turn on the Record network log <img decoding=\"async\" width=\"28\" height=\"25\" class=\"wp-image-42156\" src=\"https:\/\/devblogs.microsoft.com\/premier-developer\/wp-content\/uploads\/sites\/31\/2024\/07\/word-image-42151-4.png\" \/>.<\/li>\n<li>Prompt: <strong>Hi<\/strong><\/li>\n<li>Captured Network traffic<\/li>\n<\/ol>\n<p><img decoding=\"async\" width=\"1875\" height=\"921\" class=\"wp-image-42157\" src=\"https:\/\/devblogs.microsoft.com\/premier-developer\/wp-content\/uploads\/sites\/31\/2024\/07\/a-screenshot-of-a-computer-description-automatica-3.png\" alt=\"A screenshot of a computer Description automatically generated\" srcset=\"https:\/\/devblogs.microsoft.com\/premier-developer\/wp-content\/uploads\/sites\/31\/2024\/07\/a-screenshot-of-a-computer-description-automatica-3.png 1875w, https:\/\/devblogs.microsoft.com\/premier-developer\/wp-content\/uploads\/sites\/31\/2024\/07\/a-screenshot-of-a-computer-description-automatica-3-300x147.png 300w, https:\/\/devblogs.microsoft.com\/premier-developer\/wp-content\/uploads\/sites\/31\/2024\/07\/a-screenshot-of-a-computer-description-automatica-3-1024x503.png 1024w, https:\/\/devblogs.microsoft.com\/premier-developer\/wp-content\/uploads\/sites\/31\/2024\/07\/a-screenshot-of-a-computer-description-automatica-3-768x377.png 768w, https:\/\/devblogs.microsoft.com\/premier-developer\/wp-content\/uploads\/sites\/31\/2024\/07\/a-screenshot-of-a-computer-description-automatica-3-1536x754.png 1536w\" sizes=\"(max-width: 1875px) 100vw, 1875px\" \/><\/p>\n<p>You will see one main request (and some supporting requests) and their combined response time is going to be around half a second (401+104 ms = 505 ms). This is because the copilot got the intention perfectly and triggered the <strong>Greetings<\/strong> topic which responded without using the GenAI functionality. Prompt response within half a second is very great.<\/p>\n<h2>Experiment 2 &#8211; Prompt Response Time with GenAI &amp; Public Websites<\/h2>\n<ol>\n<li>Edit the <strong>Data source<\/strong> in <strong>Create generative answers<\/strong> action and select cnbc.com &amp; save the topic.<\/li>\n<\/ol>\n<p><img decoding=\"async\" width=\"1093\" height=\"643\" class=\"wp-image-42158\" src=\"https:\/\/devblogs.microsoft.com\/premier-developer\/wp-content\/uploads\/sites\/31\/2024\/07\/a-screenshot-of-a-computer-description-automatica-4.png\" alt=\"A screenshot of a computer Description automatically generated\" srcset=\"https:\/\/devblogs.microsoft.com\/premier-developer\/wp-content\/uploads\/sites\/31\/2024\/07\/a-screenshot-of-a-computer-description-automatica-4.png 1093w, https:\/\/devblogs.microsoft.com\/premier-developer\/wp-content\/uploads\/sites\/31\/2024\/07\/a-screenshot-of-a-computer-description-automatica-4-300x176.png 300w, https:\/\/devblogs.microsoft.com\/premier-developer\/wp-content\/uploads\/sites\/31\/2024\/07\/a-screenshot-of-a-computer-description-automatica-4-1024x602.png 1024w, https:\/\/devblogs.microsoft.com\/premier-developer\/wp-content\/uploads\/sites\/31\/2024\/07\/a-screenshot-of-a-computer-description-automatica-4-768x452.png 768w\" sizes=\"(max-width: 1093px) 100vw, 1093px\" \/><\/p>\n<ol start=\"2\">\n<li>Turn on the Record network log <img decoding=\"async\" width=\"28\" height=\"25\" class=\"wp-image-42159\" src=\"https:\/\/devblogs.microsoft.com\/premier-developer\/wp-content\/uploads\/sites\/31\/2024\/07\/word-image-42151-7.png\" \/> .<\/li>\n<li>Prompt: <strong>What is the latest news on S&amp;P 500?<\/strong><\/li>\n<li>Captured Network traffic<\/li>\n<\/ol>\n<p><img decoding=\"async\" width=\"1864\" height=\"919\" class=\"wp-image-42160\" src=\"https:\/\/devblogs.microsoft.com\/premier-developer\/wp-content\/uploads\/sites\/31\/2024\/07\/a-screenshot-of-a-computer-description-automatica-5.png\" alt=\"A screenshot of a computer Description automatically generated\" srcset=\"https:\/\/devblogs.microsoft.com\/premier-developer\/wp-content\/uploads\/sites\/31\/2024\/07\/a-screenshot-of-a-computer-description-automatica-5.png 1864w, https:\/\/devblogs.microsoft.com\/premier-developer\/wp-content\/uploads\/sites\/31\/2024\/07\/a-screenshot-of-a-computer-description-automatica-5-300x148.png 300w, https:\/\/devblogs.microsoft.com\/premier-developer\/wp-content\/uploads\/sites\/31\/2024\/07\/a-screenshot-of-a-computer-description-automatica-5-1024x505.png 1024w, https:\/\/devblogs.microsoft.com\/premier-developer\/wp-content\/uploads\/sites\/31\/2024\/07\/a-screenshot-of-a-computer-description-automatica-5-768x379.png 768w, https:\/\/devblogs.microsoft.com\/premier-developer\/wp-content\/uploads\/sites\/31\/2024\/07\/a-screenshot-of-a-computer-description-automatica-5-1536x757.png 1536w\" sizes=\"(max-width: 1864px) 100vw, 1864px\" \/><\/p>\n<p>With the prompt that goes to the external data sources you will notice a slight delay in the response (3.03 s + 0.279 s + 0.705 s = <strong>4.014 sec<\/strong>). This is due to the fact that when intent is not found in the custom topics, then <strong>Conversational boosting<\/strong> topic is triggered which has translated the prompt into search for cnbc.com (request1) and then results were combined with the prompt to send it to the LLM (request2) which caused a slight delay in the final response. Having said that, this prompt used GenAI with all the data from the website and still managed to respond within 4sec, which is super fast for this scenario.<\/p>\n<h2>Experiment 3 &#8211; Prompt Response Time with GenAI &amp; Dataverse Table<\/h2>\n<ol>\n<li>Edit the <strong>Data source<\/strong> in <strong>Create generative answers<\/strong> action and select cnbc.com &amp; save the topic.<\/li>\n<\/ol>\n<p><img decoding=\"async\" width=\"1096\" height=\"640\" class=\"wp-image-42161\" src=\"https:\/\/devblogs.microsoft.com\/premier-developer\/wp-content\/uploads\/sites\/31\/2024\/07\/a-screenshot-of-a-computer-description-automatica-6.png\" alt=\"A screenshot of a computer Description automatically generated\" srcset=\"https:\/\/devblogs.microsoft.com\/premier-developer\/wp-content\/uploads\/sites\/31\/2024\/07\/a-screenshot-of-a-computer-description-automatica-6.png 1096w, https:\/\/devblogs.microsoft.com\/premier-developer\/wp-content\/uploads\/sites\/31\/2024\/07\/a-screenshot-of-a-computer-description-automatica-6-300x175.png 300w, https:\/\/devblogs.microsoft.com\/premier-developer\/wp-content\/uploads\/sites\/31\/2024\/07\/a-screenshot-of-a-computer-description-automatica-6-1024x598.png 1024w, https:\/\/devblogs.microsoft.com\/premier-developer\/wp-content\/uploads\/sites\/31\/2024\/07\/a-screenshot-of-a-computer-description-automatica-6-768x448.png 768w\" sizes=\"(max-width: 1096px) 100vw, 1096px\" \/><\/p>\n<ol start=\"2\">\n<li>Turn on the Record network log <img decoding=\"async\" width=\"28\" height=\"25\" class=\"wp-image-42162\" src=\"https:\/\/devblogs.microsoft.com\/premier-developer\/wp-content\/uploads\/sites\/31\/2024\/07\/word-image-42151-10.png\" \/> .<\/li>\n<li>Prompt: <strong>What is the job title, department and Hire Date of Eva Garcia?<\/strong><\/li>\n<li>Captured Network traffic<\/li>\n<\/ol>\n<p><img decoding=\"async\" width=\"1870\" height=\"916\" class=\"wp-image-42163\" src=\"https:\/\/devblogs.microsoft.com\/premier-developer\/wp-content\/uploads\/sites\/31\/2024\/07\/a-screenshot-of-a-computer-description-automatica-7.png\" alt=\"A screenshot of a computer Description automatically generated\" srcset=\"https:\/\/devblogs.microsoft.com\/premier-developer\/wp-content\/uploads\/sites\/31\/2024\/07\/a-screenshot-of-a-computer-description-automatica-7.png 1870w, https:\/\/devblogs.microsoft.com\/premier-developer\/wp-content\/uploads\/sites\/31\/2024\/07\/a-screenshot-of-a-computer-description-automatica-7-300x147.png 300w, https:\/\/devblogs.microsoft.com\/premier-developer\/wp-content\/uploads\/sites\/31\/2024\/07\/a-screenshot-of-a-computer-description-automatica-7-1024x502.png 1024w, https:\/\/devblogs.microsoft.com\/premier-developer\/wp-content\/uploads\/sites\/31\/2024\/07\/a-screenshot-of-a-computer-description-automatica-7-768x376.png 768w, https:\/\/devblogs.microsoft.com\/premier-developer\/wp-content\/uploads\/sites\/31\/2024\/07\/a-screenshot-of-a-computer-description-automatica-7-1536x752.png 1536w\" sizes=\"(max-width: 1870px) 100vw, 1870px\" \/><\/p>\n<p>With the prompt that goes to the Dataverse table, you will again notice a delay in the response (0.3 s + 6.15 s + 0.1 s = <strong>6.55 sec<\/strong>). This is again due to following the RAG pattern that when intent is not found in the custom topics, the <strong>Conversational boosting<\/strong> topic is triggered which has translated the prompt into search for Employee Information Dataverse table (request1) and then results were combined with the prompt to send it to the LLM (request2) which caused a delay in the final response. Having said that this prompt used GenAI with the structured data from the Dataverse table and still managed to respond in around 6.55 seconds, which is not ideal, but can be improved by tweaking the prompts (Prompt Engineering).<\/p>\n<p>For example, for prompt: <strong>When was Eva Garcia hired? <\/strong> I see 15% improvement (5.5 sec) in the response from the same table:<\/p>\n<p><img decoding=\"async\" width=\"373\" height=\"784\" class=\"wp-image-42164\" src=\"https:\/\/devblogs.microsoft.com\/premier-developer\/wp-content\/uploads\/sites\/31\/2024\/07\/a-screenshot-of-a-computer-description-automatica-8.png\" alt=\"A screenshot of a computer Description automatically generated\" srcset=\"https:\/\/devblogs.microsoft.com\/premier-developer\/wp-content\/uploads\/sites\/31\/2024\/07\/a-screenshot-of-a-computer-description-automatica-8.png 373w, https:\/\/devblogs.microsoft.com\/premier-developer\/wp-content\/uploads\/sites\/31\/2024\/07\/a-screenshot-of-a-computer-description-automatica-8-143x300.png 143w\" sizes=\"(max-width: 373px) 100vw, 373px\" \/><\/p>\n<h2>Conclusion:<\/h2>\n<p>The custom copilots created via Copilot Studio are using <strong>GPT 3.5 Turbo<\/strong> LLM as of now. And as we saw that the prompt response time depends on the following things:<\/p>\n<ul>\n<li>Usage of the GenAI feature.<\/li>\n<li>Type of data source.<\/li>\n<li>Complexity of the prompt.<\/li>\n<\/ul>\n<p>Our only option for now is to engineer our prompts and\/or create topics intelligently as part of the best practice. Going forward, <a href=\"https:\/\/learn.microsoft.com\/en-us\/azure\/ai-services\/openai\/concepts\/models\">new versions of LLMs<\/a> will become available and result in a better experience and improved response times.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Tahir Naveed walks through how to measure prompt response times with Microsoft Copilot Studio and explores considerations that factor into the overall performance of these requests. Building custom copilots (GenAI chatbots) in Microsoft Copilot Studio is amazing. Especially with the new functionalities which were released in Microsoft Build 2024. What\u2019s more amazing is when we [&hellip;]<\/p>\n","protected":false},"author":582,"featured_media":42166,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[14,1],"tags":[10656],"class_list":["post-42151","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-ai","category-permierdev","tag-copilot-studio"],"acf":[],"blog_post_summary":"<p>Tahir Naveed walks through how to measure prompt response times with Microsoft Copilot Studio and explores considerations that factor into the overall performance of these requests. Building custom copilots (GenAI chatbots) in Microsoft Copilot Studio is amazing. Especially with the new functionalities which were released in Microsoft Build 2024. What\u2019s more amazing is when we [&hellip;]<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/premier-developer\/wp-json\/wp\/v2\/posts\/42151","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/premier-developer\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/premier-developer\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/premier-developer\/wp-json\/wp\/v2\/users\/582"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/premier-developer\/wp-json\/wp\/v2\/comments?post=42151"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/premier-developer\/wp-json\/wp\/v2\/posts\/42151\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/premier-developer\/wp-json\/wp\/v2\/media\/42166"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/premier-developer\/wp-json\/wp\/v2\/media?parent=42151"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/premier-developer\/wp-json\/wp\/v2\/categories?post=42151"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/premier-developer\/wp-json\/wp\/v2\/tags?post=42151"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}