{"id":22823948,"date":"2026-05-20T07:00:00","date_gmt":"2026-05-20T11:00:00","guid":{"rendered":"https:\/\/thenewstack.io\/?p=22823948"},"modified":"2026-05-19T19:03:16","modified_gmt":"2026-05-19T23:03:16","slug":"cursor-composer-benchmarks","status":"publish","type":"post","link":"https:\/\/thenewstack.io\/cursor-composer-benchmarks\/","title":{"rendered":"Cursor bets on cheaper coding with Composer 2.5 and Kimi K2.5"},"content":{"rendered":"\n<p>Cursor announced this week that Composer 2.5 is available in Cursor, only two months after the <a href=\"https:\/\/thenewstack.io\/cursors-composer-2-beats-opus\/\" class=\"local-link\">release of Composer 2<\/a>, which beat Opus 4.6 on coding benchmarks at a fraction of the price. It&rsquo;s another burst in the company&rsquo;s streak of model releases, marking the fourth Composer in the last seven months.&nbsp;<\/p>\n\n\n\n<p>Cursor says the latest iteration brings major upgrades to long-running coding tasks, complex instruction-following, and training efficiency, as well as behavioral improvements in &ldquo;communication style and effort calibration,&rdquo; but time will tell whether benchmark gains translate into real-world improvements.&nbsp;<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-a-cheaper-contender-in-the-coding-model-line-up\">A cheaper contender in the coding model line-up<\/h2>\n\n\n\n<p>Like its predecessor, Composer 2.5 is built on <a href=\"https:\/\/github.com\/MoonshotAI\/Kimi-K2.5\" class=\"ext-link\" rel=\"external nofollow\" onclick=\"this.target=&#039;_blank&#039;;\">Moonshot Kimi K2.5<\/a>, an open-source native multimodal agentic model, but should now outperform Composer 2 on intelligence and behavior.&nbsp;<\/p>\n\n\n\n<p>In its <a href=\"https:\/\/cursor.com\/blog\/composer-2-5#targeted-rl-with-textual-feedback\" class=\"ext-link\" rel=\"external nofollow\" onclick=\"this.target=&#039;_blank&#039;;\">announcement<\/a>, Cursor attributes these improvements to scaled training, more complex Reinforcement Learning (RL), and new learning methods. When you look at the benchmarks, it&rsquo;s easy to see how Composer 2.5 has leveled up from Composer 2, moving from a 61.7% score on <a href=\"https:\/\/www.tbench.ai\/\" class=\"ext-link\" rel=\"external nofollow\" onclick=\"this.target=&#039;_blank&#039;;\">Terminal-Bench 2.0<\/a> to 69.3% and from 52.2% to 63.2% on its own CursorBench v3.1.&nbsp;<\/p>\n\n\n\n<p>And while Composer 2.5 still hasn&rsquo;t surpassed Opus 4.7&rsquo;s and GPT-5.5&rsquo;s scores (save inching past GPT-5.5 by 2% on <a href=\"https:\/\/www.swebench.com\/multilingual.html\" class=\"ext-link\" rel=\"external nofollow\" onclick=\"this.target=&#039;_blank&#039;;\">SWE-Bench Multilingual<\/a>), it&rsquo;s definitely giving Anthropic and OpenAI a run for their money.&nbsp;<\/p>\n\n\n\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p>But benchmarks are just that &mdash; benchmarks.<\/p>\n<\/blockquote>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"828\" height=\"466\" src=\"https:\/\/cdn.thenewstack.io\/media\/2026\/05\/2e01da0c-cursor-benchmarks.webp\" alt=\"\" class=\"wp-image-22823951\"\/><figcaption class=\"wp-element-caption\">Image: Cursor<\/figcaption><\/figure>\n\n\n\n<p>While the benchmarks offer an interesting, high-level comparison of the industry&rsquo;s main contenders, they don&rsquo;t provide any real assurance for how these models will perform in the real world.&nbsp;&nbsp;<\/p>\n\n\n\n<p>As one <a href=\"https:\/\/www.reddit.com\/r\/vibecoding\/comments\/1tgyqoj\/cursor_annonced_a_model_that_beats_opus_47_and\/\" class=\"ext-link\" rel=\"external nofollow\" onclick=\"this.target=&#039;_blank&#039;;\">Redditor<\/a> commented: &ldquo;Haven&#8217;t tested it yet but the benchmarks are wild. What&#8217;s interesting is that raw model performance doesn&#8217;t always translate to actual coding productivity. I&#8217;ve seen plenty of &lsquo;better&rsquo; models still generate code that needs heavy cleanup or doesn&#8217;t fit the project context properly.&rdquo;<\/p>\n\n\n\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p>&#8220;Anyone who&#8217;s used Claude or GPT-4 for actual projects knows that intelligence on benchmarks &ne; usefulness in practice.&rdquo;<\/p>\n<\/blockquote>\n\n\n\n<p>Instead, they posit the real test of Composer 2.5 will come once it&rsquo;s used to handle multi-file changes and if it can then maintain consistency with existing codebases: &ldquo;Anyone who&#8217;s used Claude or GPT-4 for actual projects knows that intelligence on benchmarks &ne; usefulness in practice.&rdquo;<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-cursor-aims-to-improve-long-running-agent-work\">Cursor aims to improve long-running agent work<\/h2>\n\n\n\n<p>Cursor also says Composer 2.5 has leveled up on long-running coding tasks, for which it trained the model with targeted textual feedback to tackle tricky credit assignment during RL: &ldquo;The idea is to provide feedback directly at the point in the trajectory where the model could have behaved better.&rdquo;&nbsp;<\/p>\n\n\n\n<p>By constructing and inserting short hints into the local context, Cursor aims to target specific mistakes while still retaining the bigger-picture RL objective.&nbsp;<\/p>\n\n\n\n<p>With barely a day since release, it&rsquo;s still too early to tell if this training will make a real difference, but an early glimpse at user feedback suggests the problem could still give developers trouble. <\/p>\n\n\n\n<p>As one <a href=\"https:\/\/www.reddit.com\/r\/cursor\/comments\/1tha71k\/composer_25_suddenly_believes_its_in_ask_mode\/?show=original\" class=\"ext-link\" rel=\"external nofollow\" onclick=\"this.target=&#039;_blank&#039;;\">Redditor<\/a> notes, &ldquo;Composer 2.5 starts to work in agent mode, then all of a sudden it thinks it&#8217;s in ask mode and stops to work. When I prompt it to continue it tries to understand where it was in the task and only finishes what it just was working on, yet forgets about everything else in the pipeline.&rdquo;<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-more-synthetic-data-training-more-unexpected-reward-hacking\">More synthetic data training, more unexpected reward hacking<\/h2>\n\n\n\n<p>According to Cursor, Composer 2.5 was trained on 25 times as many synthetic tasks as Composer 2, using a range of approaches to generate them. But such a breadth of synthetic task creation had at least one sour side effect: unexpected reward hacking.&nbsp;<\/p>\n\n\n\n<p>As Cursor itself admits: &ldquo;As the model became more adept, Composer 2.5 was able to find increasingly sophisticated workarounds to solve the task at hand,&rdquo; such as reverse-engineering a Python type-checking cache.&nbsp;<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-are-you-always-getting-what-you-pay-for-nbsp\">Are you always getting what you pay for?&nbsp;<\/h2>\n\n\n\n<p>Composer 2.5 costs $0.50 per million input tokens and $2.50 per million output tokens. Upgrading to the &ldquo;faster&rdquo; tier will put you back $3.00 per million input tokens and $15.00 per million output tokens &mdash; but you&rsquo;re left with the same intelligence.&nbsp;<\/p>\n\n\n\n<p>Whether or not the better latency is worth the sixfold price increase, one thing is for certain: Composer 2.5 is considerably cheaper than both <a href=\"https:\/\/www.anthropic.com\/news\/claude-opus-4-7\" class=\"ext-link\" rel=\"external nofollow\" onclick=\"this.target=&#039;_blank&#039;;\">Opus 4.7<\/a> and <a href=\"https:\/\/openai.com\/api\/pricing\/\" class=\"ext-link\" rel=\"external nofollow\" onclick=\"this.target=&#039;_blank&#039;;\">GPT-5.5<\/a>, with Anthropic&rsquo;s model standing at $25 per million output tokens, OpenAI&rsquo;s at $30 per million output tokens, and both companies at $5 per million input tokens.<\/p>\n\n\n\n<p>Whether lower prices are enough to push developers to make the switch is the question. &ldquo;We have to ask ourselves if Opus 4.7 is 10x better,&rdquo; comments one Redditor, to which another replies: &ldquo;For some tasks &mdash; yes. I&#8217;m not a huge fan of Composer for UI. But it&#8217;s great for small, targeted tasks. Also, he is excellent at explaining details.&rdquo;<\/p>\n\n\n\n<p>Either way, Cursor says an improvement is already in the works. Last month, Cursor announced a <a href=\"https:\/\/cursor.com\/blog\/spacex-model-training\" class=\"ext-link\" rel=\"external nofollow\" onclick=\"this.target=&#039;_blank&#039;;\">partnership<\/a> with SpaceX on model training. The company now teases that it is working with SpaceXAI to train &ldquo;a significantly larger model from scratch, using 10x more total compute&rdquo; that it expects &ldquo;to be a major leap in model capability.&rdquo;&nbsp;<\/p>\n\n\n\n<p>Developers will have to wonder, given this week&#8217;s news about Composer 2.5&#8217;s prices, how much it&rsquo;ll cost.&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Cursor announced this week that Composer 2.5 is available in Cursor, only two months after the release of Composer 2,<\/p>\n","protected":false},"author":2011,"featured_media":22823950,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"inline_featured_image":false,"footnotes":""},"categories":[13989,14578,13673],"tags":[],"coauthors":[12489],"class_list":["post-22823948","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-ai-agents","category-ai-models","category-developer-tools"],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO Premium plugin v27.6 (Yoast SEO v27.6) - https:\/\/yoast.com\/product\/yoast-seo-premium-wordpress\/ -->\n<title>Cursor bets on cheaper coding with Composer 2.5 and Kimi K2.5 - The New Stack<\/title>\n<meta name=\"description\" content=\"Cursor&#039;s Composer 2.5 undercuts Opus 4.7 and GPT-5.5 on price, posts gains on Terminal-Bench and SWE-Bench, but real-world coding tests loom.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/thenewstack.io\/cursor-composer-benchmarks\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Cursor bets on cheaper coding with Composer 2.5 and Kimi K2.5\" \/>\n<meta property=\"og:description\" content=\"Cursor&#039;s Composer 2.5 undercuts Opus 4.7 and GPT-5.5 on price, posts gains on Terminal-Bench and SWE-Bench, but real-world coding tests loom.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/thenewstack.io\/cursor-composer-benchmarks\/\" \/>\n<meta property=\"og:site_name\" content=\"The New Stack\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/thenewstack\" \/>\n<meta property=\"article:published_time\" content=\"2026-05-20T11:00:00+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/cdn.thenewstack.io\/media\/2026\/05\/25233ae1-adriandra-karuniawan-ndd8ngyai04-unsplash-scaled.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"2560\" \/>\n\t<meta property=\"og:image:height\" content=\"1809\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"author\" content=\"Meredith Shubel\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@thenewstack\" \/>\n<meta name=\"twitter:site\" content=\"@thenewstack\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Meredith Shubel\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"4 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"NewsArticle\",\"@id\":\"https:\\\/\\\/thenewstack.io\\\/cursor-composer-benchmarks\\\/#article\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/thenewstack.io\\\/cursor-composer-benchmarks\\\/\"},\"author\":{\"name\":\"Meredith Shubel\",\"@id\":\"https:\\\/\\\/thenewstack.io\\\/#\\\/schema\\\/person\\\/eead7903d121280a70547f785d638bfa\"},\"headline\":\"Cursor bets on cheaper coding with Composer 2.5 and Kimi K2.5\",\"datePublished\":\"2026-05-20T11:00:00+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\\\/\\\/thenewstack.io\\\/cursor-composer-benchmarks\\\/\"},\"wordCount\":864,\"publisher\":{\"@id\":\"https:\\\/\\\/thenewstack.io\\\/#organization\"},\"image\":{\"@id\":\"https:\\\/\\\/thenewstack.io\\\/cursor-composer-benchmarks\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/cdn.thenewstack.io\\\/media\\\/2026\\\/05\\\/25233ae1-adriandra-karuniawan-ndd8ngyai04-unsplash-scaled.jpg\",\"articleSection\":[\"AI Agents\",\"AI Models\",\"Developer tools\"],\"inLanguage\":\"en-US\"},{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/thenewstack.io\\\/cursor-composer-benchmarks\\\/\",\"url\":\"https:\\\/\\\/thenewstack.io\\\/cursor-composer-benchmarks\\\/\",\"name\":\"Cursor bets on cheaper coding with Composer 2.5 and Kimi K2.5 - The New Stack\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/thenewstack.io\\\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\\\/\\\/thenewstack.io\\\/cursor-composer-benchmarks\\\/#primaryimage\"},\"image\":{\"@id\":\"https:\\\/\\\/thenewstack.io\\\/cursor-composer-benchmarks\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/cdn.thenewstack.io\\\/media\\\/2026\\\/05\\\/25233ae1-adriandra-karuniawan-ndd8ngyai04-unsplash-scaled.jpg\",\"datePublished\":\"2026-05-20T11:00:00+00:00\",\"description\":\"Cursor's Composer 2.5 undercuts Opus 4.7 and GPT-5.5 on price, posts gains on Terminal-Bench and SWE-Bench, but real-world coding tests loom.\",\"breadcrumb\":{\"@id\":\"https:\\\/\\\/thenewstack.io\\\/cursor-composer-benchmarks\\\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/thenewstack.io\\\/cursor-composer-benchmarks\\\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/thenewstack.io\\\/cursor-composer-benchmarks\\\/#primaryimage\",\"url\":\"https:\\\/\\\/cdn.thenewstack.io\\\/media\\\/2026\\\/05\\\/25233ae1-adriandra-karuniawan-ndd8ngyai04-unsplash-scaled.jpg\",\"contentUrl\":\"https:\\\/\\\/cdn.thenewstack.io\\\/media\\\/2026\\\/05\\\/25233ae1-adriandra-karuniawan-ndd8ngyai04-unsplash-scaled.jpg\",\"width\":2560,\"height\":1809,\"caption\":\"Scale of happy and sad emojis over thinking eyes.\"},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/thenewstack.io\\\/cursor-composer-benchmarks\\\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\\\/\\\/thenewstack.io\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Cursor bets on cheaper coding with Composer 2.5 and Kimi K2.5\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/thenewstack.io\\\/#website\",\"url\":\"https:\\\/\\\/thenewstack.io\\\/\",\"name\":\"The New Stack\",\"description\":\"\",\"publisher\":{\"@id\":\"https:\\\/\\\/thenewstack.io\\\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/thenewstack.io\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\\\/\\\/thenewstack.io\\\/#organization\",\"name\":\"The New Stack\",\"url\":\"https:\\\/\\\/thenewstack.io\\\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/thenewstack.io\\\/#\\\/schema\\\/logo\\\/image\\\/\",\"url\":\"https:\\\/\\\/cdn.thenewstack.io\\\/media\\\/2021\\\/11\\\/a9fbec84-the-new-stack-logo-rgb-screen.png\",\"contentUrl\":\"https:\\\/\\\/cdn.thenewstack.io\\\/media\\\/2021\\\/11\\\/a9fbec84-the-new-stack-logo-rgb-screen.png\",\"width\":1032,\"height\":128,\"caption\":\"The New Stack\"},\"image\":{\"@id\":\"https:\\\/\\\/thenewstack.io\\\/#\\\/schema\\\/logo\\\/image\\\/\"},\"sameAs\":[\"https:\\\/\\\/www.facebook.com\\\/thenewstack\",\"https:\\\/\\\/x.com\\\/thenewstack\"]},{\"@type\":\"Person\",\"@id\":\"https:\\\/\\\/thenewstack.io\\\/#\\\/schema\\\/person\\\/eead7903d121280a70547f785d638bfa\",\"name\":\"Meredith Shubel\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/thenewstack.io\\\/wp-content\\\/uploads\\\/2025\\\/09\\\/53f49f49-cropped-35fc143f-meredith-shubel-2-96x96.jpg930fb93852139f2c3618a4f975654fec\",\"url\":\"https:\\\/\\\/thenewstack.io\\\/wp-content\\\/uploads\\\/2025\\\/09\\\/53f49f49-cropped-35fc143f-meredith-shubel-2-96x96.jpg\",\"contentUrl\":\"https:\\\/\\\/thenewstack.io\\\/wp-content\\\/uploads\\\/2025\\\/09\\\/53f49f49-cropped-35fc143f-meredith-shubel-2-96x96.jpg\",\"caption\":\"Meredith Shubel\"},\"description\":\"Meredith Shubel is a technical writer covering cloud infrastructure and enterprise software. She has contributed to The New Stack since 2022, profiling startups and exploring how organizations adopt emerging technologies. Beyond The New Stack, she ghostwrites white papers, executive bylines, and other thought leadership content on cybersecurity, AI and digital transformation.\",\"sameAs\":[\"https:\\\/\\\/www.linkedin.com\\\/in\\\/meredithshubel\\\/\"],\"url\":\"https:\\\/\\\/thenewstack.io\\\/author\\\/mshubel\\\/\"}]}<\/script>\n<!-- \/ Yoast SEO Premium plugin. -->","yoast_head_json":{"title":"Cursor bets on cheaper coding with Composer 2.5 and Kimi K2.5 - The New Stack","description":"Cursor's Composer 2.5 undercuts Opus 4.7 and GPT-5.5 on price, posts gains on Terminal-Bench and SWE-Bench, but real-world coding tests loom.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/thenewstack.io\/cursor-composer-benchmarks\/","og_locale":"en_US","og_type":"article","og_title":"Cursor bets on cheaper coding with Composer 2.5 and Kimi K2.5","og_description":"Cursor's Composer 2.5 undercuts Opus 4.7 and GPT-5.5 on price, posts gains on Terminal-Bench and SWE-Bench, but real-world coding tests loom.","og_url":"https:\/\/thenewstack.io\/cursor-composer-benchmarks\/","og_site_name":"The New Stack","article_publisher":"https:\/\/www.facebook.com\/thenewstack","article_published_time":"2026-05-20T11:00:00+00:00","og_image":[{"width":2560,"height":1809,"url":"https:\/\/cdn.thenewstack.io\/media\/2026\/05\/25233ae1-adriandra-karuniawan-ndd8ngyai04-unsplash-scaled.jpg","type":"image\/jpeg"}],"author":"Meredith Shubel","twitter_card":"summary_large_image","twitter_creator":"@thenewstack","twitter_site":"@thenewstack","twitter_misc":{"Written by":"Meredith Shubel","Est. reading time":"4 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"NewsArticle","@id":"https:\/\/thenewstack.io\/cursor-composer-benchmarks\/#article","isPartOf":{"@id":"https:\/\/thenewstack.io\/cursor-composer-benchmarks\/"},"author":{"name":"Meredith Shubel","@id":"https:\/\/thenewstack.io\/#\/schema\/person\/eead7903d121280a70547f785d638bfa"},"headline":"Cursor bets on cheaper coding with Composer 2.5 and Kimi K2.5","datePublished":"2026-05-20T11:00:00+00:00","mainEntityOfPage":{"@id":"https:\/\/thenewstack.io\/cursor-composer-benchmarks\/"},"wordCount":864,"publisher":{"@id":"https:\/\/thenewstack.io\/#organization"},"image":{"@id":"https:\/\/thenewstack.io\/cursor-composer-benchmarks\/#primaryimage"},"thumbnailUrl":"https:\/\/cdn.thenewstack.io\/media\/2026\/05\/25233ae1-adriandra-karuniawan-ndd8ngyai04-unsplash-scaled.jpg","articleSection":["AI Agents","AI Models","Developer tools"],"inLanguage":"en-US"},{"@type":"WebPage","@id":"https:\/\/thenewstack.io\/cursor-composer-benchmarks\/","url":"https:\/\/thenewstack.io\/cursor-composer-benchmarks\/","name":"Cursor bets on cheaper coding with Composer 2.5 and Kimi K2.5 - The New Stack","isPartOf":{"@id":"https:\/\/thenewstack.io\/#website"},"primaryImageOfPage":{"@id":"https:\/\/thenewstack.io\/cursor-composer-benchmarks\/#primaryimage"},"image":{"@id":"https:\/\/thenewstack.io\/cursor-composer-benchmarks\/#primaryimage"},"thumbnailUrl":"https:\/\/cdn.thenewstack.io\/media\/2026\/05\/25233ae1-adriandra-karuniawan-ndd8ngyai04-unsplash-scaled.jpg","datePublished":"2026-05-20T11:00:00+00:00","description":"Cursor's Composer 2.5 undercuts Opus 4.7 and GPT-5.5 on price, posts gains on Terminal-Bench and SWE-Bench, but real-world coding tests loom.","breadcrumb":{"@id":"https:\/\/thenewstack.io\/cursor-composer-benchmarks\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/thenewstack.io\/cursor-composer-benchmarks\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/thenewstack.io\/cursor-composer-benchmarks\/#primaryimage","url":"https:\/\/cdn.thenewstack.io\/media\/2026\/05\/25233ae1-adriandra-karuniawan-ndd8ngyai04-unsplash-scaled.jpg","contentUrl":"https:\/\/cdn.thenewstack.io\/media\/2026\/05\/25233ae1-adriandra-karuniawan-ndd8ngyai04-unsplash-scaled.jpg","width":2560,"height":1809,"caption":"Scale of happy and sad emojis over thinking eyes."},{"@type":"BreadcrumbList","@id":"https:\/\/thenewstack.io\/cursor-composer-benchmarks\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/thenewstack.io\/"},{"@type":"ListItem","position":2,"name":"Cursor bets on cheaper coding with Composer 2.5 and Kimi K2.5"}]},{"@type":"WebSite","@id":"https:\/\/thenewstack.io\/#website","url":"https:\/\/thenewstack.io\/","name":"The New Stack","description":"","publisher":{"@id":"https:\/\/thenewstack.io\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/thenewstack.io\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/thenewstack.io\/#organization","name":"The New Stack","url":"https:\/\/thenewstack.io\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/thenewstack.io\/#\/schema\/logo\/image\/","url":"https:\/\/cdn.thenewstack.io\/media\/2021\/11\/a9fbec84-the-new-stack-logo-rgb-screen.png","contentUrl":"https:\/\/cdn.thenewstack.io\/media\/2021\/11\/a9fbec84-the-new-stack-logo-rgb-screen.png","width":1032,"height":128,"caption":"The New Stack"},"image":{"@id":"https:\/\/thenewstack.io\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/thenewstack","https:\/\/x.com\/thenewstack"]},{"@type":"Person","@id":"https:\/\/thenewstack.io\/#\/schema\/person\/eead7903d121280a70547f785d638bfa","name":"Meredith Shubel","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/thenewstack.io\/wp-content\/uploads\/2025\/09\/53f49f49-cropped-35fc143f-meredith-shubel-2-96x96.jpg930fb93852139f2c3618a4f975654fec","url":"https:\/\/thenewstack.io\/wp-content\/uploads\/2025\/09\/53f49f49-cropped-35fc143f-meredith-shubel-2-96x96.jpg","contentUrl":"https:\/\/thenewstack.io\/wp-content\/uploads\/2025\/09\/53f49f49-cropped-35fc143f-meredith-shubel-2-96x96.jpg","caption":"Meredith Shubel"},"description":"Meredith Shubel is a technical writer covering cloud infrastructure and enterprise software. She has contributed to The New Stack since 2022, profiling startups and exploring how organizations adopt emerging technologies. Beyond The New Stack, she ghostwrites white papers, executive bylines, and other thought leadership content on cybersecurity, AI and digital transformation.","sameAs":["https:\/\/www.linkedin.com\/in\/meredithshubel\/"],"url":"https:\/\/thenewstack.io\/author\/mshubel\/"}]}},"_links":{"self":[{"href":"https:\/\/thenewstack.io\/wp-json\/wp\/v2\/posts\/22823948","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/thenewstack.io\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/thenewstack.io\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/thenewstack.io\/wp-json\/wp\/v2\/users\/2011"}],"replies":[{"embeddable":true,"href":"https:\/\/thenewstack.io\/wp-json\/wp\/v2\/comments?post=22823948"}],"version-history":[{"count":16,"href":"https:\/\/thenewstack.io\/wp-json\/wp\/v2\/posts\/22823948\/revisions"}],"predecessor-version":[{"id":22824004,"href":"https:\/\/thenewstack.io\/wp-json\/wp\/v2\/posts\/22823948\/revisions\/22824004"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/thenewstack.io\/wp-json\/wp\/v2\/media\/22823950"}],"wp:attachment":[{"href":"https:\/\/thenewstack.io\/wp-json\/wp\/v2\/media?parent=22823948"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/thenewstack.io\/wp-json\/wp\/v2\/categories?post=22823948"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/thenewstack.io\/wp-json\/wp\/v2\/tags?post=22823948"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/thenewstack.io\/wp-json\/wp\/v2\/coauthors?post=22823948"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}