Skip to content

GPT 5.2 on Azure rate_limit_exceeded & no_kv_space issues #5526

@sellerscrisp

Description

@sellerscrisp

Description

This is not really an opencode problem but is fixable. Azure is having some trouble with their release of GPT 5.2 on Foundry, but there is a clear failure in their API design. The error response still returns 200 OK status codes which inhibits most retry mechanisms.

{"type": "error","sequence_number":2,"error":{"type":"server_error","code":"rate_limit_exceeded","message":"| ... | Traceback (most recent call):\n | \n|  File \"/usr/local/lib/python3.12/site-packages/inference_server/routes.py\", line 726, in streaming_completion\n |  await response.write_to(reactor)]\n | oai_grpc.errors.ServerError:  | no_kv_space\n  | ","param":null }

as well as:

{"type":"error","sequence_number":2,"error":{"type":"server_error","code":"server_error","message":"An error occurred while processing your request. You can retry your request, or contact us through an Azure support request at: https://go.microsoft.com/fwlink/?linkid=2213926 if the error persists. Please include the request ID ... in your message.","param":null}}

Looking at retryable(), I think it would make some sense to add some additional retry logic. Testing out these additions in my fork https://github.com/sellerscrisp/opencode/tree/feat/azure-foundry-retry

OpenCode version

1.0.152

Steps to reproduce

  1. Deploy GPT 5.2 on Azure Foundry
  2. Setup and use GPT 5.2 in OpenCode

Screenshot and/or share link

Image

Operating System

No response

Terminal

No response

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions