Skip to content

Context overflow from vLLM is not handled -> no auto compaction #17762

@ghost91-

Description

@ghost91-

Description

There are already many patterns for detecting context overflow, for different providers. However, there are no dedicated patterns for vLLM yet. This means that when using a vLLM instance as a provider, auto compaction is not trigger when running into an error due to context overflow.

Plugins

No response

OpenCode version

1.2.26

Steps to reproduce

  1. Serve some model with vLLM
  2. configure that vLLM instance as a provider (OpenAI compatible)
  3. send a message to vLLM that is too large for the model's context window
  4. Observe that there is an error, but no auto compaction

Screenshot and/or share link

No response

Operating System

Arch Linux (irrelevant)

Terminal

Alacritty (irrelevant)

Metadata

Metadata

Assignees

Labels

bugSomething isn't workingcoreAnything pertaining to core functionality of the application (opencode server stuff)

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions