fix(cache): improve Anthropic prompt cache hit rate with system split and tool stability by bhagirathsinh-vaghela · Pull Request #14743 · anomalyco/opencode

bhagirathsinh-vaghela · 2026-02-23T03:17:43Z

Issue for this PR

Closes #5416, #5224
Related: #14065, #5422, #14203

Type of change

Bug fix

What does this PR do?

Fixes cross-repo and cross-session Anthropic prompt cache misses. Same-session caching already works (AI SDK places markers correctly). This PR fixes the cases where the prefix changes between repos, sessions, or process restarts — causing full cache writes on every first prompt.

Anthropic hashes tools → system → messages in prefix order. Any change to an earlier block invalidates everything after it. OpenCode has several sources of unnecessary prefix changes.

Terminology (1-indexed): S1/S2 = system block 1/2. M1/M2 = cache marker on S1/S2.

Always-active fixes:

System prompt is a single block — dynamic content (env, project AGENTS.md) invalidates the stable provider prompt. Split into 2 blocks: stable (provider prompt + global AGENTS.md) first, dynamic (env + project) second.
Bash tool schema includes Instance.directory — changes per-repo, invalidating tool hash. Removed; model gets cwd from the environment block.
Skill tool ordering is nondeterministic — Object.values() on glob results. Sorted by name.

Opt-in fixes (behind env var flags):

Date and instructions change between turns — OPENCODE_EXPERIMENTAL_CACHE_STABILIZATION=1 freezes date and caches instruction file reads for the process lifetime.
Extended cache TTL — OPENCODE_EXPERIMENTAL_CACHE_1H_TTL=1 sets 1h TTL on M1 (2x write cost vs 1.25x for default 5-min). Useful for sessions with idle gaps.

Commits:

#	What	Behind flag?
1	Cache token audit logging	`OPENCODE_CACHE_AUDIT`
2	Stabilize system prefix (freeze date + instructions)	`OPENCODE_EXPERIMENTAL_CACHE_STABILIZATION`
3	Split system prompt into 2 blocks	Always active
4	Remove cwd from bash tool schema	Always active
5	Sort skill tool ordering	Always active
6	Optional 1h TTL on M1	`OPENCODE_EXPERIMENTAL_CACHE_1H_TTL`

What this doesn't fix:

Per-project skills or MCP tools that differ across repos — the skill tool description changes per project, breaking M1 even on the same machine. This is expected; per-project tools are inherently dynamic.
Cross-machine cache sharing (different skill tool descriptions per machine)
Plan/build mode switches (TaskTool description changes per mode) — deferred
Compaction cache alignment (/compact doesn't utilize prompt caching #10342 — planned follow-up)

Impact beyond Anthropic: The prefix stability fixes also benefit providers with automatic prefix caching (OpenAI, DeepSeek, Gemini, xAI, Groq) — no markers needed, just a stable prefix.

How did you verify your code works?

OPENCODE_CACHE_AUDIT=1 logs [CACHE] hit/miss per LLM call. Tested with Claude Sonnet 4.6 on Anthropic direct API, bun dev, Feb 23 2026.

Cross-repo (different folder, within 5-min TTL — the key improvement):

BEFORE (no fixes):

Prompt 1: hit=0.0%   read=0      write=17,786  new=3   (full miss, no reuse)
Prompt 2: hit=99.9%  read=17,786 write=10      new=3
Prompt 3: hit=99.9%  read=17,796 write=14      new=3

AFTER (system split + tool stability):

Prompt 1: hit=97.6%  read=17,345 write=428     new=3   (block 1 reused, only env block misses)
Prompt 2: hit=99.9%  read=17,773 write=10      new=3
Prompt 3: hit=99.9%  read=17,783 write=14      new=3

The first prompt in a new repo goes from 0% → 97.6% cache hit. S1 (tools + provider prompt + global AGENTS.md) is reused across repos. These numbers are based on my setup — S1 is ~17,345 tokens, mostly tool definitions (~12k tokens), with provider prompt (~2k) and global AGENTS.md (~2.8k) making up the rest. Your numbers will differ based on your tool set (MCP servers, skills) and global AGENTS.md size, but the cross-repo miss is eliminated regardless.

Only block 2 (env with different cwd = 428 tokens) is a cache write on the first prompt in a new repo.

To reproduce:

OPENCODE_CACHE_AUDIT=1 bun dev /tmp/folder-a
# send a prompt, exit
OPENCODE_CACHE_AUDIT=1 bun dev /tmp/folder-b
# send a prompt within 5 minutes
grep '\[CACHE\]' ~/.local/share/opencode/log/dev.log

Screenshots / recordings

N/A — no UI changes.

Updates (post-rebase, Mar 21 2026)

Rebased onto current dev
Dropped skill ordering commit (already merged via fix: stabilize agent and skill ordering in prompt descriptions #18261)
fix: stabilize agent and skill ordering in prompt descriptions #18261 also added SystemPrompt.skills() which puts skill descriptions in the system prompt. Global skills (from ~/.config/opencode/skills/) are now placed in S1 (stable) and project skills in S2 (dynamic), so global skills don't cause cache writes on cross-repo switch. On my setup, cross-repo cache hit improved from 87% → 97.7%.
Added commit 6: splitSystemPrompt provider config option. Providers that reject multiple system messages (e.g. llama.cpp with Qwen templates) can set provider.<id>.options.splitSystemPrompt: false to get single-block behavior.

Updated commit table:

#	What	Behind flag?
1	Cache token audit logging	`OPENCODE_CACHE_AUDIT`
2	Stabilize system prefix (freeze date + instructions)	`OPENCODE_EXPERIMENTAL_CACHE_STABILIZATION`
3	Split system prompt and skills into stable/dynamic blocks	Always active
4	Remove cwd from bash tool schema	Always active
5	Optional 1h TTL on M1	`OPENCODE_EXPERIMENTAL_CACHE_1H_TTL`
6	`splitSystemPrompt` provider option to opt out of split	Config: `provider.<id>.options.splitSystemPrompt: false`

Checklist

I have tested my changes locally
I have not included unrelated changes in this PR

github-actions · 2026-02-23T03:18:19Z

The following comment was made by an LLM, it may be inaccurate:

Potential related PRs found:

fix: split system prompt into 2 messages for proper Anthropic prompt caching #14203 - "fix: split system prompt into 2 messages for proper Anthropic prompt caching"
- fix: split system prompt into 2 messages for proper Anthropic prompt caching #14203
- Directly related: This appears to address the same system prompt splitting approach that is a core fix in PR fix(cache): improve Anthropic prompt cache hit rate with system split and tool stability #14743.
feat(provider): add provider-specific cache configuration system (significant token usage reduction) #5422 - "feat(provider): add provider-specific cache configuration system (significant token usage reduction)"
- feat(provider): add provider-specific cache configuration system (significant token usage reduction) #5422
- Related: This PR deals with provider-specific cache configuration and token usage optimization, which aligns with the cache improvements in fix(cache): improve Anthropic prompt cache hit rate with system split and tool stability #14743.
feat: add experimental compaction prompt and preserve prefix support #11492 - "feat: add experimental compaction prompt and preserve prefix support"
- feat: add experimental compaction prompt and preserve prefix support #11492
- Related: Addresses prefix preservation during compaction, which relates to the cache stability goals.

Note: PR #14203 appears to be the most directly related, as it's specifically about the system prompt splitting strategy that is a key component of PR #14743's improvements.

github-actions · 2026-02-23T03:32:33Z

Thanks for updating your PR! It now meets our contributing guidelines. 👍

bhagirathsinh-vaghela · 2026-02-23T04:04:21Z

Reviewer's guide — supplementary context not covered in the PR description. Uses same terminology (S1/S2, M1/M2) defined there.

AI SDK cache marker mechanics

Ref: Anthropic prompt caching docs | Anthropic engineers' caching best practices (Feb 19 2026): Thariq Shihipar, R. Lance Martin

Max 4 cache_control markers per request. The AI SDK already places markers on the first 2 system blocks and the last 2 conversation turns. That part works — the problem is OpenCode mutating blocks before these markers, cascading hash changes downstream.

Key subtlety: before this PR, OpenCode had a single system block. M1 covered it, but M2 was unused — it fell through to conversation. The system split (commit 3) is what activates both markers, letting S1 (stable) cache independently from S2 (dynamic).

Since M1 covers the tool block too (tools hash before system in Anthropic's ordering), any tool instability (commits 4–5) completely invalidates M1 — the entire cached prefix up to that marker is lost.

Related open PRs

Several open PRs address parts of this (#5422, #14203, #10380, #11492). This PR addresses the root causes directly.

Update (post-rebase, Mar 21 2026)

Skill ordering commit dropped (covered by fix: stabilize agent and skill ordering in prompt descriptions #18261)
Global skills now placed in S1 (stable block) alongside provider prompt and global AGENTS.md. Project skills remain in S2. This was needed because fix: stabilize agent and skill ordering in prompt descriptions #18261 added SystemPrompt.skills() to the system prompt during the rebase.
Added splitSystemPrompt provider option (default: on). Providers that reject multiple system messages can set splitSystemPrompt: false in their provider config to get single-block behavior.

bhagirathsinh-vaghela · 2026-02-23T05:31:07Z

CI failure seems pre-existing — same NotFoundError affecting all PRs since the Windows path fixes landed in dev. Unrelated to this PR. All other checks pass.

ShanePresley · 2026-02-23T22:25:09Z

I pulled this into my fork and it's working beautifully. Unfortunately I only found this after getting a huge bill from Anthropic. Thanks OpenCode!

TomLucidor · 2026-02-24T07:55:01Z

@bhagirathsinh-vaghela could you check this with SLMs like Qwen3 or Nemotron or Kimi-Linear or GPT-OSS? Or providers using the OpenAI-compatible APIs (e.g. OpenRouter)?
Also why are some of the E2E tests failing in OpenCode PR?

Bonus ask: would Speculative Decoding work with this fork? I am looking at this from the lens of vLLM-MLX and MLX-OpenAI-Server (for non-MLX there is vLLM).

bhagirathsinh-vaghela · 2026-03-02T01:57:05Z

@bhagirathsinh-vaghela could you check this with SLMs like Qwen3 or Nemotron or Kimi-Linear or GPT-OSS? Or providers using the OpenAI-compatible APIs (e.g. OpenRouter)? Also why are some of the E2E tests failing in OpenCode PR?

Bonus ask: would Speculative Decoding work with this fork? I am looking at this from the lens of vLLM-MLX and MLX-OpenAI-Server (for non-MLX there is vLLM).

@TomLucidor

The fixes are provider/model-agnostic — they stabilize the request prefix so it is byte-for-byte identical across calls. Any provider with server-side prefix caching benefits automatically. See my reviewer's guide comment above for the full breakdown of each fix.

The specific model behind the provider does not matter — the changes are purely at the request layer. You can verify with any provider using OPENCODE_CACHE_AUDIT=1 to see hit/miss per call.

E2E failures — pre-existing upstream issue, since fixed. CI is green now.

Speculative decoding — orthogonal. This PR only changes what is sent in the request, not how the server processes it.

kamelkace · 2026-03-04T19:29:37Z

packages/opencode/src/session/system.ts

        `  Is directory a git repo: ${project.vcs === "git" ? "yes" : "no"}`,
        `  Platform: ${process.platform}`,
-        `  Today's date: ${new Date().toDateString()}`,
+        `  Today's date: ${date.toDateString()}`,


Would it make sense to change the wording here, to hint to the LLM that this isn't a live updating value? Otherwise it might make some weird choices elsewhere for long lived conversations. E.g.

Suggested change

` Today's date: ${date.toDateString()}`,

` Session started at: ${date.toDateString()}`,

Good point — this is better to show when the date is frozen. I'm keeping Today's date in this PR for now since it's what all OpenCode users expect(at least by experience even if they are not aware), but I'm not against the change if maintainers agree.

Separately, I've been experimenting locally with a progressive disclosure approach — making the env block fully static, instructing the model to fetch cwd, date, platform, etc. via tool calls when needed. Eliminates the block 2 cache write entirely at the cost of an occasional extra round-trip.

Interesting finding in this approach: completely removing the env block tended to result in models not bothering to fetch the info at all and assume things which is non deterministic. A static block with explicit "figure out when needed" instructions worked much better, at least with Anthropic models.

Separately, I've been experimenting locally with a progressive disclosure approach — making the env block fully static, instructing the model to fetch cwd, date, platform, etc. via tool calls when needed. [...] A static block with explicit "figure out when needed" instructions worked much better, at least with Anthropic models.

Hmm! I'll have to give that a shot when I patch from this PR later; I'm running locally against one of the Qwen3.5 models, so it'll be interesting data to see how they respond.

fkroener · 2026-03-08T08:01:07Z

Looking forward to seeing less prompt re-processing with opencode. Unfortunately it seems currently this patchset breaks llama.cpp support:

[60919] srv operator(): got exception: {"error":{"code":400,"message":"Unable to generate parser for this template. Automatic parser generation failed: \n------------\nWhile executing CallExpression at line 85, column 32 in source:\n...first %}↵ {{- raise_exception('System message must be at the beginnin...\n ^\nError: Jinja Exception: System message must be at the beginning.","type":"invalid_request_error"}}

Tested with and without the new autoparser. Maybe I'm using it wrong?

fkroener · 2026-03-09T09:05:53Z

So, after partially reverting fix(cache): split system prompt into 2 blocks for independent caching, or rather naively ensuring llama.cpp gets just one system prompt (revert.patch) opencode now flies with this patchset using a llama.cpp endpoint (openai api though).

No more "erased invalidated context checkpoint" for all checkpoints and reprocessing of the entire context seemingly whenever I send a new query.

Checkpoint reuse happens usually at around 99 %, sometimes drops to 93 % - lowest was in the 70 % with > 60k tokens.

Much appreciated!

Wonder whether the split system message is something @pwilkin would be willing to support or whether it should be guarded to only be sent to Antrophic endpoints.

pwilkin · 2026-03-09T13:41:27Z

Any chance the system message could be moved to the top of the messages list? We could possibly do this for the Anthropic API, but technically the system prompt should be the first message.

fkroener · 2026-03-09T13:56:58Z

Thanks @pwilkin. Given this is actually coming from the model template (Qwen 3.5) and not the parser:

    {%- if message.role == "system" %}
        {%- if not loop.first %}
            {{- raise_exception('System message must be at the beginning.') }}
        {%- endif %}
    {%- elif message.role == "user" %}

this should probably best be handled on OpenCode's end.

…m PR anomalyco#14743 - Add OPENCODE_EXPERIMENTAL_CACHE_STABILIZATION and OPENCODE_EXPERIMENTAL_CACHE_1H_TTL flags - Split system prompt into 2 blocks (stable/dynamic) for better cache reuse - Freeze date and instructions behind OPENCODE_EXPERIMENTAL_CACHE_STABILIZATION flag - Remove Instance.directory from bash tool schema for cross-repo cache hits - Sort skill tools alphabetically for deterministic ordering - Add extended TTL support for first system cache marker - Add cache audit display in TUI sidebar behind OPENCODE_CACHE_AUDIT env var - Fix llama-server compatibility: join system blocks for non-Anthropic providers - Update tests for all changed functionality Co-authored-by: chand1012 <[email protected]>

sandeep-chaps · 2026-03-13T15:55:52Z

When will this PR make it into a release? We are seeing lower cache hit rates (Anthropic) across users using the same repo with a standard workflow based on opencode. -> higher token costs

Stellarthoughts · 2026-03-14T08:36:28Z

Even more important now to get it into release with the general rollout of 1M Context windows for Max subscribers. The price remained as if it was 200K window, so it's up to caching to cut costs.

https://claude.com/blog/1m-context-ga

hhieuu · 2026-03-18T07:57:01Z

Would love to see this get in as well. Caching is much less efficient in OpenCode with Claude models. We are pushing internal users to OpenCode for better general model supports, but the caching issue is a blocker.

thdxr · 2026-03-21T15:42:24Z

we are looking at it

rekram1-node · 2026-03-21T19:13:28Z

I think most of these changes prolly make sense, but it seems like the primary 2 things that are gained:

option for 1H ttl
tool prompt cache won't bust between projects as frequently.

In my experience #2 prolly won't have much impact for most ppl but we may as well do that

We actually resolved some of the ordering things in separate pr, ill look at the rest of this and then we will ship a cleaned up version

bhagirathsinh-vaghela · 2026-03-22T03:36:15Z

@fkroener The system split can now be disabled per-provider via config. For your llama.cpp setup, add to your opencode.json:

{
  "provider": {
    "<your-llama-provider-id>": {
      "options": {
        "splitSystemPrompt": false
      }
    }
  }
}

This builds a single system message (pre-PR behavior) while still benefiting from the other prefix stability fixes (tool schema, skill ordering).

bhagirathsinh-vaghela · 2026-03-22T03:36:22Z

@rekram1-node Thanks for the feedback. Updates since your comment:

Skill ordering commit dropped — already covered by fix: stabilize agent and skill ordering in prompt descriptions #18261
System split commit now also separates global vs project skills (global skills in S1 for cross-repo reuse)
Added splitSystemPrompt provider config option for providers that reject multiple system messages (llama.cpp/Qwen)

…NCODE_CACHE_AUDIT

…E_STABILIZATION

… for independent caching

…OPENCODE_EXPERIMENTAL_CACHE_1H_TTL flag

…m block split

henry701 · 2026-03-23T04:51:57Z

+1, was creating a draft PR for this and bot directed me to this. It's not just for Anthropic, this busts cache for every single LLM implementation that uses prefixing to preserve model output quality (all of them?)

phazei · 2026-03-26T04:39:49Z

I do hope this gets merged in, it's useful overall, and while support was removed for Anthropic Plus/Max plans, it would still be good for API as well as all the people who are just going to use a plugin to provide more Claude support.

rekram1-node · 2026-03-26T16:05:53Z

Yeah today Ill merge this in, I am gonna split it up into separate commits/prs and resolve the conflicts but ill add the author of the PR as a coauthor on the commits so he gets credit for helping out.

…o#14973 (agent loop fix) PR anomalyco#14743 — fix(cache): improve Anthropic prompt cache hit rate - Split system prompt into stable (global) + dynamic (project) blocks - Remove cwd from bash tool schema (was busting cache per-repo) - Freeze date under OPENCODE_EXPERIMENTAL_CACHE_STABILIZATION flag - Add optional 1h TTL on first system block (OPENCODE_EXPERIMENTAL_CACHE_1H_TTL) - Add OPENCODE_CACHE_AUDIT logging for per-call cache accounting - Track global vs project skill scope for stable cache prefix - Add splitSystemPrompt provider option to opt out PR anomalyco#14973 — fix(core): prevent agent loop stopping after tool calls - Check lastAssistantMsg.parts for tool type before exiting loop - Fixes OpenAI-compatible providers (Gemini, LiteLLM) returning finish_reason 'stop' instead of 'tool_calls' when tools were called ci: add FORCE_JAVASCRIPT_ACTIONS_TO_NODE24 to upstream-sync workflow build: relax bun version check to minor-level for local builds

AndresCdo

Excellent work on this PR. The benchmarks are compelling — going from 0% → 97.6% cross-repo cache hit is a massive improvement. I've reviewed the diff in detail and cross-referenced with related work (#5422, #5224, #14065). Here's my feedback:

Strengths

System prompt split is the right architectural fix. Separating stable (provider prompt + global instructions) from dynamic (env + project) content addresses the root cause of cache invalidation.
Bash tool schema cwd removal is a clean fix. The model already gets cwd from the environment block, so this is redundant and harmful to cache stability.
Skill scope classification (global vs project) is a valuable addition that extends beyond caching — it improves prompt organization generally.
splitSystemPrompt provider option is a good escape hatch for providers that reject multiple system messages.
Cache audit logging (OPENCODE_CACHE_AUDIT) is excellent for observability and debugging.

Concerns

1. Module-level singletons lack invalidation

session/system.ts and session/instruction.ts use process-lifetime singletons (cachedDate, cached) with no invalidation path. This is fine for the CACHE_STABILIZATION flag's intent, but consider:

Tests that change the date or instructions between runs will get stale data
If a user edits their global AGENTS.md mid-session, the cached version won't reflect it
Multi-instance scenarios (multiple project directories) share the same cache

Suggestion: Add a simple invalidation mechanism, even if it's just clearCache() exported for tests.

2. systemSplit calculation is fragile

In session/prompt.ts, systemSplit = instructions.global.length + (skills.global ? 1 : 0). This assumes the system array ordering is fixed. If any future change reorders this array, the split point will be wrong and cache markers will land on the wrong content.

Suggestion: Consider making the split explicit rather than positional. Pass { stable: string[], dynamic: string[] } instead of { system: string[], systemSplit: number }.

3. MCP tools are still a cache breaker

You correctly note this in 'What this doesn't fix'. A mitigation: sort MCP tools by name (deterministic ordering) and place them after the stable system prefix. This won't eliminate the cache miss but will ensure consistency across turns.

4. 1h TTL as env var vs config

OPENCODE_EXPERIMENTAL_CACHE_1H_TTL is behind an env var. Given the strong benchmarks, this could be a config option to align with #16848 and #244.

Relationship to #5422

PR #5422 takes a different approach (ProviderConfig with 874 lines of provider-specific defaults). This PR is more surgical (243 lines, focused fixes). I'd recommend merging this first as the foundation, then layering #5422's provider-specific defaults on top if needed.

Verdict

This is a well-scoped, well-tested PR that addresses a real performance problem. My concerns are mostly about long-term maintainability rather than correctness — none should block merging. Would love to see the systemSplit fragility addressed before merge.

phazei · 2026-04-01T22:12:36Z

@rekram1-node did you manage to get this split up and merged?

github-actions bot added the needs:compliance This means the issue will auto-close after 2 hours. label Feb 23, 2026

github-actions bot removed the needs:compliance This means the issue will auto-close after 2 hours. label Feb 23, 2026

bhagirathsinh-vaghela force-pushed the prompt-caching branch from b67a66a to 906a317 Compare February 23, 2026 04:06

bhagirathsinh-vaghela marked this pull request as ready for review February 23, 2026 04:18

bhagirathsinh-vaghela force-pushed the prompt-caching branch from 906a317 to c499424 Compare February 23, 2026 04:23

bhagirathsinh-vaghela marked this pull request as draft February 23, 2026 04:25

bhagirathsinh-vaghela marked this pull request as ready for review February 23, 2026 05:31

This was referenced Feb 23, 2026

[FEATURE]: Anthropic (and others) caching improvement #5416

Open

System environment prompt causes cache invalidation #5224

Open

Cache hit rate is low / zero #14065

Open

bhagirathsinh-vaghela force-pushed the prompt-caching branch from c499424 to 176c069 Compare February 23, 2026 06:12

github-actions bot mentioned this pull request Feb 24, 2026

📊 AI CLI 工具社区动态日报 2026-02-24 duanyytop/agents-radar#7

Open

bhagirathsinh-vaghela force-pushed the prompt-caching branch from 176c069 to f08aa45 Compare February 25, 2026 04:19

This was referenced Mar 2, 2026

📊 AI CLI 工具社区动态日报 2026-03-02 rollysys/agents-radar#22

Open

📊 AI CLI 工具社区动态日报 2026-03-02 duanyytop/agents-radar#40

Open

bhagirathsinh-vaghela force-pushed the prompt-caching branch from f08aa45 to 7984393 Compare March 4, 2026 04:31

kamelkace reviewed Mar 4, 2026

View reviewed changes

alexsmirnov added a commit to alexsmirnov/opencode that referenced this pull request Mar 5, 2026

feat: prompt cache improvement from anomalyco#14743

8bb20e4

Copilot AI mentioned this pull request Mar 12, 2026

feat(cache): stabilize system prompt prefix for KV cache reuse chand1012/opencode#1

Draft

6 tasks

bhagirathsinh-vaghela force-pushed the prompt-caching branch from 7984393 to d7849ca Compare March 22, 2026 03:34

bhagirathsinh-vaghela added 2 commits March 21, 2026 20:39

feat(cache): add cache token audit logging and TUI sidebar behind OPE…

c78ce0e

…NCODE_CACHE_AUDIT

fix(cache): stabilize system prefix behind OPENCODE_EXPERIMENTAL_CACH…

669fda6

…E_STABILIZATION

bhagirathsinh-vaghela force-pushed the prompt-caching branch from d7849ca to b57aa98 Compare March 22, 2026 03:39

bhagirathsinh-vaghela added 4 commits March 21, 2026 21:01

fix(cache): split system prompt and skills into stable/dynamic blocks…

fba2aad

… for independent caching

fix(cache): remove cwd from bash tool schema for cross-repo cache hits

212709f

feat(cache): add optional 1h TTL on first system cache marker behind …

3bcde34

…OPENCODE_EXPERIMENTAL_CACHE_1H_TTL flag

fix(cache): add splitSystemPrompt provider option to opt out of syste…

2e02781

…m block split

bhagirathsinh-vaghela force-pushed the prompt-caching branch from b57aa98 to 2e02781 Compare March 22, 2026 04:03

github-actions bot mentioned this pull request Mar 23, 2026

fix(session): stabilize single-message prompt prefix for llama cache reuse #18710

Closed

6 tasks

rekram1-node mentioned this pull request Mar 28, 2026

tweak: adjust bash tool description to increase cache hit rates between projects #19487

Merged

github-actions bot mentioned this pull request Mar 30, 2026

fix(opencode): keep dynamic user.system separate for prompt caching #20109

Open

6 tasks

AndresCdo reviewed Mar 31, 2026

View reviewed changes

	` Today's date: ${date.toDateString()}`,
	` Session started at: ${date.toDateString()}`,

Conversation

bhagirathsinh-vaghela commented Feb 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Issue for this PR

Type of change

What does this PR do?

How did you verify your code works?

Screenshots / recordings

Updates (post-rebase, Mar 21 2026)

Checklist

Uh oh!

github-actions bot commented Feb 23, 2026

Uh oh!

github-actions bot commented Feb 23, 2026

Uh oh!

bhagirathsinh-vaghela commented Feb 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

AI SDK cache marker mechanics

Related open PRs

Update (post-rebase, Mar 21 2026)

Uh oh!

bhagirathsinh-vaghela commented Feb 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ShanePresley commented Feb 23, 2026

Uh oh!

TomLucidor commented Feb 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bhagirathsinh-vaghela commented Mar 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kamelkace Mar 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bhagirathsinh-vaghela Mar 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kamelkace Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

fkroener commented Mar 8, 2026

Uh oh!

fkroener commented Mar 9, 2026

Uh oh!

pwilkin commented Mar 9, 2026

Uh oh!

fkroener commented Mar 9, 2026

Uh oh!

sandeep-chaps commented Mar 13, 2026

Uh oh!

Stellarthoughts commented Mar 14, 2026

Uh oh!

hhieuu commented Mar 18, 2026

Uh oh!

thdxr commented Mar 21, 2026

Uh oh!

rekram1-node commented Mar 21, 2026

Uh oh!

bhagirathsinh-vaghela commented Mar 22, 2026

Uh oh!

bhagirathsinh-vaghela commented Mar 22, 2026

Uh oh!

henry701 commented Mar 23, 2026

Uh oh!

phazei commented Mar 26, 2026

Uh oh!

rekram1-node commented Mar 26, 2026

Uh oh!

AndresCdo left a comment

Choose a reason for hiding this comment

Strengths

Concerns

1. Module-level singletons lack invalidation

2. systemSplit calculation is fragile

3. MCP tools are still a cache breaker

4. 1h TTL as env var vs config

Relationship to #5422

bhagirathsinh-vaghela commented Feb 23, 2026 •

edited

Loading

bhagirathsinh-vaghela commented Feb 23, 2026 •

edited

Loading

bhagirathsinh-vaghela commented Feb 23, 2026 •

edited

Loading

TomLucidor commented Feb 24, 2026 •

edited

Loading

bhagirathsinh-vaghela commented Mar 2, 2026 •

edited

Loading

kamelkace Mar 4, 2026 •

edited

Loading

bhagirathsinh-vaghela Mar 5, 2026 •

edited

Loading