Skip to content

Fix stale runtime model reuse on session reset#41173

Merged
jalehman merged 3 commits intoopenclaw:mainfrom
PonyX-lab:fix/sessions-reset-stale-runtime-model
Mar 10, 2026
Merged

Fix stale runtime model reuse on session reset#41173
jalehman merged 3 commits intoopenclaw:mainfrom
PonyX-lab:fix/sessions-reset-stale-runtime-model

Conversation

@PonyX-lab
Copy link
Copy Markdown
Contributor

Describe the problem and fix in 2–5 bullets:

  • Problem: sessions.reset reused stale runtime model / modelProvider fields from the previous session entry instead of recomputing from current defaults and explicit
    overrides.
  • Why it matters: after changing the configured default model, resetting an existing session could keep the session pinned to an outdated or unsupported model.
  • What changed: gateway reset now strips stale runtime model state before calling resolveSessionModelRef(...), and runtime resetSession() also clears stale runtime
    model metadata defensively.
  • What did NOT change (scope boundary): model resolution precedence for normal non-reset flows is unchanged; this only affects reset paths.

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

  • Closes #
  • Related #

User-visible / Behavior Changes

  • sessions.reset now recomputes the next session model from current defaults and session overrides instead of preserving stale runtime model metadata from the previous
    session.

Security Impact (required)

  • New permissions/capabilities? (No)
  • Secrets/tokens handling changed? (No)
  • New/changed network calls? (No)
  • Command/tool execution surface changed? (No)
  • Data access scope changed? (No)
  • If any Yes, explain risk + mitigation:

Repro + Verification

Environment

  • OS: Linux
  • Runtime/container: local repo checkout with pnpm
  • Model/provider: reproduced with qwencode/qwen3.5-plus-2026-02-15 stale runtime state and openai/gpt-test-a configured default in tests
  • Integration/channel (if any): none
  • Relevant config (redacted): agent default model changed after existing session already persisted runtime model fields

Steps

  1. Create or keep a session entry whose persisted runtime fields contain an old modelProvider / model.
  2. Change the configured default model.
  3. Call sessions.reset for that session.

Expected

  • Reset session resolves model from current defaults plus any explicit session overrides.

Actual

  • Reset session kept using the previous session's stale runtime model metadata.

Evidence

Attach at least one:

  • Failing test/log before + passing after
  • Trace/log snippets
  • Screenshot/recording
  • Perf numbers (if relevant)

Human Verification (required)

What you personally verified (not just CI), and how:

  • Verified scenarios: ran targeted gateway and e2e tests covering gateway sessions.reset and runtime resetSession() retry behavior.
  • Edge cases checked: explicit reset path recomputes from defaults; compaction-failure retry path clears stale runtime model fields from persisted session state.
  • What you did not verify: full pnpm build && pnpm check && pnpm test repo-wide suite.

Review Conversations

  • I replied to or resolved every bot review conversation I addressed in this PR.
  • I left unresolved only the conversations that still need reviewer or maintainer judgment.

If a bot review conversation is addressed by this PR, resolve that conversation yourself. Do not leave bot review conversation cleanup for maintainers.

Compatibility / Migration

  • Backward compatible? (Yes)
  • Config/env changes? (No)
  • Migration needed? (No)
  • If yes, exact upgrade steps:

Failure Recovery (if this breaks)

  • How to disable/revert this change quickly: revert commit f07e5a0.
  • Files/config to restore: src/gateway/server-methods/sessions.ts, src/auto-reply/reply/agent-runner.ts.
  • Known bad symptoms reviewers should watch for: reset sessions unexpectedly ignoring explicit model overrides (not expected; covered by existing model resolution
    precedence and targeted tests).

Risks and Mitigations

  • Risk: reset paths might accidentally drop intended sticky model configuration.
  • Mitigation: only runtime model / modelProvider / systemPromptReport are cleared; explicit modelOverride / providerOverride remain intact, and targeted tests
    cover reset recomputation behavior.

AI-assisted: yes.
Testing: targeted tests only.

@openclaw-barnacle openclaw-barnacle bot added gateway Gateway runtime size: S labels Mar 9, 2026
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Mar 9, 2026

Greptile Summary

This PR fixes a bug where sessions.reset reused stale runtime model/modelProvider/systemPromptReport fields from the previous session entry instead of recomputing them from current defaults and explicit overrides. The fix is applied in two places: the gateway sessions.reset handler (which now strips runtime model state before calling resolveSessionModelRef), and the runtime resetSession() function in agent-runner.ts (which defensively clears stale runtime model fields on compaction-failure retries). Targeted tests are added for both paths.

  • src/gateway/server-methods/sessions.ts: New stripRuntimeModelState() helper strips model, modelProvider, and systemPromptReport from an entry before passing it to resolveSessionModelRef(...), ensuring the resolved model for the new session comes from current defaults/overrides rather than stale runtime metadata.
  • src/auto-reply/reply/agent-runner.ts: resetSession() now explicitly sets model, modelProvider, and systemPromptReport to undefined when constructing the next entry, preventing stale runtime values from being spread into the reset entry.
  • Tests: A gateway integration test confirms that after resetting a session with a stale qwencode model, the new entry reflects the currently configured openai/gpt-test-a default. An e2e test confirms that on compaction-failure retry, stale runtime model fields are cleared from both the in-memory store and the persisted JSON.
  • The fix correctly preserves modelOverride and providerOverride through the strip step, so explicit per-session model pinning continues to work as expected on reset.

Confidence Score: 5/5

  • This PR is safe to merge — the change is narrowly scoped to reset paths, preserves all explicit override fields, and is verified by targeted tests.
  • The fix is minimal and well-understood, directly addressing the reported stale-state bug. stripRuntimeModelState only touches the three runtime-only fields (model, modelProvider, systemPromptReport) and intentionally leaves modelOverride/providerOverride intact, which is exactly the right invariant. The resolveSessionModelRef code confirms it correctly falls through to those override fields when runtime fields are absent. Both code paths are covered by tests, and no normal (non-reset) flows are affected.
  • No files require special attention.

Last reviewed commit: f07e5a0

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f07e5a00c4

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

@jalehman jalehman self-assigned this Mar 9, 2026
@jalehman jalehman force-pushed the fix/sessions-reset-stale-runtime-model branch 5 times, most recently from 8f35b25 to 083726d Compare March 10, 2026 21:00
@jalehman jalehman force-pushed the fix/sessions-reset-stale-runtime-model branch from 083726d to d8a04a4 Compare March 10, 2026 21:02
@jalehman jalehman merged commit 5337439 into openclaw:main Mar 10, 2026
3 checks passed
@jalehman
Copy link
Copy Markdown
Contributor

Merged via squash.

Thanks @PonyX-lab!

mrosmarin added a commit to mrosmarin/openclaw that referenced this pull request Mar 10, 2026
* main: (42 commits)
  test: share runtime group policy fallback cases
  refactor: share windows command shim resolution
  refactor: share approval gateway client setup
  refactor: share telegram payload send flow
  refactor: share passive account lifecycle helpers
  refactor: share channel config schema fragments
  refactor: share channel config security scaffolding
  refactor: share onboarding secret prompt flows
  refactor: share scoped account config patching
  feat(discord): add autoArchiveDuration config option (openclaw#35065)
  fix(gateway): harden token fallback/reconnect behavior and docs (openclaw#42507)
  fix(acp): strip provider auth env for child ACP processes (openclaw#42250)
  fix(browser): surface 429 rate limit errors with actionable hints (openclaw#40491)
  fix(acp): scope cancellation and event routing by runId (openclaw#41331)
  docs: require codex review in contributing guide (openclaw#42503)
  Fix stale runtime model reuse on session reset (openclaw#41173)
  docs: document r: spam auto-close label
  fix(ci): auto-close and lock r: spam items
  fix(acp): implicit streamToParent for mode=run without thread (openclaw#42404)
  test: extract sendpayload outbound contract suite
  ...
frankekn pushed a commit to MoerAI/openclaw that referenced this pull request Mar 11, 2026
Merged via squash.

Prepared head SHA: d8a04a4
Co-authored-by: PonyX-lab <[email protected]>
Co-authored-by: jalehman <[email protected]>
Reviewed-by: @jalehman
frankekn pushed a commit to Effet/openclaw that referenced this pull request Mar 11, 2026
Merged via squash.

Prepared head SHA: d8a04a4
Co-authored-by: PonyX-lab <[email protected]>
Co-authored-by: jalehman <[email protected]>
Reviewed-by: @jalehman
frankekn pushed a commit to ImLukeF/openclaw that referenced this pull request Mar 11, 2026
Merged via squash.

Prepared head SHA: d8a04a4
Co-authored-by: PonyX-lab <[email protected]>
Co-authored-by: jalehman <[email protected]>
Reviewed-by: @jalehman
Treedy2020 pushed a commit to Treedy2020/openclaw that referenced this pull request Mar 11, 2026
Merged via squash.

Prepared head SHA: d8a04a4
Co-authored-by: PonyX-lab <[email protected]>
Co-authored-by: jalehman <[email protected]>
Reviewed-by: @jalehman
dhoman pushed a commit to dhoman/chrono-claw that referenced this pull request Mar 11, 2026
Merged via squash.

Prepared head SHA: d8a04a4
Co-authored-by: PonyX-lab <[email protected]>
Co-authored-by: jalehman <[email protected]>
Reviewed-by: @jalehman
ahelpercn pushed a commit to ahelpercn/openclaw that referenced this pull request Mar 12, 2026
Merged via squash.

Prepared head SHA: d8a04a4
Co-authored-by: PonyX-lab <[email protected]>
Co-authored-by: jalehman <[email protected]>
Reviewed-by: @jalehman
Ruijie-Ysp pushed a commit to Ruijie-Ysp/clawdbot that referenced this pull request Mar 12, 2026
Merged via squash.

Prepared head SHA: d8a04a4
Co-authored-by: PonyX-lab <[email protected]>
Co-authored-by: jalehman <[email protected]>
Reviewed-by: @jalehman
guang384 added a commit to guang384/openclaw that referenced this pull request Mar 12, 2026
Fixes openclaw#43930

When switching between models (e.g., Gemini → Grok → Gemini), the context
may contain function_response entries that were valid for the previous
model but cause errors for Gemini API with 'Name cannot be empty'.

This fix skips tool calls without valid name during message conversion,
preventing the INVALID_ARGUMENT error from Gemini API.

Related to PR openclaw#41173 which added stripRuntimeModelState() for session reset,
but model switching is a different scenario that needs separate handling.
guang384 added a commit to guang384/openclaw that referenced this pull request Mar 12, 2026
Fixes openclaw#43930

When switching between models (e.g., Gemini → Grok → Gemini), the context
may contain function_response entries that were valid for the previous
model but cause errors for Gemini API with 'Name cannot be empty'.

This fix skips tool calls without valid name during message conversion,
preventing the INVALID_ARGUMENT error from Gemini API.

Related to PR openclaw#41173 which added stripRuntimeModelState() for session reset,
but model switching is a different scenario that needs separate handling.
leozhengliu-pixel pushed a commit to leozhengliu-pixel/openclaw that referenced this pull request Mar 13, 2026
Merged via squash.

Prepared head SHA: d8a04a4
Co-authored-by: PonyX-lab <[email protected]>
Co-authored-by: jalehman <[email protected]>
Reviewed-by: @jalehman
Interstellar-code pushed a commit to Interstellar-code/operator1 that referenced this pull request Mar 16, 2026
Merged via squash.

Prepared head SHA: d8a04a4
Co-authored-by: PonyX-lab <[email protected]>
Co-authored-by: jalehman <[email protected]>
Reviewed-by: @jalehman

(cherry picked from commit 5337439)
senw-developers pushed a commit to senw-developers/va-openclaw that referenced this pull request Mar 17, 2026
Merged via squash.

Prepared head SHA: d8a04a4
Co-authored-by: PonyX-lab <[email protected]>
Co-authored-by: jalehman <[email protected]>
Reviewed-by: @jalehman
alexey-pelykh pushed a commit to remoteclaw/remoteclaw that referenced this pull request Mar 25, 2026
Merged via squash.

Prepared head SHA: d8a04a4
Co-authored-by: PonyX-lab <[email protected]>
Co-authored-by: jalehman <[email protected]>
Reviewed-by: @jalehman

(cherry picked from commit 5337439)
alexey-pelykh pushed a commit to remoteclaw/remoteclaw that referenced this pull request Mar 25, 2026
Merged via squash.

Prepared head SHA: d8a04a4
Co-authored-by: PonyX-lab <[email protected]>
Co-authored-by: jalehman <[email protected]>
Reviewed-by: @jalehman

(cherry picked from commit 5337439)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

gateway Gateway runtime size: S

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants