Attempt to recover from parsing failure on truncated input by returning last partial parse by pwilkin · Pull Request #20204 · ggml-org/llama.cpp

pwilkin · 2026-03-07T18:22:35Z

When model output ends abruptly, we might end in a scenario where not everything is output, for example, we might not have ended reasoning yet. Nevertheless, try to recover by returning the last partial output instead of throwing an error.

Fixes #20193

…ng last partial parse

aldehir · 2026-03-07T18:47:04Z

I think we just produce an incomplete AST like we do during streaming. I'll add it to my PR.

Attempt to recover from parsing failure on truncated input by returni…

b12db5a

…ng last partial parse

pwilkin requested review from ggerganov and ngxson as code owners March 7, 2026 18:22

github-actions bot added examples server labels Mar 7, 2026

pwilkin mentioned this pull request Mar 7, 2026

common : gracefully handle incomplete output #20191

Merged

pwilkin closed this Mar 8, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Attempt to recover from parsing failure on truncated input by returning last partial parse#20204

Attempt to recover from parsing failure on truncated input by returning last partial parse#20204
pwilkin wants to merge 1 commit intoggml-org:masterfrom
pwilkin:recover-parse

pwilkin commented Mar 7, 2026

Uh oh!

aldehir commented Mar 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

pwilkin commented Mar 7, 2026

Uh oh!

aldehir commented Mar 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants