Apps token limit by DOsinga · Pull Request #7474 · block/goose

DOsinga · 2026-02-24T12:38:57Z

Summary

makes the canonical model config drive the output tokens that we set for the providers

Copilot

Pull request overview

This PR centralizes provider max-token configuration by routing request token limits through ModelConfig::max_output_tokens() (instead of per-provider defaults / ad hoc overrides), and removes an Apps-specific max token override so canonical model limits drive app generation output size.

Changes:

Replace provider-specific max_tokens/max_new_tokens defaults with model_config.max_output_tokens() across multiple request formatters.
Always include output token limits in OpenAI (chat + responses) and Google request payloads.
Remove Apps extension logic that forcibly set max_tokens = 16384, relying on the provider’s canonical model config instead.

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 5 comments.

Show a summary per file

File	Description
crates/goose/src/providers/sagemaker_tgi.rs	Uses `max_output_tokens()` for TGI `max_new_tokens` instead of a hardcoded default.
crates/goose/src/providers/formats/snowflake.rs	Uses `max_output_tokens()` for Snowflake `max_tokens` field.
crates/goose/src/providers/formats/openai_responses.rs	Always inserts `max_output_tokens` derived from `max_output_tokens()`.
crates/goose/src/providers/formats/openai.rs	Always inserts either `max_tokens` or `max_completion_tokens` from `max_output_tokens()`.
crates/goose/src/providers/formats/google.rs	Always sets `generation_config` with `max_output_tokens` from `max_output_tokens()`.
crates/goose/src/providers/formats/databricks.rs	Uses `max_output_tokens()` for max token fields; refactors thinking budget parsing.
crates/goose/src/providers/formats/anthropic.rs	Refactors thinking budget parsing for Claude thinking mode.
crates/goose/src/model.rs	Changes the fallback returned by `max_output_tokens()` when unset.
crates/goose/src/agents/platform_extensions/apps.rs	Removes forced `max_tokens = 16384` override for Apps tool calls.

Comments suppressed due to low confidence (1)

crates/goose/src/providers/formats/anthropic.rs:444

max_tokens + budget_tokens can overflow i32 if CLAUDE_THINKING_BUDGET is set large, potentially wrapping to a negative value in release builds; use a checked/saturating add (or compute in i64 and clamp) before inserting into the payload.

        payload
            .as_object_mut()
            .unwrap()
            .insert("max_tokens".to_string(), json!(max_tokens + budget_tokens));

Copilot · 2026-02-24T12:43:04Z

crates/goose/src/providers/formats/databricks.rs

+        // With thinking enabled, max_tokens must include both output and thinking budget
+        let max_tokens = model_config.max_output_tokens() + budget_tokens;
+        payload
+            .as_object_mut()
+            .unwrap()
+            .insert("max_tokens".to_string(), json!(max_tokens));


model_config.max_output_tokens() + budget_tokens is an i32 addition that can overflow if CLAUDE_THINKING_BUDGET is set large, which can wrap to a negative max_tokens in release builds; use checked/saturating arithmetic (or compute in i64 and clamp) before serializing.

i32 is 2 billion

crates/goose/src/model.rs

Copilot · 2026-02-24T12:43:05Z

crates/goose/src/providers/sagemaker_tgi.rs

            "parameters": {
-                "max_new_tokens": self.model.max_tokens.unwrap_or(150),
+                "max_new_tokens": self.model.max_output_tokens(),
                "temperature": self.model.temperature.unwrap_or(0.7),


Switching TGI's max_new_tokens from a small provider-specific default (150) to model.max_output_tokens() means an unset/unknown model will now request 16,384 tokens by default, which is likely to cause slow/expensive generations or SageMaker endpoint errors; consider keeping a conservative TGI fallback (e.g., use max_tokens only when explicitly set, otherwise default to ~150 or a configurable env var).

crates/goose/src/providers/formats/anthropic.rs

Copilot · 2026-02-24T12:43:06Z

crates/goose/src/providers/formats/databricks.rs

+        let budget_tokens: i32 = std::env::var("CLAUDE_THINKING_BUDGET")
+            .ok()
+            .and_then(|s| s.parse().ok())
+            .unwrap_or(DEFAULT_THINKING_BUDGET);


The comment says Anthropic requires budget_tokens >= 1024, but CLAUDE_THINKING_BUDGET is parsed without enforcing that minimum (or preventing negatives), which can lead to Databricks request rejection; clamp/validate the parsed value to at least 1024 before using it.

Suggested change

.unwrap_or(DEFAULT_THINKING_BUDGET);

.unwrap_or(DEFAULT_THINKING_BUDGET)

.max(1024);

Co-authored-by: Copilot <[email protected]>

Copilot

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated no new comments.

katzdave

lgtm!

katzdave · 2026-02-24T18:46:31Z

crates/goose/src/model.rs

        }

-        // Priority 2: Global default
        4_096


Was separately thinking about making this bigger. Too small seems strictly worse as it can lead to truncation.

I had it higher for a bit, but presumably we almost always have a real value here so then erring on a lower side is probably better

…m-cache * 'main' of github.com:block/goose: chore(release): release version 1.25.0 (minor) (#7263) Docs: Community all-stars and page update (#7483) fix: searchbar zindex modal overlay (#7502) blog: Order Lunch Without Leaving Your AI Agent (#7505) feat: gateway to chat to goose - telegram etc (#7199) Option to add changeable defaults in goose-releases (#7373) Fix out of order messages (#7472) fix: ensure latest session always displays in sidebar (#7489) docs: rename TLDR to Quick Install in MCP tutorials (#7493) docs: update Neighborhood extension page with video embed and layout improvements (#7473) feat: let AskAI Discord bot see channels in the server (#7480) Apps token limit (#7474) fix(apps): declare color-scheme to allow transparent MCP App iframes (#7479)

Co-authored-by: Douwe Osinga <[email protected]> Co-authored-by: Copilot <[email protected]> Signed-off-by: Oleg Levchenko <[email protected]>

Douwe Osinga added 3 commits February 24, 2026 07:22

Actually use the max output tokens from the different models

0815113

Update thinking

612710c

Remove comment

1068abd

DOsinga requested review from Copilot and katzdave February 24, 2026 12:38

Copilot started reviewing on behalf of DOsinga February 24, 2026 12:39 View session

Copilot AI reviewed Feb 24, 2026

View reviewed changes

Update crates/goose/src/providers/formats/anthropic.rs

2b4ca8b

Co-authored-by: Copilot <[email protected]>

Copilot AI review requested due to automatic review settings February 24, 2026 16:58

Update crates/goose/src/model.rs

2f1375b

Co-authored-by: Copilot <[email protected]>

Copilot started reviewing on behalf of DOsinga February 24, 2026 16:59 View session

Copilot AI reviewed Feb 24, 2026

View reviewed changes

katzdave approved these changes Feb 24, 2026

View reviewed changes

DOsinga added this pull request to the merge queue Feb 24, 2026

Merged via the queue into main with commit e7b1566 Feb 24, 2026
21 checks passed

DOsinga deleted the apps-token-limit branch February 24, 2026 19:46

github-actions bot mentioned this pull request Feb 25, 2026

chore(release): release version 1.26.0 (minor) #7512

Merged

u35tpus pushed a commit to u35tpus/goose that referenced this pull request Feb 26, 2026

Apps token limit (block#7474)

56b4d83

Co-authored-by: Douwe Osinga <[email protected]> Co-authored-by: Copilot <[email protected]> Signed-off-by: Oleg Levchenko <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Apps token limit#7474

Apps token limit#7474
DOsinga merged 5 commits intomainfrom
apps-token-limit

DOsinga commented Feb 24, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Feb 24, 2026

Uh oh!

DOsinga Feb 24, 2026

Uh oh!

Uh oh!

Copilot AI Feb 24, 2026

Uh oh!

Uh oh!

Copilot AI Feb 24, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

katzdave left a comment

Uh oh!

katzdave Feb 24, 2026

Uh oh!

DOsinga Feb 24, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	.unwrap_or(DEFAULT_THINKING_BUDGET);
	.unwrap_or(DEFAULT_THINKING_BUDGET)
	.max(1024);

Conversation

DOsinga commented Feb 24, 2026

Summary

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Feb 24, 2026

Choose a reason for hiding this comment

Uh oh!

DOsinga Feb 24, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Copilot AI Feb 24, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Copilot AI Feb 24, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

katzdave left a comment

Choose a reason for hiding this comment

Uh oh!

katzdave Feb 24, 2026

Choose a reason for hiding this comment

Uh oh!

DOsinga Feb 24, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants