Skip to content

fix: truncation agent token calculations#915

Merged
kalvinnchau merged 7 commits intomainfrom
kalvin/truncate-agent-updates
Jan 30, 2025
Merged

fix: truncation agent token calculations#915
kalvinnchau merged 7 commits intomainfrom
kalvin/truncate-agent-updates

Conversation

@kalvinnchau
Copy link
Copy Markdown
Contributor

@kalvinnchau kalvinnchau commented Jan 29, 2025

truncation agent token calculations

  • propagate and display any errors that come back from truncate_messages instead of ignoring them
  • add a note that users should restart in a fresh session if they hit the context limit
  • self.token_counter.count_tokens(&msg.as_concat_text()) was only counting messages of type Message::Text, ignoring any ToolResponses which tend to be a large amount of tokens
  • use self.token_counter.count_chat_tokens("", std::slice::from_ref(msg), &[]) to get a full count of the message tokens
  • update context_limit calculation to subtract the total amount of system_prompt and tools tokens already in use

…ssages

previously count_tokens(&msg.as_concat_text) would not count
ToolResponses, update that to use count_chat_tokens on each individual
message

make count_tokens_for_tools a public method to use to account for tool
token counts

update the context_limit to take into account for the system_prompt, and
tool request token counts
at this point we already know what message is being removed, the
total_tokens count doesn't get used after this point and we just remove
the paired message
@kalvinnchau kalvinnchau changed the title truncation agent updates fix:truncation agent updates Jan 29, 2025
@kalvinnchau kalvinnchau changed the title fix:truncation agent updates fix:truncation agent token calculations Jan 29, 2025
let context_limit = remaining_tokens;

// Calculate current token count of each message, use count_chat_tokens to ensure we
// capture the full content of the message
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets mention tool response in the comment

Copy link
Copy Markdown
Contributor

@salman1993 salman1993 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm!

@kalvinnchau kalvinnchau marked this pull request as ready for review January 29, 2025 23:58
// Calculate current token count
// Take into account the system prompt, and our tools input and subtract that from the
// remaining context limit
let system_prompt_token_count = self.token_counter.count_tokens(system_prompt);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does we pass the system_prompt back and forth in the reply loop?

if we do, i wonder if subtracting:

count_tokens(system_prompt) * num_user_exchanges

would be accurate

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i believe we just pass it in once, and not within the messages in the loop

Copy link
Copy Markdown

@wendytang wendytang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice!

@salman1993 salman1993 changed the title fix:truncation agent token calculations fix: truncation agent token calculations Jan 30, 2025
@kalvinnchau kalvinnchau merged commit ff71de4 into main Jan 30, 2025
@kalvinnchau kalvinnchau deleted the kalvin/truncate-agent-updates branch January 30, 2025 15:50
salman1993 added a commit that referenced this pull request Jan 30, 2025
* origin/main:
  fix: clarify linux cli install only (#927)
  feat: update ui for ollama host (#912)
  feat: add CONFIGURE=false option in install script (#920)
  fix: truncation agent token calculations (#915)
  fix: request payload for o1 models (#921)
  Update SupportedEnvironments.js so others don't get confused on why they can not open the macos app on x86 (#888)
  fix: improve configure process with error message (#919)
  docs: Goose on Windows via WSL (#901)
  fix: more graceful handling of missing usage in provider response (#907)
  feat: rm uv.lock cause it points to square artifactory (#917)
  feat: Update issue templates for bug report for goose (#913)
  fix: post endpoint url on sse endpoint event (#900)
michaelneale added a commit that referenced this pull request Jan 30, 2025
* main:
  chore: remove gpt-3.5-turbo UI suggestion, as it is deprecated (#959)
  chore: remove o1-mini suggestion from UI add model view (#957)
  fix: missing field in request (#956)
  docs: update provider docs, fix rate limit link (#943)
  fix: clarify linux cli install only (#927)
  feat: update ui for ollama host (#912)
  feat: add CONFIGURE=false option in install script (#920)
  fix: truncation agent token calculations (#915)
  fix: request payload for o1 models (#921)
michaelneale added a commit that referenced this pull request Jan 31, 2025
* main: (28 commits)
  ci: per semver build metadata should be after + (#971)
  fix: temp fix to make CI workflow pass (#970)
  chore: bump patch version to 1.0.3 (#967)
  fix: load shell automatically from env for GUI (#948)
  fix: update versions in release and canary workflows (#911)
  docs: fix typo, name (#963)
  docs: typo fix (#961)
  chore: remove gpt-3.5-turbo UI suggestion, as it is deprecated (#959)
  chore: remove o1-mini suggestion from UI add model view (#957)
  fix: missing field in request (#956)
  docs: update provider docs, fix rate limit link (#943)
  fix: clarify linux cli install only (#927)
  feat: update ui for ollama host (#912)
  feat: add CONFIGURE=false option in install script (#920)
  fix: truncation agent token calculations (#915)
  fix: request payload for o1 models (#921)
  Update SupportedEnvironments.js so others don't get confused on why they can not open the macos app on x86 (#888)
  fix: improve configure process with error message (#919)
  docs: Goose on Windows via WSL (#901)
  fix: more graceful handling of missing usage in provider response (#907)
  ...
cbruyndoncx pushed a commit to cbruyndoncx/goose that referenced this pull request Jul 20, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants