Skip to content

feat(agent): memory condensation for longer context#1457

Merged
kalvinnchau merged 14 commits intoblock:mainfrom
arielherself:dev-condense
Mar 6, 2025
Merged

feat(agent): memory condensation for longer context#1457
kalvinnchau merged 14 commits intoblock:mainfrom
arielherself:dev-condense

Conversation

@arielherself
Copy link
Copy Markdown
Contributor

When context grows beyond the limit of the model, the current implementation will cut off older messages. This PR introduces a more graceful solution that lets the model itself summarize the earlier chat history and use it as part of the context. This enables the model to retain crucial information for a prolonged duration.

The previous implementation is actually buggy when the length of the
context is greater than `2 * model_limit`. We should apply incremental
summarization.
@arielherself
Copy link
Copy Markdown
Contributor Author

I think it would be more efficient to save the compressed chat in the log file, rather than compress the long chat every time. Since it's not directly related to compression strategies, I will open another PR to implement this.

@michaelneale
Copy link
Copy Markdown
Collaborator

I like that idea - we had something like this in the old python version (but it would do it each time, which loses prompt caching) - any reason why truncate.rs is changed - would this be an alternative one completely ideally?

@arielherself
Copy link
Copy Markdown
Contributor Author

arielherself commented Mar 3, 2025

I like that idea - we had something like this in the old python version (but it would do it each time, which loses prompt caching) - any reason why truncate.rs is changed - would this be an alternative one completely ideally?

Oh I modified truncate.rs to fit in the uniform Compressor trait, so we could switch between these two strategies (truncation and memory condensation) without effort. But it came to me afterward that we should fall back to truncation anyway, so maybe I don't need to do this. I will fix this in a moment.

@arielherself
Copy link
Copy Markdown
Contributor Author

I have undone the unnecessary changes :)

@kalvinnchau
Copy link
Copy Markdown
Contributor

I like the idea here too; but I do think this should be a separate agent from truncate since this introduces a large change to the default behavior.
The model we have been following is users/testers would should opt to use a "summarize" agent and we can test/iterate on it before it becomes the default behavior.

Which would be found and set via:

❯ goose agents
Available agent versions:
* truncate (default)
  reference

❯ cat ~/.config/goose/config.yaml
GOOSE_AGENT: "summarize"

@arielherself
Copy link
Copy Markdown
Contributor Author

Thanks for your guidance! I have moved the feature to a separate "summarize" agent, and also tweaked the CLI so it displays the current agent version:

 ./target/debug/goose agents
Available agent versions:
  reference
* summarize
  truncate (default)

@michaelneale
Copy link
Copy Markdown
Collaborator

nice, yes that seems clearer now, @alexhancock @baxen @salman1993 I think this is interesting and potentially low risk to try, especially now we have sessions across GUI and CLI, people are likely to come across larger and larger sessions, so having some summarize would be interesting to try.

@arielherself
Copy link
Copy Markdown
Contributor Author

That's cool! Please let me know if there's anything else I could do.

Copy link
Copy Markdown
Collaborator

@michaelneale michaelneale left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this - adds a new agent for people to try and I like the sound of this approach, seems worth a try?

@kalvinnchau kalvinnchau merged commit 5f750a5 into block:main Mar 6, 2025
1 check passed
michaelneale added a commit that referenced this pull request Mar 7, 2025
* main:
  bugfix: refactor workdirs to be async-safe, and simpler (#1558)
  feat: split required_extensions in bench to builtin/external (#1547)
  fix: continue to use resumed session after confirmation is cancelled (#1548)
  feat: add image tool to developer mcp (#1515)
  docs: using gooseignore (#1554)
  ci: use cargo update --workspace to ensure Cargo.lock is updated (#1539)
  fix: respond to interrupted tool calls with a ToolResponseMessageContent (#1557)
  fix: get tool def back to chat mode (#1538)
  ui: add default icon (#1553)
  fix: fix summarize agent, use session_id and add provider fn (#1552)
  feat(agent): memory condensation for longer context (#1457)
  docs: goose tips blog (#1550)
  docs: update to provider view (#1546)
  docs: resuming sessions (#1543)
  feat: goose bench framework for functional and regression testing
  feat: use refresh_tokens from databricks api (#1517)
  feat: use Ctrl/Cmd + ↑/↓ to navigate message history (#1501)
  feat: remove tools from chat mode (#1533)
  feat: use dropdown for goose selection (#1531)
  docs: goosehints in desktop (#1529)
@arielherself arielherself deleted the dev-condense branch March 7, 2025 06:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants