feat(llm): response compact API#1161
Conversation
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces a new 'compact' API for LLM responses, providing a specialized endpoint and data models to handle the compaction of responses. This feature enhances the system's ability to manage and process specific types of LLM interactions by integrating dedicated request and response handling, ensuring a streamlined and efficient workflow for compact API calls within the existing OpenAI service infrastructure. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces a new "compact" LLM response API endpoint, including new CompactRequest and CompactResponse data structures, corresponding request types, and API formats. It adds a new /v1/responses/compact route and implements inbound and outbound transformers for this new functionality within the OpenAI LLM integration. The review comments highlight several areas for improvement, including an inconsistency in the buildCompactURL function regarding the RawURL configuration, missing population of the Model field in llm.Response during compact response transformations, significant code duplication between llm/transformer/openai/compact.go and llm/transformer/openai/responses/compact_outbound.go, the absence of a Model field in the CompactAPIResponse struct for inbound transformations, and an inefficient TransformError implementation in CompactInboundTransformer.
llm/transformer/openai/compact.go
Outdated
| func (t *OutboundTransformer) buildCompactURL() string { | ||
| return t.config.BaseURL + "/responses/compact" | ||
| } |
There was a problem hiding this comment.
The buildCompactURL function does not handle the RawURL configuration option. This is inconsistent with other URL building functions in the transformer, such as buildFullRequestURL in llm/transformer/openai/outbound.go. If t.config.RawURL is true, it should return t.config.BaseURL directly to support custom endpoints.
func (t *OutboundTransformer) buildCompactURL() string {
if t.config.RawURL {
return t.config.BaseURL
}
return t.config.BaseURL + "/responses/compact"
}
llm/transformer/openai/compact.go
Outdated
| llmResp := &llm.Response{ | ||
| RequestType: llm.RequestTypeCompact, | ||
| APIFormat: llm.APIFormatOpenAIResponseCompact, | ||
| Compact: &llm.CompactResponse{ | ||
| ID: compactResp.ID, | ||
| CreatedAt: compactResp.CreatedAt, | ||
| Object: compactResp.Object, | ||
| Output: compactResp.Output, | ||
| }, | ||
| } |
There was a problem hiding this comment.
The Model field of the llm.Response is not being set. It should be populated from the upstream response. After adding the Model field to responses.CompactAPIResponse, you can use compactResp.Model to populate this.
| llmResp := &llm.Response{ | |
| RequestType: llm.RequestTypeCompact, | |
| APIFormat: llm.APIFormatOpenAIResponseCompact, | |
| Compact: &llm.CompactResponse{ | |
| ID: compactResp.ID, | |
| CreatedAt: compactResp.CreatedAt, | |
| Object: compactResp.Object, | |
| Output: compactResp.Output, | |
| }, | |
| } | |
| llmResp := &llm.Response{ | |
| RequestType: llm.RequestTypeCompact, | |
| APIFormat: llm.APIFormatOpenAIResponseCompact, | |
| Model: compactResp.Model, | |
| Compact: &llm.CompactResponse{ | |
| ID: compactResp.ID, | |
| CreatedAt: compactResp.CreatedAt, | |
| Object: compactResp.Object, | |
| Output: compactResp.Output, | |
| }, | |
| } |
llm/transformer/openai/compact.go
Outdated
| package openai | ||
|
|
||
| import ( | ||
| "context" | ||
| "encoding/json" | ||
| "fmt" | ||
| "net/http" | ||
|
|
||
| "github.com/looplj/axonhub/llm" | ||
| "github.com/looplj/axonhub/llm/httpclient" | ||
| "github.com/looplj/axonhub/llm/transformer/openai/responses" | ||
| ) | ||
|
|
||
| // transformCompactRequest transforms a compact llm.Request to an HTTP request for the upstream provider. | ||
| func (t *OutboundTransformer) transformCompactRequest( | ||
| ctx context.Context, | ||
| llmReq *llm.Request, | ||
| ) (*httpclient.Request, error) { | ||
| if llmReq.Compact == nil { | ||
| return nil, fmt.Errorf("compact request is nil in llm.Request") | ||
| } | ||
|
|
||
| payload := responses.CompactAPIRequest{ | ||
| Model: llmReq.Model, | ||
| Input: llmReq.Compact.Input, | ||
| Instructions: llmReq.Compact.Instructions, | ||
| PreviousResponseID: llmReq.Compact.PreviousResponseID, | ||
| PromptCacheKey: llmReq.Compact.PromptCacheKey, | ||
| } | ||
|
|
||
| body, err := json.Marshal(payload) | ||
| if err != nil { | ||
| return nil, fmt.Errorf("failed to marshal compact request: %w", err) | ||
| } | ||
|
|
||
| headers := make(http.Header) | ||
| headers.Set("Content-Type", "application/json") | ||
| headers.Set("Accept", "application/json") | ||
|
|
||
| apiKey := t.config.APIKeyProvider.Get(ctx) | ||
|
|
||
| return &httpclient.Request{ | ||
| Method: http.MethodPost, | ||
| URL: t.buildCompactURL(), | ||
| Headers: headers, | ||
| Body: body, | ||
| Auth: &httpclient.AuthConfig{ | ||
| Type: "bearer", | ||
| APIKey: apiKey, | ||
| }, | ||
| RequestType: string(llm.RequestTypeCompact), | ||
| APIFormat: string(llm.APIFormatOpenAIResponseCompact), | ||
| SkipInboundQueryMerge: true, | ||
| }, nil | ||
| } | ||
|
|
||
| // buildCompactURL constructs the compact API URL. | ||
| func (t *OutboundTransformer) buildCompactURL() string { | ||
| return t.config.BaseURL + "/responses/compact" | ||
| } | ||
|
|
||
| // transformCompactResponse transforms an HTTP compact response to unified llm.Response. | ||
| func (t *OutboundTransformer) transformCompactResponse( | ||
| ctx context.Context, | ||
| httpResp *httpclient.Response, | ||
| ) (*llm.Response, error) { | ||
| if httpResp.StatusCode >= 400 { | ||
| return nil, t.TransformError(ctx, &httpclient.Error{ | ||
| StatusCode: httpResp.StatusCode, | ||
| Body: httpResp.Body, | ||
| }) | ||
| } | ||
|
|
||
| if len(httpResp.Body) == 0 { | ||
| return nil, fmt.Errorf("response body is empty") | ||
| } | ||
|
|
||
| var compactResp responses.CompactAPIResponse | ||
| if err := json.Unmarshal(httpResp.Body, &compactResp); err != nil { | ||
| return nil, fmt.Errorf("failed to unmarshal compact response: %w", err) | ||
| } | ||
|
|
||
| llmResp := &llm.Response{ | ||
| RequestType: llm.RequestTypeCompact, | ||
| APIFormat: llm.APIFormatOpenAIResponseCompact, | ||
| Compact: &llm.CompactResponse{ | ||
| ID: compactResp.ID, | ||
| CreatedAt: compactResp.CreatedAt, | ||
| Object: compactResp.Object, | ||
| Output: compactResp.Output, | ||
| }, | ||
| } | ||
|
|
||
| if compactResp.Usage != nil { | ||
| llmResp.Usage = compactResp.Usage.ToUsage() | ||
| } | ||
|
|
||
| return llmResp, nil | ||
| } |
There was a problem hiding this comment.
This file is almost identical to llm/transformer/openai/responses/compact_outbound.go. This code duplication makes maintenance harder. Consider refactoring the common logic for handling compact requests and responses into a shared package or utility function that both openai.OutboundTransformer and responses.OutboundTransformer can use. This would improve maintainability and ensure consistency. For example, you could create a compact package inside llm/transformer/shared and move the transformation logic there.
| type CompactAPIResponse struct { | ||
| ID string `json:"id"` | ||
| CreatedAt int64 `json:"created_at"` | ||
| Object string `json:"object"` | ||
| Output json.RawMessage `json:"output"` | ||
| Usage *Usage `json:"usage,omitempty"` | ||
| } |
There was a problem hiding this comment.
The CompactAPIResponse struct is missing the Model field. It's a good practice for API responses to include the model that was used to generate the response, especially since the request includes it. This information is valuable for clients and for debugging.
| type CompactAPIResponse struct { | |
| ID string `json:"id"` | |
| CreatedAt int64 `json:"created_at"` | |
| Object string `json:"object"` | |
| Output json.RawMessage `json:"output"` | |
| Usage *Usage `json:"usage,omitempty"` | |
| } | |
| type CompactAPIResponse struct { | |
| ID string `json:"id"` | |
| Model string `json:"model"` | |
| CreatedAt int64 `json:"created_at"` | |
| Object string `json:"object"` | |
| Output json.RawMessage `json:"output"` | |
| Usage *Usage `json:"usage,omitempty"` | |
| } |
| resp := CompactAPIResponse{ | ||
| ID: llmResp.Compact.ID, | ||
| CreatedAt: llmResp.Compact.CreatedAt, | ||
| Object: llmResp.Compact.Object, | ||
| Output: llmResp.Compact.Output, | ||
| } |
There was a problem hiding this comment.
Following the suggestion to add a Model field to CompactAPIResponse, you should populate this field from the llm.Response. The llm.Response contains the model information that was used.
| resp := CompactAPIResponse{ | |
| ID: llmResp.Compact.ID, | |
| CreatedAt: llmResp.Compact.CreatedAt, | |
| Object: llmResp.Compact.Object, | |
| Output: llmResp.Compact.Output, | |
| } | |
| resp := CompactAPIResponse{ | |
| ID: llmResp.Compact.ID, | |
| Model: llmResp.Model, | |
| CreatedAt: llmResp.Compact.CreatedAt, | |
| Object: llmResp.Compact.Object, | |
| Output: llmResp.Compact.Output, | |
| } |
| func (t *CompactInboundTransformer) TransformError(ctx context.Context, rawErr error) *httpclient.Error { | ||
| inbound := NewInboundTransformer() | ||
| return inbound.TransformError(ctx, rawErr) | ||
| } |
There was a problem hiding this comment.
This implementation creates a new InboundTransformer instance on every call to TransformError. Since InboundTransformer is stateless, this is inefficient. Consider refactoring this to avoid repeated allocations, for example by using a package-level variable for the InboundTransformer instance and reusing it here. This would improve performance by reducing garbage collection pressure.
| llmResp := &llm.Response{ | ||
| RequestType: llm.RequestTypeCompact, | ||
| APIFormat: llm.APIFormatOpenAIResponseCompact, | ||
| Model: "", |
There was a problem hiding this comment.
The Model field in the llm.Response is being set to an empty string. This field should be populated with the model name from the upstream response. After adding the Model field to CompactAPIResponse, you can use compactResp.Model to populate this.
| Model: "", | |
| Model: compactResp.Model, |
|
close #930 |
2ae3285 to
402e622
Compare
c5a7cdc to
5cf75bf
Compare
| case "compaction", "compaction_summary": | ||
| if p.Compact != nil { | ||
| contentItems = append(contentItems, compactionItemFromPart(p, p.Type)) | ||
| } |
There was a problem hiding this comment.
🟡 convertUserMessage nests compaction items inside message content instead of emitting them as top-level items
In convertUserMessage, compaction content parts are added to the message's contentItems slice, which gets wrapped inside Content: &Input{Items: contentItems} of a single message Item. This produces {"type": "message", "role": "user", "content": [{"type": "compaction", ...}]}, where the compaction item is incorrectly nested as a message content item.
In contrast, convertAssistantMessage at llm/transformer/openai/responses/outbound_convert.go:262-270 correctly uses the flushMessage() pattern to break compaction items out as separate top-level items. The Responses API expects compaction and compaction_summary to be top-level input items, not nested inside a message content array. While current code paths don't create user/developer messages with compaction parts (since compactionMessageFromItem always creates assistant messages), any future code that does would produce invalid API payloads that the upstream provider would likely reject.
Prompt for agents
Refactor convertUserMessage in llm/transformer/openai/responses/outbound_convert.go to return []Item instead of a single Item, and use the same flushMessage() pattern that convertAssistantMessage uses (lines 241-281) to emit compaction/compaction_summary parts as separate top-level items instead of nesting them inside the message content array. The call site in convertInputFromMessages (line 114) should then use `items = append(items, convertUserMessage(msg)...)` instead of `items = append(items, convertUserMessage(msg))`.
Was this helpful? React with 👍 or 👎 to provide feedback.
Uh oh!
There was an error while loading. Please reload this page.