This package provides a test OpenAI API server for testing AI Gateway functionality without requiring actual API access or credentials.
Pre-recorded OpenAI request/responses are stored as YAML files in the cassettes directory, using the go-vcr v4 format.
The test server works by:
- Automatically loading all pre-recorded API interactions from embedded "cassette" YAML files
- Matching incoming requests against recorded interactions based on the
X-Cassette-Nameheader - Replaying the recorded responses with delays faster than real platforms to keep tests fast.
This approach provides:
- Deterministic testing: Same inputs always produce same outputs
- No API credentials needed: Tests can run without OpenAI API keys
- Fast execution: No network calls to external services
- Cost savings: No API usage charges during testing
import (
"testing"
"github.com/envoyproxy/ai-gateway/internal/testopenai"
)
func TestMyFeature(t *testing.T) {
// Create server on random port - cassettes are automatically loaded
server, err := testopenai.NewServer()
require.NoError(t, err)
defer server.Close()
// Create a request for a specific cassette
req, err := testopenai.NewRequest(server.URL(), testopenai.CassetteChatBasic)
require.NoError(t, err)
// Make the request
resp, err := http.DefaultClient.Do(req)
// ... test your code
}The test server can record new interactions when:
- No matching cassette is found
OPENAI_API_KEYorAZURE_OPENAI_API_KEYis set in the environment- A cassette name is provided via
X-Cassette-Nameheader
To record a new cassette, follow these steps:
-
Add a constant for your test scenario to requests.go:
const ( // ... existing constants // CassetteChatFeatureX includes feature X, added to OpenAI version 1.2.3. CassetteChatFeatureX _cassetteNameEnd // Keep this at the end )
Note: The constants use
iotaenumeration, so your new constant must be added before_cassetteNameEndto be included in theAllCassettes()iteration.Also add its string mapping:
var stringValues = map[CassetteName]string{ // ... existing mappings CassetteChatFeatureX: "chat-feature-x", }
-
Add the request body for your test to requests.go:
var requestBodies = map[CassetteName]*openai.ChatCompletionRequest{ // ... existing entries CassetteChatFeatureX: { Model: openai.ModelGPT41Nano, Messages: []openai.ChatCompletionMessageParamUnion{ { Type: openai.ChatMessageRoleUser, Value: openai.ChatCompletionUserMessageParam{ Role: openai.ChatMessageRoleUser, Content: openai.StringOrUserRoleContentUnion{ Value: "Your test prompt", }, }, }, }, // Add your feature-specific fields here }, }
-
Run
TestNewRequestwith your API credentials set:For OpenAI:
cd tests/internal/testopenai OPENAI_API_KEY=sk-.. go test -run TestNewRequest -v
For Azure OpenAI:
cd tests/internal/testopenai AZURE_OPENAI_API_KEY=your-key \ AZURE_OPENAI_ENDPOINT=https://your-resource.cognitiveservices.azure.com \ AZURE_OPENAI_DEPLOYMENT=your-deployment-name \ OPENAI_API_VERSION=2024-02-15-preview \ go test -run TestNewRequest -v
-
Use it in tests like chat_completions_test.go
graph TD
A[Request arrives] --> B{X-Cassette-Name\nheader present?}
B -->|Yes| C[Search for specific cassette]
B -->|No| D[Search all cassettes]
C --> E{Cassette found?}
D --> F{Match found?}
E -->|Yes| G{Interaction matches?}
E -->|No| H{API key set?}
F -->|Yes| P[Return recorded response]
F -->|No| I[Return 400 error:\nInclude X-Cassette-Name header]
G -->|Yes| P
G -->|No| O[Return 409 error:\nInteraction out of date]
H -->|Yes| J[Record new interaction]
H -->|No| K[Return 500 error:\nNo cassette found]
J --> L[Make real API call]
L --> M[Save to cassette file\nwith .yaml extension]
M --> N[Return response to client]
style I fill:#f96
style K fill:#f96
style O fill:#fa6
OpenAI is not the only inference API supported, but it is special as it is the most common frontend and backend for AI Gateway. This is why we expose the requests, as we will often proxy these even if the backend is not OpenAI compatible.
The recording process would remain consistent for other cloud services, such as Anthropic or Bedrock, though there could be variations in how requests are scrubbed for secrets or handled for request signing. In a future refactoring, we could extract the core recording infrastructure into a separate package, reducing this one to just cassette constants and OpenAI-specific request recording and handling details. Most of the code could be reused for other backends.
For additional insights, refer to OpenTelemetry instrumentation, which often employs VCR for LLM frameworks as well.
Here are key parts of the OpenTelemetry Botocore Bedrock instrumentation that deals with request signing and recording:
Here are key parts of OpenInference Anthropic instrumentation, which handles their endpoint.