Description
OpenAIEmbeddingGenerator.GenerateAsync throws a null reference exception when the response does not have a usage object.
Error Message: System.NullReferenceException : Object reference not set to an instance of an object. Stack Trace:
at Microsoft.Extensions.AI.OpenAIEmbeddingGenerator.<GenerateAsync>d__5.MoveNext()
at Microsoft.Extensions.AI.OpenTelemetryEmbeddingGenerator`2.<GenerateAsync>d__15.MoveNext()
at Microsoft.SemanticKernel.Connectors.SqliteVec.SqliteCollection`2.<UpsertAsync>d__22.MoveNext()
The cloudflare workers AI OpenAI compatible endpoint does not return usage. The response looks like this
{
"object": "list",
"data": [
{
"object": "embedding",
"embedding": [
-0.05989326909184456,
...
-0.07520853728055954
],
"index": 0
}
],
"model": "@cf/google/embeddinggemma-300m"
}
missing
"usage": {
"prompt_tokens": 0,
"total_tokens": 0
}
Reproduction Steps
Use the clouldflare Workers AI OpenAI compatible endpoint to generate embeddings.
I wrote a unit test in this repo where the output did not have usage and it also threw the exception.
Expected behavior
Should handle missing Usage property and return 0 for InputTokenCount and TotalTokenCount or a null Usage
Actual behavior
Throws a null reference exception
Regression?
I don't know.
Known Workarounds
None.
Configuration
Platform
- Language: C# NET 8
- Source: NuGet Microsoft.SemanticKernel.Connectors.OpenAI Version=1.67.1
- AI model: @cf/google/embeddinggemma-300m
- IDE: Visual Studio
- OS: Windows
Other information
These lines
|
Usage = new() |
|
{ |
|
InputTokenCount = embeddings.Usage.InputTokenCount, |
|
TotalTokenCount = embeddings.Usage.TotalTokenCount |
|
}, |
Description
OpenAIEmbeddingGenerator.GenerateAsync throws a null reference exception when the response does not have a usage object.
The cloudflare workers AI OpenAI compatible endpoint does not return
usage. The response looks like this{ "object": "list", "data": [ { "object": "embedding", "embedding": [ -0.05989326909184456, ... -0.07520853728055954 ], "index": 0 } ], "model": "@cf/google/embeddinggemma-300m" }missing
Reproduction Steps
Use the clouldflare Workers AI OpenAI compatible endpoint to generate embeddings.
I wrote a unit test in this repo where the output did not have usage and it also threw the exception.
Expected behavior
Should handle missing Usage property and return 0 for InputTokenCount and TotalTokenCount or a null Usage
Actual behavior
Throws a null reference exception
Regression?
I don't know.
Known Workarounds
None.
Configuration
Platform
Other information
These lines
extensions/src/Libraries/Microsoft.Extensions.AI.OpenAI/OpenAIEmbeddingGenerator.cs
Lines 78 to 82 in cba42b4