-
Notifications
You must be signed in to change notification settings - Fork 61
Description
Summary
Lemonade 9.1.4 introduces breaking changes to the /health endpoint that affect GAIA. This issue tracks the upgrade and required code changes.
Background
The Lemonade PR #857 removes the checkpoint_loaded and context_size fields from the top level of the /health endpoint response.
New /health Response Format (9.1.4+)
{
"all_models_loaded": [
{
"backend_url": "http://127.0.0.1:8001/v1",
"checkpoint": "unsloth/Qwen3-0.6B-GGUF:Q4_0", # NEW location
"device": "gpu",
"last_use": 4343774,
"model_name": "Qwen3-0.6B-GGUF",
"recipe": "llamacpp",
"recipe_options": {
"ctx_size": 8192, # NEW location
"llamacpp_args": "--no-mmap",
"llamacpp_backend": "rocm"
},
"type": "llm"
}
],
"log_streaming": {"sse": true, "websocket": false},
"max_models": {"embedding": 1, "llm": 1, "reranking": 1},
"model_loaded": "Qwen3-0.6B-GGUF",
"status": "ok",
"version": "9.1.4"
}Migration Guide
| Old Field (removed) | New Location |
|---|---|
response["checkpoint_loaded"] |
response["all_models_loaded"][N]["checkpoint"] |
response["context_size"] |
response["all_models_loaded"][N]["recipe_options"]["ctx_size"] |
Impact Analysis
Affected Code in GAIA
1. src/gaia/llm/lemonade_client.py - validate_context_size() ⚠️ HIGH IMPACT
def validate_context_size(self, required_tokens: int = 32768, quiet: bool = False) -> tuple:
try:
health = self.health_check()
reported_ctx = health.get("context_size", 0) # ⚠️ Will always be 0 after 9.1.4
if reported_ctx >= required_tokens:
# ...Impact: Context validation will always fail because context_size will be missing from the top level, causing reported_ctx to default to 0.
2. src/gaia/llm/lemonade_client.py - get_status() ⚠️ MEDIUM IMPACT
def get_status(self) -> LemonadeStatus:
# ...
status.context_size = health.get("context_size", 0) # ⚠️ Will always be 0Impact: LemonadeStatus.context_size will always report 0, affecting any status displays or dependent logic.
3. tests/test_lemonade_client.py - Test Mocks 🔧 LOW IMPACT
health_response = {
"status": "ok",
"checkpoint_loaded": "amd/Llama-3.2-3B-Instruct-...", # Field being removed
"model_loaded": TEST_MODEL,
}Impact: Test mocks use outdated response format. Tests still pass since checkpoint_loaded isn't actually parsed in production code, but mocks should be updated to reflect the new format.
Not Affected
checkpoint_loaded- Only appears in test mocks, not parsed in production code- Basic health checks (
status == "ok") - Still work as before
Required Changes
1. Update validate_context_size()
def validate_context_size(self, required_tokens: int = 32768, quiet: bool = False) -> tuple:
try:
health = self.health_check()
# Lemonade 9.1.4+: context_size moved to all_models_loaded[N].recipe_options.ctx_size
all_models = health.get("all_models_loaded", [])
if all_models:
# Get context size from the first loaded model (typically the LLM)
reported_ctx = all_models[0].get("recipe_options", {}).get("ctx_size", 0)
else:
# Fallback for older Lemonade versions
reported_ctx = health.get("context_size", 0)
# ...2. Update get_status()
def get_status(self) -> LemonadeStatus:
# ...
all_models = health.get("all_models_loaded", [])
if all_models:
status.context_size = all_models[0].get("recipe_options", {}).get("ctx_size", 0)
else:
status.context_size = health.get("context_size", 0)3. Update Test Mocks
Update tests/test_lemonade_client.py to use the new response format with all_models_loaded.
4. Consider: Multi-Model Support
The new format supports multiple models. Consider whether GAIA should:
- Track context sizes per model type (LLM vs embedding vs reranking)
- Update
LemonadeStatusto store multiple context sizes - Add helper methods to query specific model types
Acceptance Criteria
- Update
validate_context_size()to read from new location - Update
get_status()to read from new location - Update test mocks to use Lemonade 9.1.4 response format
- Add backward compatibility for older Lemonade versions (if needed)
- Update minimum Lemonade version requirement in documentation/requirements
- Test with Lemonade 9.1.4
References
- Lemonade PR: remove checkpoint_loaded and context_size from /health and add recipe and options in
all_models_loadedlemonade-sdk/lemonade#857 - Lemonade 9.1.4 Release