Skip to content

Lemonade 9.1.4 cautionary upgrade #193

@itomek

Description

@itomek

Summary

Lemonade 9.1.4 introduces breaking changes to the /health endpoint that affect GAIA. This issue tracks the upgrade and required code changes.

Background

The Lemonade PR #857 removes the checkpoint_loaded and context_size fields from the top level of the /health endpoint response.

New /health Response Format (9.1.4+)

{
    "all_models_loaded": [
        {
            "backend_url": "http://127.0.0.1:8001/v1",
            "checkpoint": "unsloth/Qwen3-0.6B-GGUF:Q4_0",  # NEW location
            "device": "gpu",
            "last_use": 4343774,
            "model_name": "Qwen3-0.6B-GGUF",
            "recipe": "llamacpp",
            "recipe_options": {
                "ctx_size": 8192,  # NEW location
                "llamacpp_args": "--no-mmap",
                "llamacpp_backend": "rocm"
            },
            "type": "llm"
        }
    ],
    "log_streaming": {"sse": true, "websocket": false},
    "max_models": {"embedding": 1, "llm": 1, "reranking": 1},
    "model_loaded": "Qwen3-0.6B-GGUF",
    "status": "ok",
    "version": "9.1.4"
}

Migration Guide

Old Field (removed) New Location
response["checkpoint_loaded"] response["all_models_loaded"][N]["checkpoint"]
response["context_size"] response["all_models_loaded"][N]["recipe_options"]["ctx_size"]

Impact Analysis

Affected Code in GAIA

1. src/gaia/llm/lemonade_client.py - validate_context_size() ⚠️ HIGH IMPACT

def validate_context_size(self, required_tokens: int = 32768, quiet: bool = False) -> tuple:
    try:
        health = self.health_check()
        reported_ctx = health.get("context_size", 0)  # ⚠️ Will always be 0 after 9.1.4
        if reported_ctx >= required_tokens:
            # ...

Impact: Context validation will always fail because context_size will be missing from the top level, causing reported_ctx to default to 0.

2. src/gaia/llm/lemonade_client.py - get_status() ⚠️ MEDIUM IMPACT

def get_status(self) -> LemonadeStatus:
    # ...
    status.context_size = health.get("context_size", 0)  # ⚠️ Will always be 0

Impact: LemonadeStatus.context_size will always report 0, affecting any status displays or dependent logic.

3. tests/test_lemonade_client.py - Test Mocks 🔧 LOW IMPACT

health_response = {
    "status": "ok",
    "checkpoint_loaded": "amd/Llama-3.2-3B-Instruct-...",  # Field being removed
    "model_loaded": TEST_MODEL,
}

Impact: Test mocks use outdated response format. Tests still pass since checkpoint_loaded isn't actually parsed in production code, but mocks should be updated to reflect the new format.

Not Affected

  • checkpoint_loaded - Only appears in test mocks, not parsed in production code
  • Basic health checks (status == "ok") - Still work as before

Required Changes

1. Update validate_context_size()

def validate_context_size(self, required_tokens: int = 32768, quiet: bool = False) -> tuple:
    try:
        health = self.health_check()
        
        # Lemonade 9.1.4+: context_size moved to all_models_loaded[N].recipe_options.ctx_size
        all_models = health.get("all_models_loaded", [])
        if all_models:
            # Get context size from the first loaded model (typically the LLM)
            reported_ctx = all_models[0].get("recipe_options", {}).get("ctx_size", 0)
        else:
            # Fallback for older Lemonade versions
            reported_ctx = health.get("context_size", 0)
        # ...

2. Update get_status()

def get_status(self) -> LemonadeStatus:
    # ...
    all_models = health.get("all_models_loaded", [])
    if all_models:
        status.context_size = all_models[0].get("recipe_options", {}).get("ctx_size", 0)
    else:
        status.context_size = health.get("context_size", 0)

3. Update Test Mocks

Update tests/test_lemonade_client.py to use the new response format with all_models_loaded.

4. Consider: Multi-Model Support

The new format supports multiple models. Consider whether GAIA should:

  • Track context sizes per model type (LLM vs embedding vs reranking)
  • Update LemonadeStatus to store multiple context sizes
  • Add helper methods to query specific model types

Acceptance Criteria

  • Update validate_context_size() to read from new location
  • Update get_status() to read from new location
  • Update test mocks to use Lemonade 9.1.4 response format
  • Add backward compatibility for older Lemonade versions (if needed)
  • Update minimum Lemonade version requirement in documentation/requirements
  • Test with Lemonade 9.1.4

References

Metadata

Metadata

Assignees

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions