Lemonade 9.1.4 cautionary upgrade

## Summary

Lemonade 9.1.4 introduces breaking changes to the `/health` endpoint that affect GAIA. This issue tracks the upgrade and required code changes.

## Background

The [Lemonade PR #857](https://github.com/lemonade-sdk/lemonade/pull/857) removes the `checkpoint_loaded` and `context_size` fields from the top level of the `/health` endpoint response. 

### New `/health` Response Format (9.1.4+)
```python
{
    "all_models_loaded": [
        {
            "backend_url": "http://127.0.0.1:8001/v1",
            "checkpoint": "unsloth/Qwen3-0.6B-GGUF:Q4_0",  # NEW location
            "device": "gpu",
            "last_use": 4343774,
            "model_name": "Qwen3-0.6B-GGUF",
            "recipe": "llamacpp",
            "recipe_options": {
                "ctx_size": 8192,  # NEW location
                "llamacpp_args": "--no-mmap",
                "llamacpp_backend": "rocm"
            },
            "type": "llm"
        }
    ],
    "log_streaming": {"sse": true, "websocket": false},
    "max_models": {"embedding": 1, "llm": 1, "reranking": 1},
    "model_loaded": "Qwen3-0.6B-GGUF",
    "status": "ok",
    "version": "9.1.4"
}
```

### Migration Guide
| Old Field (removed) | New Location |
|---------------------|--------------|
| `response["checkpoint_loaded"]` | `response["all_models_loaded"][N]["checkpoint"]` |
| `response["context_size"]` | `response["all_models_loaded"][N]["recipe_options"]["ctx_size"]` |

---

## Impact Analysis

### Affected Code in GAIA

#### 1. `src/gaia/llm/lemonade_client.py` - `validate_context_size()` ⚠️ **HIGH IMPACT**
```python
def validate_context_size(self, required_tokens: int = 32768, quiet: bool = False) -> tuple:
    try:
        health = self.health_check()
        reported_ctx = health.get("context_size", 0)  # ⚠️ Will always be 0 after 9.1.4
        if reported_ctx >= required_tokens:
            # ...
```

**Impact:** Context validation will always fail because `context_size` will be missing from the top level, causing `reported_ctx` to default to `0`.

#### 2. `src/gaia/llm/lemonade_client.py` - `get_status()` ⚠️ **MEDIUM IMPACT**
```python
def get_status(self) -> LemonadeStatus:
    # ...
    status.context_size = health.get("context_size", 0)  # ⚠️ Will always be 0
```

**Impact:** `LemonadeStatus.context_size` will always report `0`, affecting any status displays or dependent logic.

#### 3. `tests/test_lemonade_client.py` - Test Mocks 🔧 **LOW IMPACT**
```python
health_response = {
    "status": "ok",
    "checkpoint_loaded": "amd/Llama-3.2-3B-Instruct-...",  # Field being removed
    "model_loaded": TEST_MODEL,
}
```

**Impact:** Test mocks use outdated response format. Tests still pass since `checkpoint_loaded` isn't actually parsed in production code, but mocks should be updated to reflect the new format.

### Not Affected
- `checkpoint_loaded` - Only appears in test mocks, not parsed in production code
- Basic health checks (`status == "ok"`) - Still work as before

---

## Required Changes

### 1. Update `validate_context_size()`
```python
def validate_context_size(self, required_tokens: int = 32768, quiet: bool = False) -> tuple:
    try:
        health = self.health_check()
        
        # Lemonade 9.1.4+: context_size moved to all_models_loaded[N].recipe_options.ctx_size
        all_models = health.get("all_models_loaded", [])
        if all_models:
            # Get context size from the first loaded model (typically the LLM)
            reported_ctx = all_models[0].get("recipe_options", {}).get("ctx_size", 0)
        else:
            # Fallback for older Lemonade versions
            reported_ctx = health.get("context_size", 0)
        # ...
```

### 2. Update `get_status()`
```python
def get_status(self) -> LemonadeStatus:
    # ...
    all_models = health.get("all_models_loaded", [])
    if all_models:
        status.context_size = all_models[0].get("recipe_options", {}).get("ctx_size", 0)
    else:
        status.context_size = health.get("context_size", 0)
```

### 3. Update Test Mocks

Update `tests/test_lemonade_client.py` to use the new response format with `all_models_loaded`.

### 4. Consider: Multi-Model Support

The new format supports multiple models. Consider whether GAIA should:
- Track context sizes per model type (LLM vs embedding vs reranking)
- Update `LemonadeStatus` to store multiple context sizes
- Add helper methods to query specific model types

---

## Acceptance Criteria

- [ ] Update `validate_context_size()` to read from new location
- [ ] Update `get_status()` to read from new location  
- [ ] Update test mocks to use Lemonade 9.1.4 response format
- [ ] Add backward compatibility for older Lemonade versions (if needed)
- [ ] Update minimum Lemonade version requirement in documentation/requirements
- [ ] Test with Lemonade 9.1.4

---

## References

- Lemonade PR: https://github.com/lemonade-sdk/lemonade/pull/857
- Lemonade 9.1.4 Release

Old Field (removed)	New Location
`response["checkpoint_loaded"]`	`response["all_models_loaded"][N]["checkpoint"]`
`response["context_size"]`	`response["all_models_loaded"][N]["recipe_options"]["ctx_size"]`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Lemonade 9.1.4 cautionary upgrade #193

Summary

Background

New `/health` Response Format (9.1.4+)

Migration Guide

Impact Analysis

Affected Code in GAIA

1. `src/gaia/llm/lemonade_client.py` - `validate_context_size()` ⚠️ HIGH IMPACT

2. `src/gaia/llm/lemonade_client.py` - `get_status()` ⚠️ MEDIUM IMPACT

3. `tests/test_lemonade_client.py` - Test Mocks 🔧 LOW IMPACT

Not Affected

Required Changes

1. Update `validate_context_size()`

2. Update `get_status()`

3. Update Test Mocks

4. Consider: Multi-Model Support

Acceptance Criteria

References

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Lemonade 9.1.4 cautionary upgrade #193

Description

Summary

Background

New /health Response Format (9.1.4+)

Migration Guide

Impact Analysis

Affected Code in GAIA

1. src/gaia/llm/lemonade_client.py - validate_context_size() ⚠️ HIGH IMPACT

2. src/gaia/llm/lemonade_client.py - get_status() ⚠️ MEDIUM IMPACT

3. tests/test_lemonade_client.py - Test Mocks 🔧 LOW IMPACT

Not Affected

Required Changes

1. Update validate_context_size()

2. Update get_status()

3. Update Test Mocks

4. Consider: Multi-Model Support

Acceptance Criteria

References

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

New `/health` Response Format (9.1.4+)

1. `src/gaia/llm/lemonade_client.py` - `validate_context_size()` ⚠️ HIGH IMPACT

2. `src/gaia/llm/lemonade_client.py` - `get_status()` ⚠️ MEDIUM IMPACT

3. `tests/test_lemonade_client.py` - Test Mocks 🔧 LOW IMPACT

1. Update `validate_context_size()`

2. Update `get_status()`