fix(llm): kimi-k2.6 rejects temperature=0 — raises 400 on every call

## Problem

`extract_corpus_parallel(files, backend="kimi")` always raises `openai.BadRequestError: 400` before extracting a single node. The call never reaches the model.

```
openai.BadRequestError: Error code: 400 - {
  'error': {
    'message': 'invalid temperature: only 1 is allowed for this model',
    'type': 'invalid_request_error'
  }
}
```

## Root cause

`_call_openai_compat` hardcodes `temperature=0`:

```python
# graphify/llm.py, _call_openai_compat
resp = client.chat.completions.create(
    model=model,
    ...
    temperature=0,   # <-- always sent, regardless of backend
)
```

`kimi-k2.6` (and `kimi-k2.5`) enforce a **model-level fixed temperature**. The [official Kimi API docs](https://platform.kimi.ai/docs/guide/kimi-k2-6-quickstart) state:

> k2.6/k2.5 will use a fixed value of **1.0** in thinking mode and **0.6** in instant mode. Any other value will result in an error.

This is a model constraint, not an account tier restriction. Every user hitting the Kimi backend gets this error regardless of their plan.

## Repro

```python
import os
from pathlib import Path
from graphify.llm import extract_corpus_parallel

os.environ["MOONSHOT_API_KEY"] = "<your-key>"
extract_corpus_parallel([Path("README.md")], backend="kimi")
# → openai.BadRequestError: 400 invalid temperature
```

## Fix

Make temperature backend-configurable in `BACKENDS` and skip the parameter when `None`:

```python
BACKENDS = {
    "claude": {
        ...
        "temperature": 0,   # deterministic extraction
    },
    "kimi": {
        ...
        "temperature": None,  # model enforces its own fixed temperature; sending any value raises 400
    },
}

def _call_openai_compat(base_url, api_key, model, user_message, temperature=0):
    create_kwargs = {
        "model": model,
        "messages": [...],
        "max_completion_tokens": 8192,
    }
    if temperature is not None:
        create_kwargs["temperature"] = temperature
    resp = client.chat.completions.create(**create_kwargs)

def extract_files_direct(...):
    ...
    return _call_openai_compat(
        cfg["base_url"], key, mdl, user_msg,
        temperature=cfg.get("temperature", 0)
    )
```

This keeps `temperature=0` for Claude (deterministic), omits the parameter for Kimi (lets the model use its built-in default), and is trivially extensible for future backends.

The same pattern applies to any future OpenAI-compat backend that restricts or ignores the temperature field.

## Notes

- Same root cause affects `kimi-k2.5` — both models have the fixed-temperature constraint.
- The `_call_claude` path is unaffected (uses the `anthropic` SDK directly, no temperature kwarg sent).
- No behavior change for existing Claude users.

Happy to submit a PR with tests if useful.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(llm): kimi-k2.6 rejects temperature=0 — raises 400 on every call #610

Problem

Root cause

Repro

Fix

Notes

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

fix(llm): kimi-k2.6 rejects temperature=0 — raises 400 on every call #610

Description

Problem

Root cause

Repro

Fix

Notes

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions