Skip to content

Releases: defilantech/LLMKube

v0.5.3

01 Apr 17:10
86f9bbe

Choose a tag to compare

0.5.3 (2026-04-01)

Features

  • add KV cache type configuration and extraArgs escape hatch (#256) (7a4b855)
  • add Ollama as runtime backend for Metal agent (#258) (6148b89)
  • add oMLX as alternative runtime backend for Metal agent (#257) (eaf9045)

Bug Fixes

llmkube-0.5.3

01 Apr 17:10
86f9bbe

Choose a tag to compare

A Helm chart for LLMKube - Kubernetes operator for GPU-accelerated LLM inference

v0.5.2

28 Mar 02:36
eed8274

Choose a tag to compare

0.5.2 (2026-03-27)

Features

  • add pod security context defaults and CRD overrides (#239) (904432b)

Documentation

llmkube-0.5.2

28 Mar 02:36
eed8274

Choose a tag to compare

A Helm chart for LLMKube - Kubernetes operator for GPU-accelerated LLM inference

v0.5.1

16 Mar 06:48
4a22006

Choose a tag to compare

0.5.1 (2026-03-16)

Features

  • add memory pressure watchdog with runtime monitoring (#216) (5fa6d54)
  • add pvc:// model source and SHA256 integrity verification (#229) (1b94f5d)
  • auto-detect llama-server from Homebrew paths on macOS (#215) (a1e4302)

Bug Fixes

  • controller metrics port declarations and ServiceMonitor consistency (#214) (296ec99)
  • correct CHANGELOG entry from 0.4.21 to 0.5.0 (#212) (f7f703a)
  • quote job-level if expression to fix YAML parsing in helm-chart workflow (8714b9f)

llmkube-0.5.1

16 Mar 06:48
4a22006

Choose a tag to compare

A Helm chart for LLMKube - Kubernetes operator for GPU-accelerated LLM inference

llmkube-0.5.0

05 Mar 09:35

Choose a tag to compare

Helm chart for LLMKube v0.5.0 — fixes appVersion to match published controller image

v0.5.0

04 Mar 09:40
b2e53b8

Choose a tag to compare

0.5.0 (2026-03-04)

Features

  • add pre-flight memory validation for Metal agent (#204) (ba252ef)
  • add health checks, metrics, and continuous monitoring to Metal agent (#205) (a113fd1)
  • add per-model memoryBudget and memoryFraction CRD fields (#206) (e632369)

Bug Fixes

  • agent: unregister service endpoints on metal process delete (#168) (147b9bc)
  • enable controller metrics endpoint in Helm chart (#195) (70940af)
  • prevent model re-download of cached models after helm upgrade (#203) (a8f9a88)
  • use Recreate strategy for GPU workloads to prevent rolling update deadlock (#196) (2e45181)

Documentation

  • rewrite README for clarity, positioning, and growth (#190) (a7fc152)

v0.4.20

01 Mar 00:21
205d91d

Choose a tag to compare

0.4.20 (2026-02-28)

Features

  • add license compliance scanning for GGUF models (#188) (c26400a)
  • add Prometheus metrics, OpenTelemetry tracing, and inference observability (#189) (c653ff1)
  • add PVC inspection to cache list for orphaned entry detection (#183) (2723d92)
  • agent: add structured zap logging to metal agent (#164) (e9d143c)
  • deps: upgrade to Kubernetes 1.35 and controller-runtime v0.23.1 (#175) (3c323f4)

Bug Fixes

  • correct Metal quickstart docs for selectorless services (#173) (89471ec)
  • prevent command injection in init container shell commands (#172) (3aa9cc3)
  • remove mutable latest tags and pin container images (#174) (3c4569a)

Documentation

  • add Apple Silicon Metal option to bug report template (#169) (e7689d8)

llmkube-0.4.20

01 Mar 00:21
205d91d

Choose a tag to compare

A Helm chart for LLMKube - Kubernetes operator for GPU-accelerated LLM inference