Skip to content

Additional Grafana dashboards: inference metrics and cost tracking #5

@Defilan

Description

@Defilan

Description

Create additional Grafana dashboards beyond the current GPU metrics dashboard.

Goals

  1. Inference Metrics Dashboard

    • Requests per second
    • Tokens per second (prompt + generation)
    • Latency percentiles (P50, P95, P99)
    • Request queue depth
    • Model usage breakdown
  2. Cost Tracking Dashboard

    • GPU utilization vs cost
    • Cost per 1K tokens
    • Idle time tracking
    • Spot instance savings
    • Monthly cost projections

Success Criteria

  • Inference metrics dashboard JSON
  • Cost tracking dashboard JSON
  • Import instructions in docs
  • Screenshots in documentation
  • Alert rules for cost thresholds

Sprint

Sprint 2-3 priority

Current State

GPU hardware metrics dashboard exists at config/grafana/llmkube-gpu-dashboard.json

Good First Issue

Great for contributors familiar with Grafana and PromQL!

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions