Parse LLM API response logs (Anthropic, OpenAI, Google) and generate a readable usage + cost report. Works on raw JSONL you already have sitting in log files — no framework adoption, no SDK wrapping, no proxy.
pip install llm-usage-reportPoint it at one or more JSONL files (or stdin), it reads the usage / usageMetadata fields from each line, normalizes them, and prints a table like:
model requests input output cache_read cache_new cost (USD)
----------------- -------- ------ ------ ---------- --------- ----------
claude-opus-4-5 8 120000 35000 20000 50000 $4.57
claude-sonnet-4-5 42 88000 32000 0 0 $0.75
gpt-4o 15 45000 12000 0 0 $0.23
gemini-2.5-pro 6 8000 3000 0 0 $0.05
----------------- -------- ------ ------ ---------- --------- ----------
TOTAL 71 261000 82000 20000 50000 $5.60
| Provider | Shape it understands |
|---|---|
| Anthropic | {"type":"message","model":"claude-...","usage":{"input_tokens":..., "output_tokens":..., "cache_read_input_tokens":..., "cache_creation_input_tokens":...}} |
| OpenAI | {"object":"chat.completion","model":"gpt-...","usage":{"prompt_tokens":..., "completion_tokens":...}} and the newer response object |
| Google Gemini | {"modelVersion":"gemini-...","usageMetadata":{"promptTokenCount":..., "candidatesTokenCount":..., "cachedContentTokenCount":...}} |
Each line in your JSONL is expected to be a full API response. The parser is lenient: unparseable lines are skipped silently so you can point it at noisy log files.
# Point at a file
llm-usage-report path/to/api-responses.jsonl
# Or a directory (scans recursively for *.jsonl)
llm-usage-report ./logs/
# Or pipe from stdin
tail -f logs/api.jsonl | llm-usage-report -
# Group by day, provider, project, or user
llm-usage-report logs/ --group-by day
llm-usage-report logs/ --group-by provider
# Machine-readable output
llm-usage-report logs/ --format json > report.json
llm-usage-report logs/ --format csv > report.csvInclude optional project and user fields at the top level of each log line:
{"type":"message","model":"claude-sonnet-4-5","usage":{"input_tokens":100,"output_tokens":50},"project":"search","user":"alice"}Then group by them:
llm-usage-report logs/ --group-by project
llm-usage-report logs/ --group-by userPrices are embedded as a dated snapshot (see src/llm_usage_report/pricing.py). They will drift — provider prices change. Two ways to override:
# Via CLI flag
llm-usage-report logs/ --pricing ./my-pricing.json
# Or via environment variable
export LLM_USAGE_REPORT_PRICING=./my-pricing.json
llm-usage-report logs/The override file has the same shape as the built-in table:
{
"claude-sonnet-4-5": {"input": 3.00, "output": 15.00, "cache_read": 0.30, "cache_creation": 3.75},
"gpt-5": {"input": 15.00, "output": 60.00}
}Rates are USD per 1 million tokens. Unknown models contribute zero cost (token totals still aggregate correctly).
from llm_usage_report import parse_stream, aggregate, GroupKey
from llm_usage_report.formatters import format_table
with open("logs/api.jsonl") as f:
records = list(parse_stream(f))
summaries = aggregate(records, group_by=GroupKey.MODEL)
print(format_table(summaries, group_label="model"))This package does one thing: it reads logs you already have and tells you what they cost. It does not:
- proxy your API calls (use LiteLLM if you want that)
- require adopting an SDK (use Braintrust / Helicone / LangSmith if you want full observability)
- pretend to know the future (pricing is a snapshot; override it when you need to)
If you outgrow this, good — that's what full LLM observability platforms are for. This is for the team that's logging to a file today and needs an answer before end of sprint.
MIT.