ETL pipeline for analyzing Claude Code session performance on GraalVM Truffle runtime optimization tasks. Produces SVG charts for embedding in a master thesis.
uv syncuv run cc-create-report <config.yaml>Example with included test data:
uv run cc-create-report examples/config.yamlexperiments:
- id: plain-claude
path: ./data/2026-02-15-a-s4-test
- id: plugin-claude
path: ./data/2026-02-16-b-s4-test
analyses:
- strategy: single_experiment
experiment: plain-claude
- strategy: single_experiment
experiment: plugin-claude
- strategy: compare_experiments
experiments: [plain-claude, plugin-claude]Paths are resolved relative to the config file location.
| ID | Name | Chart | Description |
|---|---|---|---|
| SE-1 | Geo-mean progression | Line | Overall benchmark performance across iterations |
| SE-2 | Character usage | Stacked bar | Output character breakdown (thinking, read, edit, bash, skill) |
| SE-3 | Sub-agent tokens | Stacked bar | Token distribution across main/Explore/Bash agents |
| SE-4 | Per-benchmark | Line (one per benchmark) | Individual benchmark progression |
| SE-5 | Iteration efficiency | Scatter | Performance improvement vs. cost per iteration |
| SE-6 | Session duration | Grouped bar | Session duration across iterations |
| ID | Name | Chart | Description |
|---|---|---|---|
| CE-1 | Runtime comparison | Grouped bar | Final-iteration benchmark runtimes |
| CE-2 | Geo-mean comparison | Line | Geo-mean progression overlaid |
| CE-3 | Token & cost | Grouped bar | Total tokens, cost, messages |
| CE-4 | Tool usage | Grouped bar | Calls per tool type |
| CE-5 | Character usage | Grouped bar | Characters per category |
| CE-6 | USD efficiency | Bar | Improvement per dollar |
| CE-7 | Token efficiency | Bar | Improvement per token |
| CE-8 | Consistency | Box plot | Final geo-mean distribution across runs |
| CE-9 | Duration | Grouped bar | Session duration per iteration |
Each experiment is a directory containing:
<id>--summary.csv.csv— one session metrics file<id>--run-<R>--iteration-<I>--<benchmark>.csv— benchmark result files
See REQUIREMENTS.md for full column schemas.
Create a file in src/etl/strategies/single/ or src/etl/strategies/comparison/:
from etl.strategy import SingleExperimentStrategy, register_single
@register_single
class MyStrategy(SingleExperimentStrategy):
name = "my_strategy"
def run(self, data, output_dir):
# data.benchmarks, data.summary_aggregated, data.summary_separated
...The file is auto-discovered on import — no registration boilerplate needed.
uv run pytest