cc-experiment-reports

ETL pipeline for analyzing Claude Code session performance on GraalVM Truffle runtime optimization tasks. Produces SVG charts for embedding in a master thesis.

Setup

uv sync

Usage

uv run cc-create-report <config.yaml>

Example with included test data:

uv run cc-create-report examples/config.yaml

Configuration

experiments:
  - id: plain-claude
    path: ./data/2026-02-15-a-s4-test
  - id: plugin-claude
    path: ./data/2026-02-16-b-s4-test

analyses:
  - strategy: single_experiment
    experiment: plain-claude

  - strategy: single_experiment
    experiment: plugin-claude

  - strategy: compare_experiments
    experiments: [plain-claude, plugin-claude]

Paths are resolved relative to the config file location.

Strategies

Single-experiment (analyze one experiment)

ID	Name	Chart	Description
SE-1	Geo-mean progression	Line	Overall benchmark performance across iterations
SE-2	Character usage	Stacked bar	Output character breakdown (thinking, read, edit, bash, skill)
SE-3	Sub-agent tokens	Stacked bar	Token distribution across main/Explore/Bash agents
SE-4	Per-benchmark	Line (one per benchmark)	Individual benchmark progression
SE-5	Iteration efficiency	Scatter	Performance improvement vs. cost per iteration
SE-6	Session duration	Grouped bar	Session duration across iterations

Comparison (compare N experiments)

ID	Name	Chart	Description
CE-1	Runtime comparison	Grouped bar	Final-iteration benchmark runtimes
CE-2	Geo-mean comparison	Line	Geo-mean progression overlaid
CE-3	Token & cost	Grouped bar	Total tokens, cost, messages
CE-4	Tool usage	Grouped bar	Calls per tool type
CE-5	Character usage	Grouped bar	Characters per category
CE-6	USD efficiency	Bar	Improvement per dollar
CE-7	Token efficiency	Bar	Improvement per token
CE-8	Consistency	Box plot	Final geo-mean distribution across runs
CE-9	Duration	Grouped bar	Session duration per iteration

Input data format

Each experiment is a directory containing:

<id>--summary.csv.csv — one session metrics file
<id>--run-<R>--iteration-<I>--<benchmark>.csv — benchmark result files

See REQUIREMENTS.md for full column schemas.

Adding a strategy

Create a file in src/etl/strategies/single/ or src/etl/strategies/comparison/:

from etl.strategy import SingleExperimentStrategy, register_single

@register_single
class MyStrategy(SingleExperimentStrategy):
    name = "my_strategy"

    def run(self, data, output_dir):
        # data.benchmarks, data.summary_aggregated, data.summary_separated
        ...

The file is auto-discovered on import — no registration boilerplate needed.

Tests

uv run pytest

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
examples		examples
src/etl		src/etl
tests		tests
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

cc-experiment-reports

Setup

Usage

Configuration

Strategies

Single-experiment (analyze one experiment)

Comparison (compare N experiments)

Input data format

Adding a strategy

Tests

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

cc-experiment-reports

Setup

Usage

Configuration

Strategies

Single-experiment (analyze one experiment)

Comparison (compare N experiments)

Input data format

Adding a strategy

Tests

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages