Adding scripts to track the performance of the agent by eddierichter-amd · Pull Request #158 · amd/gaia

eddierichter-amd · 2026-01-09T18:21:07Z

Add util/token-and-performance-script/perf_analysis.py, a plotting tool that ingests one or more llama.cpp server logs and generates prompt/input/output token count charts along with TTFT and TPS plots.
Compute aggregate prefill vs decode time (prefill from TTFT, decode from output_tokens/TPS) and render a per-log pie chart figure with legends mapping pies to logs; annotate slices with percentage and seconds.
Document requirements, usage (python perf_analysis.py [--show] <log_file>...), and saved outputs in util/token-and-performance-script/README.md.

…ken sizes, ttft, and tps for an entire run of an agent.

eddierichter-amd · 2026-01-13T03:45:23Z

@claude review this PR.

github-actions · 2026-01-13T03:45:44Z

Claude finished @eddierichter-amd's task —— View job

Reviewing PR

Read the changed files
Review code quality and structure
Check for issues and best practices
Provide feedback

Code Review: Performance Analysis Script

Overall, this is a well-structured utility script for analyzing llama.cpp server logs. The code is clean, uses good Python practices (dataclasses, enums, type hints), and includes comprehensive documentation.

🟢 Strengths

Clean architecture: Good use of @dataclass, Enum, and type hints throughout
Error handling: Proper file validation and OSError handling in main()
Modular design: Extraction, plotting, and CLI parsing are well-separated
Documentation: README.md is clear with usage examples

🟡 Suggestions

Type hint issue (perf_analysis.py:166):
```
def autopct_with_seconds(values: Sequence[float]) -> callable:
```
Should be Callable[[float], str] (capitalized, from typing). callable is a builtin function, not a type.
Potential division by zero (perf_analysis.py:305-309):
The if tps > 0 guard is good, but consider logging/warning when TPS values are zero since it indicates missing data.
Output directory flexibility: Currently saves to current working directory. Consider adding an --output-dir argument for flexibility.
Missing file header year range (perf_analysis.py:3):
```
# Copyright(C) 2025-2026 Advanced Micro Devices, Inc.
```
Per CLAUDE.md, should be 2024-2025. The 2025-2026 range seems incorrect.
Unused import (perf_analysis.py:22):
Tuple from typing is imported but Python 3.9+ allows using tuple directly in annotations (already used in some places).
Consider adding --quiet flag: The script prints status messages for each plot. A quiet mode would be useful for scripted usage.

🟢 Minor Nits

README.md line 44 has a trailing newline (cosmetic)
Consider adding example output screenshots to the README for users unfamiliar with the metrics

Summary

This is solid, production-ready code. The suggestions above are minor improvements. The script will be valuable for profiling GAIA agent performance with Lemonade.

Recommendation: Approve with minor fixes for the copyright year and type hint.

eddierichter-amd · 2026-01-13T03:58:19Z

@claude I fixed the typing issues. I will address the header issue in another PR. Anything else for this one?

github-actions · 2026-01-13T03:58:36Z

Claude finished @eddierichter-amd's task —— View job

Re-reviewing PR Changes

Read the updated perf_analysis.py file
Verify the typing fixes
Check for any remaining issues
Provide final feedback

Review: Updated Code

I see the type hint fix on line 166 - you've updated from callable to Callable[[float], str]. However, there's a missing import that will cause a runtime error:

🔴 Issue: Missing Import (`perf_analysis.py:22,166`)

# Line 22 - Current:
from typing import Dict, Iterable, List, Sequence, Tuple

# Line 166 uses:
def autopct_with_seconds(values: Sequence[float]) -> Callable[[float], str]:

Callable is used but not imported. This will raise a NameError when the function is called.

Fix: Add Callable to the import:

from typing import Callable, Dict, Iterable, List, Sequence, Tuple

Also, as noted before, Tuple is imported but unused since you're using tuple directly in type hints (Python 3.9+ style). You could remove it, but it's not critical.

Summary

One quick fix needed - add Callable to the imports on line 22. Everything else looks good!

eddierichter-amd · 2026-01-13T04:02:25Z

@claude good catch! Fixed that. Anything else?

github-actions · 2026-01-13T04:02:48Z

Claude finished @eddierichter-amd's task —— View job

Final Review Check

Read the updated perf_analysis.py file
Verify the Callable import fix
Check for any remaining issues
Provide final feedback

Review Complete

The Callable import is now correctly added on line 22. The code looks good to go!

One tiny remaining nit:

🟢 Minor: Tuple was removed from the import (good!) but you're using Tuple on line 117 in the type hint Sequence[Tuple[str, List[float]]]. This will cause a NameError at runtime. Either:
- Add Tuple back to the imports, or
- Use lowercase tuple (Python 3.9+ style): Sequence[tuple[str, List[float]]]

Same applies to lines 150, 272, and 275.

Otherwise, this PR is ready to merge!

eddierichter-amd · 2026-01-13T04:06:57Z

@kovtcharov-amd passed Claude review. let me know what you think.

util/token-and-performance-script/README.md

…have everythinging in the GAIA cli now

eddierichter-amd self-assigned this Jan 9, 2026

eddierichter-amd requested review from itomek-amd and kovtcharov-amd January 9, 2026 18:36

eddierichter-amd added 4 commits January 10, 2026 03:14

Adding a script that can take in logs from lemonade and output the to…

08490e4

…ken sizes, ttft, and tps for an entire run of an agent.

Renaming script and adding some additional functionality

fd26553

Adding headers

9ec72df

lint

83fc05d

eddierichter-amd force-pushed the eddie/token-and-performance-tracking-eddie-fork branch from 76ec95d to 83fc05d Compare January 9, 2026 19:14

Adding how to obtain a log file from Lemonade to the documentation

80f466c

eddierichter-amd added 2 commits January 12, 2026 20:56

Fixing typing of autopct_with_seconds

1871961

Merge branch 'main' into eddie/token-and-performance-tracking-eddie-fork

78e2031

eddierichter-amd added 2 commits January 12, 2026 21:01

Adding Callable to the import

73d7505

Removing Tuple from import list

4d1ffbd

kovtcharov-amd approved these changes Jan 13, 2026

View reviewed changes

util/token-and-performance-script/README.md Outdated Show resolved Hide resolved

util/token-and-performance-script/README.md Outdated Show resolved Hide resolved

util/token-and-performance-script/README.md Outdated Show resolved Hide resolved

eddierichter-amd added 3 commits January 13, 2026 09:23

Adding a .mdx file to show the performance visualizer on the website

afce8a4

Renaming title of README

c925a38

Updating perf vis to accessible from the GAIA CLI

f42b7c0

github-actions bot added documentation Documentation changes cli CLI changes labels Jan 13, 2026

eddierichter-amd added 3 commits January 13, 2026 09:59

Removing the script and documentation from the util/ directory as we …

2f12c61

…have everythinging in the GAIA cli now

lint

686ce23

Adding header to perf_analysis.py

2bcbbc4

eddierichter-amd merged commit 45ea206 into amd:main Jan 13, 2026
48 of 52 checks passed

eddierichter-amd deleted the eddie/token-and-performance-tracking-eddie-fork branch January 13, 2026 18:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding scripts to track the performance of the agent#158

Adding scripts to track the performance of the agent#158
eddierichter-amd merged 15 commits intoamd:mainfrom
eddierichter-amd:eddie/token-and-performance-tracking-eddie-fork

eddierichter-amd commented Jan 9, 2026

Uh oh!

eddierichter-amd commented Jan 13, 2026

Uh oh!

github-actions bot commented Jan 13, 2026 •

edited

Loading

Uh oh!

eddierichter-amd commented Jan 13, 2026

Uh oh!

github-actions bot commented Jan 13, 2026 •

edited

Loading

Uh oh!

eddierichter-amd commented Jan 13, 2026

Uh oh!

github-actions bot commented Jan 13, 2026 •

edited

Loading

Uh oh!

eddierichter-amd commented Jan 13, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

eddierichter-amd commented Jan 9, 2026

Uh oh!

eddierichter-amd commented Jan 13, 2026

Uh oh!

github-actions bot commented Jan 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviewing PR

Code Review: Performance Analysis Script

🟢 Strengths

🟡 Suggestions

🟢 Minor Nits

Summary

Uh oh!

eddierichter-amd commented Jan 13, 2026

Uh oh!

github-actions bot commented Jan 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Re-reviewing PR Changes

Review: Updated Code

🔴 Issue: Missing Import (perf_analysis.py:22,166)

Summary

Uh oh!

eddierichter-amd commented Jan 13, 2026

Uh oh!

github-actions bot commented Jan 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Final Review Check

Review Complete

Uh oh!

eddierichter-amd commented Jan 13, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

github-actions bot commented Jan 13, 2026 •

edited

Loading

github-actions bot commented Jan 13, 2026 •

edited

Loading

🔴 Issue: Missing Import (`perf_analysis.py:22,166`)

github-actions bot commented Jan 13, 2026 •

edited

Loading