Skip to content

Adding scripts to track the performance of the agent#158

Merged
eddierichter-amd merged 15 commits intoamd:mainfrom
eddierichter-amd:eddie/token-and-performance-tracking-eddie-fork
Jan 13, 2026
Merged

Adding scripts to track the performance of the agent#158
eddierichter-amd merged 15 commits intoamd:mainfrom
eddierichter-amd:eddie/token-and-performance-tracking-eddie-fork

Conversation

@eddierichter-amd
Copy link
Collaborator

  • Add util/token-and-performance-script/perf_analysis.py, a plotting tool that ingests one or more llama.cpp server logs and generates prompt/input/output token count charts along with TTFT and TPS plots.
  • Compute aggregate prefill vs decode time (prefill from TTFT, decode from output_tokens/TPS) and render a per-log pie chart figure with legends mapping pies to logs; annotate slices with percentage and seconds.
  • Document requirements, usage (python perf_analysis.py [--show] <log_file>...), and saved outputs in util/token-and-performance-script/README.md.

@eddierichter-amd eddierichter-amd force-pushed the eddie/token-and-performance-tracking-eddie-fork branch from 76ec95d to 83fc05d Compare January 9, 2026 19:14
@eddierichter-amd
Copy link
Collaborator Author

@claude review this PR.

@github-actions
Copy link

github-actions bot commented Jan 13, 2026

Claude finished @eddierichter-amd's task —— View job


Reviewing PR

  • Read the changed files
  • Review code quality and structure
  • Check for issues and best practices
  • Provide feedback

Code Review: Performance Analysis Script

Overall, this is a well-structured utility script for analyzing llama.cpp server logs. The code is clean, uses good Python practices (dataclasses, enums, type hints), and includes comprehensive documentation.

🟢 Strengths

  • Clean architecture: Good use of @dataclass, Enum, and type hints throughout
  • Error handling: Proper file validation and OSError handling in main()
  • Modular design: Extraction, plotting, and CLI parsing are well-separated
  • Documentation: README.md is clear with usage examples

🟡 Suggestions

  1. Type hint issue (perf_analysis.py:166):

    def autopct_with_seconds(values: Sequence[float]) -> callable:

    Should be Callable[[float], str] (capitalized, from typing). callable is a builtin function, not a type.

  2. Potential division by zero (perf_analysis.py:305-309):
    The if tps > 0 guard is good, but consider logging/warning when TPS values are zero since it indicates missing data.

  3. Output directory flexibility: Currently saves to current working directory. Consider adding an --output-dir argument for flexibility.

  4. Missing file header year range (perf_analysis.py:3):

    # Copyright(C) 2025-2026 Advanced Micro Devices, Inc.

    Per CLAUDE.md, should be 2024-2025. The 2025-2026 range seems incorrect.

  5. Unused import (perf_analysis.py:22):
    Tuple from typing is imported but Python 3.9+ allows using tuple directly in annotations (already used in some places).

  6. Consider adding --quiet flag: The script prints status messages for each plot. A quiet mode would be useful for scripted usage.

🟢 Minor Nits

  • README.md line 44 has a trailing newline (cosmetic)
  • Consider adding example output screenshots to the README for users unfamiliar with the metrics

Summary

This is solid, production-ready code. The suggestions above are minor improvements. The script will be valuable for profiling GAIA agent performance with Lemonade.

Recommendation: Approve with minor fixes for the copyright year and type hint.


@eddierichter-amd
Copy link
Collaborator Author

@claude I fixed the typing issues. I will address the header issue in another PR. Anything else for this one?

@github-actions
Copy link

github-actions bot commented Jan 13, 2026

Claude finished @eddierichter-amd's task —— View job


Re-reviewing PR Changes

  • Read the updated perf_analysis.py file
  • Verify the typing fixes
  • Check for any remaining issues
  • Provide final feedback

Review: Updated Code

I see the type hint fix on line 166 - you've updated from callable to Callable[[float], str]. However, there's a missing import that will cause a runtime error:

🔴 Issue: Missing Import (perf_analysis.py:22,166)

# Line 22 - Current:
from typing import Dict, Iterable, List, Sequence, Tuple

# Line 166 uses:
def autopct_with_seconds(values: Sequence[float]) -> Callable[[float], str]:

Callable is used but not imported. This will raise a NameError when the function is called.

Fix: Add Callable to the import:

from typing import Callable, Dict, Iterable, List, Sequence, Tuple

Also, as noted before, Tuple is imported but unused since you're using tuple directly in type hints (Python 3.9+ style). You could remove it, but it's not critical.

Summary

One quick fix needed - add Callable to the imports on line 22. Everything else looks good!


@eddierichter-amd
Copy link
Collaborator Author

@claude good catch! Fixed that. Anything else?

@github-actions
Copy link

github-actions bot commented Jan 13, 2026

Claude finished @eddierichter-amd's task —— View job


Final Review Check

  • Read the updated perf_analysis.py file
  • Verify the Callable import fix
  • Check for any remaining issues
  • Provide final feedback

Review Complete

The Callable import is now correctly added on line 22. The code looks good to go!

One tiny remaining nit:

  • 🟢 Minor: Tuple was removed from the import (good!) but you're using Tuple on line 117 in the type hint Sequence[Tuple[str, List[float]]]. This will cause a NameError at runtime. Either:
    • Add Tuple back to the imports, or
    • Use lowercase tuple (Python 3.9+ style): Sequence[tuple[str, List[float]]]

Same applies to lines 150, 272, and 275.

Otherwise, this PR is ready to merge!


@eddierichter-amd
Copy link
Collaborator Author

@kovtcharov-amd passed Claude review. let me know what you think.

@github-actions github-actions bot added documentation Documentation changes cli CLI changes labels Jan 13, 2026
@eddierichter-amd eddierichter-amd merged commit 45ea206 into amd:main Jan 13, 2026
48 of 52 checks passed
@eddierichter-amd eddierichter-amd deleted the eddie/token-and-performance-tracking-eddie-fork branch January 13, 2026 18:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cli CLI changes documentation Documentation changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants