-
Notifications
You must be signed in to change notification settings - Fork 487
Update evaluation docs #2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
AnuradhaKaruppiah
merged 2 commits into
NVIDIA:develop
from
AnuradhaKaruppiah:eval-doc-fixes
Mar 16, 2025
Merged
Update evaluation docs #2
AnuradhaKaruppiah
merged 2 commits into
NVIDIA:develop
from
AnuradhaKaruppiah:eval-doc-fixes
Mar 16, 2025
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Contributor
AnuradhaKaruppiah
commented
Mar 15, 2025
- Fix config file used in the concepts readmed. config.yml => eval_config.yml
- Fix the format for "--skip-workflow" to specify the dataset
- Miscellaneous cleanup
5627c17 to
8978f46
Compare
1. Fix config file used in the concepts readmed. config.yml => eval_config.yml 2. Fix the format for "--skip-workflow" to specify the dataset 2. Miscellaneous cleanup Signed-off-by: Anuradha Karuppiah <[email protected]>
8978f46 to
98b1377
Compare
sean-javiya-nvidia
approved these changes
Mar 16, 2025
Signed-off-by: Anuradha Karuppiah <[email protected]>
Contributor
Author
|
CI is not setup yet but I ran vale locally |
AnuradhaKaruppiah
referenced
this pull request
in AnuradhaKaruppiah/oss-agentiq
Aug 4, 2025
Update evaluation docs
copy-pr-bot bot
pushed a commit
that referenced
this pull request
Aug 13, 2025
…cs-p2 Vale spelling fixes
scheckerNV
pushed a commit
to scheckerNV/aiq-factory-reset
that referenced
this pull request
Aug 22, 2025
Update evaluation docs
rapids-bot bot
pushed a commit
that referenced
this pull request
Oct 10, 2025
1. Add intermediate_manager instance tracking similar to observabilty 2. Add configurable tracemalloc to track top users 3. Add a debug end point for dumping stats Sample Usage: ``` nat mcp serve --config_file examples/getting_started/simple_calculator/configs/config.yml --enable_memory_profiling True --memory_profile_interval 10 --memory_profile_log_level=INFO ``` Start the client and run eval against that endpoint to similuate multiple-users: ``` nat serve --config_file examples/MCP/simple_calculator_mcp/configs/config-mcp-client.yml nat eval --config_file examples/evaluation_and_profiling/simple_calculator_eval/configs/config-tunable-rag-eval.yml --endpoint http://localhost:8000 --reps=2 ``` Sample Output (an intentional resource leak was used to reference output; this is not expected with a regular workflow): ``` ================================================================================ 2025-10-09 19:53:34 - INFO - nat.front_ends.mcp.memory_profiler:166 - MEMORY PROFILE AFTER 20 REQUESTS: 2025-10-09 19:53:34 - INFO - nat.front_ends.mcp.memory_profiler:167 - Current Memory: 2.95 MB 2025-10-09 19:53:34 - INFO - nat.front_ends.mcp.memory_profiler:168 - Peak Memory: 7.35 MB 2025-10-09 19:53:34 - INFO - nat.front_ends.mcp.memory_profiler:169 - 2025-10-09 19:53:34 - INFO - nat.front_ends.mcp.memory_profiler:170 - NAT COMPONENT INSTANCES: 2025-10-09 19:53:34 - INFO - nat.front_ends.mcp.memory_profiler:171 - IntermediateStepManagers: 1 active (0 outstanding steps) 2025-10-09 19:53:34 - INFO - nat.front_ends.mcp.memory_profiler:174 - BaseExporters: 0 active (0 isolated) 2025-10-09 19:53:34 - INFO - nat.front_ends.mcp.memory_profiler:175 - Subject (event streams): 1 instances 2025-10-09 19:53:34 - INFO - nat.front_ends.mcp.memory_profiler:182 - TOP 10 MEMORY GROWTH SINCE BASELINE: 2025-10-09 19:53:34 - INFO - nat.front_ends.mcp.memory_profiler:189 - #1: /home/devcontainers/.local/share/uv/python/cpython-3.12.8-linux-x86_64-gnu/lib/python3.12/linecache.py:139: size=753 KiB (+753 KiB), count=7950 (+7950), average=97 B 2025-10-09 19:53:34 - INFO - nat.front_ends.mcp.memory_profiler:189 - #2: <frozen importlib._bootstrap_external>:757: size=704 KiB (+704 KiB), count=5558 (+5558), average=130 B 2025-10-09 19:53:34 - INFO - nat.front_ends.mcp.memory_profiler:189 - #3: <frozen abc>:123: size=188 KiB (+188 KiB), count=2460 (+2460), average=78 B 2025-10-09 19:53:34 - INFO - nat.front_ends.mcp.memory_profiler:189 - #4: /home/devcontainers/dev/forks/nat/examples/getting_started/simple_calculator/src/nat_simple_calculator/register.py:118: size=98.1 KiB (+98.1 KiB), count=10 (+10), average=10041 B 2025-10-09 19:53:34 - INFO - nat.front_ends.mcp.memory_profiler:189 - #5: <frozen abc>:106: size=67.9 KiB (+67.9 KiB), count=238 (+238), average=292 B 2025-10-09 19:53:34 - INFO - nat.front_ends.mcp.memory_profiler:189 - #6: /home/devcontainers/dev/forks/nat/examples/getting_started/simple_calculator/src/nat_simple_calculator/register.py:128: size=48.9 KiB (+48.9 KiB), count=10 (+10), average=5007 B 2025-10-09 19:53:34 - INFO - nat.front_ends.mcp.memory_profiler:189 - #7: /home/devcontainers/dev/forks/nat/examples/getting_started/simple_calculator/src/nat_simple_calculator/register.py:112: size=37.7 KiB (+37.7 KiB), count=11 (+11), average=3509 B 2025-10-09 19:53:34 - INFO - nat.front_ends.mcp.memory_profiler:189 - #8: /home/devcontainers/dev/forks/nat/.venv/lib/python3.12/site-packages/uvicorn/protocols/http/httptools_impl.py:40: size=30.3 KiB (+30.3 KiB), count=346 (+346), average=90 B 2025-10-09 19:53:34 - INFO - nat.front_ends.mcp.memory_profiler:189 - #9: /home/devcontainers/dev/forks/nat/.venv/lib/python3.12/site-packages/pydantic/main.py:253: size=26.0 KiB (+26.0 KiB), count=167 (+167), average=159 B 2025-10-09 19:53:34 - INFO - nat.front_ends.mcp.memory_profiler:189 - #10: /home/devcontainers/dev/forks/nat/.venv/lib/python3.12/site-packages/uvicorn/protocols/http/httptools_impl.py:37: size=24.4 KiB (+24.4 KiB), count=500 (+500), average=50 B 2025-10-09 19:53:34 - INFO - nat.front_ends.mcp.memory_profiler:191 - =============================================================================== ``` You can also get aggregate stats via the `debug/memory/stats` endpoint on the MCP server - ``` curl -s http://localhost:9901/debug/memory/stats |jq { "enabled": true, "request_count": 16, "current_memory_mb": 3.41, "peak_memory_mb": 7.75, "active_intermediate_managers": 1, "outstanding_steps": 0, "active_exporters": 0, "isolated_exporters": 0, "subject_instances": 0 } ``` ## By Submitting this PR I confirm: - I am familiar with the [Contributing Guidelines](https://github.com/NVIDIA/NeMo-Agent-Toolkit/blob/develop/docs/source/resources/contributing.md). - We require that all contributors "sign-off" on their commits. This certifies that the contribution is your original work, or you have rights to submit it under the same license, or a compatible license. - Any contribution which contains commits that are not Signed-Off will not be accepted. - When the PR is ready for review, new or existing tests cover these changes. - When the PR is ready for review, the documentation is up to date with these changes. ## Summary by CodeRabbit * **New Features** * Optional memory profiling for the MCP front end with an enable flag, configurable interval/top‑N, and a new debug endpoint exposing current memory stats. * Per-call profiling hooks integrated into function registration and invocation flows. * **Improvements** * Runtime visibility now includes active manager and outstanding-step counts plus exporter/subject counts. * Safer baseline management and defensive handling when tracing is unavailable; configurable per-request logging. * **Tests** * Comprehensive tests for profiler behavior, metrics, and error handling. Authors: - Anuradha Karuppiah (https://github.com/AnuradhaKaruppiah) Approvers: - Will Killian (https://github.com/willkill07) - Yuchen Zhang (https://github.com/yczhang-nv) URL: #961
elliott-davis
pushed a commit
to elliott-davis/NeMo-Agent-Toolkit
that referenced
this pull request
Oct 30, 2025
1. Add intermediate_manager instance tracking similar to observabilty 2. Add configurable tracemalloc to track top users 3. Add a debug end point for dumping stats Sample Usage: ``` nat mcp serve --config_file examples/getting_started/simple_calculator/configs/config.yml --enable_memory_profiling True --memory_profile_interval 10 --memory_profile_log_level=INFO ``` Start the client and run eval against that endpoint to similuate multiple-users: ``` nat serve --config_file examples/MCP/simple_calculator_mcp/configs/config-mcp-client.yml nat eval --config_file examples/evaluation_and_profiling/simple_calculator_eval/configs/config-tunable-rag-eval.yml --endpoint http://localhost:8000 --reps=2 ``` Sample Output (an intentional resource leak was used to reference output; this is not expected with a regular workflow): ``` ================================================================================ 2025-10-09 19:53:34 - INFO - nat.front_ends.mcp.memory_profiler:166 - MEMORY PROFILE AFTER 20 REQUESTS: 2025-10-09 19:53:34 - INFO - nat.front_ends.mcp.memory_profiler:167 - Current Memory: 2.95 MB 2025-10-09 19:53:34 - INFO - nat.front_ends.mcp.memory_profiler:168 - Peak Memory: 7.35 MB 2025-10-09 19:53:34 - INFO - nat.front_ends.mcp.memory_profiler:169 - 2025-10-09 19:53:34 - INFO - nat.front_ends.mcp.memory_profiler:170 - NAT COMPONENT INSTANCES: 2025-10-09 19:53:34 - INFO - nat.front_ends.mcp.memory_profiler:171 - IntermediateStepManagers: 1 active (0 outstanding steps) 2025-10-09 19:53:34 - INFO - nat.front_ends.mcp.memory_profiler:174 - BaseExporters: 0 active (0 isolated) 2025-10-09 19:53:34 - INFO - nat.front_ends.mcp.memory_profiler:175 - Subject (event streams): 1 instances 2025-10-09 19:53:34 - INFO - nat.front_ends.mcp.memory_profiler:182 - TOP 10 MEMORY GROWTH SINCE BASELINE: 2025-10-09 19:53:34 - INFO - nat.front_ends.mcp.memory_profiler:189 - NVIDIA#1: /home/devcontainers/.local/share/uv/python/cpython-3.12.8-linux-x86_64-gnu/lib/python3.12/linecache.py:139: size=753 KiB (+753 KiB), count=7950 (+7950), average=97 B 2025-10-09 19:53:34 - INFO - nat.front_ends.mcp.memory_profiler:189 - NVIDIA#2: <frozen importlib._bootstrap_external>:757: size=704 KiB (+704 KiB), count=5558 (+5558), average=130 B 2025-10-09 19:53:34 - INFO - nat.front_ends.mcp.memory_profiler:189 - NVIDIA#3: <frozen abc>:123: size=188 KiB (+188 KiB), count=2460 (+2460), average=78 B 2025-10-09 19:53:34 - INFO - nat.front_ends.mcp.memory_profiler:189 - NVIDIA#4: /home/devcontainers/dev/forks/nat/examples/getting_started/simple_calculator/src/nat_simple_calculator/register.py:118: size=98.1 KiB (+98.1 KiB), count=10 (+10), average=10041 B 2025-10-09 19:53:34 - INFO - nat.front_ends.mcp.memory_profiler:189 - NVIDIA#5: <frozen abc>:106: size=67.9 KiB (+67.9 KiB), count=238 (+238), average=292 B 2025-10-09 19:53:34 - INFO - nat.front_ends.mcp.memory_profiler:189 - NVIDIA#6: /home/devcontainers/dev/forks/nat/examples/getting_started/simple_calculator/src/nat_simple_calculator/register.py:128: size=48.9 KiB (+48.9 KiB), count=10 (+10), average=5007 B 2025-10-09 19:53:34 - INFO - nat.front_ends.mcp.memory_profiler:189 - NVIDIA#7: /home/devcontainers/dev/forks/nat/examples/getting_started/simple_calculator/src/nat_simple_calculator/register.py:112: size=37.7 KiB (+37.7 KiB), count=11 (+11), average=3509 B 2025-10-09 19:53:34 - INFO - nat.front_ends.mcp.memory_profiler:189 - NVIDIA#8: /home/devcontainers/dev/forks/nat/.venv/lib/python3.12/site-packages/uvicorn/protocols/http/httptools_impl.py:40: size=30.3 KiB (+30.3 KiB), count=346 (+346), average=90 B 2025-10-09 19:53:34 - INFO - nat.front_ends.mcp.memory_profiler:189 - NVIDIA#9: /home/devcontainers/dev/forks/nat/.venv/lib/python3.12/site-packages/pydantic/main.py:253: size=26.0 KiB (+26.0 KiB), count=167 (+167), average=159 B 2025-10-09 19:53:34 - INFO - nat.front_ends.mcp.memory_profiler:189 - NVIDIA#10: /home/devcontainers/dev/forks/nat/.venv/lib/python3.12/site-packages/uvicorn/protocols/http/httptools_impl.py:37: size=24.4 KiB (+24.4 KiB), count=500 (+500), average=50 B 2025-10-09 19:53:34 - INFO - nat.front_ends.mcp.memory_profiler:191 - =============================================================================== ``` You can also get aggregate stats via the `debug/memory/stats` endpoint on the MCP server - ``` curl -s http://localhost:9901/debug/memory/stats |jq { "enabled": true, "request_count": 16, "current_memory_mb": 3.41, "peak_memory_mb": 7.75, "active_intermediate_managers": 1, "outstanding_steps": 0, "active_exporters": 0, "isolated_exporters": 0, "subject_instances": 0 } ``` - I am familiar with the [Contributing Guidelines](https://github.com/NVIDIA/NeMo-Agent-Toolkit/blob/develop/docs/source/resources/contributing.md). - We require that all contributors "sign-off" on their commits. This certifies that the contribution is your original work, or you have rights to submit it under the same license, or a compatible license. - Any contribution which contains commits that are not Signed-Off will not be accepted. - When the PR is ready for review, new or existing tests cover these changes. - When the PR is ready for review, the documentation is up to date with these changes. * **New Features** * Optional memory profiling for the MCP front end with an enable flag, configurable interval/top‑N, and a new debug endpoint exposing current memory stats. * Per-call profiling hooks integrated into function registration and invocation flows. * **Improvements** * Runtime visibility now includes active manager and outstanding-step counts plus exporter/subject counts. * Safer baseline management and defensive handling when tracing is unavailable; configurable per-request logging. * **Tests** * Comprehensive tests for profiler behavior, metrics, and error handling. Authors: - Anuradha Karuppiah (https://github.com/AnuradhaKaruppiah) Approvers: - Will Killian (https://github.com/willkill07) - Yuchen Zhang (https://github.com/yczhang-nv) URL: NVIDIA#961
dagardner-nv
added a commit
to dagardner-nv/NeMo-Agent-Toolkit
that referenced
this pull request
Nov 3, 2025
…alues exactly Signed-off-by: David Gardner <[email protected]>
copy-pr-bot bot
pushed a commit
that referenced
this pull request
Nov 17, 2025
…api_version-dg Parametarize the api_version argument
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.