Skip to content

Conversation

@chang-wenbin
Copy link
Collaborator

@chang-wenbin chang-wenbin commented Oct 10, 2025

Calculate paddle_peak_increase using paddle_allocated_mem_after_run:
paddle_peak_increase = paddle_allocated_mem_after_run - paddle_allocated_mem_before_run
This way paddle_peak_increase is closer to the actual situation.

The reserved variable returns the current memory size managed by the Allocator. The allocated variable returns the current memory size allocated to the Tensor. The difference we are concerned about here is the memory size allocated to the active Tensor before and after the profile. Therefore, theoretically, using paddle_allocated_mem_after_run - paddle_allocated_mem_before_run can meet the requirement. Using reserved may cause the calculated difference to be artificially high, resulting in a smaller memory size allocated to the kv-cache

@paddle-bot
Copy link

paddle-bot bot commented Oct 10, 2025

Thanks for your contribution!

Copy link
Collaborator

@gongshaotian gongshaotian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@gongshaotian gongshaotian merged commit 533896f into PaddlePaddle:develop Oct 10, 2025
25 of 29 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants