[metrics] Add serveral observability metrics #3868

qwes5s5 · 2025-09-03T13:19:59Z

fastdeploy:cache_config_info 类型为 Gauge，记录了当前节点缓存设置(CacheConfig)相关的信息，当引擎进行初始化时进行记录，内含有：

block_size: KV缓存中每个块（block）的大小，以token为单位。
bytes_per_block: 每个KV缓存块所占用的字节数。
bytes_per_layer_per_block: 每个块在单层模型中占用的字节数。
cache_dtype: 缓存中存储键值（KV）对的数据类型。
cache_queue_port: 用于缓存队列通信的端口号。
cache_transfer_protocol: 缓存数据传输使用的协议，例如 ipc（进程间通信）。
dec_token_num: 解码阶段的令牌数量。
each_token_cache_space: 每个令牌在KV缓存中占用的空间大小。
enable_chunked_prefill: 是否启用分块预填充。
enable_hierarchical_cache: 是否启用分层缓存。
enable_prefix_caching: 是否启用前缀缓存。
enable_ssd_cache: 是否启用SSD固态硬盘缓存。
enc_dec_block_num: 编码器-解码器模型的块数量。
gpu_memory_utilization: 允许使用的GPU内存占总GPU内存的比例。
kv_cache_ratio: KV缓存占用的内存比例。
max_block_num_per_seq: 每个序列（请求）允许分配的最大块数量。
model_cfg: 模型的配置对象。
num_cpu_blocks: 分配给CPU的缓存块数量。
num_gpu_blocks_override: 覆盖默认值后，分配给GPU的缓存块数量。
pd_comm_port: 用于通信的端口号。
prealloc_dec_block_slot_num_threshold: 预先分配解码块槽的阈值。
prefill_kvcache_block_num: 预填充阶段使用的KV缓存块数量。
rdma_comm_ports: 用于RDMA（远程直接数据存取）通信的端口号。
swap_space: 交换空间的大小，用于处理内存不足的情况。
total_block_num: KV缓存的总块数。
示例：

# HELP fastdeploy:cache_config_info Information of the engine's CacheConfig
# TYPE fastdeploy:cache_config_info gauge
fastdeploy:cache_config_info{block_size="64",bytes_per_block="589824",bytes_per_layer_per_block="32768",
cache_dtype="bfloat16",cache_queue_port="8003",cache_transfer_protocol="ipc",dec_token_num="128",
each_token_cache_space="9216",enable_chunked_prefill="False",enable_hierarchical_cache="False",
enable_prefix_caching="False",enable_ssd_cache="False",enc_dec_block_num="2",gpu_memory_utilization="0.9",
kv_cache_ratio="0.75",max_block_num_per_seq="128",model_cfg="<fastdeploy.config.ModelConfig object at 0x7f74a6b413f0>",
num_cpu_blocks="0",num_gpu_blocks_override="1000",pd_comm_port="None",prealloc_dec_block_slot_num_threshold="5",
prefill_kvcache_block_num="750",rdma_comm_ports="None",swap_space="None",total_block_num="1000"} 1.0

fastdeploy:available_batch_size，类型为Gauge，记录了当前节点可用批量大小，表示还能接受多少新的请求，在资源分配和回收时记录

# HELP fastdeploy:available_batch_size Number of how many new requests the system can still accept.
# TYPE fastdeploy:available_batch_size gauge
fastdeploy:available_batch_size 128.0

fastdeploy:hit_req_rate，类型为Gauge，记录了Request级别前缀缓存命中率，在CacheMetrics更新命中metrics时进行记录

# HELP fastdeploy:hit_req_rate Request-level prefix cache hit rate
# TYPE fastdeploy:hit_req_rate gauge
fastdeploy:hit_req_rate 0.5

fastdeploy:hit_token_rate，类型为Gauge，记录了Token级别的前缀缓存命中率，在CacheMetrics更新命中metrics时进行记录

# HELP fastdeploy:hit_token_rate Token-level prefix cache hit rate
# TYPE fastdeploy:hit_token_rate gauge
fastdeploy:hit_token_rate 0.5

fastdeploy:cpu_hit_token_rate，类型为Gauge，记录了Token级别cpu前缀缓存命中率，在CacheMetrics更新命中metrics时进行记录

# HELP fastdeploy:cpu_hit_token_rate Token-level CPU prefix cache hit rate
# TYPE fastdeploy:cpu_hit_token_rate gauge
fastdeploy:cpu_hit_token_rate 0.0

fastdeploy:gpu_hit_token_rate，类型为Gauge，记录了Token级别GPU前缀缓存命中率，在CacheMetrics更新命中metrics时进行记录

# HELP fastdeploy:gpu_hit_token_rate Token-level GPU prefix cache hit rate
# TYPE fastdeploy:gpu_hit_token_rate gauge
fastdeploy:gpu_hit_token_rate 0.5

CLAassistant · 2025-09-03T13:20:05Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
2 out of 3 committers have signed the CLA.

✅ qwes5s5
✅ Jiang-Jia-Jun
❌ K11OntheBoat

K11OntheBoat seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

paddle-bot · 2025-09-03T13:20:07Z

Thanks for your contribution!

Jiang-Jia-Jun · 2025-09-03T13:25:29Z

docs/zh/online_serving/metrics.md

 | `fastdeploy:request_success_total`        | Counter   | 成功处理的请求个数           | 个   |
-
+| `fastdeploy:cache_config_info`            | Gauge     | 推理引擎的缓存配置信息        | 个   |
+| `fastdeploy:available_batch_size`         | Gauge     | 系统还可以接受的请求数量      | 个   |


这里改成 Decode阶段还可以插入的请求数量

Jiang-Jia-Jun · 2025-09-03T13:26:08Z

fastdeploy/metrics/metrics.py

+        "available_batch_size": {
+            "type": Gauge,
+            "name": "fastdeploy:available_batch_size",
+            "description": "Number of how many new requests the system can still accept.",


对应英文也改一下

Jiang-Jia-Jun · 2025-09-03T13:28:07Z

docs/online_serving/metrics.md

 | `fastdeploy:request_success_total`           | Counter   | Number of successfully processed requests           | Count   |
-
+| `fastdeploy:cache_config_info`               | Gauge     | Information of the engine's CacheConfig             | Count   |
+| `fastdeploy:available_batch_size`            | Gauge     | Number of how many new requests the system can still accept| Count   |


同中文，改一下注释含义，例如 "Number of requests that can continue to be inserted during the decode phase"

…wes5s5/FastDeploy into feature/add-monitoring-metrics

…toring-metrics

…wes5s5/FastDeploy into feature/add-monitoring-metrics

…toring-metrics

…wes5s5/FastDeploy into feature/add-monitoring-metrics

* Add several observability metrics * [wenxin-tools-584] 【可观测性】支持查看本节点的并发数、剩余block_size、排队请求数等信息 * adjust some metrics and md files * trigger ci * adjust ci file * trigger ci * trigger ci --------- Co-authored-by: K11OntheBoat <[email protected]> Co-authored-by: Jiang-Jia-Jun <[email protected]>

* [metrics] Add serveral observability metrics (#3868) * Add several observability metrics * [wenxin-tools-584] 【可观测性】支持查看本节点的并发数、剩余block_size、排队请求数等信息 * adjust some metrics and md files * trigger ci * adjust ci file * trigger ci * trigger ci --------- Co-authored-by: K11OntheBoat <[email protected]> Co-authored-by: Jiang-Jia-Jun <[email protected]> * version adjust --------- Co-authored-by: K11OntheBoat <[email protected]> Co-authored-by: Jiang-Jia-Jun <[email protected]>

K11OntheBoat added 7 commits September 1, 2025 13:14

Add several observability metrics

c9ab324

[wenxin-tools-584] 【可观测性】支持查看本节点的并发数、剩余block_size、排队请求数等信息

1f6f00d

Merge branch 'develop' into feature/add-monitoring-metrics

554062e

add metrics unit test

f90f8d6

Merge branch 'develop' into feature/add-monitoring-metrics

ea5b3b1

adjust some metrics and md files

e2a3c7c

Merge branch 'develop' into feature/add-monitoring-metrics

e036593

paddle-bot bot added the contributor External developers label Sep 3, 2025

Jiang-Jia-Jun requested changes Sep 3, 2025

View reviewed changes

adjust some metrics and md files

a4b3c78

Jiang-Jia-Jun changed the title ~~Add serveral observability metrics~~ [metrics] Add serveral observability metrics Sep 3, 2025

Merge branch 'feature/add-monitoring-metrics' of https://github.com/q…

44eb587

…wes5s5/FastDeploy into feature/add-monitoring-metrics

qwes5s5 requested a review from Jiang-Jia-Jun September 3, 2025 13:56

Jiang-Jia-Jun previously approved these changes Sep 3, 2025

View reviewed changes

K11OntheBoat added 5 commits September 4, 2025 23:10

Merge remote-tracking branch 'upstream/develop' into feature/add-moni…

4c56d8e

…toring-metrics

trigger ci

d61162e

Merge remote-tracking branch 'upstream/develop' into feature/add-moni…

5ff22e9

…toring-metrics

adjust ci file

8ae8332

Merge remote-tracking branch 'upstream/develop' into feature/add-moni…

e3d37b2

…toring-metrics

qwes5s5 dismissed Jiang-Jia-Jun’s stale review via e3d37b2 September 5, 2025 12:53

qwes5s5 requested a review from Jiang-Jia-Jun September 5, 2025 12:53

Jiang-Jia-Jun and others added 7 commits September 7, 2025 14:01

Merge branch 'develop' into feature/add-monitoring-metrics

ef5462a

Merge remote-tracking branch 'upstream/develop' into feature/add-moni…

0625e2b

…toring-metrics

Merge branch 'feature/add-monitoring-metrics' of https://github.com/q…

6590b50

…wes5s5/FastDeploy into feature/add-monitoring-metrics

trigger ci

6b72fd0

trigger ci

bbaae69

Merge remote-tracking branch 'upstream/develop' into feature/add-moni…

a91039a

…toring-metrics

adjust set_cache_config_info

49b4773

K11OntheBoat and others added 5 commits September 7, 2025 21:36

Merge branch 'feature/add-monitoring-metrics' of https://github.com/q…

9501704

…wes5s5/FastDeploy into feature/add-monitoring-metrics

add unittest

8665a1d

Merge branch 'feature/add-monitoring-metrics' of https://github.com/q…

34247ac

…wes5s5/FastDeploy into feature/add-monitoring-metrics

Merge branch 'develop' into feature/add-monitoring-metrics

e0584ef

Merge branch 'develop' into feature/add-monitoring-metrics

5116b4f

Jiang-Jia-Jun merged commit 17169a1 into PaddlePaddle:develop Sep 8, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[metrics] Add serveral observability metrics #3868

[metrics] Add serveral observability metrics #3868

Uh oh!

qwes5s5 commented Sep 3, 2025

Uh oh!

CLAassistant commented Sep 3, 2025 •

edited

Loading

Uh oh!

paddle-bot bot commented Sep 3, 2025

Uh oh!

Jiang-Jia-Jun Sep 3, 2025

Uh oh!

Jiang-Jia-Jun Sep 3, 2025

Uh oh!

Jiang-Jia-Jun Sep 3, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[metrics] Add serveral observability metrics #3868

[metrics] Add serveral observability metrics #3868

Uh oh!

Conversation

qwes5s5 commented Sep 3, 2025

Uh oh!

CLAassistant commented Sep 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

paddle-bot bot commented Sep 3, 2025

Uh oh!

Jiang-Jia-Jun Sep 3, 2025

Choose a reason for hiding this comment

Uh oh!

Jiang-Jia-Jun Sep 3, 2025

Choose a reason for hiding this comment

Uh oh!

Jiang-Jia-Jun Sep 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

CLAassistant commented Sep 3, 2025 •

edited

Loading

Jiang-Jia-Jun Sep 3, 2025 •

edited

Loading