Skip to content

[cluster telemetry] internal service types#7599

Merged
coszio merged 4 commits intodevfrom
grpc-types-for-peer-telemetry
Dec 4, 2025
Merged

[cluster telemetry] internal service types#7599
coszio merged 4 commits intodevfrom
grpc-types-for-peer-telemetry

Conversation

@coszio
Copy link
Copy Markdown
Contributor

@coszio coszio commented Nov 24, 2025

As a first step towards aggregating telemetry from all peers, we need to be able to communicate telemetry with other peers. This can only be done through the internal service.

This PR adds internal service types for mapping current telemetry into internal service. For now, we include only info about the peer, resharding, and shard transfers.

@coszio coszio requested review from generall and timvisee November 24, 2025 21:41
@coszio coszio added this to the Cluster Telemetry milestone Nov 24, 2025
coderabbitai[bot]

This comment was marked as resolved.

coderabbitai[bot]

This comment was marked as resolved.

@coszio coszio force-pushed the grpc-types-for-peer-telemetry branch from 9b662d4 to bcf2900 Compare November 25, 2025 12:21
coderabbitai[bot]

This comment was marked as resolved.

coderabbitai[bot]

This comment was marked as resolved.

@qdrant qdrant deleted a comment from coderabbitai Bot Dec 2, 2025
Comment on lines +13 to +21
message GetPeerTelemetryResponse {
reserved 1; // id
reserved 2; // app
map<string, CollectionTelemetry> collections = 3; // Mapping from collection name to its telemetry
ClusterTelemetry cluster = 4; // Telemetry about the cluster and peers
reserved 5; // requests
reserved 6; // memory
reserved 7; // hardware
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this basically follows the same structure as our existing telemetry, but it is very limited.

Is there a strong argument for doing it this way, versus implementing a whole new kind of telemetry with it's own fields?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, and it is made to replicate it closely.

We already have the information we need from existing telemetry, so we can reuse the collectors we already have for internal service.

@coszio coszio merged commit d0fcc10 into dev Dec 4, 2025
15 checks passed
@coszio coszio deleted the grpc-types-for-peer-telemetry branch December 4, 2025 15:08
timvisee pushed a commit that referenced this pull request Dec 18, 2025
* stripped down grpc types for peer telemetry

* return err instead of panic

* use correct definitions

* improve types
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants