Skip to content

[Fix] Make _stcore/metrics OpenMetrics-compliant#14538

Merged
sfc-gh-mbarnes merged 2 commits intodevelopfrom
fix-metrics
Mar 27, 2026
Merged

[Fix] Make _stcore/metrics OpenMetrics-compliant#14538
sfc-gh-mbarnes merged 2 commits intodevelopfrom
fix-metrics

Conversation

@mayagbarnes
Copy link
Copy Markdown
Collaborator

@mayagbarnes mayagbarnes commented Mar 27, 2026

Describe your changes

  • Canonicalize counter metric families in /_stcore/metrics so metadata uses OpenMetrics-compliant family names while emitted counter samples still use the _total suffix.
  • Tighten text serialization to omit empty # UNIT lines and emit # HELP <family_name> <help>.
  • Update runtime, server, and e2e metrics tests for the session_events family rename and the canonical family/sample split.

Follow-up to #14476

API-visible change

  • families=session_events_total no longer works.
  • Consumers should now request families=session_events.
  • The emitted sample line remains session_events_total{...}.

Testing Plan

  • Unit Tests (JS and/or Python)
  • E2E Tests

@mayagbarnes mayagbarnes added change:refactor PR contains code refactoring without behavior change impact:users PR changes affect end users labels Mar 27, 2026
@snyk-io
Copy link
Copy Markdown
Contributor

snyk-io bot commented Mar 27, 2026

Snyk checks have passed. No issues have been found so far.

Status Scan Engine Critical High Medium Low Total (0)
Open Source Security 0 0 0 0 0 issues
Licenses 0 0 0 0 0 issues

💻 Catch issues earlier using the plugins for VS Code, JetBrains IDEs, Visual Studio, and Eclipse.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 27, 2026

✅ PR preview is ready!

Name Link
📦 Wheel file https://core-previews.s3-us-west-2.amazonaws.com/pr-14538/streamlit-1.55.0-py3-none-any.whl
📦 @streamlit/component-v2-lib Download from artifacts
🕹️ Preview app pr-14538.streamlit.app (☁️ Deploy here if not accessible)

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates Streamlit’s /_stcore/metrics OpenMetrics text output to be more spec-compliant by using canonical counter family names (without _total) in metadata while continuing to emit counter samples with the _total suffix, and by tightening comment serialization.

Changes:

  • Canonicalize the session_events counter family name (metadata uses session_events; sample remains session_events_total{...}), and remove the counter sample_name override mechanism.
  • Update text serialization to omit # UNIT lines for unitless metrics and to emit # HELP <family_name> <help>.
  • Update unit, server, and e2e tests to reflect the family rename and the metadata/sample split behavior.

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated no comments.

Show a summary per file
File Description
lib/streamlit/runtime/stats.py Renames SESSION_EVENTS_FAMILY and makes counter samples consistently render as <family>_total.
lib/streamlit/runtime/websocket_session_manager.py Aligns emitted session metrics with canonical counter family naming (no explicit sample name override).
lib/streamlit/web/server/stats_request_handler.py Tightens OpenMetrics text comment rendering (# HELP includes family name; omits empty # UNIT).
lib/tests/streamlit/runtime/stats_test.py Updates CounterStat expectations to match _total sample rendering.
lib/tests/streamlit/runtime/websocket_session_manager_test.py Updates runtime stats assertions for session_events family and _total sample naming.
lib/tests/streamlit/web/server/stats_handler_test.py Updates Tornado metrics endpoint tests for session_events family and # HELP/# UNIT formatting.
lib/tests/streamlit/web/server/starlette/starlette_app_test.py Updates Starlette metrics endpoint tests for the same OpenMetrics output changes.
e2e_playwright/web_server_test.py Updates e2e coverage for metrics filtering with canonical families=session_events.
e2e_playwright/websocket_reconnects_test.py Updates e2e metrics scraping to request families=session_events while parsing session_events_total{...} samples.

@mayagbarnes mayagbarnes added the ai-review If applied to PR or issue will run AI review workflow label Mar 27, 2026
@github-actions github-actions bot removed the ai-review If applied to PR or issue will run AI review workflow label Mar 27, 2026
Copy link
Copy Markdown
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary

This PR makes the /_stcore/metrics endpoint OpenMetrics-compliant by canonicalizing counter metric family names. The key changes:

  1. Family name fix: SESSION_EVENTS_FAMILY changes from "session_events_total" to "session_events" — the _total suffix is reserved for counter sample names per the OpenMetrics spec, not the metric family name.
  2. sample_name field removed: CounterStat.sample_name is eliminated; to_metric_str() now derives the sample name by appending _total to the family name automatically.
  3. # HELP line fix: Now emits # HELP <family_name> <help> instead of # HELP <help>, matching the OpenMetrics specification.
  4. # UNIT line fix: Empty unit lines are no longer emitted, matching the OpenMetrics specification.
  5. Test updates: All unit and E2E tests are updated for the new family name and serialization format.

This is a follow-up to #14476 and introduces an intentional API-visible breaking change: consumers must now use families=session_events instead of families=session_events_total.

All three expected reviewers (claude-4.6-opus-high-thinking, gemini-3.1-pro, gpt-5.3-codex-high) completed their reviews.

Code Quality

Unanimous agreement across all reviewers. The code is clean, well-structured, and follows existing patterns. The removal of the sample_name field in favor of deriving the sample name from the family name (f"{self.family_name}_total") is a good simplification that eliminates a source of inconsistency where callers had to manually keep the sample name and family name in sync. The text serialization logic in stats_request_handler.py is improved — conditionally emitting # UNIT and including the family name in # HELP both align with the OpenMetrics exposition format.

Test Coverage

Unanimous agreement across all reviewers. Test coverage is thorough and well-updated across all affected layers:

  • Unit tests (stats_test.py): Counter stat serialization tests updated, sample_name test replaced with _total-suffix convention test.
  • Unit tests (websocket_session_manager_test.py): Family name references updated; new to_metric_str() assertion verifies sample name behavior.
  • Handler tests (stats_handler_test.py, starlette_app_test.py): Text output assertions verify corrected # TYPE, # HELP, and # UNIT format across both Tornado and Starlette paths.
  • E2E tests (web_server_test.py, websocket_reconnects_test.py): Query parameters updated; assertions still verify session_events_total appears in output (the sample name) while other families are excluded.

Backwards Compatibility

Minor disagreement between reviewers. All three reviewers identified the same breaking change: families=session_events_total no longer returns session event metrics — consumers must use families=session_events. The emitted sample lines remain session_events_total{...}, so scraping tools matching on sample names are unaffected.

  • claude-4.6-opus-high-thinking and gemini-3.1-pro: Accepted this as an intentional, documented breaking change with limited blast radius given the endpoint's recent introduction and _stcore prefix.
  • gpt-5.3-codex-high: Flagged this as a blocking backwards-compatibility issue, recommending a deprecation alias accepting both names.

Resolution: The breaking change is intentional, clearly documented in the PR, and scoped to an internal endpoint (_stcore) introduced very recently (#14476). The change moves toward spec compliance, which benefits long-term compatibility with standard Prometheus/OpenMetrics tooling. A deprecation alias would add complexity for minimal benefit given the limited consumer base. This is non-blocking, but migration guidance in release notes is a reasonable suggestion.

Security & Risk

Unanimous agreement. No security concerns identified. Changes are limited to metric serialization formatting and don't touch authentication, authorization, CORS, WebSocket handling, file serving, or any security-sensitive codepath.

External test recommendation

  • Recommend external_test: No
  • Triggered categories: None
  • Evidence:
    • Changes are limited to metric serialization format in lib/streamlit/runtime/stats.py and lib/streamlit/web/server/stats_request_handler.py
    • No routing, auth, WebSocket, embedding, asset serving, CORS, or security header changes
    • The /_stcore/metrics endpoint URL and access pattern remain identical
  • Confidence: High

Reviewer disagreement on this point: gpt-5.3-codex-high recommended external tests targeting routing/URL behavior (medium confidence), while claude-4.6-opus-high-thinking and gemini-3.1-pro found no external test need. The endpoint URL itself is unchanged — only the query parameter semantics for family filtering changed. This does not affect standard Prometheus scraping (which fetches all metrics without family filters), so external testing is not warranted.

Accessibility

No frontend changes are included in this PR. No accessibility considerations apply. All reviewers agreed on this point.

Recommendations

  1. Consider adding brief migration guidance in release notes for any consumers using families=session_events_total query parameter (all reviewers noted the breaking change).

Verdict

APPROVED: Clean, well-tested OpenMetrics compliance fix that correctly canonicalizes counter family names, fixes HELP/UNIT metadata formatting, and thoroughly updates all tests. The intentional breaking change is acceptable given the endpoint's recent introduction and internal scope. Two of three reviewers approved; the single changes-requested concern (backwards compatibility deprecation window) is non-blocking given context.


Consolidated AI review by claude-4.6-opus-high-thinking from 3 individual reviews (claude-4.6-opus-high-thinking, gemini-3.1-pro, gpt-5.3-codex-high).

This review also includes 1 inline comment(s) on specific code lines.


CACHE_MEMORY_FAMILY: Final = "cache_memory_bytes"
SESSION_EVENTS_FAMILY: Final = "session_events_total"
SESSION_EVENTS_FAMILY: Final = "session_events"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thought: Renaming the family key from session_events_total to session_events is an intentional breaking change for families= query consumers. If any internal tooling or dashboards rely on families=session_events_total, they will silently receive empty results. Consider whether a deprecation alias (accepting both names for one release cycle) is warranted, though the recent introduction of this endpoint (#14476) and its _stcore prefix significantly limit blast radius.

@sfc-gh-mbarnes sfc-gh-mbarnes marked this pull request as ready for review March 27, 2026 19:45
Copy link
Copy Markdown
Collaborator

@lukasmasuch lukasmasuch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

@sfc-gh-mbarnes sfc-gh-mbarnes merged commit 26a7b56 into develop Mar 27, 2026
85 of 86 checks passed
@sfc-gh-mbarnes sfc-gh-mbarnes deleted the fix-metrics branch March 27, 2026 20:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

change:refactor PR contains code refactoring without behavior change impact:users PR changes affect end users

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants