Auto-escape some markdown syntax in markdown labels by lukasmasuch · Pull Request #13887 · streamlit/streamlit

lukasmasuch · 2026-02-10T16:34:19Z

Describe your changes

This PR fixes issue #7359 where widget labels containing markdown syntax characters (-, +, *, #, >, 1.) would render as empty labels because the markdown parser was interpreting them as list markers, headings, or blockquotes which are then stripped for labels.

The fix escapes these markdown syntax patterns when isLabel is true, converting them to literal text by adding backslash escapes before markdown is processed. The escaping only applies to patterns followed by whitespace or end of line (e.g., "- item" is escaped but "not-a-list" is not).

GitHub Issue Link (if applicable)

Fixes #7359

Testing Plan

Unit Tests: 175 passing tests covering escaped patterns, non-escaped patterns, edge cases with pre-escaped text, and emphasis markdown
E2E Tests: Added test cases for "+" and "1. Something" labels to verify they display correctly in buttons
No additional manual testing needed beyond existing test coverage

Co-authored-by: sea-turt1e [email protected]

See original PR [fix] Implement markdown escaping for labels in StreamlitMarkdown com… #12695

Contribution License Agreement

By submitting this pull request you agree that all contributions to this project are made under the Apache 2.0 license.

Escape markdown syntax patterns in labels that would otherwise be stripped, leaving empty content. The fix adds backslash escapes before markdown list markers (-, +, *), headings (#), blockquotes (>), and ordered list markers (1., 1), etc.) when they appear at the start of a line followed by whitespace. Also includes comprehensive unit tests covering escaped patterns, non-escaped patterns, and edge cases like pre-escaped text and emphasis markdown.

snyk-io · 2026-02-10T16:34:33Z

✅ Snyk checks have passed. No issues have been found so far.

Status	Scanner	Critical	High	Medium	Low	Total (0)
✅	Open Source Security	0	0	0	0	0 issues
✅	Licenses	0	0	0	0	0 issues

💻 Catch issues earlier using the plugins for VS Code, JetBrains IDEs, Visual Studio, and Eclipse.

github-actions · 2026-02-10T16:34:39Z

✅ PR preview is ready!

Name	Link
📦 Wheel file	https://core-previews.s3-us-west-2.amazonaws.com/pr-13887/streamlit-1.54.0-py3-none-any.whl
📦 `@streamlit/component-v2-lib`	Download from artifacts
🕹️ Preview app	pr-13887.streamlit.app (☁️ Deploy here if not accessible)

Copilot

Pull request overview

This PR fixes issue #7359 where widget labels containing markdown syntax characters (-, +, *, #, >, 1.) would render as empty because the markdown parser interpreted them as list markers, headings, or blockquotes that are then stripped from labels.

Changes:

Added escaping logic in StreamlitMarkdown.tsx to escape markdown syntax patterns when isLabel is true
Added comprehensive unit tests covering escaped patterns, non-escaped patterns, edge cases, and emphasis markdown
Added E2E tests for button labels with "+" and "1. Something" to verify the fix works end-to-end

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

File	Description
frontend/lib/src/components/shared/StreamlitMarkdown/StreamlitMarkdown.tsx	Implements markdown syntax escaping for labels using two regex patterns: one for unordered lists/headings/blockquotes, one for ordered lists
frontend/lib/src/components/shared/StreamlitMarkdown/StreamlitMarkdown.test.tsx	Adds 175 test cases covering escaped patterns (-, +, *, #, >, 1.), non-escaped patterns (mid-word hyphens, hashtags, decimals), pre-escaped text, and emphasis markdown; updates existing test expectations to reflect the new escaping behavior
e2e_playwright/st_button_test.py	Adds E2E test to verify markdown syntax characters are displayed literally in button labels
e2e_playwright/st_button.py	Adds test buttons with "+" and "1. Something" labels to test app

frontend/lib/src/components/shared/StreamlitMarkdown/StreamlitMarkdown.test.tsx

…Markdown.test.tsx Co-authored-by: Copilot <[email protected]>

github-actions · 2026-02-10T17:36:47Z

Summary

This PR fixes issue #7359 where widget labels containing markdown syntax characters (-, +, *, #, >, 1., etc.) would render as empty or incorrect labels. The root cause was that the markdown parser interpreted these patterns as list markers, headings, or blockquotes, which were then stripped by the existing disallowedElements mechanism — leaving empty labels.

The fix adds regex-based escaping of markdown syntax patterns in processedSource when isLabel is true, converting them to literal text (via backslash escapes) before the markdown parser processes them. The escaping is carefully scoped to only match patterns at the start of a line followed by whitespace or end-of-line, avoiding false positives on text like not-a-list, #hashtag, or 1.5.

Changed files:

frontend/lib/src/components/shared/StreamlitMarkdown/StreamlitMarkdown.tsx — Core fix (regex escaping)
frontend/lib/src/components/shared/StreamlitMarkdown/StreamlitMarkdown.test.tsx — Unit tests
e2e_playwright/st_button.py — E2E test app script
e2e_playwright/st_button_test.py — E2E test assertions

Code Quality

The implementation is clean and well-structured:

Regex correctness: Both regex patterns are well-crafted:
- /^(\s*)((?:[+\-*]|#+)(?=\s|$)|>)/gm handles unordered lists, headings, and blockquotes correctly.
- /^(\s*)(\d+)([.)])(?=\s|$)/gm handles ordered lists, escaping only the punctuation (not the digits).
- The gm flags ensure multi-line handling works correctly.
- The ^ anchor prevents matching mid-line occurrences (e.g., not-a-list).
No double-escaping: Pre-escaped input like 1\. text or \- text does not match the regexes because \ is not in the matched character classes. This is correct.
Complement with existing mechanism: The LABEL_DISALLOWED_ELEMENTS list (line 871-892) still serves its purpose for elements not handled by escaping (e.g., tables). The two mechanisms work well together.
Proper memoization: The processing is inside useMemo with the correct dependency array [source, isLabel] (line 1062).
Good inline comments: The regex patterns are well-documented with examples of what they escape and what they don't.

Minor observations:

invalidCases test description is slightly stale (StreamlitMarkdown.test.tsx line 630): The test name "does NOT render invalid markdown when isLabel is true" was accurate when the behavior was "strip disallowed elements." Now the behavior is "escape so markdown is never parsed in the first place." The test still passes correctly (escaped text renders as a <p>, not the disallowed tag), but the description could be updated to reflect the new escaping behavior for clarity.
getBy* + toBeInTheDocument pattern (e.g., lines 633-634, 701-702): Per the frontend AGENTS.md, getBy* already throws if the element is not found, making toBeInTheDocument redundant — toBeVisible is preferred. However, this pattern is extensively used in the existing test file, so it's consistent with the surrounding code.

Test Coverage

Unit tests (113 lines added): Comprehensive and well-organized:

markdownEscapingCases (17 cases): Covers all escaped patterns — single characters (-, +, *, >, #), patterns with text, indented patterns, ordered lists with . and ), multi-digit ordered lists (99.), and multi-hash headings. Also verifies elements render as <p> (plain paragraph), not as special elements.
nonEscapingCases (5 cases): Important anti-regression tests ensuring no over-escaping — mid-word hyphens, hashless hashtags, decimal numbers, and pre-escaped text.
Emphasis test: Verifies *italic label* still renders as <em>, confirming the regex doesn't break emphasis syntax.
Updated invalidCases: Expectations correctly updated to reflect the new literal-text behavior.

E2E tests (12 lines added): Appropriately lightweight:

Tests + and 1. Something labels on buttons, verifying they display literally.
Uses get_element_by_key per best practices.
TOTAL_BUTTONS count correctly updated from 30 to 32.
Minor: Per e2e AGENTS.md, adding a negative assertion would strengthen the test (e.g., assert no empty button text or that the button label is not empty). However, the to_contain_text assertions implicitly verify the text is present and non-empty.

Backwards Compatibility

This is a backwards-compatible bug fix. The behavioral change only affects labels that previously had their content incorrectly stripped:

Labels with plain text are unaffected.
Labels using inline markdown (bold, italic, code, links) are unaffected — the regex only matches line-start patterns with space/EOL lookahead.
Labels that were previously empty due to the bug now correctly display the intended text.

The only theoretical concern would be users intentionally relying on markdown stripping in labels (e.g., using # as a prefix that gets removed). This would be an extreme edge case exploiting buggy behavior, and the fix is clearly the correct behavior.

Security & Risk

Low risk: The change is a pre-processing text transformation (regex escaping) that runs before the existing markdown parser. It adds backslash characters, which is safe.
No XSS concern: The escaping only adds \ characters to prevent markdown parsing. It does not remove any existing security mechanisms (HTML sanitization, allowHTML controls, etc.).
No user-provided regex: The regex patterns are static constants, not derived from user input.

Accessibility

This change improves accessibility by ensuring widget labels display their intended text content. Previously, labels like "-" or "+" would render as empty, which is problematic for both visual and screen reader users. Now they render as visible, readable text.

No new interactive elements or ARIA attributes are introduced, so no additional accessibility concerns apply.

Recommendations

(Optional) Update the invalidCases test description at StreamlitMarkdown.test.tsx line 630: Consider renaming from "does NOT render invalid markdown when isLabel is true - $tag" to something like "escapes markdown syntax in labels so it renders as plain text - $tag" to better reflect the new escaping behavior.
(Optional) Add a negative assertion to the E2E test: Per e2e best practices, test_markdown_syntax_in_labels could assert that the button labels are not empty elements, e.g., checking that the button doesn't contain an empty <p> tag or verifying to_have_count(1) for the text locator.
(Nit) Consider extracting regexes to module-level constants: Per the "Static Data Structures" guideline in frontend/AGENTS.md, the two regex patterns inside useMemo are static and could be extracted to module-level const variables. While useMemo already prevents re-creation on re-renders, module-level extraction would be slightly cleaner and make them reusable/testable independently. However, since they're only used in one place and useMemo handles the performance concern, this is a minor style preference.

Verdict

APPROVED: Clean, well-tested bug fix that correctly escapes markdown syntax characters in widget labels to prevent them from being stripped, resolving #7359 without breaking existing functionality.

This is an automated AI review by opus-4.6-thinking.

cursor

✅ Bugbot reviewed your changes and found no new issues!

Comment @cursor review or bugbot run to trigger another review on this PR

Copilot AI review requested due to automatic review settings February 10, 2026 16:34

Copilot started reviewing on behalf of lukasmasuch February 10, 2026 16:34 View session

Copilot AI reviewed Feb 10, 2026

View reviewed changes

frontend/lib/src/components/shared/StreamlitMarkdown/StreamlitMarkdown.test.tsx Outdated Show resolved Hide resolved

lukasmasuch added security-assessment-completed change:feature PR contains new feature or enhancement implementation impact:users PR changes affect end users labels Feb 10, 2026

Update frontend/lib/src/components/shared/StreamlitMarkdown/Streamlit…

8f74421

…Markdown.test.tsx Co-authored-by: Copilot <[email protected]>

lukasmasuch changed the title ~~Fix markdown syntax characters in widget labels~~ Auto-escape some markdown syntax characters in markdown labels Feb 10, 2026

lukasmasuch added the ai-review If applied to PR or issue will run AI review workflow label Feb 10, 2026

github-actions bot removed the ai-review If applied to PR or issue will run AI review workflow label Feb 10, 2026

cursor bot reviewed Feb 10, 2026

View reviewed changes

lukasmasuch changed the title ~~Auto-escape some markdown syntax characters in markdown labels~~ Auto-escape some markdown syntax in markdown labels Feb 10, 2026

mayagbarnes approved these changes Feb 11, 2026

View reviewed changes

lukasmasuch merged commit 9661d92 into develop Feb 11, 2026
56 of 57 checks passed

lukasmasuch deleted the lukasmasuch/fix-markdown-dash branch February 11, 2026 17:11

lukasmasuch mentioned this pull request Feb 23, 2026

[fix] Implement markdown escaping for labels in StreamlitMarkdown com… #12695

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Auto-escape some markdown syntax in markdown labels#13887

Auto-escape some markdown syntax in markdown labels#13887
lukasmasuch merged 2 commits intodevelopfrom
lukasmasuch/fix-markdown-dash

lukasmasuch commented Feb 10, 2026 •

edited

Loading

Uh oh!

snyk-io bot commented Feb 10, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Feb 10, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

github-actions bot commented Feb 10, 2026

Uh oh!

cursor bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

lukasmasuch commented Feb 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Describe your changes

GitHub Issue Link (if applicable)

Testing Plan

Uh oh!

snyk-io bot commented Feb 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Snyk checks have passed. No issues have been found so far.

Uh oh!

github-actions bot commented Feb 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ PR preview is ready!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

github-actions bot commented Feb 10, 2026

Summary

Code Quality

Test Coverage

Backwards Compatibility

Security & Risk

Accessibility

Recommendations

Verdict

Uh oh!

cursor bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

lukasmasuch commented Feb 10, 2026 •

edited

Loading

snyk-io bot commented Feb 10, 2026 •

edited

Loading

github-actions bot commented Feb 10, 2026 •

edited

Loading