Skip to content

[fix] Implement markdown escaping for labels in StreamlitMarkdown com…#12695

Closed
sea-turt1e wants to merge 14 commits intostreamlit:developfrom
sea-turt1e:fix/issue_7359_st_button_empty
Closed

[fix] Implement markdown escaping for labels in StreamlitMarkdown com…#12695
sea-turt1e wants to merge 14 commits intostreamlit:developfrom
sea-turt1e:fix/issue_7359_st_button_empty

Conversation

@sea-turt1e
Copy link
Copy Markdown

@sea-turt1e sea-turt1e commented Oct 5, 2025

Describe your changes

  • Escape single-character markdown tokens and list markers when rendering widget labels so that st.button("+") and similar inputs keep their visible text instead of being unwrapped.
  • Normalize the label-preprocessing helper to preserve surrounding whitespace while only escaping tokens that would otherwise trigger list, quote, or horizontal rule rendering.
  • Extend StreamlitMarkdown unit coverage to confirm literal tokens (+, -, >, 1., ***) now render verbatim without enabling disallowed markdown elements.
st.button("+")
st.button("-")
st.button("a")
スクリーンショット 2025-10-05 11 07 50

GitHub Issue Link (if applicable)

#7359

Testing Plan

  • Unit Tests (JS)
    yarn workspace @streamlit/lib test src/components/shared/StreamlitMarkdown/StreamlitMarkdown.test.tsx

Contribution License Agreement

By submitting this pull request you agree that all contributions to this project are made under the Apache 2.0 license.

@snyk-io
Copy link
Copy Markdown
Contributor

snyk-io bot commented Oct 5, 2025

Snyk checks have passed. No issues have been found so far.

Status Scanner Critical High Medium Low Total (0)
Open Source Security 0 0 0 0 0 issues
Licenses 0 0 0 0 0 issues

💻 Catch issues earlier using the plugins for VS Code, JetBrains IDEs, Visual Studio, and Eclipse.

({ input, expected }) => {
render(<StreamlitMarkdown source={input} allowHTML={false} isLabel />)
const markdownText = screen.getByText(expected)
expect(markdownText).toBeInTheDocument()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the TypeScript Test Guide, using queries that throw when elements are not found (like getByText) combined with toBeInTheDocument() assertions is redundant and should be avoided. Since getByText already throws if the element is not found, the toBeInTheDocument() assertion is unnecessary. Either replace toBeInTheDocument() with toBeVisible() or remove the assertion entirely.

Suggested change
expect(markdownText).toBeInTheDocument()
expect(markdownText).toBeVisible()

Spotted by Diamond (based on custom rule: Typescript Unit Tests)

Fix in Graphite


Is this helpful? React 👍 or 👎 to let us know.

@sfc-gh-lwilby sfc-gh-lwilby added security-assessment-completed change:feature PR contains new feature or enhancement implementation impact:users PR changes affect end users labels Oct 6, 2025
@sfc-gh-lwilby sfc-gh-lwilby added the status:needs-product-approval PR requires product approval before merging label Oct 6, 2025
@sfc-gh-lwilby sfc-gh-lwilby requested a review from Copilot October 6, 2025 13:01
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR implements markdown escaping for widget labels in the StreamlitMarkdown component to prevent single-character tokens (like "+", "-", ">") and list markers from being interpreted as markdown syntax. This ensures that buttons and other widgets display their intended text literally instead of being rendered as markdown elements.

  • Added escaping functions to neutralize markdown syntax in label text while preserving whitespace
  • Updated the markdown processing logic to conditionally apply escaping when isLabel is true
  • Enhanced test coverage to verify literal rendering of markdown tokens in labels

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
StreamlitMarkdown.tsx Implements markdown escaping logic for labels and integrates it into the markdown processing pipeline
StreamlitMarkdown.test.tsx Adds test cases to verify literal rendering of markdown tokens and fixes assertion method

Comment on lines +675 to +679
// If the core is empty after trimming, return the original line (it may be all whitespace).
const trimmedCore = core.trim()
if (trimmedCore.length === 0) {
return line
}
Copy link

Copilot AI Oct 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment on line 675 is misleading. The check core.length === 0 on line 672 already handles empty core content, so this check for trimmedCore.length === 0 handles cases where core contains only whitespace, not empty core. The comment should be updated to reflect this distinction.

Copilot uses AI. Check for mistakes.
if (SINGLE_CHARACTER_LABEL_TOKENS.has(trimmedCore)) {
return `${leadingWhitespace}\\${trimmedCore}${trailingWhitespace}`
}
// Match unordered list tokens (-, *, +) with optional trailing dot or parenthesis
Copy link

Copilot AI Oct 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment on line 684 incorrectly describes the regex pattern. The regex matches ordered list tokens (numbers followed by dot or parenthesis), not unordered list tokens. The comment should read 'Match ordered list tokens (1., 2), etc.)' instead.

Suggested change
// Match unordered list tokens (-, *, +) with optional trailing dot or parenthesis
// Match ordered list tokens (1., 2), etc.)

Copilot uses AI. Check for mistakes.
@sfc-gh-lwilby
Copy link
Copy Markdown
Collaborator

@sea-turt1e I think it would be good to add an e2e test as well.

@sea-turt1e
Copy link
Copy Markdown
Author

@sfc-gh-lwilby
I added e2e test, and update by some github copilot revlew.

@jrieke jrieke added status:product-approved Community PR is approved by product team and removed status:needs-product-approval PR requires product approval before merging labels Nov 10, 2025
@jrieke
Copy link
Copy Markdown
Collaborator

jrieke commented Nov 10, 2025

Looks good from product side! @sfc-gh-lwilby do you want to finish the review for this?

@github-actions
Copy link
Copy Markdown
Contributor

Summary

This PR fixes issue #7359 where buttons with labels containing markdown syntax tokens like +, -, 1., *** were being stripped or rendered incorrectly by the markdown processor. The fix introduces a new escapeMarkdownForLabel() function in StreamlitMarkdown.tsx that escapes markdown syntax that would otherwise be interpreted as list markers, horizontal rules, or quote markers when isLabel is true.

Changes:

  • Added escapeMarkdownForLabel() and escapeMarkdownLineForLabel() functions to escape problematic markdown tokens in labels
  • Added SINGLE_CHARACTER_LABEL_TOKENS constant (+, -, *, >)
  • Modified processedSource to apply escaping when rendering labels
  • Added frontend unit tests for literal markdown tokens
  • Added E2E tests with buttons using literal markdown token labels

Code Quality

The implementation is well-structured and follows the codebase patterns:

Positives:

  1. The escaping logic in escapeMarkdownLineForLabel() is clear and well-commented
  2. Proper handling of edge cases:
    • Preserves leading/trailing whitespace
    • Handles empty strings and whitespace-only content
    • Handles both Unix (\n) and Windows (\r\n) line endings
    • Escapes backslashes first to avoid double-escaping
  3. Covers the key markdown tokens that cause issues:
    • Single character tokens: +, -, *, >
    • Ordered list tokens: 1., 2), etc.
    • Horizontal rules: ---, ***, ___, - - -, * * *, _ _ _
  4. The useMemo hook correctly includes isLabel in its dependency array

Minor observation:

  • Line 17 in StreamlitMarkdown.tsx removes React from the import, which is fine since React 17+ supports JSX without explicit React import. This is a minor cleanup, not a concern.

Test Coverage

Frontend Unit Tests (StreamlitMarkdown.test.tsx):

  • Added comprehensive test cases using test.each for parameterized testing
  • Tests cover +, -, >, 1., *** tokens
  • Tests verify the text is visible using toBeVisible() assertion (following best practices)
  • Properly cleans up between test cases with cleanup()

E2E Tests (st_button_test.py):

  • Added test test_literal_markdown_token_labels with clear docstring
  • Uses get_element_by_key for stable locators (following best practices)
  • Tests +, -, 1., *** tokens
  • Updated TOTAL_BUTTONS count from 27 to 31

Note: The > token is tested in unit tests but not in E2E tests. This is acceptable since blockquotes (>) are already in the LABEL_DISALLOWED_ELEMENTS list, so the escaping provides defense-in-depth but the element would be unwrapped anyway.

Backwards Compatibility

No breaking changes. This fix:

  • Only affects labels (isLabel=true)
  • Previously, tokens like +, - were being stripped/invisible
  • Now they render as expected
  • This is purely a bug fix that improves behavior for existing use cases

Security & Risk

No security concerns identified:

  • The change is purely presentation logic for markdown escaping
  • Input is already processed through React Markdown's sanitization
  • No new user input pathways or DOM manipulation introduced
  • The escaping logic is defensive and doesn't introduce injection vectors

Risk Assessment: Low

  • The change is isolated to the isLabel code path
  • Existing tests for non-label markdown rendering continue to pass
  • The escaping approach (backslash escaping) is standard markdown practice

Recommendations

The PR is well-implemented. A few optional improvements for consideration:

  1. Consider adding edge case unit tests (optional):

    • Multi-line labels with tokens (e.g., "Line 1\n+\nLine 3")
    • Labels with multiple consecutive tokens (e.g., "+ - *")
    • Labels mixing tokens with regular text (e.g., "Click + to add")
  2. Minor code organization (optional):
    The SINGLE_CHARACTER_LABEL_TOKENS constant could be defined closer to where escapeMarkdownLineForLabel() uses it (currently at line 741, used at line 683) for better code locality.

These are minor suggestions and not blockers for merging.

Verdict

APPROVED: This is a clean, well-tested fix for a legitimate bug affecting button labels with markdown tokens. The implementation is sound, test coverage is adequate, and there are no backwards compatibility or security concerns.


This is an automated AI review. Please verify the feedback and use your judgment.

@sfc-gh-lwilby
Copy link
Copy Markdown
Collaborator

Hi @sea-turt1e! I ran a readiness check on this PR to see what needs attention before we can move forward with review. Here's what I found:

Status: Not Ready for Review Yet

Good news: Your PR is well-written with comprehensive tests and already has product approval! However, there are a couple of CI failures that need to be fixed first.


What Needs to Be Fixed

1. Frontend Linter Failure ⚠️

The js-unit-tests check is failing at the linter step. Please run:

make frontend-lint

If there are any linting errors, you can auto-fix them with:

make frontend-format

2. E2E Test Failure ⚠️

The playwright-e2e-tests check is failing. This is likely a snapshot mismatch from your new tests. You can update the snapshots by running:

make update-snapshots

This command will fetch the snapshot differences from the CI run and update them locally.

Note: The E2E test you added looks good structurally! The failure is likely just missing snapshots or a minor formatting issue.


Infrastructure Issues (Not Your Fault)

Two checks show as "failed" but they're actually infrastructure issues posting comments to the PR:

  • check-size - The size comparison passed, but the workflow failed when trying to post a comment
  • py-coverage-report - The coverage report generated successfully, but the workflow failed when trying to post a comment

You don't need to fix these - they're workflow permission/API issues that a maintainer will need to investigate.


Your PR Quality Score: 72/100 (Grade: B-)

The code quality is strong! Just a couple minor CI issues to fix. Here's the breakdown:

  • Documentation: 25/30 ✅ - Great title, clear description, helpful screenshot
  • Technical: 30/40 ✅ - Comprehensive tests, good implementation, just needs formatting/snapshots
  • Responsiveness: 17/30 ✅ - You've been responsive to feedback

Once you fix the linter and snapshots, your score will jump to ~85+ (A-)!


Next Steps

  1. Fix the linter and E2E test failures (commands above)
  2. Push your fixes to this branch
  3. Request CI re-run: Tag me in a comment (@sfc-gh-lwilby) to re-run the CI checks
  4. Once CI is green: I'll move forward with the review!

Context for Your Understanding

This PR fixes a real bug (issue #7359) where buttons with markdown tokens like +, -, *** were being stripped or rendered incorrectly. Your implementation:

✅ Has comprehensive frontend unit tests
✅ Has E2E tests (as requested by @sfc-gh-lwilby)
✅ Already has product approval from @jrieke
✅ Received a positive automated code review
✅ Addresses the issue correctly with proper markdown escaping

The automated reviewer noted your fix is "clean, well-tested" with only optional suggestions for edge cases. Once CI is green, this should be a straightforward review!


Questions? Feel free to ask if you need help with any of the commands or run into issues. Thanks for your contribution! 🙏

@sfc-gh-lwilby sfc-gh-lwilby self-assigned this Jan 9, 2026
@sfc-gh-lwilby sfc-gh-lwilby added the not-ready-for-maintainer-review A preliminary assessment has been done on this PR and it is missing some key elements. label Jan 9, 2026
lukasmasuch added a commit that referenced this pull request Feb 11, 2026
## Describe your changes

This PR fixes issue #7359 where widget labels containing markdown syntax
characters (-, +, *, #, >, 1.) would render as empty labels because the
markdown parser was interpreting them as list markers, headings, or
blockquotes which are then stripped for labels.

The fix escapes these markdown syntax patterns when `isLabel` is true,
converting them to literal text by adding backslash escapes before
markdown is processed. The escaping only applies to patterns followed by
whitespace or end of line (e.g., "- item" is escaped but "not-a-list" is
not).

## GitHub Issue Link (if applicable)

Fixes #7359

## Testing Plan

- Unit Tests: 175 passing tests covering escaped patterns, non-escaped
patterns, edge cases with pre-escaped text, and emphasis markdown
- E2E Tests: Added test cases for "+" and "1. Something" labels to
verify they display correctly in buttons
- No additional manual testing needed beyond existing test coverage

---

Co-authored-by: sea-turt1e
<[email protected]>

- See original PR #12695

---

**Contribution License Agreement**

By submitting this pull request you agree that all contributions to this
project are made under the Apache 2.0 license.

---------

Co-authored-by: Copilot <[email protected]>
@lukasmasuch
Copy link
Copy Markdown
Collaborator

Closing this since it was fixed in this PR: #13887

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

change:feature PR contains new feature or enhancement implementation impact:users PR changes affect end users not-ready-for-maintainer-review A preliminary assessment has been done on this PR and it is missing some key elements. status:product-approved Community PR is approved by product team

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants