Expose Event::HardBreak style #1034

blyxyas · 2025-04-24T12:12:03Z

Being that this information is already parsed into ItemBody and the user might want to behave differently depending on the style (either backslash or double spaced) of the hardbreak

I think it's the best option to expose it as a new enum.

This issue came up in Clippy, we has to do very weird things to get the hardbreak style, and results in performance issues. Sorry if this is controversial enough to need an issue.

Being that this information is already parsed into ItemBody and the user might want to behave differently depending on the style (either backslash or double spaced) of the hardbreak I think it's the best option to expose it as a new enum. This issue came up in Clippy, we has to do very weird things to get the hardbreak style, and results in performance issues. Sorry if this is controversial enough to need an issue.

ollpu

Thanks! I think this makes sense, and is in line with some other enums we already expose. Check fmt!

Those other enums tend to be named ..Kind, so maybe this should be the same for the sake of uniformity, HardBreakKind?

Strictly speaking, the double-space style/kind is "two or more spaces", so I wonder if there's a more accurate name. Spaces?

By the way, do you have a pointer to the poorly performing post-hoc implementation in Clippy? I'm curious because I imagine it to be a relatively simple check if you know the event spans (with OffsetIter)

README.md

guide/src/index.md

fuzz/src/lib.rs

Also, format

blyxyas · 2025-04-24T17:34:38Z

This line A calls Fragments::span(), which calls source_span_for_markdown_range, which is a fairly expensive function. That first case calls it just to check if we're on a line that terminates in \, and it's called about 700 times while analyzing tokio taking up about 10% of the Clippy runtime in that crate (which is unacceptable!).

blyxyas · 2025-04-25T18:17:39Z

Oh I didn't see the failling CI :/ GitHub seems to be funky with the CI interface in this PR.

Turns out that `doc_markdown` uses a non-cheap rustdoc function to convert from markdown ranges into source spans. And it was using it a lot (about once every 18 lines of documentation on `tokio`, which ends up being about 1800 times). This ended up being about 18% of the total Clippy runtime as discovered by lintcheck --perf in docs-heavy crates. This PR optimizes one of the cases in which Clippy calls the function, and a future PR once pulldown-cmark/pulldown-cmark#1034 is merged will be open. Note that not all crates were affected by this crate equally, those with more docs are affected far more than those light ones.

ollpu · 2025-04-25T19:37:52Z

This line A calls Fragments::span(), which calls source_span_for_markdown_range, which is a fairly expensive function. That first case calls it just to check if we're on a line that terminates in \, and it's called about 700 times while analyzing tokio taking up about 10% of the Clippy runtime in that crate (which is unacceptable!).

I see, but I suppose you don't have to go through that machinery to get the snippet and perform the check?

-                if let Some(span) = fragments.span(cx, range.clone())
-                    && !span.from_expansion()
-                    && let Some(snippet) = snippet_opt(cx, span)
-                    && !snippet.trim().starts_with('\\')
-                    && event == HardBreak {
-                    collected_breaks.push(span);
-                }
+                if event == HardBreak
+                    && !doc[range.clone()].trim().starts_with('\\')
+                    && let Some(span) = fragments.span(cx, range.clone())
+                    && !span.from_expansion() {
+                    collected_breaks.push(span);
+                }

(Edit: Also do the HardBreak check first to avoid string slicing, since softs are much more common)

CI now failing from missing derive(serde)

blyxyas · 2025-04-29T22:37:43Z

Honestly, I'm not sure how to proceed 😅 This PR is then a little bit redundant in our specific case. If the project would still appreciate it, I can do the necessary fixes for CI (which is proving tricky, I'll manually check .github/workflows).

If you think that it does not add any value, I have no problems closing it and using the patch you provided (with the appropriate authoring information for the commit)

ollpu · 2025-05-02T12:31:19Z

I'm not sure either. Clearly there is a use for distinguishing the kinds of hard break, but it can be recovered fairly easily with offsets. The downside to this change is it makes it more cumbersome to match on Event, or to generate synthetic ones, like in the README example.

From a princpled perspective, if we add this, then in some sense we're also welcoming other AST enrichments like:

ATX vs setext heading (# heading vs heading\n===)
Bullet list marker (-, + or *)
Ordered list delimiter (1. or 1))
Emphasis marker (_ or *)

Also related to the discussion in #892

Opinions? @Martin1887 @notriddle

Either way, you're welcome to incorporate my patch in Clippy :)

Martin1887 · 2025-05-05T07:58:19Z

I totally agree with your conclusions, we have to think about maintaining the current code or exposing more details, with a more cumbersome interface and breaking changes.

Thanks!

blyxyas · 2025-05-05T12:28:42Z

In that case, I'll close this PR. In case that in a distant future this feature is more needed in the project feel free to implement it without attribution hassle (it's a 20 line change)

Thanks for the feedback and the reviews, and have a good week!

) Turns out that `doc_markdown` uses a non-cheap rustdoc function to convert from markdown ranges into source spans. And it was using it a lot (about once every 17 lines of documentation on `tokio`, which ends up being about 2000 times). This ended up being about 18% of the total Clippy runtime as discovered by lintcheck --perf in docs-heavy crates. This PR optimizes one of the cases in which Clippy calls the function, and a future PR once pulldown-cmark/pulldown-cmark#1034 is merged will be opened. This PR lands the use of the function into the single-digit zone. Note that not all crates were affected by this crate equally, those with more docs are affected far more than those light ones. changelog:[`clippy::doc_markdown`] has been optimized by 50%

ollpu reviewed Apr 24, 2025

View reviewed changes

README.md Outdated Show resolved Hide resolved

guide/src/index.md Outdated Show resolved Hide resolved

fuzz/src/lib.rs Outdated Show resolved Hide resolved

Normalize HardBreaks and fix README.md

cc77981

Also, format

blyxyas force-pushed the expose-hardbreakstyle branch from 623d18e to cc77981 Compare April 24, 2025 17:01

blyxyas added 2 commits April 24, 2025 19:43

HardBreakStyle Renamed to HardBreakKind and more

39b04a5

Format

c6a198e

blyxyas mentioned this pull request Apr 25, 2025

[Perf] Optimize documentation lints **a lot** (1/2) (18% -> 10%) rust-lang/rust-clippy#14693

Merged

blyxyas closed this May 5, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Expose Event::HardBreak style #1034

Expose Event::HardBreak style #1034

Uh oh!

blyxyas commented Apr 24, 2025

Uh oh!

ollpu left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

blyxyas commented Apr 24, 2025 •

edited

Loading

Uh oh!

blyxyas commented Apr 25, 2025

Uh oh!

ollpu commented Apr 25, 2025 •

edited

Loading

Uh oh!

blyxyas commented Apr 29, 2025

Uh oh!

ollpu commented May 2, 2025

Uh oh!

Martin1887 commented May 5, 2025

Uh oh!

blyxyas commented May 5, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Expose Event::HardBreak style #1034

Expose Event::HardBreak style #1034

Uh oh!

Conversation

blyxyas commented Apr 24, 2025

Uh oh!

ollpu left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

blyxyas commented Apr 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

blyxyas commented Apr 25, 2025

Uh oh!

ollpu commented Apr 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

blyxyas commented Apr 29, 2025

Uh oh!

ollpu commented May 2, 2025

Uh oh!

Martin1887 commented May 5, 2025

Uh oh!

blyxyas commented May 5, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

blyxyas commented Apr 24, 2025 •

edited

Loading

ollpu commented Apr 25, 2025 •

edited

Loading