Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix parser bug where link label gets broken by ] in code span #643

Merged
merged 2 commits into from
May 18, 2023

Conversation

lukas-code
Copy link

@lukas-code lukas-code commented May 3, 2023

 [before `]` after]
 ^        ^       ^
 |        |       |
(A)      (B)     (C)

When the parser encounters the ] at (C), it tries to parse a link label starting from the previous [ at (A) until it encounters the first ] at (B). This is fine, because

A link label begins with a left bracket ([) and ends with the first right bracket (]) that is not backslash-escaped.

However, due to

Code span backticks have higher precedence than any other inline constructs except HTML tags and autolinks.

the code span `]` takes precedence over the link closing.

Since a link label must not contain a ] that is not immediately preceded by a \, the above must not be parsed as a link label.

This patch fixes #642, a bug in the parser, where the range (A)...(C) was previously parsed as a shortcut link with the label (A)...(B) if the text between (A) and (B) [here: before `] is a valid link reference. We do this by simply comparing the found end (B) and supposed end (C) to ensure that the link label closed by (C) actually ends at (C).


As a bonus I've included a #[derive(Debug)] that helped me debug this and removed an unneeded &. But I can drop the unrelated changes if you don't want them.

@Martin1887 Martin1887 merged commit e896249 into pulldown-cmark:master May 18, 2023
@Martin1887
Copy link
Collaborator

Martin1887 commented May 18, 2023

It looks fine, merged.

@lukas-code
Copy link
Author

@Martin1887 would it be possible to release a 0.9.3 patch with this fix?

This bug currently causes rustdoc to crash on some inputs, see for example rust-lang/rust#111117 and rust-lang/docs.rs#2131. We have a new lint, rustdoc::unescaped_backticks, which is supposed to detect "broken" inline code. This happens a lot on keyboard layouts where ` is a dead key. However, the lint can only work correctly if "broken" markdown is actually parsed according to the spec.

Looking at the roadmap for 0.10, it seems like the next major release still takes some time, so I'd appreciate it if you could make a point release. I've prepared an example diff for a potential patch release here, if that helps: v0.9.2...lukas-code:pulldown-cmark:0.9.3

Alternatively, we can work around this bug on the rustdoc side, which would probably just mean disabling the lint for the time being.

@Martin1887
Copy link
Collaborator

I see it right. I could create a new branch from the v0.9.2 tag and then you could create a pull request to that branch from your 0.9.3 branch.

What do you think, @raphlinus?

@Martin1887
Copy link
Collaborator

Martin1887 commented May 21, 2023

The branch branch_0.9.3 has just been created. The pull request from your branch to this new branch can already be created and hence when Raph sends his OK the 0.9.3 release will be instantly published.

@lukas-code lukas-code mentioned this pull request May 21, 2023
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this pull request May 25, 2023
…rd, r=GuillaumeGomez

update `pulldown-cmark` to `0.9.3`

This PR updates `pulldown-cmark` to version `0.9.3`, which does two main things:
* Pulls in pulldown-cmark/pulldown-cmark#643 to fix rust-lang#111117
* Allows parsing strikethrough with single tildes, e.g. `~foo~` -> ~foo~. This matches the [GFM spec](https://github.github.com/gfm/#strikethrough-extension-).

Full changelog: pulldown-cmark/pulldown-cmark#646
compiler-errors added a commit to compiler-errors/rust that referenced this pull request May 25, 2023
…rd, r=GuillaumeGomez

update `pulldown-cmark` to `0.9.3`

This PR updates `pulldown-cmark` to version `0.9.3`, which does two main things:
* Pulls in pulldown-cmark/pulldown-cmark#643 to fix rust-lang#111117
* Allows parsing strikethrough with single tildes, e.g. `~foo~` -> ~foo~. This matches the [GFM spec](https://github.github.com/gfm/#strikethrough-extension-).

Full changelog: pulldown-cmark/pulldown-cmark#646
saethlin pushed a commit to saethlin/miri that referenced this pull request May 26, 2023
…llaumeGomez

update `pulldown-cmark` to `0.9.3`

This PR updates `pulldown-cmark` to version `0.9.3`, which does two main things:
* Pulls in pulldown-cmark/pulldown-cmark#643 to fix rust-lang/rust#111117
* Allows parsing strikethrough with single tildes, e.g. `~foo~` -> ~foo~. This matches the [GFM spec](https://github.github.com/gfm/#strikethrough-extension-).

Full changelog: pulldown-cmark/pulldown-cmark#646
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

] inside inline code inside link text gets parsed incorrectly if corresponding reference exists
2 participants