[JEP 467] Add support for MarkdownComments#4899
Merged
jlerbsc merged 2 commits intojavaparser:masterfrom Nov 20, 2025
Merged
Conversation
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #4899 +/- ##
===========================================
Coverage 58.368% 58.368%
Complexity 2534 2534
===========================================
Files 681 681
Lines 39302 39302
Branches 7134 7134
===========================================
Hits 22940 22940
Misses 13448 13448
Partials 2914 2914
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report in Codecov by Sentry.
🚀 New features to boost your workflow:
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Originally opened as #4875, but split into #4885 and johannescoetzee#3 (the latter opened and reviewed on my fork to see the diff compared to a local version of #4885)
Original description
This PR will replace #4875.
This PR adds support for Markdown comments as described in JEP 467. I ran into a few issues while implementing this, so the end result is a compromise between the ideal design and what is doable in a reasonable amount of time and without rewriting all of the comment handling (and requiring users to do the same).
MarkdownComment node
As it is, this PR adds the
MarkdownCommentnode, which is distinct fromJavadocComment. I considered merging these under a common class, but in my opinion the benefits to doing so is far outweighed by the complexity of the change required. Thecontentfield of theMarkdownCommentcontains the raw text, including the///and leading spaces after the first line, which is somewhat consistent with how block comments are handled. Keeping this information is necessary for the pretty printer and LPP.It also contains a method
getMarkdownContentwhich strips the leading whitespace,///and indents the remaining content consistent with the description provided in JEP 467. I suspect that this will be more useful to users than using the raw content.Parser changes
I initially tried updating the grammar to parse markdown comments as a single token, but ran into some difficulties doing this. From what I can see it looks like it should be possible, but would require manually manipulating the tokenizer input stream and I wasn't confident enough that this would work out to spend too much time on it. Instead, the
MarkdownCommentconsists of a range of tokens starting with the firstSINGLE_LINE_COMMENT, including all leading and trailing whitespace for the body and ending with the lastSINGLE_LINE_COMMENT. The handling of all this is done in theCommonTokenActionmethod defined injava.jjwhich is added to theGeneratedJavaParserTokenManager.Pretty printer and LPP
Support in the pretty printer is mostly a copy-paste of block comments are handled with only minor tweaks to account for the lack of
/**and*/. Support for the LPP is more complicated since the assumption was made that a comment will always consist of a single token. I've updated the code handling adding, removing, and replacing comments to account for the possibility of having multiple tokens corresponding to the comment.Note on the implementation
I did consider implementing Markdown comment support as a PostProcessing pass as well, but thought it would end up being more complex. The main problem is that all but the last line comment before a method end up as orphan comments, so, to correctly reconstruct the markdown comments, it would necessary to connect these line comments by looking at token ranges and the tokens between them. While separating this logic from the parser would bring some benefit, I think it would be harder to reason about than the approach implemented in this PR where the token stream is being processed directly. It would also only work if tokens are saved (as far as I understand), which is a significant downside.