[JEP 467] Add support for MarkdownComments by johannescoetzee · Pull Request #4875 · javaparser/javaparser

johannescoetzee · 2025-10-16T15:57:15Z

This PR adds support for Markdown comments as described in JEP 467. I ran into a few issues while implementing this, so the end result is a compromise between the ideal design and what is doable in a reasonable amount of time and without rewriting all of the comment handling (and requiring users to do the same).

MarkdownComment node

As it is, this PR adds the MarkdownComment node, which is distinct from JavadocComment. I considered merging these under a common class, but in my opinion the benefits to doing so is far outweighed by the complexity of the change required. The content field of the MarkdownComment contains the raw text, including the /// and leading spaces after the first line, which is somewhat consistent with how block comments are handled. Keeping this information is necessary for the pretty printer and LPP.

It also contains a method getMarkdownContent which strips the leading whitespace, /// and indents the remaining content consistent with the description provided in JEP 467. I suspect that this will be more useful to users than using the raw content.

Parser changes

I initially tried updating the grammar to parse markdown comments as a single token, but ran into some difficulties doing this. From what I can see it looks like it should be possible, but would require manually manipulating the tokenizer input stream and I wasn't confident enough that this would work out to spend too much time on it. Instead, the MarkdownComment consists of a range of tokens starting with the first SINGLE_LINE_COMMENT, including all leading and trailing whitespace for the body and ending with the last SINGLE_LINE_COMMENT. The handling of all this is done in the CommonTokenAction method defined in java.jj which is added to the GeneratedJavaParserTokenManager.

Pretty printer and LPP

Support in the pretty printer is mostly a copy-paste of block comments are handled with only minor tweaks to account for the lack of /** and */. Support for the LPP is more complicated since the assumption was made that a comment will always consist of a single token. I've updated the code handling adding, removing, and replacing comments to account for the possibility of having multiple tokens corresponding to the comment.

Note on the implementation

The implementation and general commit history for this is a bit messy again. The comment handling logic was surprisingly tricky to extend to handle this case, so I tried to implement things in the least disruptive way. As I progressed, and especially when getting to the printers, I ran into issues with my initial implementation (especially the content field) and had to backtrack and change things a bit. Sorry about that.

codecov · 2025-10-16T16:18:17Z

Codecov Report

❌ Patch coverage is 60.30769% with 129 lines in your changes missing coverage. Please review.
✅ Project coverage is 58.368%. Comparing base (51d5210) to head (9ac0726).
⚠️ Report is 25 commits behind head on master.

Files with missing lines	Patch %	Lines
.../lexicalpreservation/LexicalPreservingPrinter.java	63.736%	27 Missing and 6 partials ⚠️
.../github/javaparser/printer/PrettyPrintVisitor.java	0.000%	11 Missing ⚠️
...va/com/github/javaparser/ast/comments/Comment.java	0.000%	10 Missing ⚠️
...ithub/javaparser/ast/comments/MarkdownComment.java	83.928%	8 Missing and 1 partial ⚠️
...avaparser/GeneratedJavaParserTokenManagerBase.java	60.869%	4 Missing and 5 partials ⚠️
...parser/ast/comments/TraditionalJavadocComment.java	70.000%	6 Missing ⚠️
...om/github/javaparser/ast/visitor/CloneVisitor.java	25.000%	6 Missing ⚠️
...github/javaparser/ast/comments/JavadocComment.java	0.000%	5 Missing ⚠️
...aparser/ast/visitor/GenericListVisitorAdapter.java	0.000%	5 Missing ⚠️
.../javaparser/ast/visitor/GenericVisitorAdapter.java	0.000%	4 Missing ⚠️
... and 16 more

Additional details and impacted files

@@               Coverage Diff               @@
##              master     #4875       +/-   ##
===============================================
- Coverage     58.405%   58.368%   -0.037%     
+ Complexity      2537      2534        -3     
===============================================
  Files            677       681        +4     
  Lines          39046     39302      +256     
  Branches        7089      7134       +45     
===============================================
+ Hits           22805     22940      +135     
- Misses         13341     13448      +107     
- Partials        2900      2914       +14

Flag	Coverage Δ
AlsoSlowTests	`58.368% <60.307%> (-0.037%)`	⬇️
javaparser-core	`58.368% <60.307%> (-0.037%)`	⬇️
javaparser-symbol-solver	`58.368% <60.307%> (-0.037%)`	⬇️
jdk-10	`57.919% <60.493%> (-0.048%)`	⬇️
jdk-11	`57.936% <60.493%> (-0.033%)`	⬇️
jdk-12	`57.933% <60.493%> (-0.036%)`	⬇️
jdk-13	`57.936% <60.493%> (-0.033%)`	⬇️
jdk-14	`58.170% <60.493%> (-0.019%)`	⬇️
jdk-15	`58.173% <60.493%> (-0.035%)`	⬇️
jdk-16	`58.147% <60.493%> (-0.035%)`	⬇️
jdk-17	`58.300% <60.493%> (-0.036%)`	⬇️
jdk-18	`58.300% <60.493%> (-0.036%)`	⬇️
jdk-8	`57.937% <60.493%> (-0.033%)`	⬇️
jdk-9	`57.934% <60.493%> (-0.033%)`	⬇️
macos-latest	`58.360% <60.307%> (-0.037%)`	⬇️
ubuntu-latest	`58.355% <60.307%> (-0.037%)`	⬇️
windows-latest	`58.350% <60.307%> (-0.037%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines	Coverage Δ
...n/java/com/github/javaparser/StaticJavaParser.java	`70.238% <100.000%> (ø)`
...ava/com/github/javaparser/ast/CompilationUnit.java	`74.524% <ø> (ø)`
...thub/javaparser/ast/visitor/NodeFinderVisitor.java	`4.788% <ø> (ø)`
...hub/javaparser/ast/visitor/VoidVisitorAdapter.java	`99.065% <100.000%> (+0.004%)`	⬆️
...thub/javaparser/metamodel/JavaParserMetaModel.java	`99.707% <100.000%> (+0.001%)`	⬆️
.../javaparser/metamodel/JavadocCommentMetaModel.java	`100.000% <100.000%> (ø)`
...javaparser/metamodel/MarkdownCommentMetaModel.java	`100.000% <100.000%> (ø)`
.../metamodel/TraditionalJavadocCommentMetaModel.java	`100.000% <100.000%> (ø)`
.../java/com/github/javaparser/JavaParserAdapter.java	`97.297% <0.000%> (ø)`
...parser/ast/visitor/GenericVisitorWithDefaults.java	`96.226% <0.000%> (-0.917%)`	⬇️
... and 24 more

... and 7 files with indirect coverage changes

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a111ba0...9ac0726. Read the comment docs.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

jlerbsc · 2025-10-20T12:17:20Z

I tried several markdown examples provided by the specification https://openjdk.org/jeps/467 without causing any parsing errors. Ultimately, why should we make changes in JP since it seems to work with the current version?

johannescoetzee · 2025-10-20T12:23:53Z

@jlerbsc I assumed you wanted this based on your comment in #4869. This PR treats markdown comments as a single comment instead of a series of line comments, which makes it easier for users to use them. With how line comments are handled, only the last line comment would be associated with the commented method. Any preceding comments would be orphans, so correctly handling those currently involves handling ranges manually.

In that sense, I still think the change here is beneficial, but if simply parsing the comments without errors or crashing is sufficient, then this PR isn't needed and can be closed without issue

jlerbsc · 2025-10-21T13:01:07Z

@johannescoetzee I agree with you that markdown comments are multi-line comments similar to javadoc comments. What I don't understand is that grouping these elements into a common class or using a class hierarchy makes change more complex. Could you elaborate on this point?

@ftomassetti , what is your opinion on integrating Markdown comments into the existing comment architecture? Do you agree with the suggestion made in this PR? Personally, I don't know JP well enough to have a strong opinion on this subject, but it seems to me that this PR introduces additional complexity into comment management.

johannescoetzee · 2025-10-23T09:08:42Z

@jlerbsc I've thought about your question/concern some more and I've changed my mind. I don't think this is really a problem anymore.

I had the idea that grouping them into a common class while I was still using the first representation I tried for this (seen in the earlier commits). In that representation, the content field contained what is now returned by the getMarkdownContent method. In that version, the usual reconstruction of comments comment.getHeader() + comment.getContent() + comment.getFooter() no longer made sense, and in my opinion changing that anywhere javadoc comments were handled (including in user code) was not worth the benefit of uniting those comments.

In the current version proposed by this PR, this is no longer an issue. I would therefore propose doing the following:

Rename JavadocComment to TraditionalJavadocComment, following the naming convention from https://docs.oracle.com/javase/specs/jls/se25/html/jls-3.html#jls-3.7 (where it's called TraditionalComment, but keeping the Javadoc in the name is useful in JavaParser to distinguish it from BlockComment
Create an abstract JavadocComment class that is extended by both TraditionalJavadocComment and MarkdownComment
Update the Javadoc and JavadocParser classes to handle MarkdownComments, which should now be a fairly small change.

Please let me know if you agree with this approach. Once I hear back from you, I'll update the PR with what we've discussed.

johannescoetzee · 2025-10-23T13:34:10Z

I've pushed changes starting with the Rename JavadocComment to TraditionalJavadocComment commit that implement the shared ancestor for TraditionalJavadocComment and MarkdownComment that also demonstrate the code changes required to handle this case. Overall the changes on the user side will be fairly minor. The majority of the manual changes I had to make were for javaparser internal components. It's possible I missed something, but I've covered the things I found and I would, of course, fix any issues related to this that come up.

johannescoetzee · 2025-10-23T13:50:11Z

It looks like the test failures are slow tests that aren't run with mvn test by default. I'm updating expectations for those now

Edit: Done

jlerbsc · 2025-10-24T13:50:42Z

@johannescoetzee Thank you for the valuable work you have done. For now, I will wait for @ftomassetti 's opinion to shed further light on how to proceed.

johannescoetzee · 2025-10-24T13:58:13Z

@jlerbsc Sounds good. Since this isn't required to parse Java 23 code, would you be open to accepting Java 25 contributions before this is finalised? I'm on vacation next week, but I've made good progress with handling module imports, so will likely have that ready soon after I'm back.

jlerbsc · 2025-10-30T09:29:51Z

Hi @johannescoetzee , I am going to ask you to modify your PR to simplify it and isolate the integration of Markdown comments from the current comment management. Would it be possible for you to separate the comment class hierarchy into a separate PR (this would involve preparing the code to support Markdown comments) and then create a new PR to manage Markdown comments?

jlerbsc · 2025-10-30T09:30:00Z

Do you think that a solution consisting, initially, of parsing Markdown comments as inline comments and then creating and transforming these comments into Markdown in post-processing would be easier to maintain than a solution consisting of parsing Markdown comments directly?

johannescoetzee · 2025-11-03T10:17:03Z

Do you think that a solution consisting, initially, of parsing Markdown comments as inline comments and then creating and transforming these comments into Markdown in post-processing would be easier to maintain than a solution consisting of parsing Markdown comments directly?

I did consider this approach as well, but thought it would end up being more complex. The main problem is that all but the last line comment before a method end up as orphan comments, so, to correctly reconstruct the markdown comments, it would necessary to connect these line comments by looking at token ranges and the tokens between them. While separating this logic from the parser would bring some benefit, I think it would be harder to reason about than the approach implemented in this PR where the token stream is being processed directly. It would also only work if tokens are saved (as far as I understand), which is a significant downside.

jlerbsc · 2025-11-05T09:01:07Z

Do you think that a solution consisting, initially, of parsing Markdown comments as inline comments and then creating and transforming these comments into Markdown in post-processing would be easier to maintain than a solution consisting of parsing Markdown comments directly?

I did consider this approach as well, but thought it would end up being more complex. The main problem is that all but the last line comment before a method end up as orphan comments, so, to correctly reconstruct the markdown comments, it would necessary to connect these line comments by looking at token ranges and the tokens between them. While separating this logic from the parser would bring some benefit, I think it would be harder to reason about than the approach implemented in this PR where the token stream is being processed directly. It would also only work if tokens are saved (as far as I understand), which is a significant downside.

Could you add this analysis, along with the other approaches you considered, to the description of the PR, in order to document in the JP project the approach that was chosen and those that were considered? Thank you.

jlerbsc · 2025-11-20T20:25:34Z

Can you close this PR if it is no longer relevant?

johannescoetzee · 2025-11-21T11:56:23Z

Closed in favour of #4885 and #4899

johannescoetzee changed the title ~~Johannes/markdown comments~~ [WIP] Add support for MarkdownComments Oct 16, 2025

johannescoetzee force-pushed the johannes/markdown-comments branch 2 times, most recently from 4814c13 to 96f311b Compare October 16, 2025 16:05

johannescoetzee changed the title ~~[WIP] Add support for MarkdownComments~~ Add support for MarkdownComments Oct 16, 2025

johannescoetzee changed the title ~~Add support for MarkdownComments~~ [JEP 467] Add support for MarkdownComments Oct 16, 2025

johannescoetzee force-pushed the johannes/markdown-comments branch from 96f311b to 1b1ce43 Compare October 23, 2025 09:22

johannescoetzee added 11 commits October 23, 2025 15:29

Add MarkdownComment skeleton and metamodel

3048fb5

Generate code

5d88e99

Add pretty print visitor implementations for MarkdownComment

0103766

Add comment test

b29cdf4

Add java.jj support and fix unit tests

c810f4c

Change MarkdownComment content to make printing easier/possible

68e0793

Add pretty printer tests and fix

111b2ab

Add LPP tests for Markdown comments and fix

5c32f76

Rename JavadocComment to TraditionalJavadocComment

c0f488a

Re-introduce JavadocComment as abstract common ancestor with codegen

8308161

Switch to using JavadocComment base class where relevant

4157ca4

johannescoetzee force-pushed the johannes/markdown-comments branch from 3adeb35 to 4157ca4 Compare October 23, 2025 13:29

Add test to verify that MarkdownComment is a Javadoc comment

1f108f3

Update slow test expectations

5705012

johannescoetzee force-pushed the johannes/markdown-comments branch from 50ff8f3 to 5705012 Compare October 23, 2025 14:05

kbuntrock mentioned this pull request Oct 26, 2025

Support Java25 markdown documentation kbuntrock/openapi-maven-plugin#253

Closed

johannescoetzee mentioned this pull request Nov 4, 2025

Refactor comment hierarchy in preparation for MarkdownComments #4885

Merged

johannescoetzee mentioned this pull request Nov 7, 2025

Add support for Markdown comments johannescoetzee/javaparser#3

Closed

Add some comments to explain markdown comment handling

9ac0726

johannescoetzee mentioned this pull request Nov 20, 2025

[JEP 467] Add support for MarkdownComments #4899

Merged

johannescoetzee closed this Nov 21, 2025

Uh oh!

Conversation

johannescoetzee commented Oct 16, 2025

MarkdownComment node

Parser changes

Pretty printer and LPP

Note on the implementation

Uh oh!

codecov bot commented Oct 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

jlerbsc commented Oct 20, 2025

Uh oh!

johannescoetzee commented Oct 20, 2025

Uh oh!

jlerbsc commented Oct 21, 2025

Uh oh!

johannescoetzee commented Oct 23, 2025

Uh oh!

johannescoetzee commented Oct 23, 2025

Uh oh!

johannescoetzee commented Oct 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jlerbsc commented Oct 24, 2025

Uh oh!

johannescoetzee commented Oct 24, 2025

Uh oh!

jlerbsc commented Oct 30, 2025

Uh oh!

jlerbsc commented Oct 30, 2025

Uh oh!

johannescoetzee commented Nov 3, 2025

Uh oh!

jlerbsc commented Nov 5, 2025

Uh oh!

jlerbsc commented Nov 20, 2025

Uh oh!

johannescoetzee commented Nov 21, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

codecov bot commented Oct 16, 2025 •

edited

Loading

johannescoetzee commented Oct 23, 2025 •

edited

Loading