[JEP 467] Add support for MarkdownComments#4875
[JEP 467] Add support for MarkdownComments#4875johannescoetzee wants to merge 14 commits intojavaparser:masterfrom
Conversation
4814c13 to
96f311b
Compare
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #4875 +/- ##
===============================================
- Coverage 58.405% 58.368% -0.037%
+ Complexity 2537 2534 -3
===============================================
Files 677 681 +4
Lines 39046 39302 +256
Branches 7089 7134 +45
===============================================
+ Hits 22805 22940 +135
- Misses 13341 13448 +107
- Partials 2900 2914 +14
Flags with carried forward coverage won't be shown. Click here to find out more.
... and 7 files with indirect coverage changes Continue to review full report in Codecov by Sentry.
🚀 New features to boost your workflow:
|
|
I tried several markdown examples provided by the specification https://openjdk.org/jeps/467 without causing any parsing errors. Ultimately, why should we make changes in JP since it seems to work with the current version? |
|
@jlerbsc I assumed you wanted this based on your comment in #4869. This PR treats markdown comments as a single comment instead of a series of line comments, which makes it easier for users to use them. With how line comments are handled, only the last line comment would be associated with the commented method. Any preceding comments would be orphans, so correctly handling those currently involves handling ranges manually. In that sense, I still think the change here is beneficial, but if simply parsing the comments without errors or crashing is sufficient, then this PR isn't needed and can be closed without issue |
|
@johannescoetzee I agree with you that markdown comments are multi-line comments similar to javadoc comments. What I don't understand is that grouping these elements into a common class or using a class hierarchy makes change more complex. Could you elaborate on this point? @ftomassetti , what is your opinion on integrating Markdown comments into the existing comment architecture? Do you agree with the suggestion made in this PR? Personally, I don't know JP well enough to have a strong opinion on this subject, but it seems to me that this PR introduces additional complexity into comment management. |
|
@jlerbsc I've thought about your question/concern some more and I've changed my mind. I don't think this is really a problem anymore. I had the idea that grouping them into a common class while I was still using the first representation I tried for this (seen in the earlier commits). In that representation, the In the current version proposed by this PR, this is no longer an issue. I would therefore propose doing the following:
Please let me know if you agree with this approach. Once I hear back from you, I'll update the PR with what we've discussed. |
96f311b to
1b1ce43
Compare
3adeb35 to
4157ca4
Compare
|
I've pushed changes starting with the Rename JavadocComment to TraditionalJavadocComment commit that implement the shared ancestor for |
|
It looks like the test failures are slow tests that aren't run with Edit: Done |
50ff8f3 to
5705012
Compare
|
@johannescoetzee Thank you for the valuable work you have done. For now, I will wait for @ftomassetti 's opinion to shed further light on how to proceed. |
|
@jlerbsc Sounds good. Since this isn't required to parse Java 23 code, would you be open to accepting Java 25 contributions before this is finalised? I'm on vacation next week, but I've made good progress with handling module imports, so will likely have that ready soon after I'm back. |
|
Hi @johannescoetzee , I am going to ask you to modify your PR to simplify it and isolate the integration of Markdown comments from the current comment management. Would it be possible for you to separate the comment class hierarchy into a separate PR (this would involve preparing the code to support Markdown comments) and then create a new PR to manage Markdown comments? |
|
Do you think that a solution consisting, initially, of parsing Markdown comments as inline comments and then creating and transforming these comments into Markdown in post-processing would be easier to maintain than a solution consisting of parsing Markdown comments directly? |
I did consider this approach as well, but thought it would end up being more complex. The main problem is that all but the last line comment before a method end up as orphan comments, so, to correctly reconstruct the markdown comments, it would necessary to connect these line comments by looking at token ranges and the tokens between them. While separating this logic from the parser would bring some benefit, I think it would be harder to reason about than the approach implemented in this PR where the token stream is being processed directly. It would also only work if tokens are saved (as far as I understand), which is a significant downside. |
Could you add this analysis, along with the other approaches you considered, to the description of the PR, in order to document in the JP project the approach that was chosen and those that were considered? Thank you. |
|
Can you close this PR if it is no longer relevant? |
Fixes #4660.
This PR adds support for Markdown comments as described in JEP 467. I ran into a few issues while implementing this, so the end result is a compromise between the ideal design and what is doable in a reasonable amount of time and without rewriting all of the comment handling (and requiring users to do the same).
MarkdownComment node
As it is, this PR adds the
MarkdownCommentnode, which is distinct fromJavadocComment. I considered merging these under a common class, but in my opinion the benefits to doing so is far outweighed by the complexity of the change required. Thecontentfield of theMarkdownCommentcontains the raw text, including the///and leading spaces after the first line, which is somewhat consistent with how block comments are handled. Keeping this information is necessary for the pretty printer and LPP.It also contains a method
getMarkdownContentwhich strips the leading whitespace,///and indents the remaining content consistent with the description provided in JEP 467. I suspect that this will be more useful to users than using the raw content.Parser changes
I initially tried updating the grammar to parse markdown comments as a single token, but ran into some difficulties doing this. From what I can see it looks like it should be possible, but would require manually manipulating the tokenizer input stream and I wasn't confident enough that this would work out to spend too much time on it. Instead, the
MarkdownCommentconsists of a range of tokens starting with the firstSINGLE_LINE_COMMENT, including all leading and trailing whitespace for the body and ending with the lastSINGLE_LINE_COMMENT. The handling of all this is done in theCommonTokenActionmethod defined injava.jjwhich is added to theGeneratedJavaParserTokenManager.Pretty printer and LPP
Support in the pretty printer is mostly a copy-paste of block comments are handled with only minor tweaks to account for the lack of
/**and*/. Support for the LPP is more complicated since the assumption was made that a comment will always consist of a single token. I've updated the code handling adding, removing, and replacing comments to account for the possibility of having multiple tokens corresponding to the comment.Note on the implementation
The implementation and general commit history for this is a bit messy again. The comment handling logic was surprisingly tricky to extend to handle this case, so I tried to implement things in the least disruptive way. As I progressed, and especially when getting to the printers, I ran into issues with my initial implementation (especially the content field) and had to backtrack and change things a bit. Sorry about that.