Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parse errors if @see spans multiple lines #16005

Closed
cowwoc opened this issue Dec 5, 2024 · 17 comments · Fixed by #16355
Closed

Parse errors if @see spans multiple lines #16005

cowwoc opened this issue Dec 5, 2024 · 17 comments · Fixed by #16355

Comments

@cowwoc
Copy link

cowwoc commented Dec 5, 2024

I have read check documentation: https://checkstyle.org/checks/javadoc/javadocmethod.html
I have downloaded the latest checkstyle from: https://checkstyle.org/cmdline.html#Download_and_Run
I have executed the cli and showed it below, as cli describes the problem better than 1,000 words

javac src\main\java\Testcase.java
# no errors/warnings

/var/tmp $ cat config.xml

<?xml version="1.0"?>
<!DOCTYPE module PUBLIC
        "-//Checkstyle//DTD Checkstyle Configuration 1.3//EN"
        "https://checkstyle.org/dtds/configuration_1_3.dtd">
<module name="Checker">
    <module name="TreeWalker">
        <module name="SuppressWarningsHolder"/>
        <module name="InvalidJavadocPosition"/>

        <module name="JavadocBlockTagLocation"/>
        <module name="JavadocMissingLeadingAsterisk"/>
        <module name="JavadocMissingWhitespaceAfterAsterisk"/>
        <module name="JavadocTagContinuationIndentation">
            <property name="offset" value="1"/>
        </module>
        <module name="NonEmptyAtclauseDescription"/>
        <module name="RequireEmptyLineBeforeBlockTagGroup"/>
        <module name="SingleLineJavadoc">
            <property name="ignoreInlineTags" value="false"/>
        </module>
    </module>
</module>

/var/tmp $ cat YOUR_FILE.java

public class Testcase {
    /**
     * The name of the ECDSA (Elliptic Curve Digital Signature Algorithm).
     *
     * @see <a
     * href="https://docs.oracle.com/en/java/javase/23/docs/specs/security/standard-names.html#keypairgenerator-algorithms">
     * KeyPairGenerator Algorithms</a>.
     */
    public static final String ELLIPTIC_CURVE_ALGORITHM = "EC";
}
set RUN_LOCALE="-Duser.language=en -Duser.country=US"
java %RUN_LOCALE% -jar checkstyle-10.20.2-all.jar -c config.xml Testcase.java
Starting audit...
[ERROR] C:\Users\Gili\Documents\3rdparty\checkstyle-javadoc-parsing\Testcase.java:7: Javadoc comment at column 38 has parse error. Details: mismatched input '.' expecting <EOF> while parsing JAVADOC [JavadocBlockTagLocation]
[ERROR] C:\Users\Gili\Documents\3rdparty\checkstyle-javadoc-parsing\Testcase.java:7: Javadoc comment at column 38 has parse error. Details: mismatched input '.' expecting <EOF> while parsing JAVADOC [JavadocMissingLeadingAsterisk]
[ERROR] C:\Users\Gili\Documents\3rdparty\checkstyle-javadoc-parsing\Testcase.java:7: Javadoc comment at column 38 has parse error. Details: mismatched input '.' expecting <EOF> while parsing JAVADOC [JavadocMissingWhitespaceAfterAsterisk]
[ERROR] C:\Users\Gili\Documents\3rdparty\checkstyle-javadoc-parsing\Testcase.java:7: Javadoc comment at column 38 has parse error. Details: mismatched input '.' expecting <EOF> while parsing JAVADOC [JavadocTagContinuationIndentation]
[ERROR] C:\Users\Gili\Documents\3rdparty\checkstyle-javadoc-parsing\Testcase.java:7: Javadoc comment at column 38 has parse error. Details: mismatched input '.' expecting <EOF> while parsing JAVADOC [NonEmptyAtclauseDescription]
[ERROR] C:\Users\Gili\Documents\3rdparty\checkstyle-javadoc-parsing\Testcase.java:7: Javadoc comment at column 38 has parse error. Details: mismatched input '.' expecting <EOF> while parsing JAVADOC [RequireEmptyLineBeforeBlockTagGroup]
[ERROR] C:\Users\Gili\Documents\3rdparty\checkstyle-javadoc-parsing\Testcase.java:7: Javadoc comment at column 38 has parse error. Details: mismatched input '.' expecting <EOF> while parsing JAVADOC [SingleLineJavadoc]
Audit done.
Checkstyle ends with 7 errors.

Describe what you expect in detail.

No parsing errors since the javadoc in JDK 23.0.1 tool does not report any parsing problems.

@romani romani changed the title False-positives if @see spans multiple lines Parse errors if @see spans multiple lines Dec 10, 2024
@brunodmartins
Copy link

I want to work on this issue! @romani could you assign it to me?

@romani
Copy link
Member

romani commented Dec 15, 2024

No assignments, just make comment "I am on it" and start working.

@chaitanyanivsarkar
Copy link

i am on it.

@romani
Copy link
Member

romani commented Jan 14, 2025

It is not easy issue, if you are new to project I recommend to do "good xxxxx issue" first.

@mohitsatr
Copy link
Contributor

mohitsatr commented Feb 11, 2025

error goes away on removing . after </a> tag

$ cat config.xml
<?xml version="1.0"?>
<!DOCTYPE module PUBLIC
        "-//Checkstyle//DTD Checkstyle Configuration 1.3//EN"
        "https://checkstyle.org/dtds/configuration_1_3.dtd">
<module name="Checker">
    <module name="TreeWalker">
        <module name="SuppressWarningsHolder"/>
        <module name="InvalidJavadocPosition"/>

        <module name="JavadocBlockTagLocation"/>
        <module name="JavadocMissingLeadingAsterisk"/> 
        <module name="JavadocMissingWhitespaceAfterAsterisk"/>
        <module name="JavadocTagContinuationIndentation">
            <property name="offset" value="1"/>
        </module>
        <module name="NonEmptyAtclauseDescription"/>
        <module name="SingleLineJavadoc">
            <property name="ignoreInlineTags" value="false"/>
        </module>
    </module>
</module>

$ cat Testcase.java
public class Testcase {
    /**
     * The sample.
     *
     * @see <a
     * href="https://docs.oracle.com/en/java/javase/23/docs/specs/security/standard-names.html#keypairgenerator-algorithms">
     * KeyPairGenerator Algorithms</a>    // no dot
     */
    public static final String ELLIPTIC_CURVE_ALGORITHM = "EC";
}

$ java -jar testing/checkstyle-10.21.0-all.jar -c config.xml Testcase.java
Starting audit...
Audit done.

@mohitsatr
Copy link
Contributor

@romani

@romani
Copy link
Member

romani commented Feb 11, 2025

ok, but users have right to put " ." anywhere they want.
So we need to fix this problem.

@mohitsatr
Copy link
Contributor

@romani I've made some changes in grammer . how do i make them reflect in JavadocParser.java. do i need to have antlr4 installed locally ?

@romani
Copy link
Member

romani commented Feb 16, 2025

as you changed a grammar, just run compilation, and it will be updated.
https://github.com/checkstyle/checkstyle/blob/master/docs/GRAMMAR_UPDATES.md and look adjacent documents in this folder.

@mahfouz72
Copy link
Member

I want to share my analysis here to make things clear at some points.

First, the issues still exist even if everything is on a single line, so "multiple lines" have nothing to do with the issue.

D:\CS\test
cat src/Test.java
class OuterClass {
    /**
     * The name of the ECDSA (Elliptic Curve Digital Signature Algorithm).
     *
     * @see <a href="https://docs.oracle.com/en/java"> KeyPairGenerator Algorithms</a>.
     */
    public static final String ELLIPTIC_CURVE_ALGORITHM = "EC";
}
D:\CS\test
java  -jar checkstyle-10.21.1-all.jar -c config.xml src/Test.java
Starting audit...
[ERROR] D:\CS\test\src\Test.java:5: Javadoc comment at column 87 has parse error. Details: mismatched input '.' expecting <EOF> 
...
Audit done.
Checkstyle ends with 7 errors.

Second, No parse error if there is a space after the tag

D:\CS\test
cat src/Test.java
class OuterClass {
    /**
     * The name of the ECDSA (Elliptic Curve Digital Signature Algorithm).
     *
     * @see <a href="https://docs.oracle.com/en/java"> KeyPairGenerator Algorithms</a> .
     */
    public static final String ELLIPTIC_CURVE_ALGORITHM = "EC";
}
D:\CS\test
java  -jar checkstyle-10.21.1-all.jar -c config.xml src/Test.java
Starting audit...
Audit done.

Currently, The parser does not allow a description to appear immediately after an HTML element in @see tags unless there is a space before it. However, JavaDoc permits this.

For example, this should be valid:

@see <a href="https://example.com">Example</a>This is a description.

Whereas for references or strings, there should be a space before the description:

@see java.lang.String This is a description.

We should update the parser to:

  • Allow a description to immediately follow an HTML element (no mandatory space).
  • Require at least one space before a description if the reference is an identifier or string.

Current Rule:

(reference | STRING | htmlElement) (WS | NEWLINE)* ((WS | NEWLINE) description)?

Proposed update:

(reference | STRING) (WS | NEWLINE)+ description?                // for reference and string space requires at least one space
     | htmlElement (WS | NEWLINE)* description?                      // for HTML element, zero or more whitespace

@romani
Copy link
Member

romani commented Feb 19, 2025

Everything that javadoc tool is passing through its execution as normal (no error) and it rendering in html, should be supported by checkstyle.

Our parser should be even a bit more relaxed and resilient than javadoc tool, and we can not demand that checkstyle be executed only after javadoc tool is passing.

Allow a description to immediately follow an HTML element (no mandatory space).

Ok.

Require at least one space before a description if the reference is an identifier or string.

Requirements should be done by Checks , not a parser. See explanation above.

mohitsatr added a commit to mohitsatr/checkstyle that referenced this issue Feb 22, 2025
mohitsatr added a commit to mohitsatr/checkstyle that referenced this issue Feb 22, 2025
mohitsatr added a commit to mohitsatr/checkstyle that referenced this issue Feb 23, 2025
mohitsatr added a commit to mohitsatr/checkstyle that referenced this issue Feb 24, 2025
mohitsatr added a commit to mohitsatr/checkstyle that referenced this issue Feb 24, 2025
mohitsatr added a commit to mohitsatr/checkstyle that referenced this issue Mar 4, 2025
mohitsatr added a commit to mohitsatr/checkstyle that referenced this issue Mar 8, 2025
@nrmancuso
Copy link
Member

nrmancuso commented Mar 8, 2025

Our parser should be even a bit more relaxed and resilient than javadoc tool, and we can not demand that checkstyle be executed only after javadoc tool is passing.

This makes no sense, because if we go with this approach, it is impossible to a) clearly delineate what we do and don’t support, and b) have any reasonable grammar implementation.

There is a major reason why our Java grammar performs well: we only operate on what the specification provides. To accept anything outside of this means greater ambiguity, which by definition tanks performance. We already have to take a hit on performance due to the fact that Javadoc grammar is not context-free, thus compounding the issue.

The faulty logic of your statement is also evident when considering that now this invalid Javadoc has to be dealt with correctly by checks, thus further complicating check implementations.

But - the bottom line is that why should anyone care about checking their Javadocs if they can’t even generate them?

@rnveach
Copy link
Contributor

rnveach commented Mar 8, 2025

My memory is also that we were always strict with Javadocs and it's grammar. This is why we always accepted bad javadocs generate exceptions. There is an open issue to make these exceptions easier to understand, not remove them.

mohitsatr added a commit to mohitsatr/checkstyle that referenced this issue Mar 12, 2025
@romani
Copy link
Member

romani commented Mar 12, 2025

we have two options:

  • become syntax validation tool for javadoc and provide users good error messages, same as any compiler is doing , not easy task, but doable with a lot of efforts.
  • be loose at parsing and let Check to decide what to do on this.

But - the bottom line is that why should anyone care about checking their Javadocs if they can’t even generate them?

timing of such validaation execution.
Checkstyle is at hard limitation to run after javac. But nobody runs javadoc tool in their builds right after javac. IDEs doesnot run javadoc after compilation also, so Checkstyle validation is right after javac in execution.
javadoc is some reporting phase at the very end of build, and some people some time fix warning from javadoc tool.
Syntax of javadoc is purely versioned implementation of javadoc tool. I think we can find issues in our tracker that we do not know how it works as behavior of javadoc tool changed.

@nrmancuso
Copy link
Member

timing of such validaation execution

We have no control over this in any case, java or javadoc validation.

we have two options

This is a fallacy, as there are many ways to handle this, and nothing needs to be absolute. As we find invalid grammar, we can eliminate it and improve our performance. “Loose” at parsing time just equates to bad performance and overcomplicated checks; this is a fact.

@romani
Copy link
Member

romani commented Mar 12, 2025

nothing needs to be absolute

Just suggest, I shared most obvious ways. Sure we can consider trade offs

@github-actions github-actions bot added this to the 10.22.0 milestone Mar 13, 2025
@romani romani added the bug label Mar 13, 2025
@romani
Copy link
Member

romani commented Mar 13, 2025

Fix is merged, discussion on strategy of how to make javadoc parsers can be continued

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants