Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JavadocParagraph: violate preceding P tag before block-level HTML tags #15503

Closed
Zopsss opened this issue Aug 14, 2024 · 13 comments · Fixed by #15710
Closed

JavadocParagraph: violate preceding P tag before block-level HTML tags #15503

Zopsss opened this issue Aug 14, 2024 · 13 comments · Fixed by #15710
Assignees
Labels
approved bug false negative issues where check should place violations on code, but does not google style
Milestone

Comments

@Zopsss
Copy link
Member

Zopsss commented Aug 14, 2024

Follow up of #15458

From https://github.com/checkstyle/checkstyle/pull/14901/files#r1622428372

https://checkstyle.org/checks/javadoc/javadocparagraph.html#JavadocParagraph

Investigated at :
#15011 (comment) (there are screenshots of javadoc with extra P tag)

7.1.2 Paragraphs rule of google java style guide states that there should not be preceding <p> tag for block tags:

HTML tags for other block-level elements, such as <ul> or <table>, are not preceded with <p>.

Whole rule:

7.1.2 Paragraphs
One blank line—that is, a line containing only the aligned leading asterisk (*)—appears between paragraphs, and before the group of block tags if present. Each paragraph except the first has <p> immediately before the first word, with no space after it. HTML tags for other block-level elements, such as <ul> or <table>, are not preceded with <p>.

We currently have no support to check this, there's also a message explaining this gap in our config at coverage table:

We don't have support to check for the extra preceding p tag before block tag; there is known issue to cover it fully: #15011

At #15458 (comment), we decided to upgrade JavadocParagraph's implementation and add support to check if a block-level HTML tag is preceded by <p>, if it is then give violation for it.


How it should work?

This is a list of block-level tags in HTML:

address
article
aside
blockquote
canvas
dd
div
dl
dt
fieldset
figcaption
figure
footer
form
h1
h2
h3
h4
h5
h6
header
hr
li
main
nav
noscript
ol
p
pre
section
table
tfoot
ul
video

source w3 schools: https://www.w3schools.com/html/html_blocks.asp

We should check if any of such block-level HTML tag is not preceded by <p>.

Example:

/**
 * Some summary.
 * 
 * <p><h1>Testing...</h1></p> // should give violation extra P tag
 */
public class InvalidParagraphTagPosition {
  /**
   * Some summary.
   * 
   *<p>  // should give violation - extra P tag
   *  <ul>
   *    <p>
   *      <li>1</li> // should NOT give violation as there is not empty line before
   *    </p>
   *  </ul>
   *</p>
   *
   * <p><b>testing</b></p> // ok, inline HTML tag. Not a block-level tag
   */
  public void foo() {}

  /**
   *  Some summary.
   *
   * <p> // should give violation: extra P tag. One more possible: after P should be no space
   *  <table> 
   *  </table>
   * </p>
   */
  public void fooo() {}
}

Config:

<?xml version="1.0"?>
<!DOCTYPE module PUBLIC
          "-//Checkstyle//DTD Checkstyle Configuration 1.3//EN"
          "https://checkstyle.org/dtds/configuration_1_3.dtd">

<module name="Checker">
  <module name="TreeWalker">
    <module name="JavadocParagraph" />
  </module>
</module>

Output:

$ java -jar .\checkstyle-10.17.0-all.jar -c .\config.xml .\InvalidParagraphTagPosition.java
Starting audit...
Audit done.

Expected output:

$ java -jar .\checkstyle-10.17.0-all.jar -c .\config.xml  .\InvalidParagraphTagPosition.java
Starting audit...
[ERROR] .\InvalidParagraphTagPosition.java:2:7:  Block-level HTML tag "<h1>" should not be preceded with "<p>"
[ERROR] .\InvalidParagraphTagPosition.java:7:7:  Block-level HTML tag "<ul>" should not be preceded with "<p>"
[ERROR] .\InvalidParagraphTagPosition.java:21:7:  Block-level HTML tag "<table>" should not be preceded with "<p>"
Audit done.
Checkstyle ends with 4 errors.

The warning is given for the block-level HTML tag, explaining the tag is preceded by <p> which is not allowed.


Additional context

The coverage table website is generated by GitHub, generate site. We have not released new version so the live website does not contain explaining message, you won't be able to see it there.

@Zopsss
Copy link
Member Author

Zopsss commented Aug 14, 2024

referring to: #15458 (comment)

/**
 * Some Javadoc.
 *
 * <ul>Some sontent.</ul>
 *
 * <b>Some content.</b>
 */
 // violation 2 lines above 'P tag or block-level HTML elements should start paragraph'

Here, we should not give violation for <b> because it is an inline HTML tag. We should give violation only if the HTML tag is a block-level tag and it is preceded by <p>

@romani
Copy link
Member

romani commented Aug 14, 2024

Here, we should not give violation for <b> because it is an inline HTML tag. We should give violation only if the HTML tag is a block-level tag and it is preceded by <p>

it should gives violation as it is a paragraph (there is empty line before it), and paragraph must start with P html tag or any other HTML tags that is block-level element (P tag is one of block-level element).

@romani romani changed the title Upgrade JavadocParagraph Check's implementation to check for preceding <p> tag for block-level HTML tags JavadocParagraph: preceding P tag before block-level HTML tags Aug 14, 2024
@romani
Copy link
Member

romani commented Aug 14, 2024

@rdiachenko , please review issue and place approval label if you agree.

@rdiachenko
Copy link
Contributor

rdiachenko commented Aug 14, 2024

   *<p>
   *  <ul> // should give violation - extra P tag
   *    <p>
   *      <li>1</li> // should NOT give violation as there is not empty line before
   *    </p>
   *  </ul>
   *</p>

<li> is a block element, I think we should give a violation here because of preceding <p>.

Apart from that, the issue looks good to me. We might single out block elements into a user-facing property later on, but I agree to keep the check without such an option for now to keep it simple.

@romani
Copy link
Member

romani commented Aug 14, 2024

li is a block element, I think we should give a violation here because of preceding p

This looks like general html Google practice, lets not do this for this Check. If someone cares so much on this, they will create special Check for this.
It is not what Google needs, this Check was created for Google style. Let's just do what they care is: after empty line there should be P tag or any other html block-level.

@romani romani moved this from For Approval to Todo in Refine Google Style Guide Aug 14, 2024
@romani
Copy link
Member

romani commented Sep 8, 2024

@Zopsss , as you start this issue, please create quick update on Inputs to declare most of cases that come to mind and declaring by violations comments place where violation is expected. Send this PR for review, CI will be red is ok, at least we will align one more time on expected behavior. You meanwhile will start coding and will simply make CI pass, ideally without changes to Inputs.

@romani romani changed the title JavadocParagraph: preceding P tag before block-level HTML tags JavadocParagraph: violate preceding P tag before block-level HTML tags Sep 8, 2024
@Zopsss
Copy link
Member Author

Zopsss commented Sep 17, 2024

@romani

From example given in issue description:

* <p>
* <table> // should give violation: extra P tag. One more possible: after P should be no space

"One more possible: after P should be no space" this means that there should be no space immediately after P tag. We have a property for this allowNewlineParagraph which by defaults allows new line after P tag, if we decide not allow for newlines then we need to disable it.

From: #15503 (comment)

Let's just do what they care is: after empty line there should be P tag or any other html block-level.

so do we need to disable allowNewlineParagraph or not as part of this issue?

@romani
Copy link
Member

romani commented Sep 17, 2024

This property introduced at #1332

Examples of Checks in website are weird and unlikely to be valid. But it is unrelated problem.

We should not change this property, it should continue to allow new line, as some sort of lube wrap, not actually extra space.
As far as I remember, in html single space is rendering as space, all sequential are not rendering. When javadoc content in taken from javadoc comment, all new lines are removed by javadoc tool, so they are not a concern.

@romani
Copy link
Member

romani commented Sep 19, 2024

@Zopsss , please send first PR with update for Inputs to define most use cases and behavior on them.
commit message supplemental: tests for #15503
lines with future violations should looks like /* no violation until #15503*/ (attention to usage of multi-line comment)

We will merge this PR quickly, update for logic most likely will be in review longer.

Zopsss added a commit to Zopsss/checkstyle that referenced this issue Sep 19, 2024
@Zopsss
Copy link
Member Author

Zopsss commented Sep 19, 2024

#15680 PR sent @romani

Zopsss added a commit to Zopsss/checkstyle that referenced this issue Sep 19, 2024
Zopsss added a commit to Zopsss/checkstyle that referenced this issue Sep 19, 2024
rdiachenko pushed a commit that referenced this issue Sep 19, 2024
@Zopsss
Copy link
Member Author

Zopsss commented Sep 20, 2024

I was trying to solve this issue locally, I found some bugs in JavadocParagraph which I would like to report before proceeding further.

<?xml version="1.0"?>
<!DOCTYPE module PUBLIC
          "-//Checkstyle//DTD Checkstyle Configuration 1.3//EN"
          "https://checkstyle.org/dtds/configuration_1_3.dtd">

<module name="Checker">
  <module name="TreeWalker">
    <module name="JavadocParagraph"/>
  </module>
</module>
/**
 * Some summary.
 *
 * <p>  Some Javadoc.
 */
public class InvalidParagraphTagPosition {}
$ java -jar .\checkstyle-10.18.1-all.jar -c .\javadocparagraph_config.xml .\InvalidParagraphTagPosition.java
Starting audit...
[ERROR] C:\checkstyle testing\.\InvalidParagraphTagPosition.java:4: <p> tag should be placed immediately before the first word, with no space after. [JavadocParagraph]
Audit done.

We're facing a violation for * <p> Some Javadoc. which is correct but if I add </p> at the end of line then violation goes away

/**
 * Some summary.
 *
 * <p>  Some Javadoc.</p>
 */
public class InvalidParagraphTagPosition {}
$ java -jar .\checkstyle-10.18.1-all.jar -c .\javadocparagraph_config.xml .\InvalidParagraphTagPosition.java
Starting audit...
Audit done.

This is happening because of different parsing of P tag when it is not closed and when it is closed.

AST of above example when P tag isn't closed ( <p> Some Javadoc. ):

$ java -jar .\checkstyle-10.18.1-all.jar -J .\InvalidParagraphTagPosition.java
COMPILATION_UNIT -> COMPILATION_UNIT [6:0]
`--CLASS_DEF -> CLASS_DEF [6:0]
    |--MODIFIERS -> MODIFIERS [6:0]
    |   |--BLOCK_COMMENT_BEGIN -> /* [1:0]
    |   |   |--COMMENT_CONTENT -> *\r\n * Some summary.\r\n *\r\n * <p>  Some Javadoc.\r\n  [1:2]
    |   |   |   `--JAVADOC -> JAVADOC [1:3]
    |   |   |       |--NEWLINE -> \r\n [1:3]
    |   |   |       |--LEADING_ASTERISK ->  * [2:0]
    |   |   |       |--TEXT ->  Some summary. [2:2]
    |   |   |       |--NEWLINE -> \r\n [2:16]
    |   |   |       |--LEADING_ASTERISK ->  * [3:0]
    |   |   |       |--NEWLINE -> \r\n [3:2]
    |   |   |       |--LEADING_ASTERISK ->  * [4:0]
    |   |   |       |--TEXT ->   [4:2]
    |   |   |       |--HTML_ELEMENT -> HTML_ELEMENT [4:3]
    |   |   |       |   `--P_TAG_START -> P_TAG_START [4:3]
    |   |   |       |       |--START -> < [4:3]
    |   |   |       |       |--P_HTML_TAG_NAME -> p [4:4]
    |   |   |       |       `--END -> > [4:5]
    |   |   |       |--TEXT ->   Some Javadoc. [4:6]
    |   |   |       |--NEWLINE -> \r\n [4:21]
    |   |   |       |--TEXT ->   [5:0]
    |   |   |       `--EOF -> <EOF> [5:1]
    |   |   `--BLOCK_COMMENT_END -> */ [5:1]
    |   `--LITERAL_PUBLIC -> public [6:0]
    |--LITERAL_CLASS -> class [6:7]
    |--IDENT -> InvalidParagraphTagPosition [6:13]
    `--OBJBLOCK -> OBJBLOCK [6:41]
        |--LCURLY -> { [6:41]
        `--RCURLY -> } [6:42]

AST when P tag is closed ( <p> Some Javadoc.</p> ):

java -jar .\checkstyle-10.18.1-all.jar -J .\InvalidParagraphTagPosition.java
COMPILATION_UNIT -> COMPILATION_UNIT [6:0]
`--CLASS_DEF -> CLASS_DEF [6:0]
    |--MODIFIERS -> MODIFIERS [6:0]
    |   |--BLOCK_COMMENT_BEGIN -> /* [1:0]
    |   |   |--COMMENT_CONTENT -> *\r\n * Some summary.\r\n *\r\n * <p>  Some Javadoc.</p>\r\n  [1:2]
    |   |   |   `--JAVADOC -> JAVADOC [1:3]
    |   |   |       |--NEWLINE -> \r\n [1:3]
    |   |   |       |--LEADING_ASTERISK ->  * [2:0]
    |   |   |       |--TEXT ->  Some summary. [2:2]
    |   |   |       |--NEWLINE -> \r\n [2:16]
    |   |   |       |--LEADING_ASTERISK ->  * [3:0]
    |   |   |       |--NEWLINE -> \r\n [3:2]
    |   |   |       |--LEADING_ASTERISK ->  * [4:0]
    |   |   |       |--TEXT ->   [4:2]
    |   |   |       |--HTML_ELEMENT -> HTML_ELEMENT [4:3]
    |   |   |       |   `--PARAGRAPH -> PARAGRAPH [4:3]
    |   |   |       |       |--P_TAG_START -> P_TAG_START [4:3]
    |   |   |       |       |   |--START -> < [4:3]
    |   |   |       |       |   |--P_HTML_TAG_NAME -> p [4:4]
    |   |   |       |       |   `--END -> > [4:5]
    |   |   |       |       |--TEXT ->   Some Javadoc. [4:6]
    |   |   |       |       `--P_TAG_END -> P_TAG_END [4:21]
    |   |   |       |           |--START -> < [4:21]
    |   |   |       |           |--SLASH -> / [4:22]
    |   |   |       |           |--P_HTML_TAG_NAME -> p [4:23]
    |   |   |       |           `--END -> > [4:24]
    |   |   |       |--NEWLINE -> \r\n [4:25]
    |   |   |       |--TEXT ->   [5:0]
    |   |   |       `--EOF -> <EOF> [5:1]
    |   |   `--BLOCK_COMMENT_END -> */ [5:1]
    |   `--LITERAL_PUBLIC -> public [6:0]
    |--LITERAL_CLASS -> class [6:7]
    |--IDENT -> InvalidParagraphTagPosition [6:13]
    `--OBJBLOCK -> OBJBLOCK [6:41]
        |--LCURLY -> { [6:41]
        `--RCURLY -> } [6:42]

Notice AST when P isn't closed:

    |   |   |       |--HTML_ELEMENT -> HTML_ELEMENT [4:3]
    |   |   |       |   `--P_TAG_START -> P_TAG_START [4:3]

and when P is closed:

    |   |   |       |--HTML_ELEMENT -> HTML_ELEMENT [4:3]
    |   |   |       |   `--PARAGRAPH -> PARAGRAPH [4:3]
    |   |   |       |       |--P_TAG_START -> P_TAG_START [4:3]

when P is fully closed then generated AST contains --PARAGRAPH token.

What we have done in Check's implementation is:

@Override
public void visitJavadocToken(DetailNode ast) {
if (ast.getType() == JavadocTokenTypes.NEWLINE && isEmptyLine(ast)) {
checkEmptyLine(ast);
}
else if (ast.getType() == JavadocTokenTypes.HTML_ELEMENT
&& JavadocUtil.getFirstChild(ast).getType() == JavadocTokenTypes.P_TAG_START) {
checkParagraphTag(ast);
}
}

we're checking if after HTML_ELEMENT is there an opening P tag i.e P_TAG_START, if yes then move further.

This case is only valid when we don't close the P tag, if we close the P tag then a new token appears between HTML_ELEMENT & P_TAG_START so the check ignores that paragraph and we don't get the expected violation as shown in above example.

This is a critical bug IMO so my question is, how should we proceed with the current issue? should we just continue with this incorrect implementation or we should first fix the implementation and then move forward.

Fixing the implementation will take some time and it will be out of scope of this issue, we will need to open another issue for it. @romani @rdiachenko

EDIT: I haven't checked the other types of test cases yet so this might affect to broader types of violations.

@romani
Copy link
Member

romani commented Sep 20, 2024

this is new bug, please open issue on it.
I think it will be better to fix this new defect first, it should be easy fix, and it will simply further fix.

@romani romani added the false negative issues where check should place violations on code, but does not label Sep 20, 2024
@Zopsss
Copy link
Member Author

Zopsss commented Sep 21, 2024

Issue opened: #15685 @romani

Zopsss added a commit to Zopsss/checkstyle that referenced this issue Sep 26, 2024
Zopsss added a commit to Zopsss/checkstyle that referenced this issue Sep 26, 2024
Zopsss added a commit to Zopsss/checkstyle that referenced this issue Sep 27, 2024
Zopsss added a commit to Zopsss/checkstyle that referenced this issue Sep 27, 2024
Zopsss added a commit to Zopsss/checkstyle that referenced this issue Sep 27, 2024
Zopsss added a commit to Zopsss/checkstyle that referenced this issue Sep 27, 2024
Zopsss added a commit to Zopsss/checkstyle that referenced this issue Sep 27, 2024
Zopsss added a commit to Zopsss/checkstyle that referenced this issue Sep 28, 2024
Zopsss added a commit to Zopsss/checkstyle that referenced this issue Oct 1, 2024
Zopsss added a commit to Zopsss/checkstyle that referenced this issue Oct 2, 2024
Zopsss added a commit to Zopsss/checkstyle that referenced this issue Oct 2, 2024
Zopsss added a commit to Zopsss/checkstyle that referenced this issue Oct 2, 2024
Zopsss added a commit to Zopsss/checkstyle that referenced this issue Oct 2, 2024
Zopsss added a commit to Zopsss/checkstyle that referenced this issue Oct 2, 2024
Zopsss added a commit to Zopsss/checkstyle that referenced this issue Oct 3, 2024
Zopsss added a commit to Zopsss/checkstyle that referenced this issue Oct 3, 2024
Zopsss added a commit to Zopsss/checkstyle that referenced this issue Oct 3, 2024
@github-project-automation github-project-automation bot moved this from In Progress to Done in Refine Google Style Guide Oct 3, 2024
@github-actions github-actions bot added this to the 10.18.3 milestone Oct 3, 2024
@romani romani added the bug label Oct 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved bug false negative issues where check should place violations on code, but does not google style
Projects
No open projects
Status: Done
Development

Successfully merging a pull request may close this issue.

3 participants