Affects PMD Version: 7.23.0
Description:
We shouldn't just ignore errors - if there are parsing error we should report them.
It's better to tell the user, there was an error and the static analysis results are garbage than to pretent everything is ok.
Output
Example "current" with syntax issue, this is printing to std err only:
line 5:13 extraneous input '{' expecting {<EOF>, '@', AT_PRE_WS, 'class', 'interface', 'fun', 'object', 'val', 'var', 'typealias', 'public', 'private', 'protected', 'internal', 'enum', 'sealed', 'annotation', 'data', 'inner', 'value', 'tailrec', 'operator', 'inline', 'infix', 'external', 'suspend', 'override', 'abstract', 'final', 'open', 'const', 'lateinit', 'vararg', 'noinline', 'crossinline', 'expect', 'actual'}
line 27:16 token recognition error at: '$'
line 27:17 token recognition error at: '$'
line 9:41 extraneous input '{' expecting {<EOF>, '@', AT_PRE_WS, 'class', 'interface', 'fun', 'object', 'val', 'var', 'typealias', 'public', 'private', 'protected', 'internal', 'enum', 'sealed', 'annotation', 'data', 'inner', 'value', 'tailrec', 'operator', 'inline', 'infix', 'external', 'suspend', 'override', 'abstract', 'final', 'open', 'const', 'lateinit', 'vararg', 'noinline', 'crossinline', 'expect', 'actual'}
line 29:1 extraneous input '<EOF>' expecting {TRIPLE_QUOTE_CLOSE, MultiLineStringQuote, MultiLineStrRef, MultiLineStrText, MultiLineStrExprStart}
src/main/kotlin/nl/stokpop/kotlin/Foo.kt:32: FunctionNameTooShort: Function names should have non-cryptic and clear names.
src/main/kotlin/nl/stokpop/kotlin/Foo.kt:33: FunctionNameTooShort: Function names should have non-cryptic and clear names.
Example after logging WARN and throwing ParseException instead, you can see the violating file now:
[WARN] Syntax error at src/main/kotlin/nl/stokpop/kotlin/Foo2.kt:5:13: extraneous input '{' expecting {<EOF>, '@', AT_PRE_WS, 'class', 'interface', 'fun', 'object', 'val', 'var', 'typealias', 'public', 'private', 'protected', 'internal', 'enum', 'sealed', 'annotation', 'data', 'inner', 'value', 'tailrec', 'operator', 'inline', 'infix', 'external', 'suspend', 'override', 'abstract', 'final', 'open', 'const', 'lateinit', 'vararg', 'noinline', 'crossinline', 'expect', 'actual'}
line 27:16 token recognition error at: '$'
line 27:17 token recognition error at: '$'
src/main/kotlin/nl/stokpop/kotlin/Foo.kt:32: FunctionNameTooShort: Function names should have non-cryptic and clear names.
src/main/kotlin/nl/stokpop/kotlin/Foo.kt:33: FunctionNameTooShort: Function names should have non-cryptic and clear names.
src/main/kotlin/nl/stokpop/kotlin/Foo2.kt - ParseException: Parse exception in file 'src/main/kotlin/nl/stokpop/kotlin/Foo2.kt' at line 5, column 13: extraneous input '{' expecting {<EOF>, '@', AT_PRE_WS, 'class', 'interface', 'fun', 'object', 'val', 'var', 'typealias', 'public', 'private', 'protected', 'internal', 'enum', 'sealed', 'annotation', 'data', 'inner', 'value', 'tailrec', 'operator', 'inline', 'infix', 'external', 'suspend', 'override', 'abstract', 'final', 'open', 'const', 'lateinit', 'vararg', 'noinline', 'crossinline', 'expect', 'actual'}
[INFO] An error occurred while executing PMD.
Found that the lexer also reports to std error, so:
[WARN] Syntax error at src/main/kotlin/nl/stokpop/kotlin/Foo2.kt:5:13: extraneous input '{' expecting {<EOF>, '@', AT_PRE_WS, 'class', 'interface', 'fun', 'object', 'val', 'var', 'typealias', 'public', 'private', 'protected', 'internal', 'enum', 'sealed', 'annotation', 'data', 'inner', 'value', 'tailrec', 'operator', 'inline', 'infix', 'external', 'suspend', 'override', 'abstract', 'final', 'open', 'const', 'lateinit', 'vararg', 'noinline', 'crossinline', 'expect', 'actual'}
[WARN] Syntax error at src/main/kotlin/nl/stokpop/kotlin/Foo.kt:27:16: token recognition error at: '$'
src/main/kotlin/nl/stokpop/kotlin/Foo.kt - ParseException: Parse exception in file 'src/main/kotlin/nl/stokpop/kotlin/Foo.kt' at line 27, column 16: token recognition error at: '$'
src/main/kotlin/nl/stokpop/kotlin/Foo2.kt - ParseException: Parse exception in file 'src/main/kotlin/nl/stokpop/kotlin/Foo2.kt' at line 5, column 13: extraneous input '{' expecting {<EOF>, '@', AT_PRE_WS, 'class', 'interface', 'fun', 'object', 'val', 'var', 'typealias', 'public', 'private', 'protected', 'internal', 'enum', 'sealed', 'annotation', 'data', 'inner', 'value', 'tailrec', 'operator', 'inline', 'infix', 'external', 'suspend', 'override', 'abstract', 'final', 'open', 'const', 'lateinit', 'vararg', 'noinline', 'crossinline', 'expect', 'actual'}
[INFO] 2 errors occurred while executing PMD.
But having lexer also throw ParseException stops furter parsing... so changed it to lexer only reporting warns.
You can see it still continues to find FunctionNameTooShort unlike the previous run.
[WARN] Syntax error at src/main/kotlin/nl/stokpop/kotlin/Foo2.kt:5:13: extraneous input '{' expecting {<EOF>, '@', AT_PRE_WS, 'class', 'interface', 'fun', 'object', 'val', 'var', 'typealias', 'public', 'private', 'protected', 'internal', 'enum', 'sealed', 'annotation', 'data', 'inner', 'value', 'tailrec', 'operator', 'inline', 'infix', 'external', 'suspend', 'override', 'abstract', 'final', 'open', 'const', 'lateinit', 'vararg', 'noinline', 'crossinline', 'expect', 'actual'}
[WARN] Syntax error at src/main/kotlin/nl/stokpop/kotlin/Foo.kt:27:16: token recognition error at: '$'
[WARN] Syntax error at src/main/kotlin/nl/stokpop/kotlin/Foo.kt:27:17: token recognition error at: '$'
src/main/kotlin/nl/stokpop/kotlin/Foo.kt:32: FunctionNameTooShort: Function names should have non-cryptic and clear names.
src/main/kotlin/nl/stokpop/kotlin/Foo.kt:33: FunctionNameTooShort: Function names should have non-cryptic and clear names.
src/main/kotlin/nl/stokpop/kotlin/Foo2.kt - ParseException: Parse exception in file 'src/main/kotlin/nl/stokpop/kotlin/Foo2.kt' at line 5, column 13: extraneous input '{' expecting {<EOF>, '@', AT_PRE_WS, 'class', 'interface', 'fun', 'object', 'val', 'var', 'typealias', 'public', 'private', 'protected', 'internal', 'enum', 'sealed', 'annotation', 'data', 'inner', 'value', 'tailrec', 'operator', 'inline', 'infix', 'external', 'suspend', 'override', 'abstract', 'final', 'open', 'const', 'lateinit', 'vararg', 'noinline', 'crossinline', 'expect', 'actual'}
Note: parse errors might explain stop-Token are null and null-check is needed in toString()/getReportLocation() method of ...antlr4.BaseAntlrInnerNode/...antlr4.BaseAntlrNode?
Code Sample demonstrating the issue:
See net.sourceforge.pmd.lang.kotlin.ast.PmdKotlinParser
protected KtKotlinFile parse(final Lexer lexer, ParserTask task) {
KotlinParser parser = new KotlinParser(new CommonTokenStream(lexer));
return parser.kotlinFile().makeAstInfo(task);
}
Swift uses:
protected SwTopLevel parse(final Lexer lexer, ParserTask task) {
SwiftParser parser = new SwiftParser(new CommonTokenStream(lexer));
parser.removeErrorListeners();
parser.addErrorListener(new BaseErrorListener() {
@Override
public void syntaxError(Recognizer<?, ?> recognizer, Object offendingSymbol, int line, int charPositionInLine, String msg, RecognitionException e) {
LOGGER.warn("Syntax error at {}:{}:{}: {}", task.getFileId().getOriginalPath(),
line, charPositionInLine, msg);
// TODO: eventually we should throw a parse exception
// throw new ParseException(msg).withLocation(FileLocation.caret(task.getFileId(), line, charPositionInLine));
}
});
return parser.topLevel().makeAstInfo(task);
}
So also ParseException can be thrown as shown in the comment, but that will impact the behaviour of Kotlin/Swift parsing so users should be aware.
Example fix in PmdKotlinParser, note the override on both lexer and parser to silence std error:
protected KtKotlinFile parse(final Lexer lexer, ParserTask task) {
BaseErrorListener lexerErrorListener = new BaseErrorListener() {
@Override
public void syntaxError(Recognizer<?, ?> recognizer, Object offendingSymbol, int line, int charPositionInLine, String msg, RecognitionException e) {
// Warning only: the lexer can sometimes recover by skipping
// unknown characters. Throwing here would prevent PMD from producing any
// results for the file.
LOGGER.warn("Syntax error at {}:{}:{}: {}", task.getFileId().getOriginalPath(), line, charPositionInLine, msg);
}
};
BaseErrorListener parserErrorListener = new BaseErrorListener() {
@Override
public void syntaxError(Recognizer<?, ?> recognizer, Object offendingSymbol, int line, int charPositionInLine, String msg, RecognitionException e) {
LOGGER.warn("Syntax error at {}:{}:{}: {}", task.getFileId().getOriginalPath(), line, charPositionInLine, msg);
throw new ParseException(msg)
.withLocation(FileLocation.caret(task.getFileId(), line, charPositionInLine));
}
};
// The lexer may emit "token recognition error at ..." messages via ANTLR's default
// ConsoleErrorListener (stderr). Remove it so we can surface the error as a PMD
// ParseException with a proper location.
lexer.removeErrorListeners();
lexer.addErrorListener(lexerErrorListener);
KotlinParser parser = new KotlinParser(new CommonTokenStream(lexer));
parser.removeErrorListeners();
parser.addErrorListener(parserErrorListener);
return parser.kotlinFile().makeAstInfo(task);
}
Note that in Kotlin test cases the setup has already be changed in KotlinParsingHelper.parseImpl(): ParserException is thrown when issues arrise in tests. This can be removed if above is in place.
Steps to reproduce:
Please provide detailed steps for how we can reproduce the bug.
- run a pmd check command on kotlin files with syntax errors or with newer syntax than what current pmd kotlin parser supports.
Running PMD through: [CLI | Maven | Gradle ]
Affects PMD Version: 7.23.0
Description:
We shouldn't just ignore errors - if there are parsing error we should report them.
It's better to tell the user, there was an error and the static analysis results are garbage than to pretent everything is ok.
Output
Example "current" with syntax issue, this is printing to std err only:
Example after logging WARN and throwing ParseException instead, you can see the violating file now:
Found that the lexer also reports to std error, so:
But having lexer also throw ParseException stops furter parsing... so changed it to lexer only reporting warns.
You can see it still continues to find FunctionNameTooShort unlike the previous run.
Note: parse errors might explain stop-Token are null and null-check is needed in toString()/getReportLocation() method of
...antlr4.BaseAntlrInnerNode/...antlr4.BaseAntlrNode?Code Sample demonstrating the issue:
See
net.sourceforge.pmd.lang.kotlin.ast.PmdKotlinParserSwift uses:
So also
ParseExceptioncan be thrown as shown in the comment, but that will impact the behaviour of Kotlin/Swift parsing so users should be aware.Example fix in
PmdKotlinParser, note the override on both lexer and parser to silence std error:Note that in Kotlin test cases the setup has already be changed in
KotlinParsingHelper.parseImpl():ParserExceptionis thrown when issues arrise in tests. This can be removed if above is in place.Steps to reproduce:
Please provide detailed steps for how we can reproduce the bug.
Running PMD through: [CLI | Maven | Gradle ]