Skip to content

Checkstyle fails with "Path contains invalid character" if path to config file contains non-ascii characters #13012

@reftel

Description

@reftel

If the config file is in a location that contains non-ascii characters, then checkstyle fails:

% ls
checkstyle-10.9.3-all.jar	checkstyle.xml
% cat checkstyle.xml 
<?xml version="1.0"?>
<!DOCTYPE module PUBLIC
  "-//Puppy Crawl//DTD Check Configuration 1.3//EN"
  "configuration_1_3.dtd">
<module name="Checker">
</module>
% mkdir a
% cp checkstyle.xml a
% java -jar checkstyle-10.9.3-all.jar -c a/checkstyle.xml .
Starting audit...
Audit done.
% mkdir æ
% cp checkstyle.xml æ
% java -jar checkstyle-10.9.3-all.jar -c æ/checkstyle.xml .
com.puppycrawl.tools.checkstyle.api.CheckstyleException: unable to parse configuration stream
	at com.puppycrawl.tools.checkstyle.ConfigurationLoader.loadConfiguration(ConfigurationLoader.java:324)
	at com.puppycrawl.tools.checkstyle.ConfigurationLoader.loadConfiguration(ConfigurationLoader.java:266)
	at com.puppycrawl.tools.checkstyle.Main.runCheckstyle(Main.java:380)
	at com.puppycrawl.tools.checkstyle.Main.runCli(Main.java:338)
	at com.puppycrawl.tools.checkstyle.Main.execute(Main.java:195)
	at com.puppycrawl.tools.checkstyle.Main.main(Main.java:130)
Caused by: com.sun.org.apache.xerces.internal.util.URI$MalformedURIException: Path contains invalid character: æ
	at java.xml/com.sun.org.apache.xerces.internal.util.URI.initializePath(URI.java:1110)
	at java.xml/com.sun.org.apache.xerces.internal.util.URI.initialize(URI.java:583)
	at java.xml/com.sun.org.apache.xerces.internal.util.URI.<init>(URI.java:336)
	at java.xml/com.sun.org.apache.xerces.internal.util.URI.<init>(URI.java:299)
	at java.xml/com.sun.org.apache.xerces.internal.impl.XMLEntityManager.expandSystemIdStrictOff1(XMLEntityManager.java:2393)
	at java.xml/com.sun.org.apache.xerces.internal.impl.XMLEntityManager.expandSystemId(XMLEntityManager.java:2239)
	at java.xml/com.sun.org.apache.xerces.internal.impl.XMLEntityManager.resolveEntityAsPerStax(XMLEntityManager.java:996)
	at java.xml/com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$DTDDriver.dispatch(XMLDocumentScannerImpl.java:1142)
	at java.xml/com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$DTDDriver.next(XMLDocumentScannerImpl.java:1040)
	at java.xml/com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$PrologDriver.next(XMLDocumentScannerImpl.java:943)
	at java.xml/com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:605)
	at java.xml/com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:534)
	at java.xml/com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:888)
	at java.xml/com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:824)
	at java.xml/com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:141)
	at java.xml/com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1216)
	at java.xml/com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:635)
	at com.puppycrawl.tools.checkstyle.XmlLoader.parseInputSource(XmlLoader.java:81)
	at com.puppycrawl.tools.checkstyle.ConfigurationLoader.parseInputSource(ConfigurationLoader.java:196)
	at com.puppycrawl.tools.checkstyle.ConfigurationLoader.loadConfiguration(ConfigurationLoader.java:315)
	... 5 more
Checkstyle ends with 1 errors.

I would have expected checkstyle to work the same, regardless of what characters were used in the path.

See also https://issues.apache.org/jira/browse/MCHECKSTYLE-425 , where Michael Osipov suggests it´s due to use of URI#toString where URI#toASCIIString should have been used.


From plugin issue:

While I can reproduce it, it is not a bug in maven plugin. The issue is with Checkstyle itself. Unfortunately, Sun – back then – decided to print gargabe output with URI#toString() instead of doing the right thing. #toString() produces invalid URIs. One must ALWAYS use #toASCIIString(). Never rely on #toString(). I checked the code of Checkstyle and it unfortunately does on many occasions. File an issue there and have it fixed.

Affected files:

Name Line Text Path
AbstractAutomaticBean.java 24 import java.net.URI; D:\Entwicklung\Projekte\checkstyle\src\main\java\com\puppycrawl\tools\checkstyle
AbstractHeaderCheck.java 28 import java.net.URI; D:\Entwicklung\Projekte\checkstyle\src\main\java\com\puppycrawl\tools\checkstyle\checks\header
ImportControlCheck.java 22 import java.net.URI; D:\Entwicklung\Projekte\checkstyle\src\main\java\com\puppycrawl\tools\checkstyle\checks\imports
ImportControlLoader.java 25 import java.net.URI; D:\Entwicklung\Projekte\checkstyle\src\main\java\com\puppycrawl\tools\checkstyle\checks\imports
ConfigurationLoader.java 23 import java.net.URI; D:\Entwicklung\Projekte\checkstyle\src\main\java\com\puppycrawl\tools\checkstyle
SuppressionsLoader.java 24 import java.net.URI; D:\Entwicklung\Projekte\checkstyle\src\main\java\com\puppycrawl\tools\checkstyle\filters
PropertyCacheFile.java 29 import java.net.URI; D:\Entwicklung\Projekte\checkstyle\src\main\java\com\puppycrawl\tools\checkstyle
CommonUtil.java 28 import java.net.URI; D:\Entwicklung\Projekte\checkstyle\src\main\java\com\puppycrawl\tools\checkstyle\utils
See: apache/maven@8e0efaa

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions