fix(core): correct xml schema validation handling without needing external access#8272
Conversation
There was a problem hiding this comment.
Pull request overview
Fixes XML schema validation so that documents can be validated without triggering outbound network access, while preserving support for namespace-based validation and schemaLocation hints.
Changes:
- Reworks secure SAX parsing/validation to use a validating
XMLReaderplus an entity resolver that maps known Dependency-Check schema URLs to local bundled XSD resources (nojavax.xml.accessExternalSchemamutation). - Adds a new
AutoCloseableInputSourceabstraction and expands unit tests/resources to cover validating behavior (valid/invalid docs, schemaLocation variants, resolver behavior). - Refactors core parsers/tests to use the new
XmlUtilsAPIs and removes Engine-level system property reset logic.
Reviewed changes
Copilot reviewed 33 out of 33 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| utils/src/test/resources/schema-validation/simpledoc-valid-schemaloc-https.xml | Adds a schemaLocation-based validation fixture using the canonical HTTPS schema URL. |
| utils/src/test/resources/schema-validation/simpledoc-valid-schemaloc-https-legacy.xml | Adds a schemaLocation-based validation fixture using the legacy HTTPS schema URL. |
| utils/src/test/resources/schema-validation/simpledoc-valid-schemaloc-ambiguous-uri.xml | Adds a validation fixture for schemaLocation using an ambiguous/relative schema reference. |
| utils/src/test/resources/schema-validation/simpledoc-valid-no-schemaloc.xml | Adds a fixture validating by namespace without schemaLocation hints. |
| utils/src/test/resources/schema-validation/simpledoc-invalid-schemaloc-https-notfound.xml | Adds a negative fixture for disallowed/unknown HTTPS schemaLocation. |
| utils/src/test/resources/schema-validation/simpledoc-invalid-schemaloc-https-badprefix.xml | Adds a negative fixture for an untrusted HTTPS schemaLocation prefix. |
| utils/src/test/resources/schema-validation/simpledoc-invalid-schemaloc-file.xml | Adds a negative fixture for a file:// schemaLocation under restricted external access. |
| utils/src/test/resources/schema-validation/simpledoc-invalid-schemaloc-ambiguous-uri.xml | Adds a negative fixture for an unresolvable ambiguous schemaLocation. |
| utils/src/test/resources/schema-validation/simpledoc-invalid-no-schemaloc-bad-content.xml | Adds a negative fixture for invalid XML content under a known schema. |
| utils/src/test/resources/schema-validation/simple.xsd | Adds a simple schema used by the XML validation tests. |
| utils/src/test/resources/schema-validation/irrelevant.xsd | Adds a non-matching schema used to verify “no relevant schema” behavior. |
| utils/src/test/java/org/owasp/dependencycheck/utils/XmlUtilsTest.java | Adds validation tests for XmlUtils.buildSecureValidatingXmlReader and the new resolver behavior. |
| utils/src/test/java/org/owasp/dependencycheck/utils/AutoCloseableInputSourceTest.java | Adds tests ensuring AutoCloseableInputSource closes underlying streams and sets systemId. |
| utils/src/main/java/org/owasp/dependencycheck/utils/XmlUtils.java | Introduces secure validating/non-validating XMLReader builders and a resolver that maps known schema URLs to local resources. |
| utils/src/main/java/org/owasp/dependencycheck/utils/FileUtils.java | Adjusts resource stream API annotations (resource stream is non-null / throws). |
| utils/src/main/java/org/owasp/dependencycheck/utils/AutoCloseableInputSource.java | Adds an InputSource wrapper that is AutoCloseable and provides fromResource(...). |
| core/src/test/java/org/owasp/dependencycheck/xml/suppression/SuppressionHandlerTest.java | Updates suppression XML parsing test to use the secure validating XMLReader. |
| core/src/test/java/org/owasp/dependencycheck/xml/hints/HintHandlerTest.java | Updates hint XML parsing test to use the secure validating XMLReader. |
| core/src/test/java/org/owasp/dependencycheck/xml/assembly/GrokHandlerTest.java | Updates Grok assembly XML parsing test to use the secure validating XMLReader. |
| core/src/test/java/org/owasp/dependencycheck/resources/DependencyCheckBaseSuppressionTest.java | Removes the old DOM-based base suppression “base attribute” test (replaced by newer coverage elsewhere). |
| core/src/test/java/org/owasp/dependencycheck/analyzer/HintAnalyzerTest.java | Adds coverage ensuring core packaged hints are loadable via analyzer initialization/prepare. |
| core/src/test/java/org/owasp/dependencycheck/analyzer/AbstractSuppressionAnalyzerTest.java | Adds coverage ensuring base/hosted snapshot suppressions are marked as base and load correctly. |
| core/src/main/java/org/owasp/dependencycheck/xml/suppression/SuppressionRule.java | Adds equals/hashCode support for suppression rule comparisons (used in new tests). |
| core/src/main/java/org/owasp/dependencycheck/xml/suppression/SuppressionParser.java | Switches suppression parsing to local XSD InputSources + secure validating XMLReader and removes the old resolver. |
| core/src/main/java/org/owasp/dependencycheck/xml/pom/PomParser.java | Switches to the new non-validating secure XMLReader helper. |
| core/src/main/java/org/owasp/dependencycheck/xml/hints/HintParser.java | Switches hint parsing to local XSD InputSources + secure validating XMLReader. |
| core/src/main/java/org/owasp/dependencycheck/xml/assembly/GrokParser.java | Switches grok parsing to local XSD InputSource + secure validating XMLReader. |
| core/src/main/java/org/owasp/dependencycheck/reporting/ReportGenerator.java | Uses secure XMLReader for XML pretty printing and adjusts template loading logic. |
| core/src/main/java/org/owasp/dependencycheck/data/cache/DataCacheFactory.java | Removes redundant null check for cache properties resource stream. |
| core/src/main/java/org/owasp/dependencycheck/analyzer/HintAnalyzer.java | Removes redundant null checks and exposes hint arrays for testing via @VisibleForTesting. |
| core/src/main/java/org/owasp/dependencycheck/analyzer/AssemblyAnalyzer.java | Removes redundant null check for embedded GrokAssembly zip stream. |
| core/src/main/java/org/owasp/dependencycheck/analyzer/AbstractSuppressionAnalyzer.java | Tightens hosted suppression URL filename handling and simplifies suppression file loading logic. |
| core/src/main/java/org/owasp/dependencycheck/Engine.java | Removes storing/restoring javax.xml.accessExternalSchema system property (no longer mutated). |
Comments suppressed due to low confidence (1)
core/src/main/java/org/owasp/dependencycheck/analyzer/AssemblyAnalyzer.java:413
- If
GrokAssembly.zipis missing from the classpath,FileUtils.getResourceAsStreamwill throwFileNotFoundExceptionwhich is currently caught by the genericIOExceptionhandler and rethrown as "Unable to create temp directory for GrokAssembly". CatchFileNotFoundExceptionseparately (before creating the temp dir) and throw anInitializationExceptionthat accurately reports the missing embedded resource.
try (InputStream in = FileUtils.getResourceAsStream("GrokAssembly.zip")) {
location = FileUtils.createTempDirectory(getSettings().getTempDirectory());
ExtractionUtil.extractFiles(in, location);
} catch (ExtractionException ex) {
throw new InitializationException("Unable to extract GrokAssembly.dll", ex);
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
eb901c8 to
71fe68f
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 33 out of 33 changed files in this pull request and generated 7 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
71fe68f to
510038e
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 33 out of 33 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
For some time now, these streams have always been non-null Signed-off-by: Chad Wilson <[email protected]>
…g consistently Uses the actual analyzers to load them and ensure they can be parsed correctly Signed-off-by: Chad Wilson <[email protected]>
…hints, suppression rules and grok documents when used with schema locations Signed-off-by: Chad Wilson <[email protected]>
510038e to
a834dbf
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 33 out of 33 changed files in this pull request and generated 3 comments.
Comments suppressed due to low confidence (1)
core/src/main/java/org/owasp/dependencycheck/analyzer/AssemblyAnalyzer.java:416
extractGrokAssembly()will wrap a missingGrokAssembly.zipresource (viaFileUtils.getResourceAsStream) as “Unable to create temp directory for GrokAssembly”, becauseFileNotFoundExceptionis caught by the genericIOExceptionhandler. CatchFileNotFoundExceptionseparately (or check resource existence up-front) and throw anInitializationExceptionthat accurately reports the missing GrokAssembly resource.
try (InputStream in = FileUtils.getResourceAsStream("GrokAssembly.zip")) {
location = FileUtils.createTempDirectory(getSettings().getTempDirectory());
ExtractionUtil.extractFiles(in, location);
} catch (ExtractionException ex) {
throw new InitializationException("Unable to extract GrokAssembly.dll", ex);
} catch (IOException ex) {
throw new InitializationException("Unable to create temp directory for GrokAssembly", ex);
}
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
Oh wow that was quick @jeremylong thanks! i was expecting more questions given how I managed to screw this up in the first place 😅 This exercise was a good reminder about how much I have always hated XML in Java, and that only WSDL was worse 😅 |
|
The APIs really are terrible by modern standards, and Xerces/JAXP seems dead/dormant :( |
Description of Change
The changes in #8254 unfortunately exposed a few issues which this PR corrects which I alluded to at #8254 (comment)
schemaLocationhints I added actually currently cause the default SAX parsers to try and load from the external source, viahttps://:-(javax.xml.accessExternalSchematofile, httpswhich allowed thisschemaLocationattribute would do so without external permissions; because thejavax.xml.accessExternalSchemasetting was being set after the SaxParserFactory was created. In practice, this means on master the hints schema would fail loading with the new schema location, since it is the first Analyzer prepared. (Caused by: org.xml.sax.SAXException: Line=5, Column=104: schema_reference: Failed to read schema document 'dependency-hint.1.4.xsd', because 'https' access is not allowed due to restriction set by the accessExternalSchema property.)https://dependency-check.github.ioURLs - and also was only applied to suppression xsds.Testing challenges
This change
javax.xml.accessExternalSchemasystem property mutation, by replacing with a mechanism to load https and file references from local resources by "convention".schemaLocation(alongside an EntityResolver made available for suppression documents)EntityResolverwill inspect any attempts to resolve an entity for our own domain prefixes, and match against the the same file name as a passed resource.Related issues
Have test cases been added to cover the new functionality?
yes