-
-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[core] NodeStream API #1622
Merged
Merged
[core] NodeStream API #1622
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Generated by 🚫 Danger |
use iterator as main implementation
14 tasks
5 tasks
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
an:enhancement
An improvement on existing features / rules
in:ast
About the AST structure or API, the parsing step
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Context
With the removal of Jaxen for 7.0.0, the method
Node.findChildrenWithXPath
will be broken (it throws a checked JaxenException). It's usually only used when the path at which the node looked for is too deep to be roamed in a type-safe, null-safe manner without making the code size explode, e.g. in ForLoopCanBeForeach:pmd/pmd-java/src/main/java/net/sourceforge/pmd/lang/java/rule/bestpractices/ForLoopCanBeForeachRule.java
Lines 171 to 178 in 8a208b7
Doing the same check in Java without XPath would probably involve around 30 lines of clumsy null-check and type checks/casts. Changing the grammar to flatten the AST will simplify this partially, but it doesn't remove the problem.
An alternative that Java 8 enables is node stream processing. Basically we can design an API where we can stream the descendants of a node, allowing to filter, flatmap to children of a certain type, flatmap to descendants, etc, in a completely type safe and null safe manner.
Saxon itself has an API to write XPath-like queries in Java, leveraging the standard Stream API. From the Saxon HE doc:
That API is nice, but it's not typesafe, uses strings everywhere, and using it directly would force us to wrap the nodes in a Saxon adapter.
This PR
This PR implements a feature-complete, self-contained API to stream nodes. NodeStreams
See the javadoc on the NodeStream interface for more details.
The PR also refactors ForLoopCanBeForeachRule to use that new API. You can see how it blends well with Optional and regular Streams. What's more interesting is that I observed that the rule runs at least 30% faster!
I would argue that with that new API in 7.0.0, we could deprecate the methods getFirstDescendantOfType, getFirstChildOfType, findFirstParentOfType, etc. until 8.0.0. Their functionality is superseded by the new API, and they're not lazy.
What's missing yet:
descendantStream()
. I would think the current implementation is the fastest, but it could also be implemented with a lazy tree visit.descendantStream()
on find boundaries. This can be added later, because it needs theStream#takeWhile
function, which was introduced in JDK 9, so we'll have to implement it ourselves.